Big science discovery requires an efficient computing framework in the high performance computing architecture. Traditional scientific data analysis relies on Message Passing Interface (MPI) and MPI-IO to achieve fast computing and low I/O bottleneck. Among them, two-phase collective I/O is commonly used to reduce data movement by optimizing the non-contiguous I/O pattern. However, the inherent constraint of collective I/O prevents it from having a flexible combination with computing and lacks an efficient non-blocking I/O-Computing framework in current HPC. In this work, we propose Collective Computing, a framework that breaks the constraint of the two-phase collective I/O and provides an efficient non-blocking computing paradigm with runtime support. The fundamental idea is to move the analysis stage in advance and insert the computation into the two-phase I/O, such that the data in the first I/O phase can be computed in place and the second shuffle phase is minimized with a reduce operation. We motivate this idea by profiling the I/O and CPU usage. With both theoretical analysis and evaluation on real application and benchmarks, we show that the collective computing can achieve 2.5X speedup and is promising in big scientific data analysis.