This paper deals with city-wide air quality estimation with limited air quality monitoring stations which are geographically sparse. Since air pollution is influenced by urban dynamics (e.g., meteorology and traffic) which are available throughout the city, we can infer the air quality in regions without monitoring stations based on such spatial-temporal (ST) heterogeneous urban big data. However, big data-enabled estimation poses three challenges. The first challenge is data diversity, i.e., there are many different categories of urban data, some of which may be useless for the estimation. To overcome this, we extend Granger causality to the ST space to analyze all the causality relations in a consistent manner. The second challenge is the computational complexity due to processing the massive volume of data. To overcome this, we introduce the non-causality test to rule out urban dynamics that do not “Granger” cause air pollution, and the region of influence (ROI), which enables us to only analyze data with the highest causality levels. The third challenge is to adapt our grid-based algorithm to non-grid-based applications. By developing a flexible grid-based estimation algorithm, we can decrease the inaccuracies due to grid-based algorithm while maintaining computation efficiency.