Improving Viability of Electric Taxis by Taxi Service Strategy Optimization: A Big Data Study of New York City
Electrification of transportation is critical for a low-carbon society. In particular, public vehicles (e.g., taxis) provide a crucial opportunity for electrification. Despite the benefits of eco-friendliness and energy efficiency, adoption of electric taxis faces several obstacles, including constrained driving range, long recharging duration, limited charging stations, and low gas price, all of which impede taxi drivers’ decisions to switch to electric taxis. On the other hand, the popularity of ride-hailing mobile apps facilitates the computerization and optimization of taxi service strategies, which can provide computer-assisted decisions of navigation and roaming for taxi drivers to locate potential customers. This paper examines the viability of electric taxis with the assistance of taxi service strategy optimization, in comparison with conventional taxis with internal combustion engines. A big data study is provided using a large data set of real-world taxi trips in New York City (NYC). Our methodology is to first model the computerized taxi service strategy by Markov decision process, and then obtain the optimized taxi service strategy based on NYC taxi trip data set. The profitability of electric taxi drivers is studied empirically under various battery capacity and charging conditions. Consequently, we shed light on the solutions that can improve viability of electric taxis.
In this approach, a comparison is required for every two values existing in the dataset. The observed dominant value in each comparison can be used to find the final top-k dominant values later. Indicating the top-k dominant values can be challenging when facing big data; processing time can become astronomical even if the computational cost can be ignored. Therefore, this approach may not be the best way of solving this problem.
Several algorithms have been proposed to apply the TKD queries with performance more acceptable and efficient than the naive approach. Based on , , applying the top-k dominance query is possible in the incomplete dataset using Skyline query processing. The Skyline algorithm separates the uniform values into different buckets. The buckets group the dataset by dividing it into chunks that have same missing-value dimension. The buckets are easier and faster to process. By having separate top-k dominant values for each bucket and combining them together, determining the top-k values of the whole dataset becomes possible. The Upper Bound Based algorithm is another method which works by finding top scores using the bit-wise comparison between values, covered later in the paper .
In this paper, we employ Markov Decision Process to model computerized taxi service strategy and optimize the strategy for taxi drivers considering electric taxi operational constraints. We evaluate the effectiveness of the optimal policy of Markov Decision Process using a big data study of realworld taxi trips in New York City. The optimal policy can be implemented in an intelligent recommender system for taxi drivers. This becomes more viable especially due to the advent of autonomous vehicles. Our evaluation shows that computerized service strategy optimization allows electric taxi drivers to earn comparable net revenues as ICE drivers, who also employ computerized service strategy optimization, with at least 50 kWh battery capacity. Hence, this sheds light on the viability of electric taxis.
 Y. Wang, X. Li, X. Li, and Y. Wang, “A survey of queries over uncertain data,” Knowledge and Information Systems, vol. 37, no. 3, pp. 485–530, 2013.
 M. E. Khalefa, M. F. Mokbel, and J. J. Levandoski, “Skyline Query Processing for Incomplete Data,” in Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, ser. ICDE ’08. Washington, DC, USA: IEEE Computer Society, 2008, pp. 556–565.
 D. Papadias, G. Fu, and B. Seeger, Progressive Skyline Computation in Database Systems, 2005, vol. 30, no. 1.
 X. Miao, Y. Gao, B. Zheng, G. Chen, and H. Cui, “Top-k dominating queries on incomplete data,” 2016 IEEE 32nd International Conference on Data Engineering, ICDE 2016, vol. 28, no. 1, pp. 1500– 1501, 2016.
 X. Lian and L. Chen, “Probabilistic top-k dominating queries in uncertain databases,” Information Sciences, vol. 226, pp. 23–46, 2013.
 S. Ge, U. Leong Hou, N. Mamoulis, and D. W. L. Cheung, “Dominance relationship analysis with budget constraints,” Knowledge and Information Systems, pp. 1–32, 2013.
 M. L. Yiu and N. Mamoulis, “Multi-dimensional top-k dominating queries,” The VLDB Journal, vol. 18, no. 3, pp. 695–718, 2009.
 N. Mamoulis, K. H. Cheng, M. L. Yiu, and D. W. Cheung, “Efficient aggregation of ranked inputs,” Proceedings – International Conference on Data Engineering, vol. 2006, no. X, p. 72, 2006.
 X. Lian and L. Chen, “Top-k Dominating Queries in Uncertain Databases,” in Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, ser. EDBT ’09. New York, NY, USA: ACM, 2009, pp. 660–671.
 M. Hua, J. Pei, W. Zhang, and X. Lin, “Efficiently answering probabilistic threshold top-k queries on uncertain data,” Proceedings – International Conference on Data Engineering, pp. 1403–1405, 2008.
 X. Han, J. Li, and H. Gao, “TDEP: efficiently processing top-k dominating query on massive data,” Knowledge and Information Systems, vol. 43, no. 3, pp. 689–718, 2015.
 E. Tiakas and G. Valkanas, “Metric-Based Top-k Dominating Queries,” Proc. 17th Inter- national Conference on Extending Database Technology (EDBT), pp. 415–426, 2014.
 B. J. Santoso and G. M. Chiu, “Close dominance graph: An efficient framework for answering continuous top-k dominating queries,” IEEE Transactions on Knowledge and Data Engineering, vol. 26, no. 8, pp. 1853–1865, 2014.
 C. Lofi, K. El Maarry, and W.-T. Balke, “Skyline Queries in Crowdenabled Databases,” in Proceedings of the 16th International Conference on Extending Database Technology, ser. EDBT ’13. New York, NY, USA: ACM, 2013, pp. 465–476.
 X. Han, X. Liu, J. Li, and H. Gao, “TKAP: Efficiently processing top-k query on massive data by adaptive pruning,” Knowledge and Information Systems, vol. 47, no. 2, pp. 301–328, 2016.