Auditing Big Data Storage in Cloud Computing Using Divide and Conquer Tables
Cloud computing has arisen as the mainstream platform of utility computing paradigm that offers reliable and robust infrastructure for storing data remotely, and provides on demand applications and services. Currently, establishments that produce huge volume of sensitive data, leverage data outsourcing to reduce the burden of local data storage and maintenance. The outsourced data, however, in the cloud are not always trustworthy because of the inadequacy of physical control over the data for data owners. To better streamline this issue, scientists have now focused on relieving the security threats by designing remote data checking (RDC) techniques. However, the majority of these techniques are inapplicable to big data storage due to incurring huge computation cost on the user and cloud sides. Such schemes in existence suffer from data dynamicity problem from two sides. First, they are only applicable for static archive data and are not subject to audit the dynamic outsourced data. Second, although, some of the existence methods are able to support dynamic data update, increasing the number of update operations impose high computation and communication cost on the auditor due to maintenance of data structure, i.e., merkle hash tree. This paper presents an efficient RDC method on the basis of algebraic properties of the outsourced files in cloud computing, which inflicts the least computation and communication cost. The main contribution of this paper is to present a new data structure, called Divide and Conquer Table (D&CT), which proficiently supports dynamic data for normal file sizes. Moreover, this data structure empowers our method to be applicable for large-scale data storage with minimum computation cost. The one-way analysis of variance shows that there are significant differences between the proposed method and the existing methods in terms of the computation and communication cost on the auditor and cloud.
The existing RDC methods use diverse types of data structures (e.g. binary tree) to prove that the outsourced data remains intact. However, such data structures only can be applicable for auditing normal file size. This is because updating a small number of data blocks of large-scale files, requires re-balancing a huge number of data blocks, which incurs noticeable computation overhead on the auditor. On the other hand, in contrast to archival data that require rare update operation, there exist some large-scales data with specific application (e.g., Twitter) that are intrinsically liable to frequent data update from users. However, supporting such frequent data update operations using traditional data structures results in considerable amount of computation cost for the auditor, due to the necessity of re-arranging the large number of data blocks for several times.
We propose an efficient RDC method to audit the integrity of data in cloud computing on the basis of algebraic properties of the outsourced files. We also design a new data structure (D&CT), which allows the auditors to efficiently update the outsourced data frequently. The D&CT helps our method to be applicable for auditing big data storage efficiently. Furthermore, we implement our scheme to prove the security, justify the performance of our method for normal file size (10 MB–50 MB) and large-scale file (1 GB–1 TB), and compare with the most familiar and the stateof-the-art RDC methods on the basis of computation and communication cost.
This paper presented an efficient RDC scheme to ensure the data storage security in cloud computing. To achieve this goal, we employed algebraic signature properties that empower our method to validate the integrity of the outsourced data and reduce the computation cost on the auditor and cloud sides. By designing the D&CT as a new data structure, our RDC method has capability to support dynamic block update operations. The D&CT also allows the verifier to audit the large-scale files and perform a large number of update operations with least computation cost on the verifier and server. The security and performance analysis showed the efficiency and provably of our scheme. The majority of single-server data auditing methods are only able to detect data corruption, because they suffer from lack of the necessary capabilities to recover data. Therefore, as a part of future work, we will extend our scheme to be applicable for distributed cloud servers, which helps the DO to reinstate the corrupted data by using the remaining healthy servers.
 M. Sookhak, A. Gani, H. Talebian, S. U. Khan, R. Buyya, and A. Y. Zomaya, “Remote Data Auditing in Cloud Computing Environments: A Survey, Taxonomy, and Open Issues,” ACM Comput. Surv., vol. 47, no. 4, pp. 65:1–65:34, May 2015.
 A. Singh and L. Liu, “Sharoes: A Data Sharing Platform for Outsourced Enterprise Storage Environments,” in Proc. IEEE 24th Int. Conf. Data Engin., Apr. 2008, pp. 993–1002.
 R. Buyya, C. S. Yeo, S. Venugopal, J. Broberg, and I. Brandic, “Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility,” Fut. Gen. Comput. Syst., vol. 25, no. 6, pp. 599–616, 2009.
 C. Wang, K. Ren, W. J. Lou, and J. Li, “Toward publicly auditable secure cloud data storage services,” IEEE Netw., vol. 24, no. 4, pp. 19–24, 2010.
 L. Wei, H. Zhu, Z. Cao, X. Dong, W. Jia, Y. Chen, and A. V. Vasilakos, “Security and privacy for storage and computation in cloud computing,” Inf. Sci., vol. 258, pp. 371–386, Feb. 2014.
 G. Ateniese, R. D. Pietro, L. V. Mancini, and G. Tsudik, “Scalable and efficient provable data possession,” in Proc. ACM Int. Conf. Secur. Priv. Commun. Net., 2008, pp. 9:1–9:10.
 L. Chen, “Using algebraic signatures to check data possession in cloud storage,” Fut. Gen. Comput. Syst., vol. 29, no. 7, pp. 1709– 1715, Sep. 2013.
 C. Liu, J. Chen, L. Yang, X. Zhang, C. Yang, R. Ranjan, and K. Ramamohanarao, “Authorized Public Auditing of Dynamic Big Data Storage on Cloud with Efficient Verifiable Fine-grained Updates,” IEEE Trans. Parallel Distrib. Syst., vol. 25, no. 9, pp. 2234– 2244, Sep 2013.
 K. Yang and X. Jia, “Data storage auditing service in cloud computing: challenges, methods and opportunities,” World Wide Web, vol. 15, no. 4, pp. 409–428, 2011.
 B. Chen, R. Curtmola, G. Ateniese, and R. Burns, “Remote data checking for network coding-based distributed storage systems,” pp. 31–42, 2010.
 G. Ateniese, R. Burns, R. Curtmola, J. Herring, L. Kissner, Z. Peterson, and D. Song, “Provable data possession at untrusted stores,” in Proc. 14th ACM conf. Comput. Commun. Secur, 2007, pp. 598–609.
 R. Curtmola, O. Khan, and R. Burns, “Robust remote data checking,” in Proc. 4th ACM Int. Workshop Stor. Secur. Surviv., 2008, pp. 63–68.
 C. Erway, A. Kupc¸ ¨ u, C. Papamanthou, and R. Tamassia, “Dynamic ¨ provable data possession,” in Proc. 16th ACM Conf. Comput. Commun. Secur., 2009, pp. 213–222.
 Q. Zheng and S. Xu, “Fair and dynamic proofs of retrievability,” in Proc. 1st ACM Conf. Data App. Secur. Priv., Feb. 2011, pp. 237–248.
 M. Sookhak, A. Akhunzada, A. Gani, M. Khurram Khan, and N. B. Anuar, “Towards Dynamic Remote Data Auditing in Computational Clouds,” Scien. Wor. J., vol. 2014, pp. 1–12, 2014.