EFFICIENT CLUE-BASED ROUTE SEARCH ON ROAD NETWORKS

 

ABSTRACT

With the advances in geo-positioning technologies and location-based services, it is nowadays quite common for roadnetworks to have textual contents on the vertices. Previous work on identifying an optimal route that covers a sequence of querykeywords has been studied in recent years. However, in many practical scenarios, an optimal route might not always be desirable.For example, a personalized route query is issued by providing some clues that describe the spatial context between PoIs along theroute, where the result can be far from the optimal one. Therefore, in this paper, we investigate the problem of clue-based route search(CRS), which allows a user to provide clues on keywords and spatial relationships. First, we propose a greedy algorithm and a dynamicprogramming algorithm as baselines. To improve efficiency, we develop a branch-and-bound algorithm that prunes unnecessary verticesin query processing. In order to quickly locate candidate, we propose an AB-tree that stores both the distance and keyword informationin tree structure. To further reduce the index size, we construct a PB-tree by utilizing the virtue of 2-hop label index to pinpoint thecandidate. Extensive experiments are conducted and verify the superiority of our algorithms and index structures.

PROPOSED SYSTEM:

The principal contributions of this paper can be summarized asfollows._ We propose a greedy clue search algorithm (GCS) toanswer the CRS query approximately with no indexinvolved. In GCS, we adopt the network expansion approachto greedily select the current best candidate at eachstep to construct feasible paths._ We also develop a clue-based dynamic programming algorithm(CDP) that attempts to enumerate all feasible pathsand finally returns the optimal result. In CDP, distanceoracle is used to compute the network distance betweencandidates._ We further propose a branch-and-bound algorithm (BAB)by applying filter-and-refine paradigm such that only asmall portion of vertices are visited, hence improves thesearch efficiency. In order to quickly locate the candidatevertices, we develop AB-tree and PB-tree structures tospeed up the tree traversal, as well as a semi-dynamicindex updating mechanism to keep the index maintainablewhen growing bigger._ Our experimental evaluation demonstrates the efficiencyof our algorithms and index structures for processingthe CRS queries on real-world datasets. We show thesuperiority of our algorithms in answering CRS whencompared with the baseline algorithms.

EXISTING SYSTEM:

SEarching geo-textual objects with query location and keywordshas gained increasing attention recently due to the popularityof location-based services. In Euclidean space, IR-tree integrates signature files and R-tree to answer boolean keywordqueries. IR-tree is an R-tree augmented with inverted filesthat supports the ranking of objects based on a score functionof spatial distance and text relevancy. Cao et al. proposes alocation-aware top-k prestige-based text retrieval (LkPT) query,to retrieve the top-k spatial web objects ranked according toboth prestige-based text relevance (PR) and location proximity.provides an all-round survey of 12 state-of-art geo-textualindices and proposes a benchmark that enables the comparison ofthe spatial keyword query performance. Zhang et al. proposes the m closet keyword query (mCK query) which aimsto find the closest objects that match the query keywords andtheir distance diameter is minimized. Recently, Guo et al.propose approximation algorithms to solve the mCK query witha ratio of (2p+ _). Cao et al. propose a collective spatialkeyword query, in which a different semantics is taken such that3the group of objects in the result covers the query keywords andhas the lowest cost. Li et al. studies the problem of directionawarespatial keyword search, which aims at finding the k nearestneighbours to the query that contain all input keywords and satisfythe direction constraint. Rocha et al. address the problemof processing top-k spatial keyword queries on road networkswhere the distance between the query location and the spatialobject is the length of shortest path. ROAD organizes theroad network as a hierarchy of subgraphs, and connects themby adding shortcuts.

CONCLUSION AND FUTURE DIRECTIONS

In this paper, we study the problem of CRS on road networks,which aims to find an optimal route such that it covers a setof query keywords in a given specific order, and the matchingdistance is minimized. To answer the CRS query, we first proposea greedy clue-based algorithm GCS with no index where the networkexpansion approach is adopted to greedily select the currentbest candidates to construct feasible paths. Then, we devise anexact algorithm, namely clue-based dynamic programming CDP,to answer the query that enumerates all feasible paths and finallyreturns the optimal result. To further reduce the computationaloverhead, we propose a branch-and-bound algorithm BAB byapplying filter-and-refine paradigm such that only a small portionof vertices are visited, thus improves the search efficiency. In orderto quickly locate the candidate vertices, we develop AB-tree andPB-tree structures to speed up the tree traversal, as well as a semidynamicindex updating mechanism. Results of empirical studiesshow that all the proposed algorithms are capable of answeringCRS query efficiently, while the BAB algorithm runs much faster,and the index size of PB-tree is much smaller than AB-tree.Several directions for future research are promising. First,users may prefer a more generic preference model, which combinesPoI rating, PoI average menu price, etc, in the query clue.Second, it is of interest to take temporal information into accountand further extend the CRS query. Each PoI is assigned witha opening hours time interval [T], and each clue containsa visiting time t, where the resulting query aims to find a pathsuch that the time interval of each matched PoI covers the visitingtime. Third, requiring users to provide exact keyword match isdifficult sometimes as they are just providing “clue”, which maybe inaccurate in nature. Thus, it is of interest to extend our modelto support the approximate keyword match. Hence, the matchingdistance can be modified by incorporating both spatial distanceand textual distance together through a linear combination.

REFERENCES

[1] I. Abraham, D. Delling, A. V. Goldberg, and R. F. Werneck. Hierarchicalhub labelings for shortest paths. In ESA, pages 24–35. Springer, 2012.

[2] T. Akiba, Y. Iwata, K.-i. Kawarabayashi, and Y. Kawata. Fast shortestpathdistance queries on road networks by pruned highway labeling. InALENEX, pages 147–154. SIAM, 2014.

[3] T. Akiba, Y. Iwata, and Y. Yoshida. Fast exact shortest-path distancequeries on large networks by pruned landmark labeling. In SIGMOD,pages 349–360. ACM, 2013.

[4] T. Akiba, Y. Iwata, and Y. Yoshida. Dynamic and historical shortestpathdistance queries on large evolving networks by pruned landmarklabeling. In WWW, pages 237–248. ACM, 2014.

[5] J. L. Bentley and J. B. Saxe. Decomposable searching problems i. staticto-dynamictransformation. Journal of Algorithms, 1(4):301–358, 1980.

[6] X. Cao, L. Chen, G. Cong, and X. Xiao. Keyword-aware optimal routesearch. PVLDB, 5(11):1136–1147, 2012.

[7] X. Cao, G. Cong, and C. S. Jensen. Retrieving top-k prestige-basedrelevant spatial web objects. PVLDB, 2010.

[8] X. Cao, G. Cong, C. S. Jensen, and B. C. Ooi. Collective spatial keywordquerying. In SIGMOD, pages 373–384. ACM, 2011.

[9] H. Chen, W.-S. Ku, M.-T. Sun, and R. Zimmermann. The multi-rulepartial sequenced route query. In SIGSPATIAL, page 10. ACM, 2008.

[10] L. Chen, G. Cong, C. S. Jensen, and D. Wu. Spatial keyword queryprocessing: an experimental evaluation. PVLDB, 2013.

[11] N. Christofides. Worst-case analysis of a new heuristic for the travellingsalesman problem. Technical report, DTIC Document, 1976