Ubicc-submitted 206 206

  • Uploaded by: Usman Tariq
  • 0
  • 0
  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Ubicc-submitted 206 206 as PDF for free.

More details

  • Words: 6,567
  • Pages: 11
Continuous Reverse Nearest Neighbor Search Lien-Fa Lin*, Chao-Chun Chen Department of Computer Science and Information Engineering National Cheng-Kung University, Tainan, Taiwan, R.O.C. Department of Information Communication Southern Taiwan University of Technology, Tainan, Taiwan, R.O.C. [email protected],[email protected]

ABSTRACT The query service for the location of an object is called Location Based Services (LBSs), and Reverse Nearest Neighbor (RNN) queries are one of them. RNN queries have diversified applications, such as decision support system, market decision, query of database document, and biological information. Studies of RNN in the past, however, focused on inquirers in immobile status without consideration of continuous demand for RNN queries in moving conditions. In the environment of wireless network, users often remain in moving conditions, and sending a query command while moving is a natural behavior. Availability of such service therefore becomes very important; we refer to this type of issue as Continuous Reverse Nearest Neighbor (CRNN) queries. Because an inquirer’s location changes according to time, RNN queries will return different results according to different locations. For a CRNN query, executing RNN search for every point of time during a continuous query period will require a tremendously large price to pay. In this work, an efficient algorithm is designed to provide precise results of a CRNN query in just one execution. In addition, a large amount of experiments were conducted to verify the above-mentioned method, of which results of the experiments showed significant enhancement in efficiency. Keywords: Location Based Services, Location-Dependent Query, Continuous Query, Reverse Nearest Neighbor Query, Continuous Reverse Nearest Neighbor Query 1

INTRODUCTION

As wireless network communications and mobile device technology develop vigorously and positioning technology matures gradually, LBS is becoming a key development in the industrial as well as academic circles [2, 5, 13, 21, 26, 27]. According to the report of “IT Roadmap to a Geospatial Future” [6], LBSs will embrace pervasive computing and transform mass advertising media, marketing, and different societal facets in the upcoming decade. Despite the fact that LBSs have been existing in the traditional calculation environment (such as Yahoo! Local), its greatest development potential lies in the domain of mobile computing that provides freedom of mobility and access to information anywhere possible. LBSs shall become an indispensable application in mobile network as its required technology has matured and 3G wireless communication infrastructure is expected to be deployed everywhere. The query that answers to LBSs is referred to as

Location-Dependent Query (LDQ), of which applications include Range Query, Nearest Neighbor (NN) query, K-Nearest Neighbor (KNN) query, and Reverse Nearest Neighbor (RNN) query. There are plenty of studies about NN [14, 22, 26], KNN [4, 9, 14, 23, 25], CNN [17, 3, 12, 20], and CKNN [17, 20] queries, and issues pertaining to Reverse Nearest Neighbor (RNN) Query [10, 11, 16, 18, 19, 22, 24] have been receiving attention in recent years. RNN query means finding a collection of nearest neighbor objects for S, a given collection of objects, with q, a given query object. Practical examples of RNN query are provided in [10]. If a bank is planning to open a new branch, and its clients prefer a branch on a nearest possible location, then such new branch should be established on a location where the distance to the majority of its clients is shorter than that of other banks. Taxi cabs selecting passengers is another good example. If a taxi cab uses wireless devices to find out the location of its customer, then RNN queries will be far more advantageous than NN queries from the aspect of

Ubiquitous Computing and Communication Journal

competition. Figure 1 illustrates that Customer c is the nearest neighbor for Taxi a, but that does not necessarily mean Taxi a can capture Customer c because Taxi b is even closer to Customer c. On the contrary, the best option for Taxi a should be Customer d because Taxi a is the nearest neighbor for Customer d. That is, d is the RNN for a, and a may reach d faster than any other taxi. This is an example of CRNN query for that the query object, the taxi, changes location according to time. Mobile users will be mobile in a wireless environment, and that is why the continuous query is an important issue in the wireless environment. As far as the knowledge available to the researchers is concerned, there is not yet any researcher working on this issue. Because an inquirer changes location constantly according to time, changes of location will cause RNN queries to return different results. For a CRNN query, executing RNN search for every point of time during a continuous query period will require a tremendously large price to pay. The larger the number of query objects and the shorter the time segment are, the longer the calculation time will be. In addition, due to the continuance nature of time, defining the appropriate time segment for RNN search will be a concern; if the interval between RNN searches is too short, then more CRNN queries need to be executed to complete the query, and vice versa. If a RNN search is repeated over a longer period of time to reduce the number of execution, the RNN query result for the whole time segment will lose accuracy due to insufficient frequency of sampling. In this paper, a more efficient algorithm is designed to replace processing of each and every point of time for RNN search; just one execution of CRNN query is all it takes to properly define the segment for the query time that a user is interested in, and find out the segments that share the same answer and the RNN for each of the intervals. Other than that, an index is also used to filter out unnecessary objects to reduce search space and improve CRNN search efficiency. The experiment result suggests that using index provides efficiency 20 times better than not using index when the number of objects is 1000. This Study provides major contribution in three ways: z This Study pioneers into continuous query processing methods opposite to static query regarding RNN issues. z A CRNN search algorithm is proposed; just one execution will return all CRNN results. z The proposed method allows the index which was only applicable to finding RNN for a single query point to support CRNN query to improve CRNN search efficiency. The structure of the other sections in this work:

Related works about RNN search are introduced in Section 2. Concerned issues are defined and assumptions made are described in Section 3. The proposed CRNN search algorithm is introduced in Section 4 The experiment environment and evaluation parameters for experimental efficacy are described in Section 5. In the end, a conclusion and future study directions are provided in Section 6.

Figure 1: Example of RNN query.

2

RELATED WORK

RNN search concerns about finding q, a query that is the NN for some objects. Related works of study about RNN search are introduced and summarized in this section: z Index methods that support RNN search The number of objects can be infinite; if one must first find out the distance from query q to each object for identifying the RNN for query q, then the efficiency may be unacceptably low due to overwhelmingly large computation cost. To accelerate processing speed, most of studies adopt the index methods. Major index methods are introduced in this section. z RNN search of different types RNN searches in different scenarios are described and categorized according to static and moving situations of query q and the objects. 2.1

Index Methods for RNN Query

RNN search concerns about finding q, a query that is the NN for some objects, and it is necessary to find out the distance between query q and each object, or the distance from the coordinate of query q to the coordinate of an object. For a given q, not every object is its RNN, and these objects which can not be RNN may be practically left out of consideration to reduce the number of objects to be taken into consideration and accelerate processing speed for RNN search. Many studies were dedicated to the designing of an effective indexing structure for coordinates of an object. The most famous ones are R-Tree proposed by [8] and Rdnn-Tree proposed by

Ubiquitous Computing and Communication Journal

[10]. These two index methods are described below.

these child trees to its NN will not exceed MaxDnn.

2.1.1R-Tree R-Tree is an index structure developed in early years for spatial database and was used by [10] to accelerate RNN search processing. All objects are grouped and then placed on leaf nodes according to the closeness of their coordinates. That is, objects at similar coordinates are put in one group. Next, each group of objects is contained in a smallest possible rectangle, which is called Minimum Bounding Rectangle (MBR). Next, MBRs are grouped in clusters, which are contained inside a larger MBR until all objects are contained in the same MBR. What is stored on an internal node of a R-Tree is an MBR, in which all nodes underneath are contained, and the root of the R-Tree contains all objects. The size and range of an MBR is defined by its lower left coordinate (Ml,Md) and upper right coordinate (Mr, Mu). Figure 2 is an example of R-Tree. From a to l, there are total 12 objects; (a , b , c , d) belong to MBR b1, and (e,f,g) belong to MBR b2. MBR b1 and b2 belong to MBR B1, and MBR R contains all objects..

Figure 3. Data structure of Rdnn-tree 2.2 Categories of Rnn queries Depending on the static or moving status of query q and the query objects, related studies can be summarized into 4 categories. 1. If query q and the query objects are both static, then this category is called static query vs. static objects. 2. If query q is moving and the query objects are static, then this category is called moving query vs. static objects. 3. If query q is static and the query objects are moving, then this category is called static query vs. moving objects. 4. If both query q and the query objects are moving, then this category is called moving query vs. moving objects. 2.2.1 Static query vs. static objects

Figure 2: Example of R-tree Indexing 2.1.2 Rdnn-Tree Rdnn-tree (R-tree containing Distance of Nearest Neighbors) [22] improves the method of [10]. The author proposes a single index structure (Rdnn-tree) to provide solutions for NN queries and RNN queries at the same time. Rdnn-tree differs from standard Rtree structure by storing extra information about nearest neighbor of the points in each node. Information of (ptid,dnn) is stored on the leaf node of Rdnn-tree, as shown in Figure 3. ptid means an object of which the data concentrate on the dimension, denoted as d, and dnn means the distance from such object to its NN. Information of (ptr , Rect , MaxDnn) is stored on a non-leaf node, where ptr points to the address of a child node, Rect contains the MBR of all child nodes subordinate to this node, and MaxDnn means the maximum value of dnn of all objects in the child trees subordinate to this node. The maximum distance from any object contained in

The scenario that both query q and query objects are static is first discussed because the query and query objects are immobile and are therefore easier for processing than other scenarios. The method proposed in [10] is now introduced. For static database, the author adopts a special R-tree, called RNN-tree, for answering RNN queries. For static database that requires being frequently updated, the author proposes a combined use of NN-tree and RNN-tree. NN of every object is stored in the RNNtree, and what are stored in the NN-tree are the objects themselves and their respective collections of NN. The author uses every object as the center of a circle, of which the radius is the distance from the object to its NN, to make a circle, and then examines every circle that contains query q to find out the answers of RNN queries. Such method, however, is very inefficient for dynamic database because the structures of NN-tree and RNN-tree must be changed whenever the database is updated. In [22], the method proposed by [10] is therefore improved. The author proposes a single index structure, Rdnn-tree, for answering NN queries and RNN queries at the same

Ubiquitous Computing and Communication Journal

time. It differs from normal R-tree; it separately stores the information of NN of every object (i.e. Distance of Nearest Neighbor), and NN of every object must be calculated in advance. 2.2.2 Static query vs. moving objects Studies mentioned above primarily assume a monochromatic situation that all objects, including query q and query objects, are of the same type. In [18], the researcher addresses this type of issues in a bichromatic situation that objects are divided into two different types; one is inquirer, and the other is query object. NN and range query techniques are used in this Paper to handle RNN issues.

q; instead, i segments of time, such as segment1, segment2 …segmenti, that have the same result, are first identified among the entire CRNN query time period. RNN result of each segment of time, such as RNN1, RNN2, …, is calculated separately, and the result is returned in the format of (q , [segment1])={RNN1 result} , … , (q , [segmenti])={RNNi result} back to the inquirer.

2.2.3 Moving query Figure 4. Example of a CRNN query This subsection discusses the situation when an inquirer is no longer static but changes his or her location according to time, and the query object can be either static or moving. That is, two categories of query: moving query with static objects and moving query with moving objects, are involved. Because the inquirer is moving, these two categories of query will return different results for the identical RNN search at different points of time. This type of issue is obviously more complicate than the issues previously discussed. As far as the knowledge available to the researcher is concerned, no related study has ever discussed about the issues of these two categories. In this Study, solutions for a moving query with static objects are pursued.

3 Problem Formulation CRNN query concerns about a period of continuing time where adjacent points in such period of time may have the identical RNN. That is, a period of time may have the same RNN unless the query q has moved beyond this period of time. Please refer to Figure 4. When a user executes CRNN query q, the time segment of the continuous query is [Ps,Pe], and the query objects are {a,b,c,d}. If time point P1 can be identified, and any given point of time in the time segment from Ps to P1, or [Ps,P1], has the same RNN result, then one-time execution of RNN search is all it needs for the time segment of [Ps ,P1]. If points of time, P2, P3, and P4, are also identified, and any given point of time in the time segment of [P1, P2] has the identical RNN result, while any given point of time in the time segment of [P2,P3] has the identical result, and any given point of time in the time segment of [P3,P4] has the identical result, then the entire CRNN query needs only one-time execution of RNN search at each time segment. For processing CRNN query, it is not necessary to execute RNN search for every point of time and return the RNN of every point of time back to query

Based on the description above, CRNN query, one issue that this Study concerns, may be stated as below: Given: A collection of static objects S={O1,O2,…,On} A query point q, its current position (q.x,q.y), and moving velocity (v.x,v.y) A continuous query time [Ps , Pe]; where Ps represents the coordinate of the point of time when the query begins, and Pe represents the coordinate of the point of time when the query ends Find: The RNNs of q between any two adjacent points of time of {P1 ,P2 ,…,Pi} within [Ps ,Pe] remain constant. Such that: RNN(q,[Ps ,P1]) = {RNN1},RNN(q,[P1 ,P2]) ={RNN2},…, RNN(q,[Pi ,Pe]) = {RNNi}, where [Pi,Pe] ⊆ [Ps,Pe],{RNN1} ∈ {O1,O2,…, On}. Under the assumptions: 1. The moving direction of query q is fixed. 2. All query objects are static. As described above, two adjacent points of time may share the same RNN, or a segment of time has only one RNN unless query q moves to another segment of time that has a different RNN. The CRNN search algorithm proposed in this Study uses exactly this concept. First, the points of time that produce different RNN results within a query time are identified. These points of time divide the query time into several segments that have different RNN results, and then the RNN results are identified for each of the segments. The detailed algorithm of CRNN Search is explained in the next section.

4. CRNN Search Algorithm The detailed procedure of CRNN Search algorithm is introduced in this section. CRNN Search

Ubiquitous Computing and Communication Journal

algorithm is divided into two steps. Step 1: Finding segment points of CRNNq Points of time that produce different RNN results are identified. Based on these points of time, CRNN query is divided into several time segments that require execution of RNN search. The RNN result for any given point of time within one segment will remain constant, and different segments have different RNN results. Step 2: Calculating RNN result of each segment Separately calculate the RNN results for each of the segments that have been divided in the previous step. The entire procedure for processing CRNN Search is illustrated in Figure 5. On top of the necessary query objects and continuous query (query path), it is divided into two steps: finding segment points of CRNNq and calculating RNN result of each segment; each of the steps is described below:

As illustrated in Figure 6, if the NN of object a is b, and a circle is made using ab as the radius with a as the center point, then the distance from query q to a must be shorter than the distance from a to its NN, or object b, as long as query q falls within this circle. Therefore, during the period of time when query q remains within this circle, RNNs of object a must include a, unless query q moves out of this circle. Because the moving direction of query q is assumed to be fixed, CRNN query will form a query line (qline) from its beginning to its end. The point to which this CRNN query begins to leave this circle is the intersection S of this circle and the query line formed by CRNN query. Before intersection S, the result of RNN query must include object a; beyond intersection S, the result of RNN query will not include object a; the RNN results will be different. This intersection is referred to as a segment point. This explains why the intersection of the circle with NN as its radius and the query line is the point of time where RNN query produces different results. Making a circle by using an object itself as the center and the distance to its NN as the radius will enable all of the intersections of the circle and the query line of CRNN query to cut CRNN query into several time segments that have different results of RNN query.

Figure 5: Flow chart of CRNN query processing 4.1 Finding segment points of CRNN What CRNN query pursues is a period of continuous time; the moving distance of query objects is very short among some adjacent points of time for the query, thus possibly resulting in the same RNN result. That is, the entire period of continuous query is divided into several segments, and the RNN results in each segment are the same. If these points of time share the same RNN result, then it is not necessary to execute RNN search for each of the points of time; one-time calculation is enough. Therefore, CRNN query does not require executing RNN search for all points of time. Instead, points of time that share the same RNN result are grouped into time segments, and one-time RNN search is executed for each of the segments. RNN of query q is a collection of the objects of which the NN is query q. If the distance, or N, is realized in advance, then these objects are the RNN for query q when the distances from query q to the objects are shorter than the distances from the objects to their respective NN.

Figure 6. Finding segment point of CRNN search Figure 7 illustrates the time segmentation process described above. For object a, b, and c, their respective NNs are identified first: NN(a)=b, NN(b)=a, and NN(c)=b. Next, use each object as the center of a circle, and the distance to its respective NN as the radius to make circles of a, b, and c. Then, intersections of the circles and qlines, Ps, P1, P2, P3, P4, and Pe , are sorted according to time, and every two intersection points define a time segment. The entire CRNN query is cut into five time segments, [Ps, P1] , [ P1, P2] , [P2,P3] , [P3,P4] , and [P4,Pe]. Every segment has a unique RNN query result.

Ubiquitous Computing and Communication Journal

Figure 8: Calculating RNN result of each segment. 1.

CRNN Algorithm with Index Not every object will be an answer in the processing of CRNN query. To improve RNN query efficiency, it is preferred that the objects that can not be answers are filtered out in advance to greatly reduce search space for CRNN query, size of data that requires CRNN query, and consequently, computation cost. The process that further improves CRNN query efficiency dramatically is referred to as pruning process. Figure 9 illustrates the flowchart of CRNN query processing with a pruning process added.

Figure 7: Segmenting of the CRNN query 4.2 Calculating RNN result of each segment In the previous section, intersections of qlines and the circles with the distances between the objects and their respective NNs as the radiuses are defined. With these intersections, CRNN query is cut into several time segments. The next step is to find RNNs for each of the time segments. Because the distances from query objects to their respective NNs are used as the radiuses to make circles which are coded by the objects’ numbers, if a segment falls within a certain circle, then the resulting RNN of this time segment for the CRNN query is the object collection represented by such circle. This is illustrated in Figure 8. First, intersections of qlines that represent the CRNN query and the circles of the objects are sorted by time; every two intersection points define a time segment, and there are five segments, [Ps,P1] , [ P1 , P2] , [P2 , P3] , [P3 , P4] , and [P4 , Pe]. Segment [Ps , P1] is contained only by circle a, therefore: RNN(q,[Ps ,P1]) ={a}. Next, examine segment [Ps,P1]; this segment is contained by circle a and circle b. Therefore: RNN(q, [Ps,P1]) = {a, b}. If this process is repeated, then the obtained results will be RNN(q , [P2 , P3]) = {a , b , c}, RNN(q,[P3,P4]) ={b,c}, and RNN(q,[P4,P5]) ={c}.

Figure 9. Flow chart of CRNN query with index Step 2 and 3 are identical to Step 1 and 2 in CRNN search algorithm, which have been described in the previous sections, and they will not be reiterated again here. For step 1, the pruning process, an index structure for Rdnn-tree is designed to effectively execute the pruning process. The three steps of CRNN query with index are illustrated in Figure 9. The pruning process is described below. For every internal node of Rdnn-tree, the distance from query q to its node will be computed for every separation, and the distance is denoted as D(q,Rect). If D(q,Rect) of a node is larger than MaxDnn of the node, then all the objects beneath it will not be considered because the distance from query q to Rect Node will be equal to or larger than the distances from query q to all the objects underneath Rect node. When the distance from query q to Rect node is longer than MaxDnn, it is impossible that query q is closer to its NN than any other object underneath Rect node, and no object underneath can be the RNN result for query q. On the contrary, if D(q,Rect) equals the MaxDnn of such node, then the distances from some objects underneath Rect node to their respective NNs are shorter than the distance from query q to Rect. That is, some objects are the RNN results for query q. The examination continues along

Ubiquitous Computing and Communication Journal

the branch all the way to the lead node. All entries underneath such leaf node are recorded as the candidate objects for RNN query result. The collection of these candidate objects is referred to as RNNCanSet, which means the possible results for RNN query must exist within this collection, and the objects outside of RNNCanSet can not possibly be RNN query results. All that are needed to be considered when finding segment point of CRNNq of CRNN search algorithm are the objects inside RNNCanSet. This will greatly reduce the quantity of objects needed to be handled and enhance CRNN search algorithm efficiency. Figure 3 explains the pruning process. It begins with root node R. Because D(q,R) ≦MaxDnn of R, child nodes of B1 and B2 must be examined. Because the MaxDnn of MBR B1 ≦D(q,B1), all child nodes underneath B1 can be pruned. Next, D(q , B2)≦MaxDnn of B2, so child nodes b3 and b4 of B2 must be examined. D(q,b3) is equal to or smaller than the MaxDnn of b3, which is also a leaf node;

therefore, h and i are placed inside RNNCanSet. Next, b4 is examined. MaxDnn of b4 is equal to or smaller than D(q , b4); therefore, b4 can be pruned. The entire pruning process then ends. However, the CRNN query to be processed is not a RNN query of a single query point; therefore, the pruning process in [22] can not be directly used. To ensure that no possible RNN result is deleted, the criteria of pruning is changed from the condition that D(q,Rect), the distance from query point to Rect, must be longer than MaxDnn to the condition that MinD(q,Rect)>MaxDnn, where MinD(qline,Rect) represents the minimum distance from qline to Rect node. The reason why the shortest distance is selected is that if the minimum distance from the entire qline to Rect node is larger than MaxDnn, then the distance from any given point of time on the qline to Rect node must be longer than MaxDnn. Therefore, all the objects underneath Rect node can not be RNN for qline, and pruning is out of consideration. Details of the pruning algorithm are exhibited in Algorithm 1:

Algorithm 1: Pruning Algorithm. and comparison of experiment results. 5. Performance Study 5.1 Experiment Settings To evaluate the improvement which the method proposed in this Study has made in CRNN query efficiency, some experiments are designed, and this section provides descriptions of experiment environments, experimental parameters and settings,

The coordinates of the objects disperse in an experiment environment of [0 , 1]×[0 , 1] plane. Because distribution density of the objects may influence efficiency, it should be taken into

Ubiquitous Computing and Communication Journal

consideration in the experiment. In the experiment, three different types of distribution are used in the generation of objects’ coordinates. The three different types of distribution are Uniform distribution, Gaussian distribution, and Zipf distribution. In uniform distribution, the objects are evenly distributed on the plane, as shown in Figure 10(a). In Gaussian distribution, most of the objects concentrate on the center of the plane, as shown in Figure 10(b). In Zipf distribution, most of the objects will distribute at the extreme left and extreme bottom of the plane. In the experiment, skew factor is set at 0.8, as shown in Figure 10(c). In addition, 30 queries are generated randomly in a [0.4 , 0.6]×[0.4 , 0.6] plane by

referring to [15], and the velocity vector of each query falls between [-0.01 , 0.01]. Because the influence of different types of object distribution on efficiency is concerned in this experiment, the queries are generated as close to the center of the plane as possible. Having executed 30 queries, the average cost of executing one CRNN search is used in determining which method is more favorable. As to the program coding of Rdnn-tree in the CRNN search algorithm, R*-tree code of GIST[7] is used in perfecting Rdnn-tree to make it match with the requirement of this experiment. 1

0.8

0.8

0.8

0.6

0.6

0.6

Y axis

Y axis

1

Y axis

1

0.4

0.4

0.4

0.2

0.2

0.2

0

0

0 0

0.2

0.4

0.6

0.8

1

0

0.2

0.4

0.6

0.8

1

0

0.2

X axis

X axis

0.4

0.6

0.8

1

X axis

Figure 10 Data sets of experiment evaluation In addition to the object distribution described above, the influences that the amount of query time (qline) and the number of objects may impose on efficiency are also considered. Three data sets of Uniform, Gaussian, and Zipf are considered in object distribution. The amount of query time (qline) changes from query length 1 to query length 10. The number of objects changes from 1K to 10K. Parameters and settings used in the experiment are listed in Table 1. Table 1: Parameter settings of experiment Parameter distribution

Description Data distribution

interval

Time interval of Query Number of Data Objects

object-no

Settings Uniform, Gaussian, Zipf 1, 2, 5, 8, 10 1, 10. 30, 50 , 100(k)

CRNN query is executing RNN algorithm for every point of time which is continuous, and it is impossible to calculate the required count of execution. Therefore, the CRNN query time must be segmented before the total execution time required for CRNN query may be calculated. The more the time is segmented, the more executions of RNN are required. If a period of time is segmented into m segments, then time complexity will be O(m×n3), and if time is not adequately segmented, then the RNN result may be erroneous. These make it an inefficient CRNN search algorithm, and it will not be compared in this experiment. Efficiency of two methods is compared in this experiment: one uses Rdnn-tree as the index, and the other uses no index. To evaluate these two methods, comparison of the time required for one CRNN search execution can be used, and this comparison is referred to as total cost in this Study. 5.4 Performance Results and Discussion

5.2 Compared Algorithms and Performance Metrics The most intuitive method for finding RNN is looking for the NN of every object. If the number of query objects is N, then time Complexity is O(n2). Next, determine which objects’ NNs are query points. If the NNs are the query points, then the objects will be the RNNs for the query points. The required time complexity for the RNN algorithm is O(n3). However, the most intuitive method for finding

Based on the changes of metrics (distribution, interval, and object-no), different types of experiments have been conducted. Results are summarized by object-no and query interval in the next section. 5.4.1 The effect of object-no parameter First, the fixed query interval is set at 5. The influence imposed on efficiency by object-no

Ubiquitous Computing and Communication Journal

parameter, which is the number of objects, under different types of object distribution, will be discussed. The experiment result is shown in Figure 11. X axle represents the total cost of time required for executing one CRNN search, and Y axle represents the number of query objects. Total cost increases as the number of query objects increases. In addition, when the number of objects is 1K, the efficiency of the CRNN search that uses Rdnntree is about 300 seconds, and that of the CRNN search using no Rdnn-tree index is about 15 seconds; about 20 times faster. It is obvious that pruning some unnecessary objects by adopting Rdnn-tree as the index to reduce CRNN search space provides much higher efficiency than not adopting Rdnn-tree. The 7

107

7

10

10 CRNN without index CRNN with index

106

6

5

10

5

5

10

3

10

2

Total time (sec. )

Total time (sec. )

10

104

104 3

10

103 10

10

10

104

2

2

10

10

10

1 1K

10K

30K object-no

50K

1

1

100K

CRNN without index CRNN with index

6

CRNN without index CRNN with index

10

10

Total time (sec. )

comparison of influence from object distribution on efficiency clearly suggests Zipf distribution offers the best efficiency, followed by Uniform distribution, and Gaussian distribution offers the worst. This result can be explained as such: because Zipf distribution is located at the far left and the lowest bottom, most of the data will be pruned, and the number of objects that are included in RNNCanSet without being pruned is very small, offering the lowest total cost. On the opposite, data in Gaussian distribution concentrate in the center of the plane and very few data can be pruned, allowing many objects to remain, and causing a large RNNCanSet, thus resulting in the highest total cost.

1K

10K

30K object-no

50K

1K

100K

10K

30K object-no

50K

100K

Figure 11: Influences on different types of data distribution from changing object-no distance of MinD(qline , Rect) decreases, causing pruning efficiency to reduce. On the contrary, when the interval increases, the number of time segmentation by CRNN query increases. Consequently, the number of RNN searches for every segment increases, and total cost of CRNN query increases as well.

5.4.2 The effect of query interval parameter This section focuses on the influence from the length of query interval on each method under different types of object distribution. Results of the experiment are shown in Figure 12. Generally speaking, when the query interval is lengthened, the 105

105

104

5

10

104

4

10

3

10

102

10

Total Time (sec. )

CRNN without index CRNN with index

Total time (sec. )

Total Time (sec. )

CRNN without index CRNN with index

103

2

10

10

1 1

2

5 Query Interval

8

10

103

CRNN without index CRNN with index

2

10

10

1 1

2

5

8

10

1 1

2

Query Interval

5

8

10

Query Interval

Figure12: Effect of query interval parameter for different data distribution 6. Conclusions and Future Works An efficient CRNN search algorithm is proposed in this Study. Such algorithm requires only one execution to find out RNN results from all continuous RNN searches. The diversified experiments in this

Study also prove the efficiency of the proposed method. As wireless communication and mobile device technology become mature, more and more users access information from wireless information systems through mobile devices. To process requests from more and more mobile users, data dissemination

Ubiquitous Computing and Communication Journal

through broadcast is an effective solution for scalability. The future goal of this Study is extending the issues of CRNN search to the wireless broadcasting environment. 7. References [1] AmitSingh,H.F. and Tosun,A.S. (2003) High dimensional reverse nearest neighbor queries. Proceedings of the 20th International Conference on Information and Knowledge Management (CIKM’03), NewOrleans, LA, USA, pp.91–98. [2] Barbara,D. (1999) Mobile computing and databases-a survey. IEEE Transactions on Knowledge and Data Engineering, 11, 108–117. [3] Benetis,R.,Jensen, C.S.,Karciauskas,G., and Saltenis,S. (2002) Nearest neighbor and reverse nearest neighbor queries for moving objects. International Database Engineering and Applications Symposium, Canada, July17-19, pp.44–53. [4] Chaudhuri,S. and Gravano,L. (1999) Evaluating top-k selection queries. Proceedings of the 25th IEEE International Conference on Very Large Data Bases, pp.397–410. [5] Civilis,A., Jensen,C.S., and Pakalnis,S. (2005) Techniques for efficient road-network-based tracking of moving objects. IEEE Transactions on Knowledge and Data Engineering, 17, 698– 712. [6] Computer Science and Telecommunication Board.IT Roadmap to a geospatial future, the national academies press,2003. [7] http://gist.cs.berkeley.edu/. [8] Guttman,A. (1984) R-trees:A dynamic index structure for spatial searching. Proceedings of the 1984 ACM SIGMOD international conference on Management of data, pp.47–57. [9] Hjaltason,G.R. and Samet,H. (1999) Distance browsing in spatial data bases. ACM Transactions on Database Systems (TODS), 24, 265–318. [10] Korn,F. and Muthukrishnan,S. (2000) Influence sets based on reverse nearest neighbor queries. Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, Texas, USA, May16-18, pp.201– 212. [11] Korn,F.,Muthukrishnan, S.,and Srivastava.,D. (2002) Reverse nearest neighbor aggregates over data streams. Proceedings of the International Conference on Very Large DataBases (VLDB’02), Hong Kong, China, August, pp.91–98. [12] Korn,F., Sidiropoulos, N.,Faloutsos, C.,Siegel,E., and Protopapas,Z. (1996) Fast nearest neighbor search in medical image database. In Proceedings of the 22th International Conference on Very Large Data Bases

(CLDB’96), pp. 215–226. [13] Lee,D.L., chien Lee, W.,Xu,J., and Zheng, B. (2002) Data management in location dependent services. IEEE Pervasive Computing,1, 65–72. [14] Roussopoulos,N., Kelley,S. ,and Vincent, F.(1995) Nearest neighbor queries. Proceedings of ACM Sigmod International Conference on Management of Data , Illinois, USA, June, pp.71–79. [15] Ross,S.(2000) Introduction to Probability and Statistics for Engineers and Scientists. [16] Stanoi,I., Agrawal,D., and Abbadi,A.E. (2000) Reverse nearest neighbor queries for dynamic databases. ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp.44–53. [17] Song,Z. and Roussopoulos,N. (2001) K-nearest neighbor search for moving query point. Proceedings of 7th International Symposium on Advances in Spatial and Temporal Databases, LNCS2121, RedondoBeach, CA, USA, July1215, pp.79–96. [18] Stanoi,I.,Riedewald, M.,Agrawal,D., and Abbadi,A.E. (2001) Discovery of influence sets in frequently updated databases. Proceedings of the 27th VLDB Conference, Roma, Italy, pp.99– 108. [18] Tao,Y., Papadias,D., and Lian,X. (2004) Reverse knn search in arbitrary dimensionality. Proceedings of 30th Very Large Data Bases, Toronto, Canada, August29-September3, pp.279–290. [20]Tao,Y., Papadias, D., and Shen,Q. (2002) Continuous nearest neighbor search. International Conference on Very Large Data Bases, Hong Kong, China, August 20-23, pp.279–290. [21] Xu,J., Zheng,B., Lee,W.-C.,, and Lee,D.L. (2003) Energy efficient index for energy query location-dependent data in mobile environments. In Proceedings of the 19th IEEE International Conference on Data Engineering (ICDE’03), Bangalore, India, March, pp.239– 250. [22] Yang,C. and Lin,K.-I. (2001) An index structure for efficient reverse nearest neighbor queries. Proceedings of the 17th International Conference on Data Engineering, pp.485–492. [23] Yiu,M.L., Papadias,D., Manoulis,N., and Tao,Y. (2005) Reverse nearest neighbors in large graphs. Proceedings of 21st IEEE International Conference on Data Engineering (ICDE), Tokyo, Japan, April5-8, pp.186–187. [24] Yu,C.,Ooi,B.C., Tan,K.-L., and Jagadish,H.V. (2001) Indexing the distance: An efficient method to knn processing. Proceedings of the 27th VLDB Conference, Roma, Italy, pp. 421– 430. [25] Zheng,B.,Lee, W.-C., and Lee,D.L. (2003)

Ubiquitous Computing and Communication Journal

Search k nearest neighbors on air. Proceedings of the 4th International Conferenceon Mobile Data Management, Melbourne, Australia, January, pp.181–195. [26] Zheng,B., Xu,J., chien Lee, W., and Lee,D.L. (2004) Energy conserving air indexes for nearest neighbor search. Proceedings of the 9th International Conference on Extending Database Technology (EDBT’04), Heraklion,

Crete, Greece,March, pp.48–66. [27] Zhang,J., Zhu,M., Papadias,D., Tao,Y., and Lee,D.L. (2003) Location-based spatial queries. In Proceedings of the 2003 ACM SIGMOD international conference on Management of data, SanDiego, California, USA, June9-12, pp.443–454.

Ubiquitous Computing and Communication Journal

Related Documents

206
November 2019 43
206
May 2020 36
206
May 2020 39
206
October 2019 47
206
June 2020 23
Ubicc-submitted 206 206
November 2019 29

More Documents from "Usman Tariq"

Ubicc Journal 2007 Study 8
November 2019 17
Mpeg-2 Pocket Guide
June 2020 17
Ubicc008_268
November 2019 22
Md Ali Ahsan Razib Id57 57
November 2019 29
Crc-_ubicc_tcp_21_21_21
November 2019 25