This Is AuburnElectronic Theses and Dissertations

Approximate Evaluation of Queries for Spatial, Uncertain and Probabilistic Databases




Chen, Haiquan

Type of Degree



Computer Science


In modern geographic information systems, route search represents an important class of queries. In route search related applications, users may want to define a number of traveling rules (traveling preferences) when they plan their trips. However, these traveling rules are not considered in most existing techniques. In this dissertation we propose a novel spatial query type, the multi-rule partial sequenced route (MRPSR) query, which enables efficient trip planning with user defined traveling rules. The MRPSR query provides a unified framework that subsumes the well-known trip planning query (TPQ) and the optimal sequenced route (OSR) query. The difficulty in answering MRPSR queries lies in how to integrate multiple choices of points-of-interest (POI) with traveling rules when searching for satisfying routes. We prove thatMRPSR query is NP-hard and then provide three algorithms by mapping traveling rules to an activity on vertex network. Afterwards, we extend all the proposed algorithms to road networks. By utilizing both real and synthetic POI datasets, we investigate the performance of our algorithms. The results of extensive simulations show that our algorithms are able to answer MRPSR queries effectively and efficiently with underlying road networks. Compared to the Light Optimal Route Discoverer (LORD) based brute-force solution, the response time of our algorithms is significantly reduced while the distances of the computed routes are only slightly longer than the shortest route. The past few years have witnessed the emergence of an increasing number of applications for tracking and tracing based on Radio Frequency Identification (RFID) technologies. However, raw RFID readings are usually of low quality and may contain numerous anomalies. An ideal solution for RFID data cleansing should address the following issues. First, in many applications, duplicate readings of the same object are very common. The solution should take advantage of the resulting data redundancy for data cleaning. Second, prior knowledge about the environment (e.g., prior object distribution, false negative rates of readers) may help improve data quality, and a desired solution must be able to take into account such knowledge. Third, the solution should take advantage of physical constraints in target applications (e.g., the number of objects in a same location cannot exceed a given value) to elevate the accuracy of data cleansing. There are several existing RFID data cleansing techniques. However, none of them support all the aforementioned features. In this dissertation we propose a Bayesian inference based framework for cleaning RFID raw data. We first design an n-state detection model and formally prove that the 3-state model can maximize the system performance. In addition, we devise a Metropolis-Hastings sampler with Constraints (MH-C), which incorporates constraint management to clean RFID data with high efficiency and accuracy. Moreover, to support real-time object monitoring, we present the Streaming Bayesian Inference (SBI) method to cope with real-time RFID data streams. Finally, we evaluate the performance of our solutions through extensive experiments.