Data Science with A Focus on Spatial Domain
View/ Open
Date
2020-11-19Type of Degree
PhD DissertationDepartment
Computer Science and Software Engineering
Metadata
Show full item recordAbstract
Data science focuses on solving data-driven tasks using a variety of techniques, including but not limited to machine learning, neural networks, mathematics, and statistics. In this article, I work on two tasks in the scope of data science: contextual query understanding in natural language and data-intensive query processing. I especially cover those tasks in spatial domain. For query understanding, I focus on natural language interface to databases since data management systems are very powerful and widely used in industry. However, a natural language interface (to databases) is often customized to a particular domain and can hardly apply to other domains directly. I propose a transfer-learnable strategy to address the domain transfer challenge and devise a complete system to translate natural language questions to SQLs. I also design a natural language interface for spatial domain (SpatialNLI) as the idiosyncrasies of spatial semantics pose greater challenges. For data-intensive query processing, I focus on Spatial Skyline Query since the skyline problem suffers from quadratic running time, and many researchers put a lot of effort into accelerating its running time. I propose to address this challenge by parallelization and devise a scalable system that works for both small-scale and large-scale input. I work on both query understanding and query processing in an effort to assist users in making informed, data-driven decisions and take full advantage of data.