Query Processing over Incomplete Databases (Synthesis Lectures on Data Management)
暫譯: 不完整資料庫的查詢處理(數據管理綜合講座)
Yunjun Gao, Xiaoye Miao
- 出版商: Morgan & Claypool
- 出版日期: 2018-08-20
- 售價: $2,570
- 貴賓價: 9.5 折 $2,442
- 語言: 英文
- 頁數: 122
- 裝訂: Hardcover
- ISBN: 1681734222
- ISBN-13: 9781681734224
-
相關分類:
資料庫
海外代購書籍(需單獨結帳)
相關主題
商品描述
Incomplete data is part of life and almost all areas of scientific studies. Users tend to skip certain fields when they fill out online forms; participants choose to ignore sensitive questions on surveys; sensors fail, resulting in the loss of certain readings; publicly viewable satellite map services have missing data in many mobile applications; and in privacy-preserving applications, the data is incomplete deliberately in order to preserve the sensitivity of some attribute values.
Query processing is a fundamental problem in computer science, and is useful in a variety of applications. In this book, we mostly focus on the query processing over incomplete databases, which involves finding a set of qualified objects from a specified incomplete dataset in order to support a wide spectrum of real-life applications. We first elaborate the three general kinds of methods of handling incomplete data, including (i) discarding the data with missing values, (ii) imputation for the missing values, and (iii) just depending on the observed data values. For the third method type, we introduce the semantics of k-nearest neighbor (kNN) search, skyline query, and top-k dominating query on incomplete data, respectively. In terms of the three representative queries over incomplete data, we investigate some advanced techniques to process incomplete data queries, including indexing, pruning as well as crowdsourcing techniques.
商品描述(中文翻譯)
不完整的數據是生活的一部分,也是幾乎所有科學研究領域的常態。使用者在填寫線上表單時往往會跳過某些欄位;參與者在調查中選擇忽略敏感問題;感測器故障,導致某些讀數的丟失;公開可見的衛星地圖服務在許多行動應用程式中存在缺失數據;而在隱私保護應用中,數據故意不完整以保護某些屬性值的敏感性。
查詢處理是計算機科學中的一個基本問題,並且在各種應用中都非常有用。在本書中,我們主要關注不完整數據庫上的查詢處理,這涉及從指定的不完整數據集中找到一組合格的對象,以支持各種現實生活中的應用。我們首先詳細說明處理不完整數據的三種一般方法,包括 (i) 丟棄缺失值的數據,(ii) 對缺失值進行插補,以及 (iii) 僅依賴觀察到的數據值。對於第三種方法,我們分別介紹不完整數據上的 k-最近鄰 (kNN) 搜索、天際線查詢和前 k 主導查詢的語義。針對不完整數據的三種代表性查詢,我們研究了一些先進技術來處理不完整數據查詢,包括索引、剪枝以及眾包技術。