Tapping into Unstructured Data: Integrating Unstructured Data and Textual Analytics into Business Intelligence
暫譯: 挖掘非結構化數據:將非結構化數據與文本分析整合進商業智慧
William H. Inmon, Anthony Nesavich
- 出版商: Prentice Hall
- 出版日期: 2007-12-21
- 售價: $1,750
- 貴賓價: 9.5 折 $1,663
- 語言: 英文
- 頁數: 264
- 裝訂: Paperback
- ISBN: 0132360292
- ISBN-13: 9780132360296
-
相關分類:
大數據 Big-data、Text-mining、Data Science
立即出貨 (庫存 < 3)
買這商品的人也買了...
-
$650$514 -
$620$527 -
$990$782 -
$650$514 -
$800$720 -
$980$774 -
$420$332 -
$350$298 -
$990$891 -
$600$480 -
$1,180$932 -
$520$411 -
$620$527 -
$750$593 -
$980$774 -
$480$408 -
$480$374 -
$780$616 -
$720$612 -
$780$616 -
$520$411 -
$530$419 -
$890$757 -
$450$351 -
$480$408
相關主題
商品描述
“The authors, the best minds on the topic, are breaking new ground. They show how every organization can realize the benefits of a system that can search and present complex ideas or data from what has been a mostly untapped source of raw data.”
--Randy Chalfant, CTO, Sun Microsystems
The Definitive Guide to Unstructured Data Management and Analysis--From the World’s Leading Information Management Expert
A wealth of invaluable information exists in unstructured textual form, but organizations have found it difficult or impossible to access and utilize it. This is changing rapidly: new approaches finally make it possible to glean useful knowledge from virtually any collection of unstructured data.
William H. Inmon--the father of data warehousing--and Anthony Nesavich introduce the next data revolution: unstructured data management. Inmon and Nesavich cover all you need to know to make unstructured data work for your organization. You’ll learn how to bring it into your existing structured data environment, leverage existing analytical infrastructure, and implement textual analytic processing technologies to solve new problems and uncover new opportunities. Inmon and Nesavich introduce breakthrough techniques covered in no other book--including the powerful role of textual integration, new ways to integrate textual data into data warehouses, and new SQL techniques for reading and analyzing text. They also present five chapter-length, real-world case studies--demonstrating unstructured data at work in medical research, insurance, chemical manufacturing, contracting, and beyond.
This book will be indispensable to every business and technical professional trying to make sense of a large body of unstructured text: managers, database designers, data modelers, DBAs, researchers, and end users alike.
Coverage includes
- What unstructured data is, and how it differs from structured data
- First generation technology for handling unstructured data, from search engines to ECM--and its limitations
- Integrating text so it can be analyzed with a common, colloquial vocabulary: integration engines, ontologies, glossaries, and taxonomies
- Processing semistructured data: uncovering patterns, words, identifiers, and conflicts
- Novel processing opportunities that arise when text is freed from context
- Architecture and unstructured data: Data Warehousing 2.0
- Building unstructured relational databases and linking them to structured data
- Visualizations and Self-Organizing Maps (SOMs), including Compudigm and Raptor solutions
- Capturing knowledge from spreadsheet data and email
- Implementing and managing metadata: data models, data quality, and more
William H. Inmon is founder, president, and CTO of Inmon Data Systems. He is the father of the data warehouse concept, the corporate information factory, and the government information factory. Inmon has written 47 books on data warehouse, database, and information technology management; as well as more than 750 articles for trade journals such as Data Management Review, Byte, Datamation, and ComputerWorld. His b-eye-network.com newsletter currently reaches 55,000 people.
Anthony Nesavich worked at Inmon Data Systems, where he developed multiple reports that successfully query unstructured data.
Preface xvii
1 Unstructured Textual Data in the Organization 1
2 The Environments of Structured Data and Unstructured Data 15
3 First Generation Textual Analytics 33
4 Integrating Unstructured Text into the Structured Environment 47
5 Semistructured Data 73
6 Architecture and Textual Analytics 83
7 The Unstructured Database 95
8 Analyzing a Combination of Unstructured Data and Structured Data 113
9 Analyzing Text Through Visualization 127
10 Spreadsheets and Email 135
11 Metadata in Unstructured Data 147
12 A Methodology for Textual Analytics 163
13 Merging Unstructured Databases into the Data Warehouse 175
14 Using SQL to Analyze Text 185
15 Case Study--Textual Analytics in Medical Research 195
16 Case Study--A Database for Harmful Chemicals 203
17 Case Study--Managing Contracts Through an Unstructured Database 209
18 Case Study--Creating a Corporate Taxonomy (Glossary) 215
19 Case Study--Insurance Claims 219
Glossary 227
Index 233
商品描述(中文翻譯)
“這些作者是該主題上最優秀的頭腦,正在開創新的領域。他們展示了每個組織如何實現一個系統的好處,該系統能夠從一個大多數未被開發的原始數據來源中搜尋和呈現複雜的想法或數據。”
--Randy Chalfant, CTO, Sun Microsystems
《非結構化數據管理與分析的權威指南——來自全球領先的信息管理專家》
在非結構化文本形式中存在著大量無價的信息,但組織發現很難或不可能訪問和利用這些信息。這一情況正在迅速改變:新的方法終於使得從幾乎任何非結構化數據集合中提取有用知識成為可能。
William H. Inmon——數據倉儲之父——和Anthony Nesavich介紹了下一場數據革命:非結構化數據管理。Inmon和Nesavich涵蓋了您需要了解的所有內容,以使非結構化數據為您的組織服務。您將學習如何將其引入現有的結構化數據環境,利用現有的分析基礎設施,並實施文本分析處理技術以解決新問題並發現新機會。Inmon和Nesavich介紹了其他書籍中未曾涵蓋的突破性技術——包括文本整合的強大作用、將文本數據整合到數據倉儲中的新方法,以及用於讀取和分析文本的新SQL技術。他們還展示了五個章節長的真實案例研究——展示了非結構化數據在醫學研究、保險、化學製造、合約等領域的應用。
這本書對於每一位試圖理解大量非結構化文本的商業和技術專業人士來說都是不可或缺的:經理、數據庫設計師、數據模型師、DBA、研究人員和最終用戶等。
內容涵蓋:
- 什麼是非結構化數據,它與結構化數據有何不同
- 處理非結構化數據的第一代技術,從搜索引擎到ECM——及其局限性
- 整合文本以便用通用的口語詞彙進行分析:整合引擎、本體論、詞彙表和分類法
- 處理半結構化數據:揭示模式、單詞、標識符和衝突
- 當文本脫離上下文時出現的新處理機會
- 架構與非結構化數據:數據倉儲2.0
- 建立非結構化關聯數據庫並將其鏈接到結構化數據
- 可視化和自組織地圖(SOM),包括Compudigm和Raptor解決方案
- 從電子表格數據和電子郵件中捕獲知識
- 實施和管理元數據:數據模型、數據質量等
William H. Inmon是Inmon Data Systems的創始人、總裁和CTO。他是數據倉儲概念、企業信息工廠和政府信息工廠的創始人。Inmon已撰寫47本有關數據倉儲、數據庫和信息技術管理的書籍;以及為《數據管理評論》、《Byte》、《Datamation》和《ComputerWorld》等行業期刊撰寫了超過750篇文章。他的b-eye-network.com電子報目前的讀者達到55,000人。
Anthony Nesavich曾在Inmon Data Systems工作,開發了多個成功查詢非結構化數據的報告。
前言 xvii
1 組織中的非結構化文本數據 1
2 結構化數據和非結構化數據的環境 15
3 第一代文本分析 33
4 將非結構化文本整合到結構化環境中 47
5 半結構化數據 73
6 架構與文本分析 83
7 非結構化數據庫 95
8 分析非結構化數據和結構化數據的組合 113
9 通過可視化分析文本 127
10 電子表格和電子郵件 135
11 非結構化數據中的元數據 147
12 文本分析的方法論 163
13 將非結構化數據庫合併到數據倉儲中 175
14 使用SQL分析文本 185
15 案例研究——醫學研究中的文本分析 195
16 案例研究——有害化學品的數據庫 203
17 案例研究——通過非結構化數據庫管理合約 209
18 案例研究——創建企業分類法(詞彙表) 215
19 案例研究——保險索賠 219
詞彙表 227
索引 233