Instant Apache Solr for Indexing Data How-to

Alexandre Rafalovitch

  • 出版商: Packt Publishing
  • 出版日期: 2013-06-18
  • 售價: $960
  • 貴賓價: 9.5$912
  • 語言: 英文
  • 頁數: 90
  • 裝訂: Paperback
  • ISBN: 1782164847
  • ISBN-13: 9781782164845
  • 相關分類: 全文搜尋引擎 Full-text-search
  • 下單後立即進貨 (約3~4週)

相關主題

商品描述

Nobody pretends indexing data with Apache Solr is a walk in the park, but this book eases the path with plain language explanations and involving projects. Perfect for developers with sophisticated indexing ambitions.

Overview

  • Learn something new in an Instant! A short, fast, focused guide delivering immediate results
  • Take the most basic schema and extend it to support multi-lingual, multi-field searches
  • Make Solr pull data from a variety of existing sources
  • Discover different pathways to acquire and normalize data and content

In Detail

Content and data searching is a very important part of the modern user experience, and before something can be searched, it has to be indexed. Indexing is a hidden part of the process that has a surprisingly strong impact on the overall user experience. From speed, to faceting, to multilingual support, everything depends on correct indexing.

Instant Apache Solr for Indexing Data How-to is an example-driven guide that will take you on a journey from the basic collection of data to a multi-lingual, multi-field, multi-type schema. By the end of the book, you will know how to get your data ready for searches and how to tune the process to achieve the required search use-cases.

Instant Apache Solr for Indexing Data How-to is a friendly, practical guide that will show you how to index your data with Solr. This book will explain how Solr’s basic blocks actually work and fit together. You will then explore additional settings, pipelines, and configuration changes to achieve ever more complex goals. You will then cover how to push data into Solr and when to get Solr to pull the data. You will then master indexing textual and binary context before enabling multilingual content to be searched.

What you will learn from this book

  • Produce a basic Solr schema ready for experimentation and exploration
  • Run several collections on one Solr server
  • Import, search, and facet simple and multi-valued fields
  • Create your own field type analyzer chains for ultimate indexing flexibility
  • Detect, index, and partition multi-lingual content
  • Use CSV, XML, JSON, and binary formats to get data into Solr
  • Pull data from external files and databases using DataImportHandler
  • Write a Java client using the SolrJ library in both remote and embedded mode
  • Change data already indexed using atomic updates
  • Reshape incoming data with UpdateRequestProcessors
  • Control the visibility of data with soft and hard commits

Approach

Filled with practical, step-by-step instructions and clear explanations for the most important and useful tasks. This book is written in a friendly, practical manner with recipes covering important indexing techniques and methods using Apache Solr.

Who this book is written for

This book is for developers who want to dive deeper into Solr. Regardless of whether you are just starting with Solr or have already built your first collection by copying and modifying examples, this book will take you through the complicated steps of indexing your data with Solr.

商品描述(中文翻譯)

沒有人會假裝使用Apache Solr進行數據索引是一件輕而易舉的事情,但這本書通過清晰的語言解釋和實踐項目來簡化這一過程。非常適合有複雜索引需求的開發人員。

概述:
- 在瞬間學到新知識!一本短小、快速、專注的指南,能夠立即產生效果。
- 從最基本的架構開始,擴展支援多語言、多字段搜索。
- 讓Solr從各種現有來源中提取數據。
- 探索不同的數據和內容獲取、規範化的方法。

詳細內容:
內容和數據搜索是現代用戶體驗中非常重要的一部分,而在進行搜索之前,必須進行索引。索引是一個隱藏的過程,對整體用戶體驗有著出乎意料的重要影響。從速度、分面搜索到多語言支援,一切都取決於正確的索引。

《Instant Apache Solr for Indexing Data How-to》是一本以實例為驅動的指南,將帶您從基本的數據收集到多語言、多字段、多類型的架構。通過閱讀本書,您將學會如何準備數據以進行搜索,以及如何調整過程以實現所需的搜索用例。

《Instant Apache Solr for Indexing Data How-to》是一本友好、實用的指南,將向您展示如何使用Solr進行數據索引。本書將解釋Solr的基本組件如何工作並相互配合。然後,您將探索其他設置、管道和配置更改,以實現更複雜的目標。您還將學習如何將數據推送到Solr,以及何時讓Solr拉取數據。在啟用多語言內容進行搜索之前,您還將掌握索引文本和二進制內容的技巧。

本書將教您以下內容:
- 創建一個基本的Solr架構,以便進行實驗和探索。
- 在一個Solr服務器上運行多個集合。
- 導入、搜索和分面簡單和多值字段。
- 創建自己的字段類型分析器鏈,實現最大的索引靈活性。
- 檢測、索引和分割多語言內容。
- 使用CSV、XML、JSON和二進制格式將數據輸入Solr。
- 使用DataImportHandler從外部文件和數據庫中提取數據。
- 使用SolrJ庫編寫Java客戶端,支援遠程和嵌入式模式。
- 使用原子更新更改已經索引的數據。
- 使用UpdateRequestProcessors重新塑造傳入數據。
- 使用軟提交和硬提交控制數據的可見性。

這本書以實用的、逐步指導和清晰的解釋為特點,涵蓋了使用Apache Solr進行重要的索引技術和方法。

本書適合想要深入研究Solr的開發人員。無論您是剛開始使用Solr,還是已經通過複製和修改示例來建立了第一個集合,本書都將引導您完成使用Solr進行數據索引的複雜步驟。