Hadoop Cluster Deployment
暫譯: Hadoop 集群部署

Danil Zburivsky

  • 出版商: Packt Publishing
  • 出版日期: 2013-11-25
  • 售價: $1,670
  • 貴賓價: 9.5$1,587
  • 語言: 英文
  • 頁數: 126
  • 裝訂: Paperback
  • ISBN: 1783281715
  • ISBN-13: 9781783281718
  • 相關分類: Hadoop
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

Construct a modern Hadoop data platform effortlessly and gain insights into how to manage clusters efficiently

Overview

  • Choose the hardware and Hadoop distribution that best suits your needs
  • Get more value out of your Hadoop cluster with Hive, Impala, and Sqoop
  • Learn useful tips for performance optimization and security

In Detail

Big Data is the hottest trend in the IT industry at the moment. Companies are realizing the value of collecting, retaining, and analyzing as much data as possible. They are therefore rushing to implement the next generation of data platform, and Hadoop is the centerpiece of these platforms.

This practical guide is filled with examples which will show you how to successfully build a data platform using Hadoop. Step-by-step instructions will explain how to install, configure, and tie all major Hadoop components together. This book will allow you to avoid common pitfalls, follow best practices, and go beyond the basics when building a Hadoop cluster.

This book will walk you through the process of building a Hadoop cluster from the ground up. By using practical examples and command samples, you will be able to get a cluster up and running in no time, and you will also gain a deep understanding of how various Hadoop components work and interact with each other.

You will learn how to pick the right hardware for different types of Hadoop clusters and about the differences between various Hadoop distributions. By the end of this book, you will be able to install and configure several of the most popular Hadoop ecosystem projects including Hive, Impala, and Sqoop, and you will also be given a sneak peek into the pros and cons of using Hadoop in the cloud.

What you will learn from this book

  • Choose the optimal hardware configuration for your Hadoop cluster
  • Decipher the differences between various Hadoop versions and distributions
  • Make your cluster crash-proof with Namenode High Availability
  • Learn tips and tricks for Jobtracker, Tasktracker, and Datanodes
  • Discover the most important Hadoop ecosystem projects
  • Get more value out of your cluster by using SQL with Hive and real-time query processing with Impala
  • Set up a proper permissions model for your cluster
  • Secure Hadoop with Kerberos
  • Deploy a Hadoop cluster in a cloud environment

Approach

This book is a step-by-step tutorial filled with practical examples which will show you how to build and manage a Hadoop cluster along with its intricacies.

Who this book is written for

This book is ideal for database administrators, data engineers, and system administrators, and it will act as an invaluable reference if you are planning to use the Hadoop platform in your organization. It is expected that you have basic Linux skills since all the examples in this book use this operating system. It is also useful if you have access to test hardware or virtual machines to be able to follow the examples in the book.

商品描述(中文翻譯)

建構一個現代化的 Hadoop 數據平台,輕鬆獲得如何有效管理叢集的見解

概述
- 選擇最適合您需求的硬體和 Hadoop 發行版
- 利用 Hive、Impala 和 Sqoop 從您的 Hadoop 叢集中獲得更多價值
- 學習性能優化和安全性的實用技巧

詳細內容
大數據是目前 IT 行業中最熱門的趨勢。公司們意識到收集、保留和分析盡可能多數據的價值。因此,他們急於實施下一代數據平台,而 Hadoop 是這些平台的核心。

這本實用指南充滿了示例,將向您展示如何成功構建一個使用 Hadoop 的數據平台。逐步的指導將解釋如何安裝、配置並將所有主要的 Hadoop 組件連接在一起。本書將幫助您避免常見的陷阱,遵循最佳實踐,並在構建 Hadoop 叢集時超越基礎知識。

本書將引導您從零開始構建一個 Hadoop 叢集。通過使用實用的示例和命令範本,您將能夠迅速啟動並運行一個叢集,並深入了解各種 Hadoop 組件如何運作及相互作用。

您將學會如何為不同類型的 Hadoop 叢集選擇合適的硬體,以及各種 Hadoop 發行版之間的差異。在本書結束時,您將能夠安裝和配置幾個最受歡迎的 Hadoop 生態系統項目,包括 Hive、Impala 和 Sqoop,並且您還將獲得使用 Hadoop 在雲端的優缺點的簡要介紹。

您將從本書中學到的內容
- 為您的 Hadoop 叢集選擇最佳的硬體配置
- 解讀各種 Hadoop 版本和發行版之間的差異
- 使您的叢集具備抗崩潰能力,實現 Namenode 高可用性
- 學習 Jobtracker、Tasktracker 和 Datanodes 的技巧和竅門
- 發現最重要的 Hadoop 生態系統項目
- 通過使用 Hive 的 SQL 和 Impala 的即時查詢處理,從您的叢集中獲得更多價值
- 為您的叢集設置適當的權限模型
- 使用 Kerberos 保障 Hadoop 的安全性
- 在雲環境中部署 Hadoop 叢集

方法
本書是一個逐步的教程,充滿了實用的示例,將向您展示如何構建和管理一個 Hadoop 叢集及其複雜性。

本書的讀者對象
本書非常適合資料庫管理員、數據工程師和系統管理員,如果您計劃在您的組織中使用 Hadoop 平台,它將成為一個寶貴的參考資料。預期您具備基本的 Linux 技能,因為本書中的所有示例都使用此操作系統。如果您能夠訪問測試硬體或虛擬機,將有助於您跟隨書中的示例。