Hadoop Cluster Deployment

Danil Zburivsky

  • 出版商: Packt Publishing
  • 出版日期: 2013-11-25
  • 售價: $1,560
  • 貴賓價: 9.5$1,482
  • 語言: 英文
  • 頁數: 126
  • 裝訂: Paperback
  • ISBN: 1783281715
  • ISBN-13: 9781783281718
  • 相關分類: Hadoop
  • 下單後立即進貨 (約3~4週)

相關主題

商品描述

Construct a modern Hadoop data platform effortlessly and gain insights into how to manage clusters efficiently

Overview

  • Choose the hardware and Hadoop distribution that best suits your needs
  • Get more value out of your Hadoop cluster with Hive, Impala, and Sqoop
  • Learn useful tips for performance optimization and security

In Detail

Big Data is the hottest trend in the IT industry at the moment. Companies are realizing the value of collecting, retaining, and analyzing as much data as possible. They are therefore rushing to implement the next generation of data platform, and Hadoop is the centerpiece of these platforms.

This practical guide is filled with examples which will show you how to successfully build a data platform using Hadoop. Step-by-step instructions will explain how to install, configure, and tie all major Hadoop components together. This book will allow you to avoid common pitfalls, follow best practices, and go beyond the basics when building a Hadoop cluster.

This book will walk you through the process of building a Hadoop cluster from the ground up. By using practical examples and command samples, you will be able to get a cluster up and running in no time, and you will also gain a deep understanding of how various Hadoop components work and interact with each other.

You will learn how to pick the right hardware for different types of Hadoop clusters and about the differences between various Hadoop distributions. By the end of this book, you will be able to install and configure several of the most popular Hadoop ecosystem projects including Hive, Impala, and Sqoop, and you will also be given a sneak peek into the pros and cons of using Hadoop in the cloud.

What you will learn from this book

  • Choose the optimal hardware configuration for your Hadoop cluster
  • Decipher the differences between various Hadoop versions and distributions
  • Make your cluster crash-proof with Namenode High Availability
  • Learn tips and tricks for Jobtracker, Tasktracker, and Datanodes
  • Discover the most important Hadoop ecosystem projects
  • Get more value out of your cluster by using SQL with Hive and real-time query processing with Impala
  • Set up a proper permissions model for your cluster
  • Secure Hadoop with Kerberos
  • Deploy a Hadoop cluster in a cloud environment

Approach

This book is a step-by-step tutorial filled with practical examples which will show you how to build and manage a Hadoop cluster along with its intricacies.

Who this book is written for

This book is ideal for database administrators, data engineers, and system administrators, and it will act as an invaluable reference if you are planning to use the Hadoop platform in your organization. It is expected that you have basic Linux skills since all the examples in this book use this operating system. It is also useful if you have access to test hardware or virtual machines to be able to follow the examples in the book.

商品描述(中文翻譯)

輕鬆建立現代化的 Hadoop 數據平台,並深入了解如何有效管理叢集。

概述:
- 選擇最適合您需求的硬體和 Hadoop 發行版
- 透過 Hive、Impala 和 Sqoop 從 Hadoop 叢集中獲取更多價值
- 學習性能優化和安全性的有用技巧

詳細內容:
大數據是當前 IT 行業最熱門的趨勢。公司們正意識到收集、保留和分析盡可能多的數據的價值。因此,他們正急於實施下一代數據平台,而 Hadoop 是這些平台的核心。

這本實用指南充滿了示例,將向您展示如何使用 Hadoop 成功構建數據平台。逐步的說明將解釋如何安裝、配置和結合所有主要的 Hadoop 組件。本書將幫助您避免常見的問題,遵循最佳實踐,並在構建 Hadoop 叢集時超越基礎知識。

本書將引導您從頭開始構建 Hadoop 叢集。通過使用實際示例和命令示例,您將能夠在短時間內啟動和運行一個叢集,並且您還將深入了解各種 Hadoop 組件的工作原理和相互作用。

您將學習如何為不同類型的 Hadoop 叢集選擇合適的硬體,以及各種 Hadoop 發行版之間的差異。通過本書,您將能夠安裝和配置包括 Hive、Impala 和 Sqoop 在內的幾個最受歡迎的 Hadoop 生態系統項目,並且您還將獲得使用 Hadoop 在雲端中的優點和缺點的一瞥。

本書將教您:
- 選擇最佳的硬體配置來構建您的 Hadoop 叢集
- 解讀不同的 Hadoop 版本和發行版之間的差異
- 通過 Namenode 高可用性使您的叢集具有崩潰防護功能
- 學習 Jobtracker、Tasktracker 和 Datanodes 的技巧和訣竅
- 探索最重要的 Hadoop 生態系統項目
- 透過 Hive 使用 SQL 和透過 Impala 進行實時查詢處理,從您的叢集中獲取更多價值
- 為您的叢集設置適當的權限模型
- 使用 Kerberos 保護 Hadoop
- 在雲環境中部署 Hadoop 叢集

這本書的特點:
本書是一本充滿實用示例的逐步教程,將向您展示如何構建和管理 Hadoop 叢集及其細節。

本書的讀者:
本書適合數據庫管理員、數據工程師和系統管理員,如果您計劃在組織中使用 Hadoop 平台,本書將成為一本寶貴的參考資料。預計您具備基本的 Linux 技能,因為本書中的所有示例都使用這個操作系統。如果您能夠使用測試硬體或虛擬機器來跟隨本書中的示例,那將非常有用。