Databricks Data Intelligence Platform: Unlocking the GenAI Revolution
Gupta, Nikhil, Yip, Jason
- 出版商: Apress
- 出版日期: 2024-10-13
- 售價: $1,710
- 貴賓價: 9.5 折 $1,625
- 語言: 英文
- 頁數: 473
- 裝訂: Quality Paper - also called trade paper
- ISBN: 9798868804434
- ISBN-13: 9798868804434
海外代購書籍(需單獨結帳)
相關主題
商品描述
This book is your comprehensive guide to building robust Generative AI solutions using the Databricks Data Intelligence Platform. Databricks is the fastest-growing data platform offering unified analytics and AI capabilities within a single governance framework, enabling organizations to streamline their data processing workflows, from ingestion to visualization. Additionally, Databricks provides features to train a high-quality large language model (LLM), whether you are looking for Retrieval-Augmented Generation (RAG) or fine-tuning.
Databricks offers a scalable and efficient solution for processing large volumes of both structured and unstructured data, facilitating advanced analytics, machine learning, and real-time processing. In today's GenAI world, Databricks plays a crucial role in empowering organizations to extract value from their data effectively, driving innovation and gaining a competitive edge in the digital age. This book will not only help you master the Data Intelligence Platform but also help power your enterprise to the next level with a bespoke LLM unique to your organization.
Beginning with foundational principles, the book starts with a platform overview and explores features and best practices for ingestion, transformation, and storage with Delta Lake. Advanced topics include leveraging Databricks SQL for querying and visualizing large datasets, ensuring data governance and security with Unity Catalog, and deploying machine learning and LLMs using Databricks MLflow for GenAI. Through practical examples, insights, and best practices, this book equips solution architects and data engineers with the knowledge to design and implement scalable data solutions, making it an indispensable resource for modern enterprises.
Whether you are new to Databricks and trying to learn a new platform, a seasoned practitioner building data pipelines, data science models, or GenAI applications, or even an executive who wants to communicate the value of Databricks to customers, this book is for you. With its extensive feature and best practice deep dives, it also serves as an excellent reference guide if you are preparing for Databricks certification exams.
What You Will Learn
- Foundational principles of Lakehouse architecture
- Key features including Unity Catalog, Databricks SQL (DBSQL), and Delta Live Tables
- Databricks Intelligence Platform and key functionalities
- Building and deploying GenAI Applications from data ingestion to model serving
- Databricks pricing, platform security, DBRX, and many more topics
Who This Book Is For
Solution architects, data engineers, data scientists, Databricks practitioners, and anyone who wants to deploy their Gen AI solutions with the Data Intelligence Platform. This is also a handbook for senior execs who need to communicate the value of Databricks to customers. People who are new to the Databricks Platform and want comprehensive insights will find the book accessible.
商品描述(中文翻譯)
這本書是您全面的指南,幫助您使用 Databricks 數據智能平台構建穩健的生成式 AI 解決方案。Databricks 是增長最快的數據平台,提供統一的分析和 AI 能力,並在單一的治理框架內運作,使組織能夠簡化其數據處理工作流程,從數據攝取到可視化。此外,Databricks 還提供訓練高品質大型語言模型(LLM)的功能,無論您是尋求檢索增強生成(RAG)還是微調。
Databricks 提供可擴展且高效的解決方案,用於處理大量結構化和非結構化數據,促進高級分析、機器學習和實時處理。在當今的生成式 AI 世界中,Databricks 在幫助組織有效提取數據價值方面扮演著至關重要的角色,推動創新並在數位時代獲得競爭優勢。本書不僅將幫助您掌握數據智能平台,還將幫助您的企業提升到一個新的層次,打造獨特於您組織的定制 LLM。
本書從基礎原則開始,首先介紹平台概述,然後探討 Delta Lake 的數據攝取、轉換和存儲的功能和最佳實踐。進階主題包括利用 Databricks SQL 查詢和可視化大型數據集,確保數據治理和安全性,使用 Unity Catalog,以及使用 Databricks MLflow 部署機器學習和 LLM 以支持生成式 AI。通過實際範例、見解和最佳實踐,本書為解決方案架構師和數據工程師提供了設計和實施可擴展數據解決方案的知識,使其成為現代企業不可或缺的資源。
無論您是新接觸 Databricks 並試圖學習新平台的初學者,還是建立數據管道、數據科學模型或生成式 AI 應用的資深從業者,甚至是希望向客戶傳達 Databricks 價值的高層主管,本書都適合您。憑藉其廣泛的功能和最佳實踐深入探討,它也可作為您準備 Databricks 認證考試的優秀參考指南。
您將學到的內容包括:
- Lakehouse 架構的基礎原則
- 主要功能,包括 Unity Catalog、Databricks SQL (DBSQL) 和 Delta Live Tables
- Databricks 智能平台及其關鍵功能
- 從數據攝取到模型服務構建和部署生成式 AI 應用
- Databricks 價格、平台安全性、DBRX 及更多主題
本書適合的讀者包括解決方案架構師、數據工程師、數據科學家、Databricks 從業者,以及任何希望使用數據智能平台部署其生成式 AI 解決方案的人士。這也是一本供高層主管使用的手冊,幫助他們向客戶傳達 Databricks 的價值。對於新接觸 Databricks 平台並希望獲得全面見解的人來說,本書也非常易於理解。
作者簡介
Nikhil Gupta is a seasoned data professional with over 18 years of experience in big data technologies, driving innovation and strategic growth in the field. As a Solution Architect at Databricks, he leverages his expertise to help customers across various industries, including retail, CPG, financial services, banking, and manufacturing, modernize their data and AI implementations on the Databricks platform. His expertise spans a range of big data technologies, including data warehousing, data lakes, and real-time data processing, making him a trusted advisor for Fortune 500 companies.
Jason Yip is a data and machine learning architect. He currently serves as Director of Data and AI at Tredence, a leading data science and analytics company. He advises Fortune 500 companies on implementing data and Generative AI strategies on the cloud. He serves on multiple advisory boards at Databricks, including the Partner Product Advisory Board, and the Solution Architect Champion Advisory board. He is a top voice on Databricks and a former Microsoft employee who successfully led the Microsoft Corporate Finance big data transformation using Databricks.
作者簡介(中文翻譯)
Nikhil Gupta 是一位資深的數據專業人士,擁有超過 18 年的大數據技術經驗,致力於推動該領域的創新和戰略成長。作為 Databricks 的解決方案架構師,他利用自己的專業知識幫助各行各業的客戶,包括零售、消費品、金融服務、銀行和製造業,現代化他們在 Databricks 平台上的數據和 AI 實施。他的專業涵蓋多種大數據技術,包括數據倉儲、數據湖和實時數據處理,使他成為《財富》500 強公司的可信顧問。
Jason Yip 是一位數據和機器學習架構師。他目前擔任 Tredence 的數據和 AI 總監,Tredence 是一家領先的數據科學和分析公司。他為《財富》500 強公司提供有關在雲端實施數據和生成式 AI 策略的建議。他在 Databricks 的多個顧問委員會中任職,包括合作夥伴產品顧問委員會和解決方案架構師冠軍顧問委員會。他是 Databricks 的頂尖聲音,也是前微軟員工,成功領導了微軟企業財務的大數據轉型,使用了 Databricks。