Data Contracts: Developing Production-Grade Pipelines at Scale
暫譯: 數據合約:開發可擴展的生產級管道

Sanderson, Chad, Freeman, Mark, Schmidt, B. E.

  • 出版商: O'Reilly
  • 出版日期: 2025-12-09
  • 售價: $2,840
  • 貴賓價: 9.5$2,698
  • 語言: 英文
  • 頁數: 346
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 109815763X
  • ISBN-13: 9781098157630
  • 相關分類: Data-mining
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

Poor data quality can cause major problems for data teams, from breaking revenue-generating data pipelines to losing the trust of data consumers. Despite the importance of data quality, many data teams still struggle to avoid these issues--especially when their data is sourced from upstream workflows outside of their control. The solution: data contracts. Data contracts enable high-quality, well-governed data assets by documenting expectations of the data, establishing ownership of data assets, and then automatically enforcing these constraints within the CI/CD workflow.

This practical book introduces data contract architecture with a clear definition of data contracts, explains why the data industry needs them, and shares real-world use cases of data contracts in production. In addition, you'll learn how to implement components of the data contract architecture and understand how they're used in the data lifecycle. Finally, you'll build a case for implementing data contracts in your organization.

Authors Chad Sanderson and Mark Freeman will help you:

  • Explore real-world applications of data contracts within the industry
  • Understand how to apply each component of this architecture, such as CI/CD, monitoring, version control data, and more
  • Learn how to implement data contracts using open source tools
  • Examine ways to resolve data quality issues using data contract architecture
  • Measure the impact of implementing a data contract in your organization
  • Develop a strategy to determine how data contracts will be used in your organization

商品描述(中文翻譯)

劣質的數據品質可能會對數據團隊造成重大問題,從破壞創收的數據管道到失去數據消費者的信任。儘管數據品質的重要性不言而喻,許多數據團隊仍然難以避免這些問題,尤其是當他們的數據來自於他們無法控制的上游工作流程時。解決方案:數據合約。數據合約通過記錄數據的期望、確立數據資產的所有權,並在CI/CD工作流程中自動執行這些約束,來實現高品質、良好治理的數據資產。

這本實用的書籍介紹了數據合約架構,清楚定義了數據合約,解釋了為什麼數據行業需要它們,並分享了數據合約在生產中的實際案例。此外,您將學習如何實施數據合約架構的組件,並了解它們在數據生命周期中的使用方式。最後,您將為在您的組織中實施數據合約建立一個案例。

作者Chad Sanderson和Mark Freeman將幫助您:
- 探索數據合約在行業中的實際應用
- 理解如何應用此架構的每個組件,例如CI/CD、監控、版本控制數據等
- 學習如何使用開源工具實施數據合約
- 檢視使用數據合約架構解決數據品質問題的方法
- 衡量在您的組織中實施數據合約的影響
- 制定策略以確定數據合約在您的組織中的使用方式