Veridical Data Science: The Practice of Responsible Data Analysis and Decision Making (Hardcover)

Yu, Bin, Barter, Rebecca L.

  • 出版商: MIT
  • 出版日期: 2024-10-15
  • 售價: $2,160
  • 貴賓價: 9.8$2,117
  • 語言: 英文
  • 頁數: 526
  • 裝訂: Hardcover - also called cloth, retail trade, or trade
  • ISBN: 0262049198
  • ISBN-13: 9780262049191
  • 相關分類: Data Science
  • 立即出貨 (庫存=1)

相關主題

商品描述

Using real-world data case studies, this innovative and accessible textbook introduces an actionable framework for conducting trustworthy data science.

Most textbooks present data science as a linear analytic process involving a set of statistical and computational techniques without accounting for the challenges intrinsic to real-world applications. Veridical Data Science, by contrast, embraces the reality that most projects begin with an ambiguous domain question and messy data; it acknowledges that datasets are mere approximations of reality while analyses are mental constructs.
Bin Yu and Rebecca Barter employ the innovative Predictability, Computability, and Stability (PCS) framework to assess the trustworthiness and relevance of data-driven results relative to three sources of uncertainty that arise throughout the data science life cycle: the human decisions and judgment calls made during data collection, cleaning, and modeling. By providing real-world data case studies, intuitive explanations of common statistical and machine learning techniques, and supplementary R and Python code, Veridical Data Science offers a clear and actionable guide for conducting responsible data science. Requiring little background knowledge, this lucid, self-contained textbook provides a solid foundation and principled framework for future study of advanced methods in machine learning, statistics, and data science.

 

  • Presents the Predictability, Computability, and Stability (PCS) methodology for producing trustworthy data-driven results
  • Teaches how a data science project should be conducted from beginning to end, including extensive discussion of the data scientist's decision-making process
  • Cultivates critical thinking throughout the entire data science life cycle
  • Provides practical examples and illuminating case studies of real-world data analysis problems with associated code, exercises, and solutions
  • Suitable for advanced undergraduate and graduate students, domain scientists, and practitioners

商品描述(中文翻譯)

使用真實世界數據案例研究,這本創新且易於理解的教科書介紹了一個可行的框架,以進行可信賴的數據科學。

大多數教科書將數據科學呈現為一個線性分析過程,涉及一系列統計和計算技術,而未考慮到真實世界應用中固有的挑戰。相較之下,《Veridical Data Science》接受了這樣的現實:大多數項目始於模糊的領域問題和混亂的數據;它承認數據集僅僅是現實的近似,而分析則是心理構建。Bin Yu 和 Rebecca Barter 採用了創新的可預測性、可計算性和穩定性(PCS)框架,來評估數據驅動結果的可信度和相關性,這些結果相對於在數據科學生命週期中出現的三個不確定性來源:在數據收集、清理和建模過程中所做的人類決策和判斷。通過提供真實世界數據案例研究、對常見統計和機器學習技術的直觀解釋,以及補充的 R 和 Python 代碼,《Veridical Data Science》為進行負責任的數據科學提供了一個清晰且可行的指南。這本清晰且自成一體的教科書幾乎不需要背景知識,為未來學習機器學習、統計和數據科學的高級方法提供了堅實的基礎和原則框架。

- 提出可預測性、可計算性和穩定性(PCS)方法論,以產生可信賴的數據驅動結果
- 教授數據科學項目應如何從頭到尾進行,包括對數據科學家決策過程的廣泛討論
- 在整個數據科學生命週期中培養批判性思維
- 提供實用示例和啟發性的真實世界數據分析問題案例研究,並附有代碼、練習和解答
- 適合高年級本科生、研究生、領域科學家和實務工作者

作者簡介

Bin Yu is Chancellor's Distinguished Professor and Class of 1936 Second Chair in Statistics, EECS, and Computational Biology at the University of California, Berkeley, a 2006 Guggenheim Fellow, and a member of the US National Academy of Sciences and the American Academy of Arts and Sciences.

Rebecca L. Barter is Research Assistant Professor in Epidemiology at the University of Utah.

作者簡介(中文翻譯)

Bin Yu 是加州大學伯克利分校的校長特聘教授及1936年班第二椅的統計學、電子工程與計算生物學教授,2006年獲得古根海姆獎學金,並且是美國國家科學院及美國藝術與科學學院的成員。

Rebecca L. Barter 是猶他大學流行病學的研究助理教授。