Introduction to Data Science: Data Analysis and Prediction Algorithms with R

Irizarry, Rafael A.

買這商品的人也買了...

相關主題

商品描述

Introduction to Data Science: Data Analysis and Prediction Algorithms with R introduces concepts and skills that can help you tackle real-world data analysis challenges. It covers concepts from probability, statistical inference, linear regression, and machine learning. It also helps you develop skills such as R programming, data wrangling, data visualization, predictive algorithm building, file organization with UNIX/Linux shell, version control with Git and GitHub, and reproducible document preparation.

 

This book is a textbook for a first course in data science. No previous knowledge of R is necessary, although some experience with programming may be helpful. The book is divided into six parts: R, data visualization, statistics with R, data wrangling, machine learning, and productivity tools. Each part has several chapters meant to be presented as one lecture.

 

The author uses motivating case studies that realistically mimic a data scientist's experience. He starts by asking specific questions and answers these through data analysis so concepts are learned as a means to answering the questions. Examples of the case studies included are: US murder rates by state, self-reported student heights, trends in world health and economics, the impact of vaccines on infectious disease rates, the financial crisis of 2007-2008, election forecasting, building a baseball team, image processing of hand-written digits, and movie recommendation systems.

 

The statistical concepts used to answer the case study questions are only briefly introduced, so complementing with a probability and statistics textbook is highly recommended for in-depth understanding of these concepts. If you read and understand the chapters and complete the exercises, you will be prepared to learn the more advanced concepts and skills needed to become an expert.

A complete solutions manual is available to registered instructors who require the text for a course.

商品描述(中文翻譯)

《Introduction to Data Science: Data Analysis and Prediction Algorithms with R》是一本介紹概念和技能的教科書,可以幫助你應對現實世界的數據分析挑戰。它涵蓋了概率、統計推斷、線性回歸和機器學習等概念。同時,它還幫助你培養R編程、數據整理、數據可視化、預測算法構建、UNIX/Linux shell文件組織、Git和GitHub版本控制以及可重現的文檔準備等技能。

這本書是一門數據科學的入門教材。不需要先備的R知識,但對編程有一些經驗可能會有幫助。該書分為六個部分:R、數據可視化、統計學與R、數據整理、機器學習和生產力工具。每個部分都有幾個章節,可以作為一堂講座來呈現。

作者使用具有現實性的案例研究,模擬了數據科學家的經驗。他首先提出具體問題,然後通過數據分析來回答這些問題,以此來學習概念。案例研究的例子包括:美國各州的謀殺率、學生身高的自報數據、世界健康和經濟趨勢、疫苗對傳染病率的影響、2007-2008年的金融危機、選舉預測、組建棒球隊、手寫數字的圖像處理和電影推薦系統。

為了回答案例研究的問題,所使用的統計概念只是簡單介紹,因此強烈建議配合一本概率和統計教科書,以深入理解這些概念。如果你閱讀並理解了這些章節並完成了練習,你將準備好學習更高級的概念和技能,成為一名專家所需的。

註冊的教師可以獲得完整的解答手冊,以便在課程中使用該教材。

作者簡介

 

Rafael A. Irizarry is professor of data sciences at the Dana-Farber Cancer Institute, professor of biostatistics at Harvard, and a fellow of the American Statistical Association. Dr. Irizarry is an applied statistician and during the last 20 years has worked in diverse areas, including genomics, sound engineering, and public health. He disseminates solutions to data analysis challenges as open source software, tools that are widely downloaded and used. Prof. Irizarry has also developed and taught several data science courses at Harvard as well as popular online courses.

作者簡介(中文翻譯)

Rafael A. Irizarry 是 Dana-Farber Cancer Institute 的資料科學教授,也是哈佛大學的生物統計學教授,同時也是美國統計學會的會士。Irizarry 博士是一位應用統計學家,在過去的 20 年中從事多個領域的工作,包括基因組學、音響工程和公共衛生。他以開源軟體的形式傳播解決數據分析挑戰的解決方案,這些工具被廣泛下載和使用。Irizarry 教授還在哈佛大學開發並教授了多門資料科學課程,同時也提供熱門的線上課程。