Learning Predictive Analytics with R

Eric Mayor

  • 出版商: Packt Publishing
  • 出版日期: 2015-09-25
  • 售價: $2,170
  • 貴賓價: 9.5$2,062
  • 語言: 英文
  • 頁數: 321
  • 裝訂: Paperback
  • ISBN: 1782169350
  • ISBN-13: 9781782169352
  • 相關分類: Machine Learning
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

Get to grips with key data visualization and predictive analytic skills using R

About This Book

  • Acquire predictive analytic skills using various tools of R
  • Make predictions about future events by discovering valuable information from data using R
  • Comprehensible guidelines that focus on predictive model design with real-world data

Who This Book Is For

If you are a statistician, chief information officer, data scientist, ML engineer, ML practitioner, quantitative analyst, and student of machine learning, this is the book for you. You should have basic knowledge of the use of R. Readers without previous experience of programming in R will also be able to use the tools in the book.

What You Will Learn

  • Customize R by installing and loading new packages
  • Explore the structure of data using clustering algorithms
  • Turn unstructured text into ordered data, and acquire knowledge from the data
  • Classify your observations using Naive Bayes, k-NN, and decision trees
  • Reduce the dimensionality of your data using principal component analysis
  • Discover association rules using Apriori
  • Understand how statistical distributions can help retrieve information from data using correlations, linear regression, and multilevel regression
  • Use PMML to deploy the models generated in R

In Detail

R is statistical software that is used for data analysis. There are two main types of learning from data: unsupervised learning, where the structure of data is extracted automatically; and supervised learning, where a labeled part of the data is used to learn the relationship or scores in a target attribute. As important information is often hidden in a lot of data, R helps to extract that information with its many standard and cutting-edge statistical functions.

This book is packed with easy-to-follow guidelines that explain the workings of the many key data mining tools of R, which are used to discover knowledge from your data.

You will learn how to perform key predictive analytics tasks using R, such as train and test predictive models for classification and regression tasks, score new data sets and so on. All chapters will guide you in acquiring the skills in a practical way. Most chapters also include a theoretical introduction that will sharpen your understanding of the subject matter and invite you to go further.

The book familiarizes you with the most common data mining tools of R, such as k-means, hierarchical regression, linear regression, association rules, principal component analysis, multilevel modeling, k-NN, Naive Bayes, decision trees, and text mining. It also provides a description of visualization techniques using the basic visualization tools of R as well as lattice for visualizing patterns in data organized in groups. This book is invaluable for anyone fascinated by the data mining opportunities offered by GNU R and its packages.

Style and approach

This is a practical book, which analyzes compelling data about life, health, and death with the help of tutorials. It offers you a useful way of interpreting the data that's specific to this book, but that can also be applied to any other data.

商品描述(中文翻譯)

掌握使用R進行關鍵數據可視化和預測分析技能

關於本書
- 使用R的各種工具獲取預測分析技能
- 通過從數據中發現有價值的信息,對未來事件進行預測
- 提供針對真實世界數據的預測模型設計的易於理解的指南

本書適合對象
- 統計學家、首席信息官、數據科學家、機器學習工程師、機器學習從業者、量化分析師和機器學習學生
- 應具備基本的R使用知識,沒有R編程經驗的讀者也能使用本書中的工具

你將學到什麼
- 通過安裝和加載新包自定義R
- 使用聚類算法探索數據結構
- 將非結構化文本轉化為有序數據,並從數據中獲取知識
- 使用Naive Bayes、k-NN和決策樹對觀察結果進行分類
- 使用主成分分析減少數據的維度
- 使用Apriori發現關聯規則
- 了解統計分佈如何通過相關性、線性回歸和多層回歸從數據中檢索信息
- 使用PMML部署在R中生成的模型

詳細內容
R是用於數據分析的統計軟件。從數據中學習有兩種主要類型:無監督學習,自動提取數據結構;和監督學習,使用標記的數據部分學習目標屬性的關係或分數。由於重要信息通常隱藏在大量數據中,R通過其許多標準和尖端統計功能幫助提取該信息。

本書充滿了易於理解的指南,解釋了R的許多關鍵數據挖掘工具的運作方式,這些工具用於從數據中發現知識。

你將學習如何使用R執行關鍵的預測分析任務,例如為分類和回歸任務訓練和測試預測模型,對新數據集進行評分等。每個章節都將以實用的方式指導你獲得這些技能。大多數章節還包括理論介紹,以加深對主題的理解並鼓勵你進一步學習。

本書使你熟悉R的最常用的數據挖掘工具,如k-means、階層回歸、線性回歸、關聯規則、主成分分析、多層建模、k-NN、Naive Bayes、決策樹和文本挖掘。它還介紹了使用R的基本可視化工具以及用於組織分組數據中的模式可視化的lattice。本書對於對GNU R及其包提供的數據挖掘機會感興趣的任何人都是無價的。

風格和方法
這是一本實用的書,通過教程分析了有關生活、健康和死亡的引人入勝的數據。它為你提供了一種解釋數據的有用方式,這種方式專門適用於本書,但也可以應用於任何其他數據。