Deep Reinforcement Learning with Python: RLHF for Chatbots and Large Language Models (使用 Python 的深度強化學習：針對聊天機器人和大型語言模型的 RLHF)

Name: Deep Reinforcement Learning with Python: RLHF for Chatbots and Large Language Models
Price: 2043 TWD
Availability: InStock
Author: Sanghi, Nimish
ISBN: 9798868802720

Sanghi, Nimish

出版商: Apress
出版日期: 2024-07-15
售價: $2,150
貴賓價: 9.5 折 $2,043
語言: 英文
頁數: 634
裝訂: Quality Paper - also called trade paper
ISBN: 9798868802720
ISBN-13: 9798868802720
相關分類: Chatbot、LangChain、Python、程式語言、Reinforcement、DeepLearning

立即出貨 (庫存=1)

買這商品的人也買了...

~~$1,690~~ $1,606

Learn Robotics Programming : Build and control AI-enabled autonomous robots using the Raspberry Pi and Python, 2/e (Paperback)
~~$3,360~~ $3,192

Transformers for Natural Language Processing : Build, train, and fine-tune deep neural network architectures for NLP with Python, PyTorch, 2/e (Paperback)
~~$380~~ $342

ChatGPT 一本搞定：讓 AI 成為你的工作好幫手，徹底打敗拒絕新科技的人
~~$780~~ $616

Python + ChatGPT 零基礎 + 高效率學程式設計與運算思維, 3/e
$311

你好,ChatGPT AI ChatGPT GPT-3 GPT-4
~~$490~~ $387

Python X ChatGPT：零基礎 AI 聊天用流程圖學 Python 程式設計
~~$2,641~~ $2,502

Generative Deep Learning: Teaching Machines to Paint, Write, Compose, and Play, 2/e (Paperback)
~~$630~~ $498

AI 繪圖夢工廠：Midjourney、Stable Diffusion、Leonardo. ai × ChatGPT 超應用神技
~~$680~~ $537

ChatGPT 4 萬用手冊 2023 秋季號：超強外掛、Prompt、LineBot、OpenAI API、Midjourney、Stable Diffusion、Leonardo.ai
~~$3,150~~ $2,993

Artificial Intelligence: Foundations of Computational Agents, 3/e (Hardcover)
~~$1,650~~ $1,568

The Complete Obsolete Guide to Generative AI

商品描述

Gain a theoretical understanding to the most popular libraries in deep reinforcement learning (deep RL). This new edition focuses on the latest advances in deep RL using a learn-by-coding approach, allowing readers to assimilate and replicate the latest research in this field.

New agent environments ranging from games, and robotics to finance are explained to help you try different ways to apply reinforcement learning. A chapter on multi-agent reinforcement learning covers how multiple agents compete, while another chapter focuses on the widely used deep RL algorithm, proximal policy optimization (PPO). You'll see how reinforcement learning with human feedback (RLHF) has been used by chatbots, built using Large Language Models, e.g. ChatGPT to improve conversational capabilities.

You'll also review the steps for using the code on multiple cloud systems and deploying models on platforms such as Hugging Face Hub. The code is in Jupyter Notebook, which canbe run on Google Colab, and other similar deep learning cloud platforms, allowing you to tailor the code to your own needs.

Whether it's for applications in gaming, robotics, or Generative AI, Deep Reinforcement Learning with Python will help keep you ahead of the curve.

What You'll Learn

Explore Python-based RL libraries, including StableBaselines3 and CleanRL
Work with diverse RL environments like Gymnasium, Pybullet, and Unity ML
Understand instruction finetuning of Large Language Models using RLHF and PPO
Study training and optimization techniques using HuggingFace, Weights and Biases, and Optuna

Who This Book Is For

Software engineers and machine learning developers eager to sharpen their understanding of deep RL and acquire practical skills in implementing RL algorithms fromscratch.

商品描述(中文翻譯)

獲得對深度強化學習（deep RL）中最受歡迎的庫的理論理解。本新版本專注於深度強化學習的最新進展，採用以編碼學習的方式，讓讀者能夠吸收並複製該領域的最新研究。

新代理環境涵蓋從遊戲、機器人到金融的各種應用，幫助您嘗試不同的強化學習應用方式。一章關於多代理強化學習的內容探討了多個代理之間的競爭，而另一章則專注於廣泛使用的深度強化學習算法——近端策略優化（PPO）。您將看到如何利用人類反饋的強化學習（RLHF）來改善聊天機器人的對話能力，這些聊天機器人是基於大型語言模型（如ChatGPT）構建的。

您還將回顧在多個雲系統上使用代碼的步驟，以及如何在Hugging Face Hub等平台上部署模型。代碼使用Jupyter Notebook編寫，可以在Google Colab及其他類似的深度學習雲平台上運行，讓您能夠根據自己的需求調整代碼。

無論是應用於遊戲、機器人還是生成式人工智慧，《Deep Reinforcement Learning with Python》將幫助您保持領先。

您將學到的內容：
- 探索基於Python的強化學習庫，包括StableBaselines3和CleanRL
- 使用多樣的強化學習環境，如Gymnasium、Pybullet和Unity ML
- 理解使用RLHF和PPO對大型語言模型進行指令微調
- 研究使用HuggingFace、Weights and Biases和Optuna的訓練和優化技術

本書適合對象：
渴望加深對深度強化學習理解並獲得從零開始實施強化學習算法的實用技能的軟體工程師和機器學習開發者。

作者簡介

Nimish is a seasoned entrepreneur and an angel investor, with a rich portfolio of tech ventures in SaaS Software and Automation with AI across India, the US and Singapore. He has over 30 years of work experience. Nimish ventured into entrepreneurship in 2006 after holding leadership roles at global corporations like PwC, IBM, and Oracle.

Nimish holds an MBA from Indian Institute of Management, Ahmedabad, India (IIMA), and a Bachelor of Technology in Electrical Engineering from Indian Institute of Technology, Kanpur, India (IITK).