Data Clustering with Python: From Theory to Implementation
暫譯: 使用 Python 進行資料聚類:從理論到實作
Gan, Guojun
- 出版商: CRC
- 出版日期: 2025-09-15
- 售價: $4,120
- 貴賓價: 9.5 折 $3,914
- 語言: 英文
- 頁數: 248
- 裝訂: Hardcover - also called cloth, retail trade, or trade
- ISBN: 1032971568
- ISBN-13: 9781032971568
-
相關分類:
Python、Data-mining
海外代購書籍(需單獨結帳)
相關主題
商品描述
Data clustering, an interdisciplinary field with diverse applications, has gained increasing popularity since its origins in the 1950s. Over the past six decades, researchers from various fields have proposed numerous clustering algorithms. In 2011, I wrote a book on implementing clustering algorithms in C++ using object-oriented programming. While C++ offers efficiency, its steep learning curve makes it less ideal for rapid prototyping. Since then, Python has surged in popularity, becoming the most widely used programming language since 2022. Its simplicity and extensive scientific libraries make it an excellent choice for implementing clustering algorithms.
Features:
- Introduction to Python programming fundamentals
- Overview of key concepts in data clustering
- Implementation of popular clustering algorithms in Python
- Practical examples of applying clustering algorithms to datasets
- Access to associated Python code on GitHub
This book extends my previous work by implementing clustering algorithms in Python. Unlike the object-oriented approach in C++, this book uses a procedural programming style, as Python allows many clustering algorithms to be implemented concisely. The book is divided into two parts: the first introduces Python and key libraries like NumPy, Pandas, and Matplotlib, while the second covers clustering algorithms, including hierarchical and partitional methods. Each chapter includes theoretical explanations, Python implementations, and practical examples, with comparisons to scikit-learn where applicable.
This book is ideal for anyone interested in clustering algorithms, with no prior Python experience required. The complete source code is available at: https: //github.com/ganml/dcpython.
商品描述(中文翻譯)
資料聚類是一個跨學科的領域,擁有多樣的應用,自1950年代起便逐漸受到重視。在過去的六十年中,來自各個領域的研究者提出了許多聚類演算法。2011年,我撰寫了一本關於使用物件導向程式設計在C++中實現聚類演算法的書籍。雖然C++提供了高效能,但其陡峭的學習曲線使其不太適合快速原型開發。自那以後,Python的受歡迎程度急劇上升,自2022年以來成為最廣泛使用的程式語言。其簡單性和豐富的科學函式庫使其成為實現聚類演算法的絕佳選擇。
**特色:**
- Python程式設計基礎介紹
- 資料聚類的關鍵概念概述
- 在Python中實現流行的聚類演算法
- 將聚類演算法應用於資料集的實際範例
- 可在GitHub上訪問相關的Python程式碼
本書擴展了我之前的工作,使用Python實現聚類演算法。與C++中的物件導向方法不同,本書採用程序式程式設計風格,因為Python允許許多聚類演算法以簡潔的方式實現。本書分為兩個部分:第一部分介紹Python及其關鍵函式庫,如NumPy、Pandas和Matplotlib,第二部分涵蓋聚類演算法,包括階層式和劃分式方法。每一章都包括理論解釋、Python實現和實際範例,並在適用的情況下與scikit-learn進行比較。
本書非常適合對聚類演算法感興趣的讀者,無需具備Python的先前經驗。完整的源代碼可在以下網址獲得:https://github.com/ganml/dcpython。
作者簡介
Guojun Gan is an Associate Professor in the Department of Mathematics at the University of Connecticut, where he has been since August 2014. Prior to that, he worked at a large life insurance company in Toronto, Canada for six years and a hedge fund in Oakville, Canada for one year. He earned a BS degree from Jilin University, Changchun, China, in 2001 and MS and PhD degrees from York University, Toronto, Canada, in 2003 and 2007, respectively.
作者簡介(中文翻譯)
甘國軍是康乃爾大學數學系的副教授,自2014年8月以來一直在該校任教。在此之前,他曾在加拿大多倫多的一家大型人壽保險公司工作六年,並在加拿大奧克維爾的一家對沖基金工作一年。他於2001年在中國長春的吉林大學獲得學士學位,並於2003年和2007年分別在加拿大多倫多的約克大學獲得碩士和博士學位。