数据科学基础-微软研究院_算法

这本书介绍了数据科学的数学和算法基础,包括机器学习,高维几何,和大型网络的分析。主题包括高维数据的反直觉性质,重要的线性代数技术,如奇异值分解,随机漫步和马尔科夫链理论,机器学习的基本原理和重要算法,聚类算法和分析,大型网络的概率模型,表示学习包括主题建模和非负矩阵分解、小波和压缩感知。发展了重要的概率技术,包括大数定律、尾不等式、随机投影分析、机器学习中的泛化保证,以及用于分析大型随机图中的相变的矩方法。此外,还讨论了矩阵规范和vc维等重要的结构和复杂性度量指标。这本书适合本科和研究生课程的设计和分析算法的数据。


商品简介

This book is aimed towards both undergraduate and graduate courses in computer science on the design and analysis of algorithms for data. The material in this book will provide students with the mathematical background they need for further study and research in machine learning, data mining, and data science more generally.

书评与摘要

This book provides an introduction to the mathematical and algorithmic foundations of data science, including machine learning, high-dimensional geometry, and analysis of large networks. Topics include the counterintuitive nature of data in high dimensions, important linear algebraic techniques such as singular value decomposition, the theory of random walks and Markov chains, the fundamentals of and important algorithms for machine learning, algorithms and analysis for clustering, probabilistic models for large networks, representation learning including topic modelling and non-negative matrix factorization, wavelets and compressed sensing. Important probabilistic techniques are developed including the law of large numbers, tail inequalities, analysis of random projections, generalization guarantees in machine learning, and moment methods for analysis of phase transitions in large random graphs. Additionally, important structural and complexity measures are discussed such as matrix norms and VC-dimension. This book is suitable for both undergraduate and graduate courses in the design and analysis of algorithms for data.

电子书领取方式

关注”知识图谱AI大本营“公众号,后台回复 science01  即可获取电子书百度网盘地址