
1.1 Example

 - Database mining

       Large datasets from growth of automation/web.

       E.g., Web click data, medical records, biology, engineering

       - Applications can't program by hand.

       E.g., Atonomous helicopter, handwriting recognition, most of

       Natural Language Processing(NLP), Compter vision.

       - Self-customizing programs

       E.g., Amason, Netflix, poduct recommendations

       - Understanding human learning(brain, real AI)

1.2 What is machine learning?


       1. Arthur Samuel (1959). Machine Learning: Field of study that gives computers the ability to learn without being explicitly programed.

       2. Tom Mitchell(1998) Well-posed Learning Problem: A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.(定义理解以spam为例:T表示对邮件进行分类,判断是否为垃圾邮件,E表示对邮件进行分类的结果,P表示对邮件分类的正确率。定义的意思是通过E,使得T更好,P就是这个指标,P增加)

1.3 Supervised Learning

        给出一个算法,需要部分数据集已经有正确答案(“right answer given”)。比如(以housing price prediction为例)给定房价数据集,对于里面每个数据,算法都知道对应的正确房价。算法的结果就是算出更多的正确价格。



1.3.1 Regression

        Predict continuous valued output (price).预测一个连续值作为输出。比如知道房屋的面积,预测房价。

1.3.2 Classification

        Discrete valued output(eg:0 or 1). 比如Breast cancer(malignant, benign)

 1.4 Unsupervised Learning


2 Linear Regression with One Variable

2.1 Model and Cost Function

2.1.1 Model Representation


机器学习估算EDA功耗 eda machine learning_代价函数



x 表示输入(特征)

y 表示输出 (目标值)

m 表示训练集的样本数量

(x,y) 表示全部训练集数据

(x(i),y(i)) 表示训练集中第i个数据

h 表示假设函数,输入和输出之间的一种关系


机器学习估算EDA功耗 eda machine learning_数据_02

图 2.1 线性回归过程


 2.1.2 Cost Function



机器学习估算EDA功耗 eda machine learning_sed_03


机器学习估算EDA功耗 eda machine learning_机器学习估算EDA功耗_04

图2.2 建模误差

        对于参数的选取,决定了模型的预测值与训练集中实际值的差距。(蓝线就是modeling error)。

机器学习估算EDA功耗 eda machine learning_机器学习估算EDA功耗_05

图 2.3 代价函数图像
