Over Fitting Problem

Over fitting means that the hypothesis can get good predictions on the traning set but can not get good predictions outside the tranining set. The figure looks like the below second one.


Andrew Ng

In order to solve this problem , we use the algorithm regularization,

Regularization 

Regularization means we penalize some theta values (make theta values small enough) 

We discuss 2 situations , 

(1)Linear Regression

We redefine the cost function like this:


Andrew Ng

If we increase lambda which means we penalize theta more.Be careful ! lambda can not be too large ,which may cause under fitting.

To minimize the above cost function , we can use either gradient decent or normal equation shown below:


Andrew Ng

gradient decent


Andrew Ng

normal equation

(2)logistic regression

we redefine the cost function:


Andrew Ng

 

Also we can use gradient decent to solve this 


Andrew Ng