In this part , I am going to talk about how to minimize the cost function .

In Andrew's class , He talked about a method called gradient decent . 


Andrew Ng


Andrew Ng

is the cost function ,of course ,there can be more varibles rather than theta 0,theta 1 .

Alpha is the learning rate . Alpha can not be too big , or the theta can not be converged .  And alpha can be a fixed value . In calculating , theta 0 ,theta 1 must be calculated simultaneously .