In this part , I am going to talk about how to minimize the cost function .
In Andrew's class , He talked about a method called gradient decent .
is the cost function ,of course ,there can be more varibles rather than theta 0,theta 1 .
Alpha is the learning rate . Alpha can not be too big , or the theta can not be converged . And alpha can be a fixed value . In calculating , theta 0 ,theta 1 must be calculated simultaneously .