# Coursera: Machine Learning-Andrew NG(Week 1) Quiz - Linear Regression with One Variable

## These solutions are for reference only.try to solve on your ownbut if you get stuck in between than you can refer these solutions

there are different set of questions ,we have provided the variations in particular question at the end.read questions carefully before marking

--------------------------------------------------------------------

These solutions are for reference only.

try to solve on your own

but if you get stuck in between than you can refer these solutions

there are different set of questions ,

we have provided the variations in particular question at the end.

read questions carefully before marking

--------------------------------------------------------------------

## Linear Regression with One Variable

## TOTAL POINTS 5

explanation:As J(Î¸0,Î¸1)=0, y = hÎ¸(x) = Î¸0 + Î¸1x. Using any two values in the table, solve for Î¸0, Î¸1.

explanation: Setting x = 2, we have hÎ¸(x)=Î¸0+Î¸1x = 0 + (1.5)(2) = 3

explanation:True If Î¸0 and Î¸1 are initialized at a local minimum, then one iteration will not change their values. At a local minimum, the derivative (gradient) is zero, so gradient descent will not change the parameters.

False Setting the learning rate to be very small is not harmful, and can only speed up the convergence of gradient descent. If the learning rate is small, gradient descent ends up taking an extremely small step on each iteration, so this would actually slow down (rather than speed up) the convergence of the algorithm.

True If the first few iterations of gradient descent cause f(Î¸0,Î¸1) to increase rather than decrease, then the most likely cause is that we have set the learning rate to too large a value if alpha were small enough, then gradient descent should always successfully take a tiny small downhill and decrease f(Î¸0,Î¸1) at least a little bit. If gradient descent instead increases the objective value, that means alpha is too large (or you have a bug in your code!).

False No matter how Î¸0 and Î¸1 are initialized, so long as learning rate is sufficiently small, we can safely expect gradient descent to converge to the same solution This is not true, depending on the initial condition, gradient descent may end up at different local optima.

other options which can come in above question 4:

TOTAL POINTS 5

True | If Î¸0 and Î¸1 are initialized at a local minimum, then one iteration will not change their values. | At a local minimum, the derivative (gradient) is zero, so gradient descent will not change the parameters. |

False | Setting the learning rate to be very small is not harmful, and can only speed up the convergence of gradient descent. | If the learning rate is small, gradient descent ends up taking an extremely small step on each iteration, so this would actually slow down (rather than speed up) the convergence of the algorithm. |

True | If the first few iterations of gradient descent cause f(Î¸0,Î¸1) to increase rather than decrease, then the most likely cause is that we have set the learning rate to too large a value | if alpha were small enough, then gradient descent should always successfully take a tiny small downhill and decrease f(Î¸0,Î¸1) at least a little bit. If gradient descent instead increases the objective value, that means alpha is too large (or you have a bug in your code!). |

False | No matter how Î¸0 and Î¸1 are initialized, so long as learning rate is sufficiently small, we can safely expect gradient descent to converge to the same solution | This is not true, depending on the initial condition, gradient descent may end up at different local optima. |

True | If the learning rate is too small, then gradient descent may take a very long time to converge. | If the learning rate is small, gradient descent ends up taking an extremely small step on each iteration, and therefor can take a long time to converge |

False | Even if the learning rate Î± is very large, every iteration of gradient descent will decrease the value of f(Î¸0,Î¸1). | If the learning rate is too large, one step of gradient descent can actually vastly "overshoot" and actually increase the value of f(Î¸0,Î¸1). |

False | If Î¸0 and Î¸1 are initialized so that Î¸0=Î¸1, then by symmetry (because we do simultaneous updates to the two parameters), after one iteration of gradient descent, we will still have Î¸0=Î¸1. | The updates to Î¸0 and Î¸1 are different (even though we're doing simulaneous updates), so there's no particular reason to update them to be same after one iteration of gradient descent. |

False | For this to be true, we must have y(i)=0 for every value of i=1,2,…,m. | So long as all of our training examples lie on a straight line, we will be able to find Î¸0 and Î¸1) so that J(Î¸0,Î¸1)=0. It is not necessary that y(i) for all our examples. |

False | Gradient descent is likely to get stuck at a local minimum and fail to find the global minimum. | - |

False | For this to be true, we must have Î¸0=0 and Î¸1=0 so that hÎ¸(x)=0 | If J(Î¸0,Î¸1)=0 that means the line defined by the equation "y = Î¸0 + Î¸1x" perfectly fits all of our data. There's no particular reason to expect that the values of Î¸0 and Î¸1 that achieve this are both 0 (unless y(i)=0 for all of our training examples). |

True | Our training set can be fit perfectly by a straight line, i.e., all of our training examples lie perfectly on some straight line. | - |

other options which can come in above question 5:

False | We can perfectly predict the value of y even for new examples that we have not yet seen. (e.g., we can perfectly predict prices of even new houses that we have not yet seen.) | - |

False | This is not possible: By the definition of J(Î¸0,Î¸1), it is not possible for there to exist Î¸0 and Î¸1 so that J(Î¸0,Î¸1)=0 | - |

True | For these values of Î¸0 and Î¸1 that satisfy J(Î¸0,Î¸1)=0, we have that hÎ¸(x(i))=y(i) for every training example (x(i),y(i)) | - |

------------------------------------------------------

variations in 2nd question:

2.For this question, assume that we are using the training set from Q1.

Recall our definition of the cost function was

What is ? In the box below,Recall our definition of the cost function was

please enter your answer (Simplify fractions to decimals when entering answer, and ‘.’ as the decimal delimiter e.g., 1.5).

variations in 3rd question:

3. Suppose we set Î¸0=−1,Î¸1=0.5. What is hÎ¸(4)?

3. Suppose we set Î¸0=−1,Î¸1=0.5. What is hÎ¸(4)?

explanation:

Setting x = 4, we have hÎ¸(x)=Î¸0+Î¸1x = -1 + (0.5)(4) = 1

explanation:

Setting x = 4, we have hÎ¸(x)=Î¸0+Î¸1x = -1 + (0.5)(4) = 1

explanation:

Setting x = 6, we have hÎ¸(x)=Î¸0+Î¸1x = -2 + (0.5)(6) = 1

explanation:

Setting x = 6, we have hÎ¸(x)=Î¸0+Î¸1x = -2 + (0.5)(6) = 1

## variations in 4th question:

reference : coursera

reference : coursera