These solutions are for reference only.

try to solve on your own

but if you get stuck in between than you can refer these solutions

there are different set of questions ,

we have provided the variations in particular question at the end.

-----------------------------------------------------------------------------------------

## Regularization

TOTAL POINTS 5

EXPLANATION:

Adding a new feature to the model always results in equal or better performance on the training set. (True )
Adding many new features gives us more expressive models which are able to better fit our training set. If too many new features are added, this can lead to overfitting of the training set.

Introducing regularization to the model always results in equal or better performance on the training set. (False)
=>If we introduce too much regularization, we can underfit the training set and have worse performance on the training set.

Adding many new features to the model helps prevent overfitting on the training set(False)
=>Adding many new features gives us more expressive models which are able to better fit our training set. If too many new features are added, this can lead to overfitting of the training set.

Introducing regularization to the model always results in equal or better performance on examples not in the training set.(False)
=>If we introduce too much regularization, we can underfit the training set and this can lead to worse performance even for examples not in the training set.

EXPLANATION:

When λ is set to 1, We use regularization to penalize large value of θ. Thus, the parameter, θ, obtained will in general have smaller values.

variation to the above 2nd question is provided at the end

variation to the above 3 rd question is provided at the end

EXPLANATION:

The hypothesis follows the data points very closely and is highly complicated, indicating that it is overfitting the training set

EXPLANATION:
The hypothesis does not predict many data points well, and is thus underfitting the training set.

------------------------------------------------------------------------

### variations in 3rd question:

EXPLANATION:

Using a very large value λ cannot hurt the performance of your hypothesis; the only reason we do not set to be too large is to avoid numerical problems. (False)
=>Using a very large value of λ can lead to underfitting of the training set.

Because regularization causes J(θ) to no longer be convex, gradient descent may not always converge to the global minimum (when λ > 0, and when using an appropriate learning rate α).(False)
=>Regularized logistic regression and regularized linear regression are both convex, and thus gradient descent will still converge to the global minimum.

Using too large a value of λ can cause your hypothesis to underfit the data.(True)
=>A large value of results in a large λ regularization penalty and thus a strong preference for simpler models which can underfit the data.

Because logistic regression outputs values 0 <= h0 <= 1, its range of output values can only be "shrunk" slighly by regularization anyway, so regularization is generally not helpful for it.(False)
=>None needed

### variations in 2nd question:

---------------------------------------------------------------------------------

reference : coursera

darkmode