﻿ ﻿Simple Linear Regression - Data Science from Scratch: First Principles with Python (2015)

# Data Science from Scratch: First Principles with Python (2015)

### Chapter 14.Simple Linear Regression

Art, like morality, consists in drawing the line somewhere.

G. K. Chesterton

# The Model

``def` `predict(alpha,` `beta,` `x_i):``
`    `return` `beta` `*` `x_i` `+` `alpha``
``def` `error(alpha,` `beta,` `x_i,` `y_i):``
`    `"""the error from predicting beta * x_i + alpha``
``    when the actual value is y_i"""``
`    `return` `y_i` `-` `predict(alpha,` `beta,` `x_i)``
``def` `sum_of_squared_errors(alpha,` `beta,` `x,` `y):``
`    `return` `sum(error(alpha,` `beta,` `x_i,` `y_i)` `**` `2``
`               `for` `x_i,` `y_i` `in` `zip(x,` `y))``
``def` `least_squares_fit(x,` `y):``
`    `"""given training values for x and y,``
``    find the least-squares values of alpha and beta"""``
`    `beta` `=` `correlation(x,` `y)` `*` `standard_deviation(y)` `/` `standard_deviation(x)``
`    `alpha` `=` `mean(y)` `-` `beta` `*` `mean(x)``
`    `return` `alpha,` `beta``
``alpha,` `beta` `=` `least_squares_fit(num_friends_good,` `daily_minutes_good)``
``def` `total_sum_of_squares(y):``
`    `"""the total squared variation of y_i's from their mean"""``
`    `return` `sum(v` `**` `2` `for` `v` `in` `de_mean(y))``
` `
``def` `r_squared(alpha,` `beta,` `x,` `y):``
`    `"""the fraction of variation in y captured by the model, which equals``
``    1 - the fraction of variation in y not captured by the model"""``
` `
`    `return` `1.0` `-` `(``sum_of_squared_errors(alpha,` `beta,` `x,` `y)` `/``
`                  `total_sum_of_squares(y))``
` `
``r_squared(alpha,` `beta,` `num_friends_good,` `daily_minutes_good)`      `# 0.329``

# Using Gradient Descent

``def` `squared_error(x_i,` `y_i,` `theta):``
`    `alpha,` `beta` `=` `theta``
`    `return` `error(alpha,` `beta,` `x_i,` `y_i)` `**` `2``
` `
``def` `squared_error_gradient(x_i,` `y_i,` `theta):``
`    `alpha,` `beta` `=` `theta``
`    `return` `[-``2` `*` `error(alpha,` `beta,` `x_i,` `y_i),`       `# alpha partial derivative``
`            `-``2` `*` `error(alpha,` `beta,` `x_i,` `y_i)` `*` `x_i]` `# beta partial derivative``
` `
``# choose random value to start``
``random.seed(0)``
``theta` `=` `[``random.random(),` `random.random()]``
``alpha,` `beta` `=` `minimize_stochastic(squared_error,``
`                                  `squared_error_gradient,``
`                                  `num_friends_good,``
`                                  `daily_minutes_good,``
`                                  `theta,``
`                                  `0.0001)``
``print` `alpha,` `beta``

﻿