Thu Apr 05 2018

Let us say that have X1, X2, X3 characteristics. X1, X2, X3 if we want to predict the value of a house could be X1 = age of the house, X2 = location, X3 = amenities e.t.c

Now we want to find f which is our model where f(X1, X2, X3) will produce the value of the house. Since the value is a **number **we can use a technique named **regression**.

Now we split our data since we are using supervised learning. Some are used to create the model, some are used to verify the model.

Residuals= |(predicted labels or Y values) - (actual labels or Y values)|

Now let us measure the error in the model.

RMSE (root-mean-square) = √(SUM(predicted - actual)^2) MAE (mean absolute error) = 1/n SUM(ABS(predicted -actual))

If we want the error to be between 0 and 1 to have a more general error.

The following the closer to 0 the better

RAE (relative absolute error) = SUM(ABS(predicted - actual)) / SUM (actual^2) RSA (relative squared error) = √(SUM(predicted - actual)^2 / SUM(actual^2)

The following the closer to 1 the better

CoD (R^2) = 1 - var(predicted - actual) / var(actual)