Ordinary Least SquaresΒΆ
In linear regression, the model specification is that the dependent variable is a linear combination of the parameters. The residual is the difference between the value of the dependent variable predicted by the model, and the true value of the dependent variable. Ordinary least squares obtains parameter estimates that minimize the sum of squared residuals, SSE (also denoted RSS).
The ordinary least squares (OLS) estimator is consistent when the independent variables are exogenous and there is no multicollinearity, and optimal in the class of linear unbiased estimators when the errors are homoscedastic and serially uncorrelated. Under these conditions, the method of OLS provides minimum-variance mean-unbiased estimation when the errors have finite variances.
from miml import datasets
from miml.regression import OLS
fn = os.path.join(datasets.get_data_home(), 'weka', 'regression',
'2dplanes.arff')
ds = datasets.load_arff(fn, 10)
x = ds.x
y = ds.y
model = OLS()
model.fit(x, y)
r = model.predict(x[:10,:])
print r
>>> run script...
array([5.073347387304948])
>>> model
Linear Model:
Residuals:
Min 1Q Median 3Q Max
-8.5260 -1.6514 -0.0049 1.6755 7.8116
Coefficients:
Estimate Std. Error t value Pr(>|t|)
Intercept -0.0148 0.0118 -1.2503 0.2112
Var 1 2.9730 0.0118 251.7998 0.0000 ***
Var 2 1.5344 0.0145 105.8468 0.0000 ***
Var 3 1.0357 0.0144 71.7815 0.0000 ***
Var 4 0.5281 0.0145 36.4827 0.0000 ***
Var 5 1.4766 0.0144 102.2472 0.0000 ***
Var 6 1.0044 0.0144 69.5380 0.0000 ***
Var 7 0.5238 0.0145 36.1696 0.0000 ***
Var 8 -0.0011 0.0145 -0.0750 0.9402
Var 9 0.0024 0.0145 0.1649 0.8690
Var 10 -0.0278 0.0145 -1.9239 0.0544 .
---------------------------------------------------------------------
Significance codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.3838 on 40757 degrees of freedom
Multiple R-squared: 0.7056, Adjusted R-squared: 0.7055
F-statistic: 9766.9504 on 10 and 40757 DF, p-value: 0.000