# Simple Linear Regression

For simple linear regression we follow the conventions below.

## Observations and Estimates

The `Forecasting`

library uses as input data observations for the
independent variable and the dependent variable. It provides estimates
for the coefficients of the simple linear regression line.

\(N\) |
number of observations |

\(x_i, i \in \{1\ldots N\}\) |
observations of the independent variable |

\(y_i, i \in \{1\ldots N\}\) |
observations of the dependent variable |

\(\bar{x}=(1/N)\sum_{i=1}^{N}x_{i}\) |
average of the independent observations |

\(\bar{y}=(1/N)\sum_{i=1}^{N}y_{i}\) |
average of the dependent observations |

\(\hat{y}_i, i \in \{1\ldots N\}\) |
predictions of the dependent variable |

\(\beta_{0}, \beta_{1}\) |
coefficients of the linear relationship (random) |

\(\hat{\beta}_{0}, \hat{\beta}_{1}\) |
coefficients of the linear regression line (estimates) |

\(e_i, i \in \{1\ldots N\}\) |
error (residual) for observation data points |

## Linear Relationship

The linear relationship between \(x_i\) and \(y_i\) is modeled by the equation:

where \(\epsilon_i\) is an error term which averages out to 0 for every \(i\).

## Linear Regression

The random \(\beta_{0}\) and \(\beta_{1}\) are estimated by \(\hat{\beta}_{0}\) and \(\hat{\beta}_{1}\), such that the prediction for \(y_i\) is given by the equation:

So, the predictions based on simple linear regression corresponding to the observation data points \((x_i,y_i)\) are provided in \(\hat{y}_i, i \in \{1\ldots N\}\).

## Residuals

The error (residual) \(e_i\) for the data point \(i\) is the difference between the observed \(y_i\) and the predicted \(\hat{y}_i\), so \(e_i = y_i - \hat{\beta}_0 - \hat{\beta}_1x_i\). In order to obtain the residuals, the user will need to provide a one-dimensional parameter declared over the set of observations.

## Variation Components

Given the values of the observations, the estimates, and the
residuals, several components of variation can be computed, such as
**sum of squares total** = SST, **sum of squares error** = SSE, and
**sum of squares regression** = SSR, which are defined as follows:

These components of variation satisfy the relation \(SST = SSE + SSR\).

Furthermore, it is also possible to compute the **coefficient of
determination** = \(R^2\), the **sample linear correlation** =
\(r_{xy}\), and the **standard error of the estimate** =
\(s_e\), which are defined as follows:

## Predeclared Index `vcs`

The linear regression functions return the values of the line
coefficients in a parameter declared over the index `forecasting::co`

declared as follows:

```
Set LRcoeffSet{
Index: co;
Definition: {
data {
0, ! Intercept Coefficient of Regression Line
1 ! Slope Coefficient of Regression Line
}
}
}
```

Whenever one of the linear regression
functions communicates back components of variations, it uses
identifiers declared over the index `forecasting::vcs`

declared as
follows:

```
Set VariationCompSet {
Index: vcs;
Definition: {
data {
SST, ! Sum of Squares Total
SSE, ! Sum of Squares Error
SSR, ! Sum of Squares Regression
Rsquare, ! Coefficient of Determination
MultipleR, ! Sample Linear Correlation Rxy
Se ! Standard Error
}
}
}
```

In order to obtain the variation components, the
user will need to provide a parameter indexed over `forecasting::vcs`

to the linear regression functions.