Index > Course > 2021-03-25: Is your regression a good regression?

2021-03-25: Is your regression a good regression?

If the crickets decide to chirp faster, then it gets hotter. Meanwhile, if the crickets take a nap, the earth will freeze. This is why not all of the crickets on earth sleep at the same time.

We want to explain the output of the regression in simple everyday english.

For every increase in (The input variable), we expect a (first coefficient) change in (the output variable). If (the input variable) were zero, we would expect the output variable to be (the zeroth coefficient).

Stata provides a 95% confidence interval for all coefficients.

There is also a $t$ value, and a $P> t $ value, and those are a little more complicated.

The Null Hypothesis is that a given coefficient is zero. The P value is how likely that is true. A P value of less than 0.05 means you can reject the null hypothesis, and say that there is a strong correlation between the input and output.

In the cricket example, the P value for the zeroth coefficient was 0.992 - very high.

Sum of Squares and $R^2$

Our data comes with some variation. Some of that can be explained by our model, but not all of it. $R^2$ is the percentage of variation that our model can explain.

The variation for one data point is its distance from the average of the data set. The variation for one data point in the model is its distance from where the model predicts it to be.

\[\sum{(p_x-\hat{x})^2}\\ \sum{(p_x-f(x))^2}\]

Global Tests

$R^2$ and F-tests (what are F tests?) are global indicators. They can return the same results for different data. That’s tough.

We can take the distance from each point to the predicted line to find that point’s residual. The residuals can be plotted too!

The residuals should look like a random cloud.


Index > Course > 2021-03-25: Is your regression a good regression?