two-parameter multiple regression (x and a constant term). y=ax+b.
What is error? => OLS implies squared errors (distances) and that error in the y-variable. The line minimizes the sum of the squares of these residuals.
Goodness-of-fit: R2. Conceptually, this is the size of your residuals (in explaining y) compared to the variance in y.
Model reliability / parameter reliability [e.g. bootstrapping]
Model selection [e.g. cross-validation, statistical significance of higher order terms, AIC/BIC]. One approach is to build nested models...
Alternatives that go beyond vanilla line fitting:
Robust regression (median not mean) [e.g. "median absolute deviation"]
Error in two dimensions (see below)
Fit relationships that are more complex than straight lines
You can go nonparametric (binned scatter plot, local regression)
Mixed effects models
Bayesian parameter estimation
The issue of independence
Are the data points in your scatter plot independent, and if so, what independence do they reflect?? One way to think about it: do the dots reflect fresh sampling of noise?
Note that a different issue is the independence (or dependence) of the two variables we are plotting. That is ultimately the issue we are usually trying to understand when we make a scatter plot.
Errors in two dimensions
Fundamentally different!
Known as "deming regression"
Might be useful.
One drawback is it is CPU intensive and requires iterative methods (I think).
Moving to nonlinear relationships. What are our strategies?
Binned scatter plot. (Requires some choice of bin size)
Higher-order polynomials or other nonlinear transformations of your predictors
Fourier transform is one choice of parameterizing the x-axis. If you smooth your data (i.e. delete high frequencies), this can be viewed as fitting a nonlinear function to your data (i.e. by using a basis set of sinusoids that exist only at low frequencies).
Local regression (a.k.a. LOWESS = locally weighted scatterplot smoothing)
CPU-intensive, data-driven method to flexibly allow any shape of model.
Basically, you fit a simple model (e.g. linear model) to local windows of your data.
Window size is a major choice. The choice can be viewed in terms of bias and variance.
Pros: minimal assumptions, elegant
Cons: CPU intensive, breaks down (poor performance) in high-dimensional situations.
fieldmap regularization is an example of local regression in 3D