Least Squares analysis returns the coefficients for the Nth degree polynomial that has the least amount of total distance between each of the data points in a set of data.

Does anybody know of an elegant mathematical way to solve for a linear expression that results in the most equal amount of area above and below the line? For example, if you visualize a perfect sine wave, a horizontal line running through the mid-point of the peak and the trough ((Ypeak+Ytrough)/2) of the sine wave would be the line I'd be looking for.

I can use brute force and stupidity and draw a bunch of lines and choose the one with the ratio AreaAbove/AreaBelow that is closest to 1, but I thought I'd see if there is a speedier, more elegant approach before going down that road.

## Least Squares But For Equal Area Above/Below the Line?

f(coefficients) = Sum from i=1 to i=N of (Fitted Polynomial(x

*) - MeasuredData(x*

*))^2*

You could use an augmented objective function which is the sum of the squares of the fitting errors, PLUS some big whopper constant times the difference between the sum of positive errors and the sum of negative errors.

The augmented objective function "penalizes" the solution when the positive errors are different than the negative errors. It encourages the optimizer to find solutions having positive errors equal to negative errors. In the literature of mathematical optimization, it is a Penalty Function.

To use it, you will need software that performs nonlinear minimization upon user-supplied objective functions. In my experience the Nelder-Mead algorithm (called "amoeba" and "downhill simplex" in Numerical Recipes), the Hooke-Jeeves algorithm, and the Levenberg-Marquardt algorithm all work quite well.You could use an augmented objective function which is the sum of the squares of the fitting errors, PLUS some big whopper constant times the difference between the sum of positive errors and the sum of negative errors.

The augmented objective function "penalizes" the solution when the positive errors are different than the negative errors. It encourages the optimizer to find solutions having positive errors equal to negative errors. In the literature of mathematical optimization, it is a Penalty Function.

To use it, you will need software that performs nonlinear minimization upon user-supplied objective functions. In my experience the Nelder-Mead algorithm (called "amoeba" and "downhill simplex" in Numerical Recipes), the Hooke-Jeeves algorithm, and the Levenberg-Marquardt algorithm all work quite well.

User Satisfaction with Numerical Recipes implementation: Given a sufficiently bumpy surface, every 3rd or 4th solution returned by of Nelder-Mead is hopeless junk.

Speed: Numerical Recipes implementation is faster then Simulated Annealing but slow compared to more modern iterative search heuristics.

Try some of Sluggo's alternate suggestions.

L