Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FitMaker #49

Closed
jonathanolson opened this issue Aug 13, 2015 · 9 comments
Closed

FitMaker #49

jonathanolson opened this issue Aug 13, 2015 · 9 comments

Comments

@jonathanolson
Copy link
Contributor

What type of fit does it do, minimizing least squares via linear regression? (This should be documented).

Besides creation of the specific type of augmented matrix used for the regression, it seems like it's duplicating code from dot's Matrix (ported from JAMA). Is there a benefit to having this re-implemented? (Is it doing anything besides just solving the augmented matrix?)

Also it may not be relevant, but if FitMaker is a performance bottleneck, the use of two-dimensional non-typed arrays, and repeated references to matrix element lookups could be optimized (dot uses one-dimensional typed arrays).

Review: #31

@jonathanolson
Copy link
Contributor Author

I'll handle this.

@veillette
Copy link
Contributor

We will look into using DOT matrices instead.

@veillette
Copy link
Contributor

One can leverage solve in DOT.Matrix to do the heavy lifting.
Of course that simplifies the code base but it will come at the expense of introducing many instances of Matrix. The Dubson implementation made a clever reuse of matrices which is not possible to achieve using Dot.Matrix (unless one uses scratch Matrices).

In this new implementation, all the heavy lifting is done by squareMatrix.solve( columnMatrix );

For the record, the previous implementation of FitMaker was not a performance bottleneck in the simulation.

// Copyright 2015-2016, University of Colorado Boulder

//TODO this needs documentation and cleanup
/**
 * Fit maker for 'Curve Fitting' simulation.
 *
 * @author Andrey Zelenkov (Mlearner)
 */
define( function( require ) {
  'use strict';

  // modules
  var curveFitting = require( 'CURVE_FITTING/curveFitting' );
  var CurveFittingConstants = require( 'CURVE_FITTING/curve-fitting/CurveFittingConstants' );
  var inherit = require( 'PHET_CORE/inherit' );
  var Matrix = require( 'DOT/Matrix' );

  // constants
  var solutionArrayLength = CurveFittingConstants.MAX_ORDER_OF_FIT + 1;

  /**
   * @constructor
   */
  function FitMaker() {

    // @private
    this.solutionArray = [];
  }

  curveFitting.register( 'FitMaker', FitMaker );

  return inherit( Object, FitMaker, {

    /**
     * Returns a numerical array of the coefficients of the polynomial
     * @param {Array.<Point>} points
     * @param {number} order
     * @returns {Array}
     * @public
     */
    getFit: function( points, order ) {

      // column matrix with the coefficients
      var solutionMatrix = this.getSolutionMatrix( points, order );

      // size of the column matrix
      var numberOfRows = solutionMatrix.getRowDimension();

      // the solutionArray has solutionArrayLength elements, regardless of the dimensions of the solution matrix
      // the missing elements are padded with zeros
      for ( var k = 0; k < solutionArrayLength; ++k ) {
        this.solutionArray[ k ] = (k < numberOfRows) ? solutionMatrix.get( k, 0 ) : 0;
      }
      return this.solutionArray;
    },

    /**
     * Returns a column solution matrix, A, containing the coefficients of the polynomial
     * The solution matrix is found by solving the equation, Y = X A
     * where X is a square matrix, and Y is a column matrix.
     *
     * the number of rows of the solution matrix is given by the number of points, or
     * the order +1, whichever is smaller.
     *
     * see http://mathworld.wolfram.com/LeastSquaresFittingPolynomial.html 
     *
     * @param {Array.<Point>} points
     * @param {number} order
     * @returns {Matrix} the column matrix A
     * @private
     */
    getSolutionMatrix: function( points, order ) {

      // the rank of the matrix cannot be larger than the number of points
      var m = Math.min( order + 1, points.length );

      var squareMatrix = new Matrix( m, m ); // matrix X
      var columnMatrix = new Matrix( m, 1 ); // matrix Y

      // fill out the elements of the matrices Y and X
      points.forEach( function( point ) {
        var deltaSquared = point.delta * point.delta;
        var x = point.position.x;
        var y = point.position.y;

        for ( var j = 0; j < m; ++j ) {
          for ( var k = 0; k < m; ++k ) {
            squareMatrix.set( j, k, squareMatrix.get( j, k ) + Math.pow( x, j + k ) / deltaSquared );
          }
          columnMatrix.set( j, 0, columnMatrix.get( j, 0 ) + Math.pow( x, j ) * y / deltaSquared );
        }
      } );

      // the solution matrix, A, is X^-1 * Y
      return squareMatrix.solve( columnMatrix );
    }
  } );
} );

Assigning to @pixelzoom to decide how to proceed.

@veillette veillette assigned pixelzoom and unassigned veillette Aug 5, 2016
@pixelzoom
Copy link
Contributor

pixelzoom commented Aug 5, 2016

@veillette please clarify:

(1) Is the code that you pasted into the above comment the new proposed implementation (which doesn't appear to be pushed to GitHub), or what you are referring to as "the previous implementation"?

(2) Have you profiled the performance of the new implementation? Did you say "the previous implementation of FitMaker was not a performance bottleneck" because the new implementation is a bottleneck?

@pixelzoom pixelzoom assigned veillette and unassigned pixelzoom Aug 5, 2016
@veillette
Copy link
Contributor

Jonathan Olson's comment implied that using dot.Matrix might be a performance booster.
I was pointing out that the Dubson's implementation was not a performance bottleneck in the first place. Dubson's implementation works well and is accurate.

The code I pasted above is the new implementation that makes uses of dot.Matrix and Matrix.solve. It simplifies the codebase. I did not push it to GitHub.
Numerically, it gives the same result (solutionArray) as Dubson's implementation, even for edge cases.

So both approaches are equivalent mathematically (although dot matrix appears to use a different type of solver)

The dot.matrix implementation generates three instances of Matrix for every getFit call so one might think that there would be more garbage collection events.

However, on my laptop, I did not register any performance differences and the number of garbage collection events appears to be similar ( using fuzzMouse=100)

@pixelzoom
Copy link
Contributor

If the results are the same, performance is the same or better, and there's less code, then I would opt for the new implementation. That said, I'm not at all familiar with FitMaker yet, so your call.

@veillette
Copy link
Contributor

The new implementation of fitMaker that relies on DOT was pushed. It will simplify a bit the codebase.
One side benefit of using DOT is that we can use methods of DOT to more easily find edge cases .
For instance issue #105 is due to the fact that the inverse matrix does not exist. This easier to do in DOT.

Running fuzzMouse for five minutes does not reveal any issues.

@SaurabhTotey
Copy link
Member

I believe this issue can be closed. Marking as ready for review.

@veillette
Copy link
Contributor

This issue is so old that its title refers to a module that no longer exists. The current method getBestFitCoefficients in Curve.js fulfills the same role as FitMaker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants