Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HELP] Linear regression with errors #103

Open
mpound opened this issue Mar 4, 2021 · 5 comments
Open

[HELP] Linear regression with errors #103

mpound opened this issue Mar 4, 2021 · 5 comments
Assignees

Comments

@mpound
Copy link

mpound commented Mar 4, 2021

Hi Samuel. I have run through the notebook modopt_example.ipynb, and inverse_problems_1.ipynb and sparsity_1.ipynb in the CosmoStat/Tutorials ada branch, playing around with parameters to better my understanding. I am wondering how one takes the next step of fitting the curve if there are known random errors in y (e.g. weighted fit). Is it a term added to the cost function? Is there a modopt class to handle this?
thanks,
Marc

@mpound
Copy link
Author

mpound commented Mar 5, 2021

ok, H changes to (XT W X)-1 XT W, where W are the weights.
I'm not yet reproducing the correct answer with W=1 but I think the above is correct.

@sfarrens
Copy link
Contributor

sfarrens commented Mar 8, 2021

Hi @mpound, apologies for the delay in addressing this issue. I was not available last week.

I have not really thought about a weighted fit example. I mainly use these basic regression problems as very simple way to illustrate how convex optimisation works. I assume that given the points in x, y and the weights the problem can still be solved analytically, in which case it would be a simple case of defining the gradient appropriately in ModOpt. I will try to find some time to play with this.

@mpound
Copy link
Author

mpound commented Mar 8, 2021

Hi Samuel. Yes, the simple regression problem could be solved in other ways. I am trying to teach myself using the simple understandable case before moving on to more complex ones. I'm not typically used to thinking of things in linear algebraic terms so I have to familiarize myself with that. Ultimately, my goal is to fit a combination of linear functions at all pixels in a spatial map and add regularization to ensure a spatially smooth solution.

The science case is fitting an H_2 excitation diagram with two linear functions of temperature at all pixels given spatial maps of line intensity. The fit can be sensitive to measurement errors (ill-conditioned) so I'm thinking the modopt package can help.
Example:
two-temperature

@sfarrens sfarrens self-assigned this Mar 8, 2021
@sfarrens
Copy link
Contributor

sfarrens commented Mar 8, 2021

@mpound That makes sense. I will try to find some time to add a simple example using weighted data points so that you can have a better idea of how to manipulate the appropriate ModOpt objects for your application. This could be helpful for other users also.

@mpound
Copy link
Author

mpound commented Mar 9, 2021

@sfarrens Here is a sample method for applying weights to a fit. I tested it on the linear and polynomial fit examples in inverse_problems_1.ipynb.

    # This function implements the normal equation given a weight array W
    # to weight the fit of the y points. e.g. W = np.diag(1/sigma_y ** 2).
    # W must be a diagonal array.
    def normal_eq_wt(H,W,y):
        q = H.T @ W @ H
        qinv = np.linalg.pinv(q)
        return qinv @ H.T @ W @ y

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants