-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
logging for estimate_msm not working #468
Comments
Hi @ksarnoff, Thank you for opening this issue. I will try to look into this soon. If you want, you can help me to localize the problem in the meantime. To me it looks like this is actually a problem with the Dashboard and not with logging itself. Could you confirm that by trying out other means to read the log file? For example: `em.criterion_plot("path/to/your/log.db") or
Both are described in more detail and with examples here |
|
Hi Kim, Thanks for your patience. I tried to replicate your problem with an example from the documentation. However, everything works fine there. Does the following code work on your machine? # imports
import estimagic as em
import numpy as np
import pandas as pd
# create random number generator
rng = np.random.default_rng(seed=0)
# simulate the data
def simulate_data(params, n_draws, rng):
x = rng.normal(0, 1, size=n_draws)
e = rng.normal(0, params.loc["sd", "value"], size=n_draws)
y = params.loc["intercept", "value"] + params.loc["slope", "value"] * x + e
return pd.DataFrame({"y": y, "x": x})
true_params = pd.DataFrame(
data=[[2, -np.inf], [-1, -np.inf], [1, 1e-10]],
columns=["value", "lower_bound"],
index=["intercept", "slope", "sd"],
)
data = simulate_data(true_params, n_draws=100, rng=rng)
# calculate moments
def calculate_moments(sample):
moments = {
"y_mean": sample["y"].mean(),
"x_mean": sample["x"].mean(),
"yx_mean": (sample["y"] * sample["x"]).mean(),
"y_sqrd_mean": (sample["y"] ** 2).mean(),
"x_sqrd_mean": (sample["x"] ** 2).mean(),
}
return pd.Series(moments)
empirical_moments = calculate_moments(data)
# calculate moments_cov
moments_cov = em.get_moments_cov(
data, calculate_moments, bootstrap_kwargs={"n_draws": 5_000, "seed": 0}
)
# define simulation function
def simulate_moments(params, n_draws=10_000, seed=0):
rng = np.random.default_rng(seed)
sim_data = simulate_data(params, n_draws, rng)
sim_moments = calculate_moments(sim_data)
return sim_moments
# run estimation
start_params = true_params.assign(value=[100, 100, 100])
res = em.estimate_msm(
simulate_moments,
empirical_moments,
moments_cov,
start_params,
optimize_options="scipy_lbfgsb",
logging="log.db",
log_options={"fast_logging": True, "if_table_exists": "replace"},
)
# create criterion plot
em.criterion_plot("log.db") If so, it should produce a figure similar to this: By now estimagic is compatible with sqlalchemy 2.x. I produced the above output using:
Even if the above example runs on your computer I would like to help you to find the cause of your problem. |
That code does run correctly. I just re-ran my actual code for about 10 minutes and the log.db is not getting generated. Does it only generate once the optimization is done? I thought that wasn't the case, but if it is, it would explain the issue. Otherwise, I can give you the code I am actually feeding into estimate_msm if that's helpful. |
Ok, that is good news, even though it makes the debugging harder. The log file is generated after one evaluation of your objective function and then updated after each further evaluation. It would be super helpful to have your code or a small self contained example that produces the issue for you. If you don't want to post it publicly, you can find my email on my github profile. |
Hi Kim, Thanks for sending your code. My feeling is that you just did not wait long enough for something to be logged. Estimagic logs the parameter vector and criterion value after each iteration. For the optimizer you chose ( Switching to a gradient free optimizer (I tried "nlopt_neldermead") shows that the logging works perfectly. After about 10 minutes the database is created and then there is an update every 5 minutes. The criterion_plot also works. If you don't have a closed form gradient, probably a gradient free optimizer (e.g. "nlopt_neldermead", "nlopt_bobyqa", "nag_pybobyqa") is a good choice. Since MSM problems are nonlinear least-squares problems you could also try a specialized optimizer like "nag_dfols" or "pounders". I would also strongly recommend to try to speed up your Let me know if this helped and we can close the issue. |
Yes, switching to a gradient free optimizer was much faster. The logging works totally fine. Thanks! |
Hi,
Bug description
I am using estimate_msm and want to log the output. The log file is being created, but I am not sure it is updating and I can't open it.
To reproduce
Here's my
estimate_msm
code:Screenshots/Error messages
After some time, log.db did get generated. I wanted to check it, so I ran
estimagic dashboard log.db
from the terminal. I got the following message:and the page itself had the following message in the console:
Failed to load resource: the server responded with a status of 404 (Not Found)
.I am not sure log.db is updating (my computer says no changes have occurred in the past hour). I saw in issue 431 that estimagic is only compatible with sqlalchemy 1.4, so I downgraded to that and it didn't fix it.
System
Mac OS 12.4
estimagic 0.4.6
sqlalchemy 1.4
Thanks,
Kim
The text was updated successfully, but these errors were encountered: