Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

path_prefix does not persist through search.json #918

Closed
Jammy2211 opened this issue Feb 1, 2024 · 9 comments · Fixed by rhayes777/PyAutoConf#52
Closed

path_prefix does not persist through search.json #918

Jammy2211 opened this issue Feb 1, 2024 · 9 comments · Fixed by rhayes777/PyAutoConf#52
Assignees

Comments

@Jammy2211
Copy link
Collaborator

The following script:

https://github.com/Jammy2211/autofit_workspace_test/blob/main/scripts/database/directory/general.py

Raises the following error:

<autofit.non_linear.search.nest.dynesty.search.static.DynestyStatic object at 0x7f9c056b15d0>
Traceback (most recent call last):
  File "/mnt/c/Users/Jammy/Code/PyAuto/autofit_workspace_test/scripts/database/directory/general.py", line 158, in <module>
    assert path.join("database", "directory", unique_tag) in search.paths.output_path
  File "/usr/lib/python3.10/posixpath.py", line 90, in join
    genericpath._check_arg_types('join', a, *p)
  File "/usr/lib/python3.10/genericpath.py", line 152, in _check_arg_types
    raise TypeError(f'{funcname}() argument must be str, bytes, or '
TypeError: join() argument must be str, bytes, or os.PathLike object, not 'AttributePredicate'

This is because I added an assertion checking that the path_prefix of the search is loaded correct:

assert path.join("database", "directory", unique_tag) in search.paths.output_path

The problem it is not, which can be seen in the search.json file:

{
    "type": "instance",
    "class_path": "autofit.non_linear.search.nest.nautilus.search.Nautilus",
    "arguments": {
        "number_of_cores": 1,
        "name": "species[x1]",
        "unique_tag": null,
        "n_networks": 4,
        "n_like_new_bound": null,
        "seed": null,
        "n_live": 100,
        "n_eff": 500,
        "iterations_per_update": 500,
        "split_threshold": 100,
        "n_points_min": null,
        "path_prefix": {
            "type": "instance",
            "class_path": "pathlib.PosixPath",
            "arguments": {}
        },
        "n_shell": null,
        "n_update": null,
        "enlarge_per_dim": 1.1
    }
}
@rhayes777
Copy link
Owner

I've added a fix but I think the specific exception TypeError: join() argument must be str, bytes, or os.PathLike object, not 'AttributePredicate' is because you called aggregator.unique_tag which gives you an object used in writing queries rather than the value of a unique tag

@Jammy2211
Copy link
Collaborator Author

I tested this with the following example code:

"""
Searches: DynestyStatic
=======================

This example illustrates how to use the nested sampling algorithm DynestyStatic.

Information about Dynesty can be found at the following links:

 - https://github.com/joshspeagle/dynesty
 - https://dynesty.readthedocs.io/en/latest/
"""
# %matplotlib inline
# from pyprojroot import here
# workspace_path = str(here())
# %cd $workspace_path
# print(f"Working Directory has been set to `{workspace_path}`")

import matplotlib.pyplot as plt
import numpy as np
from os import path

import autofit as af

"""
__Data__

This example fits a single 1D Gaussian, we therefore load and plot data containing one Gaussian.
"""
dataset_path = path.join("dataset", "example_1d", "gaussian_x1")
data = af.util.numpy_array_from_json(file_path=path.join(dataset_path, "data.json"))
noise_map = af.util.numpy_array_from_json(
    file_path=path.join(dataset_path, "noise_map.json")
)

plt.errorbar(
    x=range(data.shape[0]),
    y=data,
    yerr=noise_map,
    color="k",
    ecolor="k",
    elinewidth=1,
    capsize=2,
)
plt.show()
plt.close()

"""
__Model + Analysis__

We create the model and analysis, which in this example is a single `Gaussian` and therefore has dimensionality N=3.
"""
model = af.Model(af.ex.Gaussian)

model.centre = af.UniformPrior(lower_limit=0.0, upper_limit=100.0)
model.normalization = af.LogUniformPrior(lower_limit=1e-2, upper_limit=1e2)
model.sigma = af.UniformPrior(lower_limit=0.0, upper_limit=30.0)

analysis = af.ex.Analysis(data=data, noise_map=noise_map)

"""
__Search__

We now create and run the `DynestyStatic` object which acts as our non-linear search. 

We manually specify all of the Dynesty settings, descriptions of which are provided at the following webpage:

 https://dynesty.readthedocs.io/en/latest/api.html
 https://dynesty.readthedocs.io/en/latest/api.html#module-dynesty.nestedsamplers
"""
search = af.DynestyStatic(
    path_prefix=path.join("searches"),
    name="DynestyStatic",
    nlive=50,
    bound="multi",
    sample="auto",
    bootstrap=None,
    enlarge=None,
    update_interval=None,
    walks=25,
    facc=0.5,
    slices=5,
    fmove=0.9,
    max_move=100,
    iterations_per_update=2500,
    number_of_cores=1,
)

result = search.fit(model=model, analysis=analysis)



print(search.paths.output_path)

from autoconf.dictable import output_to_json, from_json

search = from_json(file_path=search.paths._files_path / "search.json")

print(search.paths.output_path)

However, the two printed output_path's at the end give different unique ids:

/mnt/c/Users/Jammy/Code/PyAuto/autofit_workspace/output/searches/DynestyStatic/a14161122b532586a68b69acdf10895c
/mnt/c/Users/Jammy/Code/PyAuto/autofit_workspace/output/searches/DynestyStatic/1953c1b63794441f3007ec2103d4e06f

I am not sure if this is related to this issue or something else, but it would be good to understand why this occured.

@Jammy2211 Jammy2211 reopened this Feb 8, 2024
@rhayes777
Copy link
Owner

When paths is serialised then reloaded the model attribute is dropped so when the identifier is regenerated it gives a different value. I guess we could serialise the identifier to ensure it remains the same?

@rhayes777
Copy link
Owner

Oh actually that does happen. The issue is that Search does not retain Paths when serialised because of the use of kwargs

@Jammy2211
Copy link
Collaborator Author

I just did a test on the script above on those two branches but still get different unique ids at the end:

/mnt/c/Users/Jammy/Code/PyAuto/autofit_workspace/output/searches/DynestyStatic/a14161122b532586a68b69acdf10895c
/mnt/c/Users/Jammy/Code/PyAuto/autofit_workspace/output/searches/DynestyStatic/1953c1b63794441f3007ec2103d4e06f

@rhayes777
Copy link
Owner

Ah ok so whenever search or model get set this removes the existing identifier so that it must be generated. I guess this is for safety so we don't accidentally generate output under the wrong directory. I think the model never gets attached back to the paths object giving a different output path when the identifier is regenerated.

How important is this issue? I could remove the mechanism that clears the identifier?

@Jammy2211
Copy link
Collaborator Author

Its unlikely to come up, but when it does pretty toxic, so probably quite important.

@rhayes777
Copy link
Owner

Cool I've thought of a fairly simple fix

@rhayes777
Copy link
Owner

The issue is now resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants