Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update various libraries #2407

Open
wants to merge 29 commits into
base: devel
Choose a base branch
from

Conversation

joshua-cogliati-inl
Copy link
Contributor

@joshua-cogliati-inl joshua-cogliati-inl commented Dec 6, 2024


Pull Request Description

What issue does this change request address?

Closes #2411

What are the significant changes in functionality due to this change request?

Besides the changes that are also in #2250
Drop labelEncoder when pickling KerasBase to support tensorflow 2.14
Convert to tuple before using np.hstack to support numpy 1.26
Explicitly use .squeeze() when needed for to support xarray 2024.7


For Change Control Board: Change Request Review

The following review must be completed by an authorized member of the Change Control Board.

  • 1. Review all computer code.
  • 2. If any changes occur to the input syntax, there must be an accompanying change to the user manual and xsd schema. If the input syntax change deprecates existing input files, a conversion script needs to be added (see Conversion Scripts).
  • 3. Make sure the Python code and commenting standards are respected (camelBack, etc.) - See on the wiki for details.
  • 4. Automated Tests should pass, including run_tests, pylint, manual building and xsd tests. If there are changes to Simulation.py or JobHandler.py the qsub tests must pass.
  • 5. If significant functionality is added, there must be tests added to check this. Tests should cover all possible options. Multiple short tests are preferred over one large test. If new development on the internal JobHandler parallel system is performed, a cluster test must be added setting, in XML block, the node <internalParallel> to True.
  • 6. If the change modifies or adds a requirement or a requirement based test case, the Change Control Board's Chair or designee also needs to approve the change. The requirements and the requirements test shall be in sync.
  • 7. The merge request must reference an issue. If the issue is closed, the issue close checklist shall be done.
  • 8. If an analytic test is changed/added is the the analytic documentation updated/added?
  • 9. If any test used as a basis for documentation examples (currently found in raven/tests/framework/user_guide and raven/docs/workshop) have been changed, the associated documentation must be reviewed and assured the text matches the example.

@moosebuild
Copy link

Job Mingw Test on f03cef1 : invalidated by @joshua-cogliati-inl

failed in fetch

@moosebuild
Copy link

Job Test qsubs sawtooth on 3d876f7 : invalidated by @joshua-cogliati-inl

failed in fetch

Pre 2024.7 automatically squeeze()ed groupby results, so now need
to explicitly call squeeze().
@moosebuild
Copy link

Job Mingw Test on b7b22a0 : invalidated by @joshua-cogliati-inl

failed in fetch

1 similar comment
@moosebuild
Copy link

Job Mingw Test on b7b22a0 : invalidated by @joshua-cogliati-inl

failed in fetch

@moosebuild
Copy link

Job Test Fedora 31 on 6cccb51 : invalidated by @joshua-cogliati-inl

failed in set python environment

@moosebuild
Copy link

Job Test qsubs sawtooth on 6cccb51 : invalidated by @joshua-cogliati-inl

failed in set python environment with segfault

@moosebuild
Copy link

Job Test qsubs sawtooth on 6cccb51 : invalidated by @joshua-cogliati-inl

Timeout tests/framework/Optimizers/BayesianOptimizer/Matyas

@moosebuild
Copy link

Job Mingw Test on 6cccb51 : invalidated by @joshua-cogliati-inl

failed in fetch

1 similar comment
@moosebuild
Copy link

Job Mingw Test on 6cccb51 : invalidated by @joshua-cogliati-inl

failed in fetch

Copy link
Collaborator

@wangcj05 wangcj05 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments for you to consider. In addition, could you provide some explanation for the regolding tests? Some are minor changes which I assume due to the library updates, some changes may indicate a problem. Please comment on each regold test.

@@ -1628,7 +1628,10 @@ def getFundamentalFeatures(self, requestedFeatures, featureTemplate=None):
## IND
#most probabble index
if len(group['Ind']):
modeInd = stats.mode(group['Ind'])[0][0]
try:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide a comment here to explain why we need the try except here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, depending on version of Scipy, we get a different shaped array because of the change of the keepdims in stats.mode https://docs.scipy.org/doc/scipy/release/1.11.0-notes.html#expired-deprecations

So probably we should just specify keepdims here and use only one of these options.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I switched it to use an explicit keepdims.

2.0,200.0,1,0.005
2.0,400.0,1,0.005
2.0,600.0,1,0.005
2.0,-1000.0,0,0.0025
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide a comment why all values of z are zeros instead of ones?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think because the OVO classifier is using:

    <ROM name="estimator" subType="LinearRegression">
      <Features>X,Y</Features>
      <Target>Z</Target>
      <fit_intercept>True</fit_intercept>
      <normalize>True</normalize>
    </ROM>

underneath, and a linear regression is not a good fit for modeling this.
As I mentioned in the commit, this is because of this change: scikit-learn/scikit-learn#22604

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the estimator to Gaussian naive Bayes and the values are now a mixture of zeros and ones.

@joshua-cogliati-inl
Copy link
Contributor Author

Some minor comments for you to consider. In addition, could you provide some explanation for the regolding tests? Some are minor changes which I assume due to the library updates, some changes may indicate a problem. Please comment on each regold test.

  • eb93d44 Regolds tests/framework/PostProcessors/Metrics/gold/scipyMetricsBoolean/pp1_print.csv due to pending removal of kulsinski metric.
  • e3955f2 Regolds tests/framework/ROM/SKLearn/gold/data/outOVO.csv due to fix in scipy for OVO FIX Fixes OneVsOneClassifier.predict for Estimators with only predict_proba scikit-learn/scikit-learn#22604
  • e352a56 Regolds several files in tests/framework/ROM/TimeSeries/PolyExponential/gold/ because of scipy changes. (These were rel err of 1e-03 or less, so the results are generally the same to 3 significant figures)
  • 200c8f4 Regolds tests/framework/PostProcessors/Validation/gold/DSS/pp1_print_0.csv and tests/framework/PostProcessors/Validation/gold/DSS/pp3_print_0.csv. I am guessing this is from changes between scipy.integrate.simps and scipy.integrate.simpson
  • 938e8be. Forces data to have uniform timestep spacing because HAVOK DMD now requires uniform spacing, which changes the results and resulted in regolding: tests/framework/ROM/TimeSeries/DMD/gold/HAVOK/outputHAVOKDMD/outputDMD_0.csv
  • 1851ed2 Previously we checked Fourier__signal_f__period10.0__phase, but it randomly is either +pi or -pi depending on OS, library version and other minor changes. This regolds tests/framework/PostProcessors/TSACharacterizer/gold/Basic/chz.csv and tests/framework/PostProcessors/TSACharacterizer/gold/Basic/windowsChz.csv to remove Fourier__signal_f__period10.0__phase.
  • 9544362 Regolds tests/framework/ROM/TimeSeries/DMD/gold/BOPDMD/outputBOPDMD/outputDMD_0.csv because of library changes. These are similar to 1 significant figure.

wangcj05
wangcj05 previously approved these changes Dec 18, 2024
Copy link
Collaborator

@wangcj05 wangcj05 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes are good for me.

@wangcj05
Copy link
Collaborator

@joshua-cogliati-inl This PR is good for me. I have two questions:

  1. Have you tested the parallel on our HPC using this PR?
  2. When do you want to merge it?

@moosebuild
Copy link

Job Mingw Test on aa1ebb7 : invalidated by @joshua-cogliati-inl

failed in fetch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[TASK] update libraries so Python 3.11 is usable
3 participants