Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create release notes for version 10.0.0 #7770

Closed
eivindjahren opened this issue Apr 25, 2024 · 0 comments · Fixed by #7875
Closed

Create release notes for version 10.0.0 #7770

eivindjahren opened this issue Apr 25, 2024 · 0 comments · Fixed by #7875
Assignees

Comments

@eivindjahren
Copy link
Contributor

eivindjahren commented Apr 25, 2024

Changes to the Manage experiments tool

The Manage experiments tool has become nicer to work with. Before you would just get text information about each experiment:
Screenshot from 2024-04-30 08-49-37

Now there is a separate panel for viewing the information:

image

Longer retries for license error

For ECL100 jobs there is a possibility that the license server may be overloaded by too many concurrent realizations. To mitigate this we have increased the retrying mechanism for checking the license error from max 6 minutes to between 19 and 32 minutes (depending on randomized values). In order to check whether your realization is waiting for the license server to respond, while running the experiment, click on "show details":
screenshot_from_2024-04-29_15-57-24_720

Click on the square for the long running realization, 0 in the image:
screenshot_from_2024-04-29_15-57-37_720

Now click the "OPEN" button in the column STDERR for row ECLIPSE100:

screenshot_from_2024-04-29_15-57-45_720

If the message contains "Eclipse failed due to license failure, retrying in XXs" then the license server is busy and we will automatically retry running eclipse in the specified number of seconds.

screenshot_from_2024-04-29_15-57-51_720

Breaking changes for plugins, forward models and api endpoints

There are a few breaking changes, which only effects users of the storage api and plugins and ertscripts that uses LibresFacade:

  • The storage api endpoint "/ensembles/{ensemble_id}/responses/{response_name}/data" is removed.
  • The deprecated methods grid, gen_data_keys, gen_kw_keys, all_data_type_keys, observation_keys of LibresFacade are removed.

List of new commits in 10.0.0

BREAKING CHANGES

aea1d7c Remove storage API endpoint "/ensembles/{ensemble_id}/responses/{response_name}/data" by Øyvind Eide (#7566)
b090537 Remove deprecated method LibresFacade.grid by Eivind Jahren (#7567)
c656cb2 Remove method gen_data_keys from LibresFacade by Øyvind Eide (#7520)
4b0382a Remove method gen_kw_keys from LibresFacade by Øyvind Eide (#7520)
be986e7 Remove method all_data_type_keys from LibresFacade by Øyvind Eide (#7520)
7812d6c Remove method observation_keys from LibresFacade by Øyvind Eide (#7520)

GUI CHANGES

c86ab2d Write reponse as yaml by Frode Aarstad (#7692)
1065153 Rename Jobs to Forward Model in GUI by Jonathan Karlsen (#7652)
82db76d Remove duplicate data by Frode Aarstad (#7623)
01c3c30 Remove resdata version from about page by Eivind Jahren (#7460)
53daee4 Add an experiment info widget by Frode Aarstad (#7525)

GUI BUG FIXES

2ba5143 Make sure that rerun jobs do not show an error message in the GUI by Dan Sava (#7696)
fc699f6 Check for selected ensemble in evaluate ensemble panel by Frode Aarstad (#7680)
bb9b46b Fix GUI RunDialog not expanding details header by Jonathan Karlsen (#7617)
e5aa0f3 Fix memory being reported without units in GUI by Jonathan Karlsen (#7656)

BUG FIXES

82a2d12 Fixed an issue where local queue would error due to it being killed before starting by Håvard Berland (#7710)
10601d4 Ensure that opening an older version of ert with a newer version of storage shows an informative error message by Frode Aarstad (#7719)
8336e51 Fixed formatting of file open error on bsub (LSF) in by Håvard Berland (#7671)
bd0d082 Ensure that the storage lock is aquired before writing by Øyvind Eide (#7570)
fc8a405 Fixed an issue where empty storage directories were migrated by Øyvind Eide (#7570)
4ababea Establish connection and empty the event queue before cancelling tasks by Julius Parulek (#7562)
d441394 Fix bug where all ensembles would show instead of just the initialized ones by Øyvind Eide (#7538)

DOCUMENTATION

907f0b8 Fix documentation for LSF memory booking by Jonathan Karlsen (#7654)
2600d20 Document storage by Aron Høyer (#7254)
48a15b7 Document how delete_directory works on symlinks by Håvard Berland (#7444)
3fdff88 Split release notes in highlighted changes and change log by Øyvind Eide (#7542)

SCHEDULER LSFDRIVER

606a4e0 Have scheduler bkill not retry on error Job already finished by Jonathan Karlsen (#7714)
0217bfd Add -o option to bsub for stdout from LSF by Håvard Berland (#7724)
8548bcd Retry bkill on intermittent ssh failures by Håvard Berland
120de54 Let jobs failing in LSF be resubmitted by Håvard Berland (#7684)
4e7300e Fix LSF driver logging wrong message when killing by Jonathan Karlsen (#7672)
8082875 Implement EXCLUDE_HOST for scheduler lsf driver by Jonathan Karlsen (#7543)
70122c9 Solve race condition in lsf_driver for job_ids by Håvard Berland (#7581)
ade18be Revert "Implement EXCLUDE_HOST for scheduler lsf driver" by Håvard Berland (#7547)
5bb33aa Implement EXCLUDE_HOST for scheduler lsf driver by Jonathan Karlsen (#7441)
89f5179 Implement LSF_RESOURCE queue option for LSF_DRIVER by Jonathan Karlsen (#7441)
1c1badd Make fallback mechanism for failing bjobs by Håvard Berland (#7299)

REFACTORING

108a888 Remove unused code ensemble_evaluator_utils by Jonathan Karlsen (#7682)
e4d8f7f Rename JobRunner to ForwardModelRunner by Jonathan Karlsen (#7628)
29d0fe4 Move duplicate ev_types to ert/event_type_constants.py by Jonathan Karlsen (#7683)
1eaed7c Have async functions use async_utils eventloop by Jonathan Karlsen (#7624)
c600efb Remove start sync event from scheduler job.call by Jonathan Karlsen (#7616)
d55e0ef Rename async_utils get_event_loop() to get_running_loop() by Jonathan Karlsen (#7618)
20768fa Fix py3.12 asyncio warning in async_utils by Jonathan Karlsen (#7597)
bc992ec Rename Job.call to job.run in scheduler by Jonathan Karlsen (#7615)
afd2437 Remove pass statement by Håvard Berland (#7607)
b4dc96d Solve py312 warning on pkgutil.get_loader (#7597)
7ba3fb5 Solve deprecation warning from datetime on UTC by Håvard Berland (#7584)
1186971 Refactor _get_obs_and_measure_data by Feda Curic (#7480)
83b3dd2 Refactor storage migrations by Øyvind Eide (#7570)
b21fab1 Use ruff also as formatter by Lars Evje (#7503)
b71fdde Fix ErtPlugin interface staticmethods Jonathan Karlsen (#7540)
15ff246 Fix ruff preview rule PLR6301 by Jonathan Karlsen (#7529)
85fd3cd Move ignore check one level higher by Øyvind Eide (#7570)

NON-USERFACING CHANGES

5817450 Improve scheduler driver execute_with_retries method to log exit code, stdout, and stderr by Håvard Berland (#7769)
0eb422f Log stdout when bhist fails by Håvard Berland (#7697)
b9b60b3 Changed how PARALLEL is read from DATAFILE by Eivind Jahren (#7460)
0e3f9b5 Use macOS-14 runners for ert by Andreas Eknes (#7592)
5428155 Make resdata a dev-dependency by Eivind Jahren (#7460)
5f167f4 Remove logging of util_abort, it can no longer happen by Eivind Jahren (#7460)
3be64fd Add logging to LocalDriver by Håvard Berland (#7549)
3bc1f30 Have lsf driver bkill with SIGKILL after SIGTERM by Jonathan Karlsen (#7482)
a71a6df Have lsf_driver specify SIGKILL when bkilling by Jonathan Karlsen (#7433)
66e7fe5 Improved log message when killing jobs on local queue by Håvard Berland (#7556)

TESTING

4ec192a Add SIGKILL to possible returncodes from kill by Håvard Berland (#7586)
b885579 Fix dark storage performance benchmark test not compatible with async_utils event loop by Jonathan Karlsen (#7737)
998751b Increase tolerance in truncated_normal test by Eivind Jahren (#7726)
2586a4f Fix a lifetime issue of GUILogHandler by Eivind Jahren (#7730)
c093e12fadffb2617652b79eb73Ff762fc954cd72 Bump ruff to v0.4.0 by Lars Evje (#7707)
d6da821 Make flakiness test for qstat faster by Håvard Berland (#7688)
57b4398 Reduce timeout in order to speed up tests by Håvard Berland (#7678)
21387c1 Allow pre-commit to run without failure in CI after merge by Håvard Berland (#7631)
dbafb2c Add test to confirm that the job does not retry when activelly cancelled by Julius Parulek (#7625)
61046df Increase timeout for cluster integration tests by Håvard Berland (#7620)
880886e Fix storage instance used outside of storage context by Jonathan Karlsen (#7614)
076b6d8 Fix test_small_time_mismatches_are_ignored by Sondre Sortland (#7613)
6f49e28 Protect main branch from git commit with pre-commit by Håvard Berland (#7585)
f4b0b24 Fix hanging cli integration test by Jonathan Karlsen (#7601)
cfad65e Let mocked bkill support specific kill signal by Håvard Berland (#7599)
27a9060 Increase timeout in unresponsiveness test by Håvard Berland (#7552)
b5a72e3 Add --disable-monitor to Ert client runs in integration tests by Håvard Berland (#7552)
414bea7 Fix storage instance used outside of storage context by Jonathan Karlsen (#7579)
7145a29 Test that pbs driver ignores qstat flakiness (#7414)
06bc54e Fix string arguments to driver.submit() by Håvard Berland (#7549)
cdc90ba Fix integration test lsf driver invalid resource requirement by Jonathan Karlsen (#7553)
9c51f1c Ensure no underflow in parameter_example_test by Eivind Jahren (#7580)

@eivindjahren eivindjahren self-assigned this Apr 25, 2024
@eivindjahren eivindjahren moved this to In Progress in SCOUT Apr 25, 2024
@eivindjahren eivindjahren moved this from In Progress to Ready for Review in SCOUT May 13, 2024
@andreas-el andreas-el moved this from Ready for Review to Reviewed in SCOUT May 21, 2024
@github-project-automation github-project-automation bot moved this from Reviewed to Done in SCOUT May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant