Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PwBaseWorkChain: Add report on disabling bands sanity check #629

Merged
merged 2 commits into from
Dec 10, 2020

Conversation

mbercx
Copy link
Member

@mbercx mbercx commented Dec 8, 2020

Fixes #628

When the PwCalculation finishes with exit code 0, the PwBaseWorkChain still executes a sanity check to make sure the highest energy band is not occupied, which is an indication that the calculation was run with insufficient bands in case the input structure is metallic. If the user is running an insulator, however, the check is still performed unless the occupations input of the SYSTEM namelist is specifically set to 'fixed'. Since this is also the default value of pw.x, it might be confusing to users that the number of bands is increased.

Here we add a message to the report logs to explain this to the user in case the SYSTEM.occupations input is not specified, along with a recommendation that the check can be disabled when the input is specifically set to 'fixed'.

@mbercx mbercx requested a review from sphuber December 8, 2020 15:34
@sphuber
Copy link
Contributor

sphuber commented Dec 8, 2020

Just thinking of a potential downside of this: since for the high-throughput work, not explicitly setting the occupations is the default behavior, means that we will have a lot of these report messages. Essentially every PwCalculation iteration in every PwBaseWorkChain. This will give a non-negligible load on the database with these duplicated messages. Question is then if we should have this in the report or simply in the documentation of the workchain? @giovannipizzi

@mbercx
Copy link
Member Author

mbercx commented Dec 8, 2020

This will give a non-negligible load on the database with these duplicated messages.

If this is true, then I'm a little worried about all the paramiko-related error messages due to the connectivity issues we're experiencing connecting with piz daint. Some of the more problematic PwCalculations have 34 ERROR messages in the log, which corresponds to 2317 lines. 😅

@sphuber
Copy link
Contributor

sphuber commented Dec 8, 2020

If this is true, then I'm a little worried about all the paramiko-related error messages due to the connectivity issues we're experiencing connecting with piz daint. Some of the more problematic PwCalculations have 34 ERROR messages in the log, which corresponds to 2317 lines. sweat_smile

Fair point. At some point we will need to think of "archiving" nodes that can either clean the database content, like logs, or even store them in compressed more efficient form.

@giovannipizzi
Copy link
Member

Aren't you using smearing rather than fixed in the first recoinnassance SCF? So the warning should not pop up.

And once you determine it's an insulator, I would suggest to set occupations to fixed otherwise we run with 2x CPU time as the first calculation will always "fail" and need to be rerun with more bands?

To further reduce the messages one could think not to print the message if # bands != # electrons (times spin degeneracy where appropriate, one needs to consider nspin == 1, 2, 4), i.e. after if the # bands has already been explicitly increased

@sphuber
Copy link
Contributor

sphuber commented Dec 9, 2020

And once you determine it's an insulator, I would suggest to set occupations to fixed otherwise we run with 2x CPU time as the first calculation will always "fail" and need to be rerun with more bands?

That is not trivial though, because you will need to introduce a threshold above which a material is considered and insulator and then potentially some functionality to prevent this automatic change of treating the system. I am not convinced the base workchain should automatically do these changes.

@giovannipizzi
Copy link
Member

What I'm saying is indeed that it's the outer caller, after a reconnaissance SCF, can decide to set fixed for insulators (with some threshold).
Therefore, for this type of runs, there will not be warning messages written by the base workchain (the issue discussed above).

@sphuber
Copy link
Contributor

sphuber commented Dec 10, 2020

What I'm saying is indeed that it's the outer caller, after a reconnaissance SCF, can decide to set fixed for insulators (with some threshold).
Therefore, for this type of runs, there will not be warning messages written by the base workchain (the issue discussed above).

Ok perfect, I think we agree then. It is to the caller to suppress this warning by updating the parameters accordingly if they know what they're doing. Would this be ok to merge for you then @giovannipizzi ?

@giovannipizzi
Copy link
Member

Sure! I approved

@sphuber sphuber merged commit 109d100 into aiidateam:develop Dec 10, 2020
@mbercx mbercx deleted the fix/628/report_bands_check branch December 11, 2020 06:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inform users of how to disable the automatic increase of bands
3 participants