-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XspectraParser
: Parse parameters from CalcJob inputs
#853
Conversation
@@ -87,6 +87,28 @@ def parse(self, **kwargs): | |||
xy_data.set_y(y_arrays, y_names, y_units) | |||
|
|||
parsed_data, logs = parse_stdout_xspectra(filecontent=out_file, codename='XSpectra') | |||
|
|||
# Parse some additional info which the stdout does not reliably report | |||
input_params = self.node.get_incoming().get_node_by_label('parameters').get_dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To get an input node, node.inputs
is easier to read and more elegant.
input_params = self.node.get_incoming().get_node_by_label('parameters').get_dict() | |
parameters = self.node.inputs.parameters.get_attribute('INPUT_XSPECTRA') |
Since you also only ever use the INPUT_XSPECTRA
, might as well get that subdictionary straight away.
input_params = self.node.get_incoming().get_node_by_label('parameters').get_dict() | ||
|
||
epsilon_vector = [ | ||
float(input_params['INPUT_XSPECTRA'][f'xepsilon({n})']) for n in [1, 2, 3] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
float(input_params['INPUT_XSPECTRA'][f'xepsilon({n})']) for n in [1, 2, 3] | |
float(parameters[f'xepsilon({n})']) for n in [1, 2, 3] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should maybe consider parameters.get(f'xepsilon({n})', default)
here to prevent KeyError
if the key is not defined. But don't know what a sensible default could be here.
if 'xonly_plot' in input_params['INPUT_XSPECTRA']: | ||
parsed_data['only_plot_spectrum'] = input_params['INPUT_XSPECTRA']['xonly_plot'] | ||
else: | ||
parsed_data['only_plot_spectrum'] = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This common pattern can be simplified with the get(key, default)
paradigm:
if 'xonly_plot' in input_params['INPUT_XSPECTRA']: | |
parsed_data['only_plot_spectrum'] = input_params['INPUT_XSPECTRA']['xonly_plot'] | |
else: | |
parsed_data['only_plot_spectrum'] = False | |
parsed_data['only_plot_spectrum'] = parameters.get('xonly_plot', False) |
if 'xcoordcrys' in input_params['INPUT_XSPECTRA']: | ||
if input_params['INPUT_XSPECTRA']['xcoordcrys']: | ||
parsed_data['vector_coord_system'] = 'crystal' | ||
else: | ||
parsed_data['vector_coord_system'] = 'cartesian' | ||
else: | ||
parsed_data['vector_coord_system'] = 'crystal' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic seems a bit weird to me, is it correct? Basically vector_coord_system
is getting set to crystal
if xcoordcrys
is not set or xcoordcrys
is "truthy" (i.e. not None
, empty string etc.). Otherwise it is cartesian
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see in the input doc that this is a boolean. Why convert to a string? Why not simply do
parsed_data['xcoordcrys'] = parameters.get('xcoordcrys', True)
Actually, now that I read all this. These are all input parameters defined by the user. Why do we "parse" them and add them to the output parameters? If they are already in the input, why not get them from there if needed?
# We need to assign this before the loop over each line due to a bug spotted in the code | ||
# when `only_plot=True`. Since the epsilon vector is reported before the value of | ||
# `xonly_plot`, we need a starting value before the loop begins. | ||
# parsed_data['plot_only'] = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code seems commented out? Is the comment actually correct? Or can this be removed?
tests/parsers/test_xspectra.py
Outdated
'xepsilon(1)': 1, | ||
'xepsilon(2)': 0, | ||
'xepsilon(3)': 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these keys now guaranteed to be in the parameters always? Because if not, the newly added line 95 in the parser will throw a KeyError
if one of these is not defined.
Hi @sphuber, thanks for the review. I'll add an explanation here since there are a number of comments directed at the idea of parsing of the vector coordinate system and the polarisation vectors from the inputs - particularly:
The main reason for the changes proposed are for user convenience, as knowing the polarisation vector and its coordinate system used for the calculation is crucial information that (I felt, at least) should be added to the output parameters in order to make them easier to access. This is particularly useful for the polarisation vector since the Parser automatically compiles each value of Ultimately, the two options to address the issues with parsing were: (1) look in the inputs to get the parameters, to avoid the problems stated in parsing from the stdout, or (2) remove the parsing of these parameters altogether and leave it to the user to search for them manually. Unless I've missed a trick somewhere, is there a way to access these inputs as easy as one could access the output parameter dictionary? |
Thanks for the reply @PNOGillespie that helps a lot.
Well, if the user defines them in the input parameters, then it would be as easy as But as I understand now, these parameters are not required to be defined in the input parameters because a lot of them have defaults that are automatically set by However, that means you have to a be a lot more careful and I would recommend the following rules:
So to summarize I would change the parsing code to: parameters = self.node.inputs.parameters.get_attribute('INPUT_XSPECTRA')
xepsilon_defaults = {
'1': 0,
'2': 0,
'3': 1,
}
parsed_data['xepsilon'] = [float(parameters.get(f'xepsilon({n})', xepsilon_defaults[n]) for n in (1, 2, 3)]
parsed_data['xcoordcrys'] = parameters.get('xcoordcrys', True)
parsed_data['xonly_plot'] = parameters.get('xonly_plot', False |
Thanks, I think these are reasonable suggestions for the Parser. I'll bundle these into a commit with your other suggestions and push it shortly.
Each of these do have defaults, though one thing to bear in mind is that most analysis using XSpectra involves computing the spectra for different polarisation vectors - particularly in the case of crystal systems with low symmetry (e.g. Monoclinic) or ones which contain defects since the spectrum will not be invariant of polarisation vector (see past works on SiO2 & La2CuO4 as well as NiO) - so it would be reasonable to expect the user to always define this themselves. In particular, calculations for different polarisation vectors and analysing polarisation dependence is part of the In any case, I'll make sure the defaults are set properly |
Requested for aiidateam#853 For parameters Parsed from the calculation input, the commit changes their keys to those used in the input parameters: * `epsilon_vectors` => `xepsilon` * `vector_coord_system` => `xcoordcrys` * `only_plot_spectrum` => `xonly_plot` Also adds default values for these keys which match those set by `xspectra.x` and updates the relevant parser tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @PNOGillespie , just one final problem that would actually cause exceptions. Once you fix this, I will merge.
I just want to still note that with this new method of parsing the parameters from the inputs and resorting to default if not defined, it means that the parser can set the wrong defaults if the defaults in xspectra.x
change. For example if QE 7.0 changes default of xcoordcrys
to False, then the Parser would still set True
if not explicitly defined by the user. Again, not sure how likely the defaults are subject to change, but something to be aware of.
Thanks again @sphuber.
I understand, it's something I bear in mind while I'm working on the implementation. If it's any comfort, development work on |
2cd742b
to
fedf953
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @PNOGillespie
Updates the method for the `XspectraParser` class to parse the following parameters from the input of the `CalcJob` class rather than the stdout: * The polarisation (epsilon) vectors used in the calculation * The coordinate system used for the calculation (either `crystallographic` or `cartesian`) * Whether the calculation was a re-plot or a full calculation (`xonly_plot`) The keyword `plot_only` in the `output_parameters` is also changed to `only_plot_spectrum` in order to clarify its meaning. The changes were made necessary as, during further testing involving calculation restarts, it was found that XSpectra stores the polarisation and k-vector used for the calculation in its save file using a cartesian basis regardless of the setting used in the calculation itself. When performing a re-start or re-plot calculation the code reports the polarisation/k-vector stored in the save file, on a cartesian basis, which causes confusion for the Parser - particularly in the case of a restarted calculation as the polarisation vector is then reported twice: once at the initial read of the input file and then again when the save file is read (and in potentially different coordinate systems, depending on calculation settings). It was therefore decided to move the parsing of these parameters from the raw output to just taking them directly from the `CalcJob` inputs.
fedf953
to
4db2759
Compare
Addresses a request made in PR: #852
Updates the method for the
XspectraParser
class to parse the following parameters from the input of the CalcJob class, rather than the stdout:xonly_plot
)Keyword
plot_only
inoutput_parameters
is also changed toonly_plot_spectrum
in order to clarify its meaning.The changes were made necessary as, during further testing involving calculation restarts, it was found that XSpectra stores the polarisation and k-vector used for the calculation in its save file using a cartesian basis regardless of the setting used in the calculation itself. When performing a re-start or re-plot calculation the code reports the polarisation/k-vector stored in the save file, on a cartesian basis, which causes confusion for the Parser - particularly in the case of a restarted calculation as the polarisation vector is then reported twice: once at the initial read of the input file and then again when the save file is read (and in potentially different coordinate systems, depending on calculation settings). It was therefore decided to move the parsing of these parameters from the raw output to just taking them directly from the CalcJob inputs.