Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the ElectronicBandsData data plugin #5033

Conversation

sphuber
Copy link
Contributor

@sphuber sphuber commented Aug 1, 2021

Fixes #5032

This data plugin is a subclass of the BandsData plugin. It adds
support to define a fermi energy, which is set as an attribute. This is
a typical use case for electronic band structures, which so far were
using the BandsData type, but the latter was also being used for other
band structures, such as phonons, which do not share this property.

By making it a subclass, existing processes that accept BandsData
instances will automatically also accept ElectronicBandsData instances
and queries for BandsData will also match the subclasses, unless the
user turns of subclassing explicitly.

@sphuber
Copy link
Contributor Author

sphuber commented Aug 1, 2021

Probably need to add a fermi_energy_unit property as well, but first wanted to gauge if this solution might cover all use cases that we know off.

@codecov
Copy link

codecov bot commented Aug 1, 2021

Codecov Report

Merging #5033 (e473aac) into develop (a468519) will increase coverage by 0.03%.
The diff coverage is 96.56%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #5033      +/-   ##
===========================================
+ Coverage    80.23%   80.25%   +0.03%     
===========================================
  Files          515      517       +2     
  Lines        36757    36784      +27     
===========================================
+ Hits         29489    29519      +30     
+ Misses        7268     7265       -3     
Flag Coverage Δ
django 74.75% <96.56%> (+0.03%) ⬆️
sqlalchemy 73.65% <96.56%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
aiida/cmdline/commands/cmd_data/cmd_show.py 16.67% <0.00%> (ø)
aiida/orm/nodes/data/array/bands/__init__.py 100.00% <100.00%> (ø)
aiida/orm/nodes/data/array/bands/bands.py 76.55% <100.00%> (ø)
aiida/orm/nodes/data/array/bands/electronic.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a468519...e473aac. Read the comment docs.

Copy link
Member

@mbercx mbercx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sphuber! Just left a few questions/comments, and still have a more general one that's a bit of semantic nit-picking. The terms Fermi energy and Fermi level are often used interchangeably, but I think it's more precise to use Fermi energy for "the energy difference between the highest and lowest occupied single-particle states in a quantum system of non-interacting fermions at absolute zero temperature":

https://en.wikipedia.org/wiki/Fermi_energy

Whereas the Fermi level is the one we are adding to the BandsData here, I think:

https://en.wikipedia.org/wiki/Fermi_level

aiida/orm/nodes/data/array/bands/electronic.py Outdated Show resolved Hide resolved

KEY_FERMI_ENERGY = 'fermi_energy'

def __init__(self, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must admit I do not like this constructor using **kwargs from a UX point of view. ^^ Basically it's impossible to tell for a user that checks the docstring how to use it. What's the advantage of this, and does it outweigh the benefit of the user simply seeing the input arguments and having them documented in the docstring?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the interface. However, it does bring other potential use problems with it. See the commit message for an explanation. Using the / marker to make them positional only "solves" the problem somewhat.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, the / positional argument only marker is only supported for Python 3.8 and up. Not sure if there is a backport solution for this, but I doubt it since it concerns the basic syntax. @chrisjsewell do you happen to know?

@sphuber
Copy link
Contributor Author

sphuber commented Aug 2, 2021

Thanks for the review @mbercx . Fine for me to change to fermi_level. Maybe it would be good to raise this point during a CWF meeting and see that others agree. Would also be a good moment to see if people think we should add an attribute for the energy unit or whether we enforce eV as has been the norm for data types in aiida-core so far. I think being explicit and storing the unit is the better option even though that means we'd be breaking with current "standard" practice.

@sphuber sphuber force-pushed the feature/5032/bands-data-fermi-energy branch 2 times, most recently from a1ac9f6 to 6001483 Compare August 3, 2021 08:30
This data plugin is a subclass of the `BandsData` plugin. It adds
support to define a fermi energy, which is set as an attribute. This is
a typical use case for electronic band structures, which so far were
using the `BandsData` type, but the latter was also being used for other
band structures, such as phonons, which do not share this property.

By making it a subclass, existing processes that accept `BandsData`
instances will automatically also accept `ElectronicBandsData` instances
and queries for `BandsData` will also match the subclasses, unless the
user turns of subclassing explicitly.

The constructor would have used the `/` marker to make the `kpoints`,
`bands` and `fermi_energy` arguments positional only. The reason is that
the constructor needs to allow for arbitrary keyword arguments to be
passed to the parent constructor. This is necessary to be able to change
the `user` of the node for example. Without making the other arguments
positional only, a caller could define them as a keyword argument,
causing them to be passed to the constructor of the parent class, which
will raise an exception since it does not define those arguments.

However, we unfortunately still support Python 3.7 and `/` was not
added until Python 3.8, so for now we cannot use it yet.
@sphuber sphuber force-pushed the feature/5032/bands-data-fermi-energy branch from 6001483 to e473aac Compare August 3, 2021 09:02
@giovannipizzi
Copy link
Member

If the units in BandsData are always eV (I think this should be the case now that we are defining a Electronic type), I would vote to also force the fermi_level to be in eV and document it. I think it just makes our life too complex if we need to then define a new layer to convert units between various ElectronicBandsData. If we really want to have multiple units, I still think the units of the Fermi level should be the same as for the bands, so we would define only one unit for bands and the Fermi level.

Also, if we want to go down this route of defining an ElectronicBandsData type, we should move in the new type all functionality that is specific to electronic bands (two bands channels = spin up/down, the find_bandgap function, ...)?
I'm still wondering if this is really useful, in the sense that for me it's a bit different than the case of Kpoints, where we really were storing two different types of kpoints (grid vs. explicit points), and some methods just except if the other type has been set. Here there's just the definition of a "zero", and one could give a more general name (energy_zero for instance). So I think that having a subclass is less critical. But anyway, I'm not against it if we properly move electron-specific methods to this subclass and the users won't notice when the plugin change type.

@mbercx
Copy link
Member

mbercx commented Aug 11, 2021

Also, if we want to go down this route of defining an ElectronicBandsData type, we should move in the new type all functionality that is specific to electronic bands (two bands channels = spin up/down, the find_bandgap function, ...)?

I'm definitely in favor of implementing the ElectronicBandsData type, and adding the methods you mention. The find_bandgap method would really be useful, I think. How would I now obtain a band gap from a BandsData node? This should be very straightforward for users.

@giovannipizzi
Copy link
Member

giovannipizzi commented Aug 11, 2021

How would I now obtain a band gap from a BandsData node?

Ok, my fault - I was under the false impression that there was a band.find_bandgap() method, instead there is only a function. I would suggesting to move this to aiida.tools.data. To make it easy, I think we should add also the datanode.tools.* extensions similar to what I think we have for calcjobs already, but I don't think we have yet for data nodes? So one can just add things to aiida.tools rather than to the class itself, that probably should just focus son how to store the data internally, but still you will be able to do bandnode.tools.find_bandbap rather than

from aiida.tools.data.array.bands import find_bandgap
find_bandgap(bandnode)

(with the additional benefit that one can do node.tools.<TAB> to discover which tools exist for that node type.

I can't find the issue on data.tools, maybe @sphuber remembers? I think we discussed on the train back from the 2019 coding week.

EDIT: this is the corresponding .tools for calcjobs:

def tools(self) -> 'CalculationTools':

@sphuber
Copy link
Contributor Author

sphuber commented Aug 11, 2021

I can't find the issue on data.tools, maybe @sphuber remembers? I think we discussed on the train back from the 2019 coding week.

The reason that there is no data.tools analogue to the calculation.tools is because they do not have the same problem. The reason that we implemented the calculation tools concept was because the process node classes in AiiDA are not subclassable. This means that people cannot add their own custom functionality such as utility methods that operate on the various process nodes. The same does not go for Data nodes. These are subclassable and so any utility functions can be added directly on those classes. If the ElectronicBandsData needs a get_band_gap method, then we should just implement on that class. I don't see the added value of implementing this elsewhere through some complicated entry point system.

@mbercx
Copy link
Member

mbercx commented Aug 11, 2021

Regarding placing these in e.g. BandsData.tools.<method_name>: Although I do understand that adding these here will make them easier to find considering the vast number of methods available for all node classes, we do need to make sure that this is documented very well, and also introduced at the tutorial.

However, I would be more in favor of not having this kind of workaround, which again requires us to teach the users something to just find the utility methods they need. Instead, it would be better if we could hide methods that users typically do not need, so they can more easily find the methods that we would place in .tools. I think I already discussed this with @chrisjsewell at some point, and he mentioned a way to do this elegantly which I now forgot. (this was in the context of #4975)

@giovannipizzi
Copy link
Member

OK to keep it in the main class, but indeed for a user it's overwhelming to find all node methods (that they might not need) + the methods acting on the nodes. Also this makes data classes quite complex, again with a design principle that I would like to push forward that we keep data classes as stable as possible over time, just focusing on storing data in the DB in a consistent format, and move code that processes it outside.

As a historical note we had tried in AiiDA 0.x to hide methods (prefixing with underscore) that might not be needed by a end user (e.g. set_attribute), but then we agreed that this was not the right approach, as it was actually hiding methods that, from a python point of view, were not really private, as e.g. a plugin or some other AiiDA-core class might want to call them to set an attribute.
If there is some other way to do the "right" thing I'm all ears (maybe it's overriding the __dir__? But I don't know if I would do it, the user would be confused of not finding a method in dir but still be able to use it. Namespacing feels better to me.

But anyway this is also a matter of taste, and it's not crucial for this issue.

@mbercx
Copy link
Member

mbercx commented Oct 17, 2021

Fine for me to change to fermi_level. Maybe it would be good to raise this point during a CWF meeting and see that others agree.

Just a note here: This was discussed during a recent CWF meeting, and I think it was agreed that Fermi level is a slightly more correct terminology. However, I did not get the impression that everyone was as passionate about this sematic nit-picking as I. 😅

@unkcpz unkcpz added the topic/materials-science-related Issues that relate to the materials science label Nov 11, 2021
@sphuber
Copy link
Contributor Author

sphuber commented Jan 25, 2022

Will close this PR for now since it has been inactive for a long time and I don't anticipate having time soon to address the changes. Will reopen when there is a new design incorporating the suggested changes.

@sphuber sphuber closed this Jan 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic/materials-science-related Issues that relate to the materials science
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for fermi energy to the BandsData
4 participants