-
-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reading net charges from outSileSiesta #307
Conversation
Great, good idea!
To make things more easy, you could use the function |
It is more complicated to generalize for Mulliken, because it can be orbital resolved, and the structure of how it's written (even if only atom resolved) is very different! :( Coupling that with the fact that sisl can already generate mulliken from the density matrix, I thought it made sense to only parse hirshfeld and voronoi, which are formatted exactly in the same way. That's also why I called it
Yes, I thought about that too. Although there's a subtle difference. All the information of the scf is always written and in
Geez, there's always an edge case for non-collinear spin hahaha. Are voronoi and hirshfeld charges written differently? Or this is just regarding mulliken?
Nice! :) |
Codecov Report
@@ Coverage Diff @@
## master #307 +/- ##
==========================================
+ Coverage 86.87% 86.89% +0.02%
==========================================
Files 269 271 +2
Lines 38806 39095 +289
==========================================
+ Hits 33713 33973 +260
- Misses 5093 5122 +29
Continue to review full report at Codecov.
|
I agree it is much more difficult. But I think it would still fit here. So as of now, you can just stick with Hirshfeld and voronoi, and then we'll add mulliken later.
Agreed, but then errors should be raises. I.e. if they are requested but not existing. In any case, H+V are only written at the end, while Mulliken can be during SCF.
Hehe, yeah NC changes everything ;)
Do note that it jumps from the current position. So calling it repeatedly will still work. |
You'd have to include mulliken-only arguments in the method though, wouldn't you?
Hirshfeld and Voronoi can also be written during scf with Ok, I will try to add arguments to ask for specific charges. What I find most difficult is to distinguish between:
Maybe it would be useful to actually know what the input of the user was. Would it be reasonable to have a method to get the input info from the output file (as written at the top of the file)?
Ok, I will check that. |
I.e. ret = PropertyDict()
ret.mulliken = PropertyDict()
ret.mulliken.orbital = ..
ret.mulliken.atom = ..sum of orbital...
ret.hirshfeld = ...
ret.voronoi = ... hmm... Good question... In principle the hirshfeld and voronoi could also be done for orbitals. But it isn't implemented...
Yeah, true.
I think the shapes of the returned arrays should adhere to what the user asks for. If scf is requested, then the first dimension should be scf-itt. Much like the
No, this is opening up a can of worms. Some users pipe in stuff, some put the fdf on the input line. |
Knowing this, maybe it could be interesting to modify SIESTA itself to ensure that all inputs that define a run are logged in the output? 🤔 I see that SIESTA issues the following message:
But I don't have an out.fdf file. Anyway, couldn't this all be written in the output? |
Ok, current status:
I hope I made it clear enough how the method behaves in the docstring, can you check it and let me know what do you think? Thanks! |
Hmm. You can open an issue, or talk to Alberto about this... I am not so sure about this...
Hmm, that should probably be deleted. :) |
145ed0e
to
b0da498
Compare
I've fixed an error reading the scf charges (I was reading the charges after the Also, I read now just the first MD step at the beggining to understand when (if) charges are printed and then proceed to read the whole file from the beggining again. In this way I can distinguish between scf-wise and MD-step-wise charge writing, and also detect if both are turned on. To find the separation between each MD step, I use |
b0da498
to
c1addeb
Compare
c1addeb
to
6096b4a
Compare
Did you prepare some out files containing both the old and new format? If so, could you mail them to me? |
ps working on some other edits, please hold ;) |
Here's a bunch of different outputs I was using for testing. The ones that end with "4.1" are the old format |
Also added tests. To accommodate details of the input, we now have an enum that contains *options*. Opt.ANY can thus be used as arguments. This needs to be put in read_scf as well.
@pfebrer could you please have a look at these changes. Basically, I re-wrote everything. The solution is that now there exists an "option" enum object which holds Opt.ANY
Opt.ALL
Opt.NONE they can be bit-wise coupled and allows one to distinguish between different quantities. I also fixed the dataframe handling and added the tests you provided. Is this fine for you? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tested the dataframes? I can't use it, I get the error:
Invalid to pass a non-int64 dtype to RangeIndex
sisl/io/siesta/out.py
Outdated
|
||
# first line is the header | ||
header = (self.readline() | ||
.replace("Qatom", "q") # Qatom in 4.1, dQatom in master |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this result in "q" for 4.1 and "dq" for master? I thought it was better to unify them with the same names. Otherwise sisl code needs to be different depending on the version of SIESTA that generated the log :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, I missed this, will fix this.
The point was that that column refers to dq
, so I want the column to be named dq
, and the population to be named e
.
Which pandas version are you using? Yes, I did test it ;) |
Could you have a go now? Thanks! |
Ok, I realised I may have a too old pandas (0.25.3), although it's been only about a year and a half since it was released 😅 Now it works even with my pandas version. One thing that I miss with respect to how it was before is that
Don't you think this makes sense? |
While that may in some cases be the case, I don't think it is a good idea. The charges could potentially change between the SCF and MD charges due to some mixing schemes etc. |
How can I ask for the final charges now then? |
Yeah, I probably need to go through the documentation again. |
Could you have a look at the documentation now? I think it is ready? |
Great, I also think this is good to go! |
Instead of generating Voronoi charges as in my question 2 days ago I decided to let SIESTA do it. I then implemented a method to parse the charges from siesta's output. Is it useful enough to go in?
If you agree this could be useful, we can discuss the API. Right now it doesn't let the user decide which charges they want, it will return different things depending on how often charges have been written. You can find in this zip different output files of a 3 step MD of graphene, and following you can see how the method behaves for the different situations.
Charges at the end of the calculation
Charges at every MD step
Charges at every SCF step
Cheers!