WIP: Towards Type Stability #101

cortner · 2024-05-18T18:32:26Z

This PR is to address a number of itches I have with the current implementation of the interface and its implementation. ~~So far the goal is to keep it entirely backward compatible. I think this may be possible.~~ I believe I am still very close to backward compatible in the interface, though no longer in the internals.

The plan is to address the following issues:

Introduce Cell types #99
Boundary Conditions #97
Chemical Element #49
Organizing boundary conditions #5
some discussions elsewhere about having a type-stable and isbits atoms type.
an ad-hoc decision to make FlexibleSystem mutable since it is supposed to be flexible? This will be good for AtomsBuilder
Avoid use of getters with symbols #102

This is now ready for comments.

Bugs and TODOs

make all tests pass
write tests to show equivalent when using FastAtom
write tests that show performance when using FastAtom
cannot reinterpret ChemicalElement as UInt8
I would like to delete or deprecate DirichletZero
I would like to replace AtomView with FastAtom for fast_system[i]. The latter is "isbits" and hence performant. Also much much easier to use and implement, basically working like an SVector.

cortner · 2024-05-18T18:59:22Z

Recording changes

Introduce PCell type to encode a computational cell with pbc flags
Introduce SystemWithCell{D, TCELL} to help with dispatch
Adapt FlexibleSystem and FastSystem to accept PCell instead of bounding_box and boundary_conditions
Introduce FastAtom to be compatible with FastSystem
Made FlexibleSystem mutable.
make Bounding box and bc tuples
introduce OpenBC to replace DirichletZero

More details:

The P in PCell can stand for periodic or for "parallelepiped" (cc Rachel). I don't care really. But I didn't want to call it Cell. If somebody has a better name I will be happy to change it.
bounding box and bc should be tuples instead of vectors since (i) inferrable length and (ii) there should be no arithmetic performed on them.
The behaviour of isinfinite has changed, it now returns a D-tuple since the system can be infinite in some dimension but not others.

cortner · 2024-05-18T20:18:32Z

Question for the community: If I define a system that has open bc in some or all directions (i.e. is infinite in those directions) then the cell-vector pointing in that direction technically doesn't really specify a bounding box anymore. Should the call to bounding box in that case compute the actual extent of the system as specified by the positions of the particles?

My intuition is that this is something that the neighbourlist should do and not the system. I would consider returning the cell vector or NaN or Inf in those cases. But I do not have a strong view yet.

cortner · 2024-05-19T04:07:09Z

@mfherbst Do you mind explaining this test to me:

@testset "Chemical formula with system" begin
    lattice     = [12u"bohr" * rand(3) for _ in 1:3]
    atoms       = [Atom(6, randn(3)u"Å"; atomic_symbol=:C1),
                   Atom(6, randn(3)u"Å"; atomic_symbol=:C2),
                   Atom(1, randn(3)u"Å"; atomic_symbol=:D),
                   Atom(1, randn(3)u"Å"; atomic_symbol=:D),
                   Atom(1, randn(3)u"Å"; atomic_symbol=:D),
                  ]
    system      = periodic_system(atoms, lattice)
    @test atomic_symbol(system) == [:C1, :C2, :D, :D, :D]
    @test element_symbol(system) == [:C, :C, :H, :H, :H]
    @test chemical_formula(system) == "C₂H₃"
end

I take it the point is to test use of isotopes. I thought we had agreed this needed to be done differently. But I may well misremember that. This seems to be the only thing that prevents me from completing this PR.

Should there be another particle type Isotope?

struct Isotope
   atomic_number::UInt8
   nneutron::Int
end

How many people in this community care about isotopes?

EDIT: I think there are two alternative solutions:

one would be to simply add the nneutron field to ChemicalElement
the second would be to maintain a list of isotopes in chem.jl and give those isotopes ids starting with 1000. E.g. 1001 => D. Then the chemical element type gets modified to become

struct ChemicalElement
    id::Int 
end

with id no longer specifying atomic number but an index into the list of possible atom types. This isn't without danger. One would need to maintain permanent backward compatibility of ids I think and I probably prefer the isotope .

cortner · 2024-05-19T04:22:51Z

My last question for now: Are the printing tests really needed? Why would we expect the ASCII output to remain unchanged when we make changes to the code?

tjjarvinen · 2024-05-20T21:35:57Z

I would add isotopes to ChemicalElement.

Other issue that is related to isotopes is that in some cases you want to flag atoms to be different. E.g. you could say that these atoms are can be transformed to each other with symmetry operations and you then flag them with somehow. Another option is to flag atoms based on different chemical environment.

One way to add this would add extra field to chemical element. Albeit it could be moved to somewhere else too.

stuct ChemicalElement
   Z::UInt16
   N::UInt16 
   extra::UInt32
end

Note, I prefer using number of nucleons over number of neutrons, as then number 0 could be interpret, as field ignored. While with number of neutrons this is not possible.

The total bit size of the structure should be 64, 32, 16 or 128. For 64-bit system the structure is machine level will always be multiple of 64, so that size would be good target.

cortner · 2024-05-20T22:11:43Z

I am not too keen to make ChemicalElement too rich - that goes against the idea of a simple prototype implementation. We could have an extended chemical element type. Anyway, I'm happy to go with whatever the majority thinks is useful.

I'm not too sure the size matters too much at this stage, since it will normally be paired with positions, velocities, mass, etc. ... but it's an interesting suggestion for performance tweaks down the road?

tjjarvinen · 2024-05-20T22:27:54Z

Cache line is 64 bytes. So, 3*Float64 for positions, mass is Float64, velosity is another 3*Float64. If on top of this chemical element is 64-bits then it totals 64 bytes. Meaning that it would very efficient to move in memory. I would definitely go for this.

We could simply implement

 stuct ChemicalElement
   Z::UInt64
end

stuct ChemicalElementExt
   Z::UInt16
   N::UInt16 
   extra::UInt32
end

With these you can reinterpret the simple ChemicalElement to ChemicalElementExt that has option to have the extra features. Thus we have the default option to go for the simple ChemicalElement. While still have the option to define isotopes and additional features.

cortner · 2024-05-20T22:30:41Z

I understood the idea. I think it is interesting and I'm happy to be convinced that this is the way to go if enough people agree ... I doubt though that this will be the bottleneck for many interesting applications. Ie anything beyond pair interactions...

mfherbst · 2024-06-26T08:09:58Z

I take it the point is to test use of isotopes.

Yes and the fact that some input files actually do not store atom types not just based on the chemical symbols, but add numbers in the end, e.g. in cif files you frequently have C1, C2 etc. instead of just C, which not necessarily means that the carbons are different isotopes. This needs to be somehow dealt with (e.g. by separating element from atomic symbol).

Other issue that is related to isotopes is that in some cases you want to flag atoms to be different. E.g. you could say that these atoms are can be transformed to each other with symmetry operations and you then flag them with somehow. Another option is to flag atoms based on different chemical environment.

Yes that is exactly the case that we need to cover. The isotopes are not so relevant indeed.

In that sense the most simple implementation would be perhaps

struct c
   Z::UInt16
   N::UInt16
   id::UInt32
end

or even without the N property ?

My idea here is that the id is used to render otherwise identical objects different. What distinguish them is sort of out of scope at this simplest interface and is up to the implementation (e.g. by having a Dict{ChemicalElement,Any} to store additional information ?)

Are the printing tests really needed?

Not necessarily to test the precise display from my point of view but to ensure the code runs and does not crash. Crashing code on printing is super annoying and should not happen.

cortner · 2024-06-27T07:08:51Z

In view of discussions in other channels I think this PR will likely be closed unmerged and some of the points raised here brought back after the interface update.

cortner · 2024-08-13T04:01:11Z

closing since #107 is now at about the same stage as this old PR

steps towards type stability

1e515b0

Christoph Ortner added 2 commits May 18, 2024 13:33

cleanup, import without errors

9ccd41f

test/interface passes

2dbdea2

cortner mentioned this pull request May 19, 2024

Avoid use of getters with symbols #102

Closed

Christoph Ortner added 2 commits May 18, 2024 21:36

most tests pass now

6123d35

revived some tests

10dce3c

cortner closed this Aug 13, 2024

mfherbst deleted the co/atom branch October 23, 2024 17:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Towards Type Stability #101

WIP: Towards Type Stability #101

cortner commented May 18, 2024 •

edited

Loading

cortner commented May 18, 2024 •

edited

Loading

cortner commented May 18, 2024 •

edited

Loading

cortner commented May 19, 2024 •

edited

Loading

cortner commented May 19, 2024

tjjarvinen commented May 20, 2024 •

edited

Loading

cortner commented May 20, 2024 •

edited

Loading

tjjarvinen commented May 20, 2024 •

edited

Loading

cortner commented May 20, 2024 •

edited

Loading

mfherbst commented Jun 26, 2024

cortner commented Jun 27, 2024

cortner commented Aug 13, 2024

WIP: Towards Type Stability #101

WIP: Towards Type Stability #101

Conversation

cortner commented May 18, 2024 • edited Loading

Bugs and TODOs

cortner commented May 18, 2024 • edited Loading

Recording changes

cortner commented May 18, 2024 • edited Loading

cortner commented May 19, 2024 • edited Loading

cortner commented May 19, 2024

tjjarvinen commented May 20, 2024 • edited Loading

cortner commented May 20, 2024 • edited Loading

tjjarvinen commented May 20, 2024 • edited Loading

cortner commented May 20, 2024 • edited Loading

mfherbst commented Jun 26, 2024

cortner commented Jun 27, 2024

cortner commented Aug 13, 2024

cortner commented May 18, 2024 •

edited

Loading

cortner commented May 18, 2024 •

edited

Loading

cortner commented May 18, 2024 •

edited

Loading

cortner commented May 19, 2024 •

edited

Loading

tjjarvinen commented May 20, 2024 •

edited

Loading

cortner commented May 20, 2024 •

edited

Loading

tjjarvinen commented May 20, 2024 •

edited

Loading

cortner commented May 20, 2024 •

edited

Loading