Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement stresses without unfold_bz #511

Merged
merged 9 commits into from
Jul 30, 2021

Conversation

jaemolihm
Copy link
Contributor

I implemented calculating stresses without unfolding the BZ as discussed in #506.

I also increased kgrid for the stresses test from [2, 2, 1] to [3, 3, 1] because with the [2, 2, 1] grid, the stresses were same with and without symmetrization. This slightly increased the runtime of test/stresses.jl (from 31 sec to 36 sec).

closes #506

Copy link
Member

@mfherbst mfherbst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much! I have a small remark.

src/symmetry.jl Outdated
Comment on lines 318 to 319
# symmetrize a rank-2 tensor in reduced coordinates
function symmetrize_tensor(tensor, symmetries, lattice)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this file we generally follow the convention (system arguments, data arguments), i.e. it should be lattice, symmetries, tensor.

Also can you add ::Matrix or even better ::Mat3 to indicate this only works for matrices instead of general tensors.
In fact ::Mat3 would even allow to rename this function to symmetrize (same as the symmetrize we use for the densities) and add another method symmetrize(model, tensor::Mat3) and/or symmetrize(model, tensor::Mat3).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not too sure about this, for two reasons. The first is that this routine is specific to stresses-like tensors, ie that transform in a specific way, being the derivative of the energy wrt the lattice, which transforms in another way (I would imagine the term is that lattices are covariant, while stresses are contravariant? I'm hesitant here, this is definitely not my zone of confort). The second is that is that it's also specific to the fact that our stresses are in reduced coordinates. So I would call it symmetrize_stresses.

Unless it's canonical that "tensors" transform in this particular way? In which call calling it symmetrize_tensor is OK by me, since things are in reduced coordinates by default (we use _cart for non-reduced)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify with "about this" you mean the renaming to symmetrize, but not the rest.

Yeah ok, I might have been a bit quick here. From the way stresses are formed I agree with your argument that they could be special (likely the only contravariant thing we deal with in the foreseeable time). I also never looked into this in detail, however.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify with "about this" you mean the renaming to symmetrize, but not the rest.

I agree this takes Mat3 as inputs, but I'm not sure we should aim for a general symmetrize function that deduces what it should do by dispatch, since we generally don't use types a lot (eg a density is just an array). And if we keep explicit names there's no reason to type arguments. I do agree about the order though

Copy link
Member

@mfherbst mfherbst Jul 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then we should rename to symmetrize_ρ or symmetrize_density?

Edit: I think we had this initially.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, real is a bit generic... I think we already use "density" quite generically to denote "density or potential" at some other places in the code. I think it's better to stick with "density" even though it's more general, unless we can come up with another name to mean "real-space quantity"?

Copy link
Member

@mfherbst mfherbst Jul 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"symmetrize_densitylike" or "symmetrize_ρlike" ??

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at that point it's simpler to call it density, no ? :-p

Copy link
Member

@mfherbst mfherbst Jul 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. symmetrize_density it is.

@jaemolihm Sorry about that bikeshedding. If you want do the renaming, otherwise I'll do it in a followup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave it as your followup. 😄

@antoine-levitt
Copy link
Member

Great PR, thanks a lot!

We should do the same for the forces in nonlocal.jl: get rid of the explicit loop over symmetries, and just symmetrize the forces at the end.

I would feel better if I could convince myself of why this is OK and we could write it up as a comment somewhere in symmetry.jl (the same question also comes up for the density). Essentially we want to compute sum_{k reducible} f(k) = sum_{k irreducible} sum_{S for all S mapping k to a reducible kpoint} sym(S, f(k)), and we are saying that this is equal to global_symmetrization(sum_{k reducible} weight(k) f(k)) where weight counts the number of reducible kpoints a particular reducible kpoint maps to. Is it clear why we can make this step?

@jaemolihm
Copy link
Contributor Author

A simple argument is as follows. First, in your notation, sym(S, f(k)) = f(Sk).
Then, sum_{k irreducible} sum_{S for all S mapping k to a reducible kpoint} f(Sk) = sum_{k irreducible} 1/(Number of S that maps k to k) sum_{all S} f(Sk) holds.
Finally, we use f(Sk) = apply_S(f(k)) and that apply_S is a linear operation.
Then, we get sum_{k reducible} f(k) = 1 / (Number of S) * sum_{all S} apply_S(sum_{k irreducible} kweight * f(k)) where kweight = (Number of S) /(Number of S that maps k to k).
(Here, "all S" means "all S in basis.symmetries".)

@antoine-levitt
Copy link
Member

I'm with you except on

Then, sum_{k irreducible} sum_{S for all S mapping k to a reducible kpoint} f(Sk) = sum_{k irreducible} 1/(Number of S that maps k to k) sum_{all S} f(Sk) holds.

Why is that exactly? Is this because all the S are a full group?

@jaemolihm
Copy link
Contributor Author

Yes, that's because all the S's form a group, and all the S's that map k to k form a subgroup.

@mfherbst
Copy link
Member

That makes sense. I think this should be written up a bit more formally in https://juliamolsim.github.io/DFTK.jl/stable/advanced/symmetries/, because I will definitely forget the details.

@antoine-levitt
Copy link
Member

Yeah, it'd be nice to write up this argument in that page, and refer to it from the code. This definitely has confused me before and will confuse me again in the future. @jaemolihm if you feel like it feel free to do it; otherwise I'll do it at some point

@jaemolihm
Copy link
Contributor Author

I added the doc on symmetrization. I also fixed the stresses test for MPI that the random number was not broadcasted.

@@ -42,7 +43,7 @@ include("testcases.jl")
stresses = compute_stresses(scfres)
@test isapprox(stresses, compute_stresses(scfres_nosym), atol=1e-10)

dir = randn(3, 3)
dir = MPI.bcast(randn(3, 3), 0, MPI.COMM_WORLD)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! We should probably do this in PlaneWaveBasis, make sure all parameters (including Model) are the same on all processors...

@mfherbst mfherbst merged commit 70fc3c8 into JuliaMolSim:master Jul 30, 2021
calculated by first summing over the irreducible ``k`` points and then symmetrizing.
In this subsection, we denote by ``S`` the combined rotation and fractional transformation.
If ``S`` is the symmetry of the system, ``f(Sk) = S(f(k))`` holds, where ``S(f)`` is the
symmetry transformation of ``f`` which is a linear function and depends on the quantity
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean for f to be linear in this context? I would just say "consider a k-dependent quantity of interest (energy, density, force...). f is assumed to transform in a particular way under the symmetry: f(S(k)) = S(f(k)) where the action of S on f depends on the particular f."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your writing is clear. What I meant was that S is a linear function.

@mfherbst
Copy link
Member

Upps ... too quickly on my end, sorry.

@antoine-levitt
Copy link
Member

No worries, I'll bikeshed the docs on my end and propose the changes in another PR

@mfherbst
Copy link
Member

Can you also do the rename?

@antoine-levitt
Copy link
Member

antoine-levitt commented Jul 30, 2021

Sure.

@jaemolihm I think there was a bug in your equations, I corrected to

$$\begin{aligned} \sum_{k\ \mathrm{reducible}} f(k) &= \sum_{k\ \mathrm{irreducible}} \sum_{S\text{ mapping $k$ to a reducible point}} S(f(k)) \\\ &= \sum_{k\ \mathrm{irreducible}} \frac{1}{N_{S,k}} \sum_{S \in \mathcal S} S(f(k)) \\\ &= \frac 1 {N_S} \sum_{S \in \mathcal S} \left(\sum_{k\ \mathrm{irreducible}} \frac{N_S}{N_{S,k}} f(k) \right) \end{aligned}$$

Does this look OK? Also I'm still not sure how you go from the second to the third line. Wikipeding a bit gives me https://en.wikipedia.org/wiki/Lagrange%27s_theorem_(group_theory) which looks related, but I'm still not sure...

@antoine-levitt
Copy link
Member

Ooops... I pushed to master... I'll make sure tests pass locally and make another PR...

@jaemolihm jaemolihm deleted the stresses_sym branch July 31, 2021 07:58
@jaemolihm
Copy link
Contributor Author

jaemolihm commented Jul 31, 2021

Yes, the mathematics is related to Lagrange theorem. The "S\text{ mapping $k$ to a reducible point}" is mathematically translated to summing over different cosets of the little group in the group of S (and taking any one element from each coset), because Sk=S'k if S and S' belongs to the same coset. All cosets have N_{S,k} elements. Thus, one can change the sum into the sum over all elements in all cosets, divided by N_{S,k}.

Thank you for fixing the bug. In your equations, I think an "S" between \sum_S and \left( is missing in the last line.

@jaemolihm
Copy link
Contributor Author

jaemolihm commented Jul 31, 2021

The last line of the symmetrization docs, that N_S / N_S,k is inversely proportional to the number of reducible k corresponding to this irreducible k, is the result of the Lagrange theorem.

@antoine-levitt
Copy link
Member

@jaemolihm I'm revisiting this (to compute forces using symmetrization) and I still don't understand it. I've tried to check that the symmetries returned by spglib are actually a group, but it doesn't appear to be the case. Below a simple attempt at making the group structure explicit:

# Represents a symmetry (S,τ)
struct SymOp
    S::Mat3{Int}
    τ::Vec3{Float64}
    function SymOp(S, τ)
        τ = τ .- floor.(τ)
        @assert all(0 .≤ τ .< 1)
        new(S, τ)
    end
    SymOp(Sτ::Tuple) = SymOp(Sτ...) # compatibility with old stuff, doesn't hurt
end

Base.:(==)(op1::SymOp, op2::SymOp) = op1.S == op2.S && isapprox(op1.τ, op2.τ; atol=1e-8)
Base.one(::Type{SymOp}) = SymOp(Mat3{Int}(I), Vec3(zeros(3)))

function Base.:*(op1, op2)
    #(U u)(x) = u(S^-1(x-τ))
    #(U1 U2 u)(x) = (U2u)(S1^-1(x-τ1))
    #(U1 U2 u)(x) = u(S2^-1((S1^-1(x-τ1))-τ2))
    #(U1 U2 u)(x) = u(S2^-1(S1^-1(x-τ1-S1 τ2)))
    S = op1.S * op2.S
    τ = op1.τ + op1.S * op2.τ
    SymOp(S, τ)
end

import Base.inv
# inverse of y=S^-1(x-τ) is x=Sy+τ=T^-1(x-t) with T = S^-1, t=-Tτ
Base.inv(op) = SymOp(inv(op.S), -op.S\op.τ)

function check_group(symops::Vector)
    for s in symops
        inv(s)  symops || error("check_group: symop $s with inverse $(inv(s)) is not in the group")
        end
        for s2 in symops
            (s*s2  symops && s2*s  symops) || error("check_group: product is not stable")
        end
    end
    symops
end

This fails, just because in the symmetries of silicon

SymOp([1 0 0; 0 1 0; 0 0 1], [0.0, 0.0, 0.0])
SymOp([1 0 -1; 1 0 0; 1 -1 0], [0.0, 0.5, 0.0])
SymOp([0 1 -1; 1 0 -1; 0 0 -1], [0.0, 0.0, 0.5])
SymOp([0 1 0; 0 1 -1; -1 1 0], [0.5, 0.0, 0.0])
SymOp([-1 0 0; -1 0 1; -1 1 0], [0.5, 0.0, 0.0])
SymOp([0 -1 0; -1 0 0; 0 0 -1], [0.0, 0.0, 0.0])
SymOp([0 -1 1; 0 -1 0; 1 -1 0], [0.0, 0.5, 0.0])
SymOp([-1 0 1; 0 -1 1; 0 0 1], [0.0, 0.0, 0.5])
SymOp([0 1 0; 0 0 1; 1 0 0], [0.0, 0.0, 0.0])
SymOp([-1 1 0; 0 1 0; 0 1 -1], [0.0, 0.5, 0.0])
SymOp([-1 0 1; -1 1 0; -1 0 0], [0.0, 0.0, 0.5])
SymOp([0 0 1; -1 0 1; 0 -1 1], [0.5, 0.0, 0.0])
SymOp([0 -1 0; 1 -1 0; 0 -1 1], [0.5, 0.0, 0.0])
SymOp([0 0 -1; 0 -1 0; -1 0 0], [0.0, 0.0, 0.0])
SymOp([1 0 -1; 0 0 -1; 0 1 -1], [0.0, 0.5, 0.0])
SymOp([1 -1 0; 1 0 -1; 1 0 0], [0.0, 0.0, 0.5])
SymOp([0 0 1; 1 0 0; 0 1 0], [0.0, 0.0, 0.0])
SymOp([0 -1 1; 0 0 1; -1 0 1], [0.0, 0.5, 0.0])
SymOp([1 -1 0; 0 -1 1; 0 -1 0], [0.0, 0.0, 0.5])
SymOp([1 0 0; 1 -1 0; 1 0 -1], [0.5, 0.0, 0.0])
SymOp([0 0 -1; 0 1 -1; 1 0 -1], [0.5, 0.0, 0.0])
SymOp([-1 0 0; 0 0 -1; 0 -1 0], [0.0, 0.0, 0.0])
SymOp([-1 1 0; -1 0 0; -1 0 1], [0.0, 0.5, 0.0])
SymOp([0 1 -1; -1 1 0; 0 1 0], [0.0, 0.0, 0.5])
SymOp([-1 0 0; 0 -1 0; 0 0 -1], [0.0, 0.0, 0.0])
SymOp([-1 0 1; -1 0 0; -1 1 0], [0.0, 0.5, 0.0])
SymOp([0 -1 1; -1 0 1; 0 0 1], [0.0, 0.0, 0.5])
SymOp([0 -1 0; 0 -1 1; 1 -1 0], [0.5, 0.0, 0.0])
SymOp([1 0 0; 1 0 -1; 1 -1 0], [0.5, 0.0, 0.0])
SymOp([0 1 0; 1 0 0; 0 0 1], [0.0, 0.0, 0.0])
SymOp([0 1 -1; 0 1 0; -1 1 0], [0.0, 0.5, 0.0])
SymOp([1 0 -1; 0 1 -1; 0 0 -1], [0.0, 0.0, 0.5])
SymOp([0 -1 0; 0 0 -1; -1 0 0], [0.0, 0.0, 0.0])
SymOp([1 -1 0; 0 -1 0; 0 -1 1], [0.0, 0.5, 0.0])
SymOp([1 0 -1; 1 -1 0; 1 0 0], [0.0, 0.0, 0.5])
SymOp([0 0 -1; 1 0 -1; 0 1 -1], [0.5, 0.0, 0.0])
SymOp([0 1 0; -1 1 0; 0 1 -1], [0.5, 0.0, 0.0])
SymOp([0 0 1; 0 1 0; 1 0 0], [0.0, 0.0, 0.0])
SymOp([-1 0 1; 0 0 1; 0 -1 1], [0.0, 0.5, 0.0])
SymOp([-1 1 0; -1 0 1; -1 0 0], [0.0, 0.0, 0.5])
SymOp([0 0 -1; -1 0 0; 0 -1 0], [0.0, 0.0, 0.0])
SymOp([0 1 -1; 0 0 -1; 1 0 -1], [0.0, 0.5, 0.0])
SymOp([-1 1 0; 0 1 -1; 0 1 0], [0.0, 0.0, 0.5])
SymOp([-1 0 0; -1 1 0; -1 0 1], [0.5, 0.0, 0.0])
SymOp([0 0 1; 0 -1 1; -1 0 1], [0.5, 0.0, 0.0])
SymOp([1 0 0; 0 0 1; 0 1 0], [0.0, 0.0, 0.0])
SymOp([1 -1 0; 1 0 0; 1 0 -1], [0.0, 0.5, 0.0])
SymOp([0 -1 1; 1 -1 0; 0 -1 0], [0.0, 0.0, 0.5])

if you take the second one SymOp([1 0 -1; 1 0 0; 1 -1 0], [0.0, 0.5, 0.0]) and compute its inverse, you get SymOp([0 1 0; 0 1 -1; -1 1 0], [0.5, 0.5, 0.5]) which is not in the symmetries (not even its S). I don't see what could be wrong with this (problems with symmetries in cartesian vs reduced or tilde vs non tilde shouldn't matter here). Did I miss something?

@jaemolihm
Copy link
Contributor Author

jaemolihm commented Jan 31, 2022

Hi @antoine-levitt ,

First, S = [0 1 0; 0 1 -1; -1 1 0] is there on the fourth row, as it should be.

Second, I do not know how you generated the list, but did you took care of the Spglib convention that the translation is taken after the rotation? http://spglib.github.io/spglib/definition.html#symmetry-operation-boldsymbol-w-boldsymbol-w

Third, the 48 SymOps by themselves do not form a group under the simple equivalence relation. The actual group includes translation symmetry by the lattice vector. So, one should use isapprox(mod(op1.τ - op2.τ, 1); atol=1e-8) (I'm not sure this is the correct syntax) to define ==. Mathematically, the SymOps should be understood to represent cosets {(S, tau + R) | R = (n1, n2, n3)}.
But still there is no SymOp equivalent to [0.5, 0.5, 0.5] mod 1, so I guess the convention problem in my second point is there.

@antoine-levitt
Copy link
Member

Thanks for the quick response!

First, S = [0 1 0; 0 1 -1; -1 1 0] is there on the fourth row, as it should be.

OK, I'm just blind :-) Thanks, OK so that just leaves out the tau part to figure out.

Second, I do not know how you generated the list, but did you took care of the Spglib convention that the translation is taken after the rotation? http://spglib.github.io/spglib/definition.html#symmetry-operation-boldsymbol-w-boldsymbol-w

I based the derivation on the docs we have in DFTK in symmetry.jl, which is that a symmetry acts on a function u as (Uu)(x) = u(S^-1(x-τ)). We use this a bunch in the code, so I hoped this would be debugged and tested by now, but I'll check again.

Third, the 48 SymOps by themselves do not form a group. The actual group includes translation symmetry by the lattice vector. So, one should use isapprox(mod(op1.τ - op2.τ, 1); atol=1e-8) (I'm not sure this is correct syntax) to define ==.

This is taken into account already by the constructor to SymOp which normalizes the tau

@jaemolihm
Copy link
Contributor Author

jaemolihm commented Jan 31, 2022

Oh, I see. Maybe the issue with tau is that it is displacements in cartesian coordinates, so the lattice vector is [0.0, 0.5, 0.5], not an integer-valued one.

@antoine-levitt
Copy link
Member

Both spglib and DFTK are in reduced coordinates. This is consistent with what we already have, eg https://github.com/JuliaMolSim/DFTK.jl/blob/master/src/external/spglib.jl#L107

@antoine-levitt
Copy link
Member

OK, so it appears (by reverse engineering...) that the right formulas for the product are τ = op1.τ + op1.S' \ op2.τ and for the inverse -op.S'*op.τ. This has to do with the fact that we work in reciprocal space (see defs at https://docs.dftk.org/dev/advanced/symmetries/: our Stilde is spglib's W, our tautilde is spglib's w), I'll write up the details in the docs when I understand it properly...

God I hate symmetries.

@mfherbst
Copy link
Member

I think I worked this out at some point, see this file:

https://github.com/JuliaMolSim/DFTK.jl/blob/177c43afc51d06ef9995070ce7f953328bd50c30/docs/latex/brillouin_zone_symmetry/symmetry.tex

got removed ages ago, so might be wrong, but might also be helpful.

@mfherbst
Copy link
Member

ah no, sorry it does not contain the products, I misremembered.

@antoine-levitt
Copy link
Member

Yes that essentially got merged into the docs and symmetry.jl. Our notations S and tau and Stilde and tautilde are super bad. If we want to keep S and tau, can we use W and w (like spglib) for Stilde and tautilde? Is there any standard name for the S and tau?

@antoine-levitt
Copy link
Member

antoine-levitt commented Jan 31, 2022

The main thing that was confusing me is that we have in the docs the equation
# (Uu)(x) = u(S^-1(x-τ))
which is valid in cartesian coordinates (where S^-1 = S') but not in reduced; the proper equation is the one in https://docs.dftk.org/dev/advanced/symmetries/, (Uu)(x) = u(Stilde x + tautilde). It's misleading to have this equation anyway (S is a reciprocal space transformation, it has no business with doing S^-1 x), so I'll just change it with the proper one once we decide on better names for these.

@antoine-levitt antoine-levitt mentioned this pull request Feb 2, 2022
@antoine-levitt
Copy link
Member

Hi @jaemolihm, sorry to bother you again but I'm trying to implement symmetrization of the forces now and I don't understand the symmetrize_stresses routine you wrote. 1) since both the stress tensor and the symmetry operation are in reduced coordinates, why does the lattice even appear? 2) why do you use S (which is in reciprocal space) rather than W (which is in real space)? Did you copy the formula from somewhere or did you derive it yourself?

@jaemolihm
Copy link
Contributor Author

Hi @antoine-levitt , no worries, and thank you for looking this symmetry stuff in detail.
In fact, I found there are several problems with my implementation. I took the formula from Quantum ESPRESSO paper https://arxiv.org/abs/0906.2569 (Eq. (A.28)), but I got confused with the notations.

Regarding the appearance of lattice, I think there is an issue regarding the definition of stress tensor.

I opened a PR: #593

@antoine-levitt
Copy link
Member

Thank you! That paper is nice, in particular there's a good and concise description of the symmetrization for forces, stresses and phonon properties, might be useful for later. CC @epolack

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

stresses without unfold_bz
3 participants