-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue #100: get_val with HAVE_MPI #112
Conversation
The gentlest of bumps @stevengj: thoughts on this? |
mpb/fields.c
Outdated
if (local_iy >= 0 && local_iy < local_ny) { | ||
val = data[(((local_iy * nx) + ix) * nz + iz) * stride]; /* note transposition in x and y indices */ | ||
} | ||
mpi_allreduce_1(&val, real, SCALAR_MPI_TYPE, MPI_SUM, mpb_comm); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could actually call this in interp_val
on the result of the bilinear interpolation (this is the only function that calls get_val
), rather than in get_val
, to save a factor of 8 in the number of allreduce
calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, nice. Just to check my understanding, you mean we could instead reduce over the operations here,
Lines 628 to 631 in 0d886cf
return(((D(x,y,z)*(1.0-dx) + D(x2,y,z)*dx) * (1.0-dy) + | |
(D(x,y2,z)*(1.0-dx) + D(x2,y2,z)*dx) * dy) * (1.0-dz) + | |
((D(x,y,z2)*(1.0-dx) + D(x2,y,z2)*dx) * (1.0-dy) + | |
(D(x,y2,z2)*(1.0-dx) + D(x2,y2,z2)*dx) * dy) * dz); |
and then return the reduced value from
interp_val
? I.e. get_val
would then be returning a process-specific value when run with MPI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Do the reduce on the interpolated value.
This gives the same result because the interpolation is a linear operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, thanks! (my uncertainty was just whether MPI allows a function to return a process-specific value or if it has to be reduced before returning; makes sense that it doesn't need that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Functions can do whatever they want. MPI is just a library providing subroutines for a bunch of running processes to communicate with one another — each process is an independent program that can do independent computations.
The implementation looks correct to me. Would be nice to test a case with mpbi-mpi just to be sure (especially in 2d vs 3d), but on a quick skim I think you have that right. Nice job! |
I implemented the suggestion about moving the I also compiled and checked the 2D example: (cylinders on a square lattice; .ctl file below):
3D example: (spheres on in a cubic lattice; .ctl file from original post)
The agreement seems to be pretty good across the board, so I reckon the combination of 2D-example .ctl file(set! resolution 64)
(set! num-bands 3)
(set! output-epsilon (lambda () (print "skipping output-epsilon\n")))
; define a simple geometry
(set! geometry-lattice (make lattice (size 1 1 no-size)
(basis1 1 0) (basis2 0 1)))
(set! geometry (list (make cylinder
(center 0 0 0) (radius 0.25) (height infinity)
(material (make dielectric (epsilon 13) )))))
; set a k-point
(set! k-points (list (vector3 0.3532 0.1512 0)))
(run) ; run the calculation
; compute some interpolating values at an arbitrary real-space point
(define r (vector3 0.13 0.12 0))
(define band-idx 3)
(print "\n\n")
(get-hfield band-idx)
(print "|H|: " (vector3-norm (get-field-point r)) "\n")
(compute-field-energy)
(print "H-energy: " (get-energy-point r) "\n")
(get-dfield band-idx)
(print "|D|: " (vector3-norm (get-field-point r)) "\n")
(compute-field-energy)
(print "D-energy: " (get-energy-point r) "\n") |
Seems CI failed on building libctl: not sure why? Should be unrelated to the latest changes and this PR? |
CI is good now after the libctl fix. |
Thanks! |
* WIP: Initial commit for calculating symmetry transformed overlaps between Bloch states of H * Symmetry operations act on the full wave, not just the Bloch part; fix that * Minor comment improvements and an error message for detW ≠ 1 * generalize so that nontrivial mu can be used, and also allow passing in the d-field instead of the b-field * swap a few cnumbers for scalar_complex to hopefully avoid copying things unnecessarily * Fix bug in vector transformation and account for the fact that vector fields are in a Cartesian basis * Add band-index function interfaces to transformed_overlap without invoking get_bfield/get_dfield and add a clearer method description: - compute_symmetry (band function) - compute_symmetries (all bands) Also minor revisions to comments and micro-refactoring. * Fix erroneous operation of {W|w}^-1 on coordinate p - The translation component of {W|w}^-1 contains a factor of W^-1 - Previously, nonsymmorphic operations were consequently wrongly implemented * Remove MPI guards/warnings (#112 has been merged) * throw when used with mpbi - also add a comment to remind ourselves why the current implementation doesn't work with mpbi + a likely explanation for the origin of the issue * put guard-rails back on MPI: doesn't seem to work as is. - add a note describing the problem * improve consistency of comments * add Scheme documentation * comment nits * add tests and improve `get_bloch_field_point_` function parameter signature * Update mpb/fields.c Revert use of `static` for C89 compatibility. Co-authored-by: Steven G. Johnson <[email protected]> Co-authored-by: Steven G. Johnson <[email protected]>
Hi Steven,
I finally found sufficient incentive to attempt solving #100, i.e. making real-space interpolation work with MPI.
I pretty much went about it in the way we discussed there (after finally understanding, I think, the transposition of x/y indices that you mentioned).
I've compiled this with and without MPI and verified that the interpolation produces good agreement for a simple 3D test case (see below) run with
mpb
vs.mpb-mpi
:mpb
(res=32)mpb-mpi
(res=32)mpb
(res=64)mpb-mpi
(res=64)evaluated at an arbitrary point. The results for
mpb-mpi
remain unchanged (up to tiny convergence variations) when run with different number processes.3D test example (click to expand)
Does this seem like a correct implementation? One concern I have is I didn't really think about the case of
HAVE_MPI
and noSCALAR_COMPLEX
: would that need additional special-casing?I also snuck in a tiny addition to
.gitignore
for working in VSCode (can be dropped, of course).