Non-reproducible outputs from adjoint solver (memory leak?) #1262

rhamerly · 2020-06-24T03:54:14Z

Under certain conditions, when I create an an OptimizationProblem instance opt = mpa.OptimizationProblem(...) and then run opt([x]) multiple times, the result can be very different for the first run (often nan), e.g. if I run

x1 = np.ones(dv.Nx*dv.Ny) * (n_SiO2**2 + n_air**2)/2
print("Run 1")
(J0_a, dJ_a, dg_a) = opt([x1])
print("Run 2")
(J0_b, dJ_b, dg_b) = opt([x1])
print("Run 3")
(J0_c, dJ_c, dg_c) = opt([x1])

I find that dJ_a differs from dJ_b, dJ_c and is often very large, inf, or nan. It's only the calculation of dJ that gets messed up; the actual simulations look fine (figures below). One possibility is meep.adjoint is reading from the array memory that hasn't been initialized or is already freed (I've run into this with autograd elsewhere).

Note14b-dJ.pdf
Note14b-sims.pdf

Full code below:

import numpy as np
import meep as mp
import sys
sys.path.append("/Users/ryan.hamerly/Documents/Research/Code/meep/python")
import adjoint as mpa
import autograd.numpy as npa
import matplotlib.pyplot as plt

# Global Constants
lam = 0.532;     f0 = 1/lam;      df = 0.2*f0;       res = 20

n_SiO2 = 1.45;   SiO2 = mp.Medium(index=n_SiO2)
n_air  = 1.00;   air  = mp.Medium(index=n_air)
n_Si   = 3.50;   Si   = mp.Medium(index=n_Si)

dpml = 1
(sx, sy) = (10, 7.7)
(Sx, Sy) = (sx+2*dpml, sy+2*dpml)

#%%

dr_res = 2*res
dr = mp.Volume(center=mp.Vector3(0, 2.4), size=mp.Vector3(sx, 0.9))
dv = mpa.BilinearInterpolationBasis(volume=dr, resolution=dr_res)

geom = [mp.Block(size=dr.size, center=dr.center, epsilon_func=dv.func())]
src_gauss = mp.Source(mp.GaussianSource(f0, fwidth=df),
                      center=mp.Vector3(0, sy/2-0.1), size=mp.Vector3(sx, 0), component=mp.Ez)

sim = mp.Simulation(cell_size=mp.Vector3(Sx, Sy),
                    resolution=res,
                    boundary_layers=[mp.PML(dpml)],
                    geometry=geom,
                    sources=[src_gauss],
                    default_material=SiO2,
                    eps_averaging=False)

TE0 = mpa.EigenmodeCoefficient(sim, mp.Volume(mp.Vector3(0, -sy/2+1), size=mp.Vector3(lam*0.7, 0)), mode=1,
                               eig_parity=mp.ODD_Z, k0=mp.Vector3(0, -1))
ob_list = [TE0]
def J(alpha):
    return npa.abs(alpha)**2

opt = mpa.OptimizationProblem(sim, J, ob_list, [dv], fcen=f0, df=0, nf=1, decay_fields=[mp.Ez])

#%%  Now run the simulation 3 times with the same data.

print("************************* First run of opt([x1]) *************************")
x1 = np.ones(dv.Nx*dv.Ny) * (n_SiO2**2 + n_air**2)/2
(J0_a, dJ_a, dg_a) = opt([x1])

print("************************* Second run of opt([x1]) *************************")
x1 = np.ones(dv.Nx*dv.Ny) * (n_SiO2**2 + n_air**2)/2
(J0_b, dJ_b, dg_b) = opt([x1])

print("************************* Third run of opt([x1]) *************************")
x1 = np.ones(dv.Nx*dv.Ny) * (n_SiO2**2 + n_air**2)/2
(J0_c, dJ_c, dg_c) = opt([x1])

#%%  Show the derivatives dJ/d_epsilon for the three (identical) trials

(f, axList) = plt.subplots(3, 1, sharex=True, sharey=True)
for (ax, dJ, ttl) in zip(axList, [dJ_a, dJ_b, dJ_c], ["Run 1", "Run 2", "Run 3"]):
    cax = ax.imshow(np.squeeze(dJ).T, origin='lower', aspect='auto', cmap='seismic');
    ax.set_xticks([]); ax.set_yticks([]); ax.set_title(ttl)
    f.colorbar(cax, ax=ax)
plt.savefig("Note14b-dJ.pdf")
#plt.show()

#%%  Simulation setup, forward and adjoint runs

print("************************* Forward run *************************")
(f, (ax1, ax2)) = plt.subplots(1, 2, figsize=(8, 4), sharex=True, sharey=True)
opt.prepare_forward_run()
opt.sim.run(until=20)
opt.sim.plot2D(ax=ax1, fields=mp.Ez, eps_parameters=dict(cmap='binary_r'))
print("************************* Adjoint run *************************")
opt.adjoint_run()
opt.sim.reset_meep()
opt.sim.run(until=20)
opt.sim.plot2D(ax=ax2, fields=mp.Ez, eps_parameters=dict(cmap='binary_r'))
ax1.set_title("Forward Run, $t=20$"); ax2.set_title("Adjoint Run, $t=20$"); ax2.set_ylabel("")
f.tight_layout(); f.savefig("Note14b-sims.pdf")
#plt.show()

The text was updated successfully, but these errors were encountered:

rhamerly · 2020-06-24T15:25:10Z

Workaround: call opt() before doing the optimization. If running from notebook cells (e.g. PyCharm), put opt() at the head of the cell. Would be nice to have a real fix, though.

smartalecH · 2020-06-25T01:14:32Z

Weird. This should be resolved with #1242. But I'll test it just to make sure.

Thanks for the heads up and the detailed bug report.

smartalecH · 2020-06-25T12:55:54Z

I've traced the issue down to this line:

meep/python/adjoint/optimization_problem.py

Line 229 in 1c5bdbf

    
           self.gradient = [[2*np.sum(np.real(self.a_E[ar][nb]*self.d_E[nb]),axis=(3)) for nb in range(self.num_design_regions)] for ar in range(len(self.objective_functions))]

What's weird is that the actual forward fields and adjoint fields themselves are calculated and stored properly. But you occasionally get weird results once you multiply.

Since this routine is completely changed in #1242 (all of this will be done in C) I don't think it's worth looking into right now. I will make sure this issue doesn't happen with the new routine, however.

rhamerly · 2020-06-25T16:51:04Z

Yes, I traced some warnings down to that line too. The value of self.d_E is not consistent between the runs. Probably related: I've been finding gradients with odd offsets from zero, e.g. in cases where you know dJ/d_eps should be an interference pattern that oscillates around zero, but instead if oscillates around a nonzero constant (figure below). I think this is the same bug since the offset is different one run to the next. Maybe it's best to wait for the code update from #1242 to see if that fixes this. Will that be soon?

Note14c-phase.pdf

Raw code:

import numpy as np
import meep as mp
import sys
sys.path.append("/Users/ryan.hamerly/Documents/Research/Code/meep/python")
import adjoint as mpa   # Note unconventional way of loading meep.adjoint
import autograd.numpy as npa
import matplotlib.pyplot as plt
import nlopt

# Constants
lam = 0.532;     f0 = 1/lam;      df = 0.2*f0;       res = 20
n_SiO2 = 1.45;   SiO2 = mp.Medium(index=n_SiO2)
spc = 1;         dpml = 1
(sx, sy) = (10, 7)
(Sx, Sy) = (sx+2*dpml, sy+2*dpml)

# Design region parameters
dr_res = 10
dr = mp.Volume(center=mp.Vector3(0, -2.4), size=mp.Vector3(sx, 0.2))
dv = mpa.BilinearInterpolationBasis(volume=dr, resolution=dr_res)
(Nx, Ny) = (dv.Nx, dv.Ny)

# Define optimization problem.
src_gauss = [mp.Source(mp.GaussianSource(f0, fwidth=df),
                       center=mp.Vector3(0, -(sy/2-0.1)), size=mp.Vector3(sx, 0), component=mp.Ez)]
geom = [mp.Block(center=dr.center, size=dr.size, epsilon_func=dv.func())]
sim = mp.Simulation(cell_size=mp.Vector3(Sx, Sy), resolution=res, boundary_layers=[mp.PML(dpml)],
                    geometry=geom, sources=src_gauss, default_material=SiO2, eps_averaging=False)
TE0 = mpa.EigenmodeCoefficient(sim, mp.Volume(center=mp.Vector3(0, 2.6), size=mp.Vector3(0.7*lam, 0)), mode=1,
                               eig_parity=mp.ODD_Z)
def J(alpha): return npa.abs(alpha)**2
opt = mpa.OptimizationProblem(sim, J, [TE0], [dv], fcen=f0, df=0, nf=1, decay_fields=[mp.Ez])

# Show problem geometry and optimization region.
x0 = n_SiO2**2 * np.ones(Nx*Ny)
opt.update_design([x0])
opt.plot2D(True, eps_parameters=dict(cmap='binary_r'))
plt.savefig("Note14c-setup.pdf", format="pdf")
plt.show()

# Run optimizer twice.
mp.quiet(True)
(J0_1, dJ_1, dg_1) = opt([x0])
(J0_2, dJ_2, dg_2) = opt([x0])

# Plot dJ and phase curve that it *should* follow
# In addition, the two results *should* be the same but aren't (off by a constant offset).
(f, (ax1, ax2)) = plt.subplots(2, 1, figsize=(8, 4.5))
for (ax, dJ, n) in zip([ax1, ax2], [dJ_1, dJ_2], [1, 2]):
    X = np.linspace(-sx/2, sx/2, 501)
    Y = -(-sy/2+spc-0.1) - (dr.center.y)
    ax.plot(X, np.cos(n_SiO2*2*np.pi/lam*(np.sqrt(X**2 + Y**2) - Y)))
    cax = ax.imshow(np.squeeze(dJ).T, extent=[-sx/2, sx/2, -0.1, 0.1], origin='lower')
    ax.set_xlabel("X"); ax.set_ylabel("Y"); ax.set_title(f"[Run {n}]: " + r"Phase $e^{i k r}$ (line), Gradient $dJ/d\epsilon$ (fill)")
    f.colorbar(cax, ax=ax, orientation='vertical').set_label(r"$dJ/d\epsilon$")
plt.tight_layout(); plt.savefig("Note14c-phase.pdf", format="pdf"); plt.show()

# Plot forward and adjoint runs to convince ourselves the simulation is set up right
(f, (ax1, ax2)) = plt.subplots(1, 2, figsize=(9, 4))
opt.prepare_forward_run()
opt.sim.run(until=20)
opt.sim.plot2D(ax=ax1, fields=mp.Ez, eps_parameters=dict(alpha=0)); ax1.set_title("Forward, $t=20$")
opt.adjoint_run()
opt.sim.reset_meep()
opt.sim.run(until=20)
opt.sim.plot2D(ax=ax2, fields=mp.Ez, eps_parameters=dict(alpha=0)); ax2.set_title("Adjoint, $t=20$")
plt.tight_layout(); plt.savefig("Note14c-sims.pdf", format="pdf"); plt.show()

smartalecH · 2020-06-25T20:30:05Z

Okay, I think I got it.

The source you are specifying is polarized out of plane (mp.Ez). It seems that meep doesn't store DFT fields for in-plane components when only out of plane sources are specified (and vice-versa).

All of my test cases involved eigenmode sources, which have components in-plane and out-of-plane. So I never picked up on this. We should modify the field storage routines to account for this (currently _get_dft_array() inside meep.i returns garbage for this corner case).

In the meantime, a clever hack is to specify a source in-plane with zero amplitude:

src_gauss = [mp.Source(mp.GaussianSource(f0, fwidth=df),
                       center=mp.Vector3(0, -(sy/2-0.1)), size=mp.Vector3(sx, 0), component=mp.Ez),
                       mp.Source(mp.GaussianSource(f0, fwidth=df),
                       center=mp.Vector3(0, -(sy/2-0.1)), size=mp.Vector3(sx, 0),amplitude=0, component=mp.Ex)]

rhamerly · 2020-06-27T23:28:10Z

Yes, I can confirm that the workaround works. Thanks!

oskooi added the bug label Jun 25, 2020

smartalecH mentioned this issue Jun 26, 2020

Fix dft fields memory leak #1264

Closed

oskooi closed this as completed Jan 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-reproducible outputs from adjoint solver (memory leak?) #1262

Non-reproducible outputs from adjoint solver (memory leak?) #1262

rhamerly commented Jun 24, 2020

rhamerly commented Jun 24, 2020

smartalecH commented Jun 25, 2020

smartalecH commented Jun 25, 2020

rhamerly commented Jun 25, 2020

smartalecH commented Jun 25, 2020

rhamerly commented Jun 27, 2020

Non-reproducible outputs from adjoint solver (memory leak?) #1262

Non-reproducible outputs from adjoint solver (memory leak?) #1262

Comments

rhamerly commented Jun 24, 2020

rhamerly commented Jun 24, 2020

smartalecH commented Jun 25, 2020

smartalecH commented Jun 25, 2020

rhamerly commented Jun 25, 2020

smartalecH commented Jun 25, 2020

rhamerly commented Jun 27, 2020