Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add minimum linewidth constraint to waveguide mode converter topology optimization using Meep #27

Merged
merged 1 commit into from
Nov 1, 2022

Conversation

oskooi
Copy link
Collaborator

@oskooi oskooi commented Nov 1, 2022

Adds a minimum linewidth constraint to the Meep toplogy optimization of the waveguide mode converter based on A.M. Hammond et al., Optics Express, Vol. 29, pp. 23916-23938, (2021). Thanks to @smartalecH for help in setting this up.

Here is an example for a minimum linewidth of 50 nm. The feature seems to be working although more ongoing work is necessary to improve the performance of the final design.

mode_converter_optimize_minlen50_1

optimized_mode_converter_minlen50_1_refl_tran

optimal_design_minlen50_1_eval_hist

@oskooi
Copy link
Collaborator Author

oskooi commented Nov 1, 2022

CSV file of design to check against ruler.

optimal_design_minlen50_1.csv

@stevengj stevengj merged commit fc388d6 into NanoComp:main Nov 1, 2022
Copy link

@smartalecH smartalecH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to split up the transmission and reflection as separate objective functions, as we discussed. Should be trivial since you are already doing a minimax.

I think the most important thing, however, will be fixing the dummy parameter update.

lb = np.insert(lb, 0, 0) # lower bound: cannot be less than 0
ub = np.insert(ub, 0, 2) # upper bound: cannot be more than 2
x = np.insert(x, 0, 1.2) # initial guess for the worst error
lb = np.insert(lb, 0, 0.) # lower bound: cannot be less than 0.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't put any bounds on the dummy parameter. As @stevengj has said before, just let the optimizer do it's thing (especially during the inner iterations).

solver.add_inequality_mconstraint(
lambda r, x, g: glc(r, x, g, beta),
[1e-8] * opt.nf,
)
x[:] = solver.optimize(x)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we really need to recalibrate the dummy parameter (t) before starting the optimization. You can see the optimizer is spending most of its time adjusting this during the second epoch.

You can do one of three things (in order of both increasing difficulty and increased expected performance):

  1. Reset t to the same default at the start of each epoch. Heuristically, it's much easier for the optimizer to bring it down (and all the constraints are satisfied) than to bring it up (where none of the constraints are satisfied).
  2. Run a single forward run before starting each new optimization epoch, and manually set t as the maximum value.
  3. Modify the CCSA algorithm to update t analytically (as described in the original Svanberg paper).

@@ -37,13 +39,14 @@
wvls = (1.265, 1.270, 1.275, 1.285, 1.290, 1.295)
frqs = [1/wvl for wvl in wvls]

minimum_length = 0.09 # minimum length scale (μm)
minimum_length = 0.05 # minimum length scale (μm)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you drop the minimum feature size because this is what's done in the other paper and you want to be consistent? You should still get great results with 90 nm.

@oskooi oskooi deleted the mode_converter_linewidth branch November 1, 2022 15:52
@mawc2019
Copy link
Collaborator

mawc2019 commented Nov 1, 2022

CSV file of design to check against ruler.

optimal_design_minlen50_1.csv

The minimum length scale estimated by ruler is 0.0625.

@oskooi
Copy link
Collaborator Author

oskooi commented Nov 3, 2022

Thanks again for your additional suggestions, @smartalecH. I did the following: (1) removed the constraints on the epigraph variable t and (2) following your second recommendation, execute a single forward run before starting each optimization epoch and manually set t to the maximum value of R+1-T over all six wavelengths. I switched to a minimum length of 50 nm (from 90 nm) to see whether the length scale constraint is working properly and also to (hopefully) improve the final design performance. (A design with a 50 nm minimum length is measured to actually be 62.5 nm according to @mawc2019's ruler. This is a large discrepancy which would be good to address.)

Here are two designs generated using these modifications. Unfortunately, the final performance has gotten worse. The convergence history shows that during the start of the final epoch when the minimum linewidth constraint is applied, the performance gets significantly worse and does not change throughout the epoch. I wonder whether I need to keep adjusting the hyperparameter a1 to obtain a better result. We can also try changing the objective function to $\min\max(R,1-T)$.

    # insert epigraph variable and _no_ bounds as first element of 1d arrays                                              
    x = np.insert(x, 0, 1.2)   # initial guess for the worst error (ignored)                                            
    lb = np.insert(lb, 0, -np.inf)
    ub = np.insert(ub, 0, np.inf)

    evaluation_history = []
    cur_iter = [0]

    betas = [8, 16, 32]
    max_eval = 50
    tol = np.array([1e-6] * opt.nf)
    for beta in betas:
        solver = nlopt.opt(algorithm, n + 1)
        solver.set_lower_bounds(lb)
        solver.set_upper_bounds(ub)
        solver.set_min_objective(f)
        solver.set_maxeval(max_eval)
        solver.add_inequality_mconstraint(
            lambda r, x, g: c(r, x, g, eta_i, beta),
            tol,
        )
        # apply the minimum linewidth constraint                                                                        
        # only in the final "epoch"                                                                                     
        if beta == betas[-1]:
            solver.add_inequality_mconstraint(
                lambda r, x, g: glc(r, x, g, beta),
                [1e-8] * opt.nf,
            )

        # run a single forward run before each epoch                                                                    
        # and manually set the epigraph variable to                                                                     
        # the largest value of the objective function                                                                   
        # over the six wavelengths                                                                                      
        t0 = opt(
            [mapping(x[1:], eta_i, beta)],
            need_gradient=False,
        )
        x[0] = np.amax(t0[0])

        x[:] = solver.optimize(x)

1. a1 = 1e-4

mode_converter_optimize_minlen50_4

optimized_mode_converter_minlen50_4_refl_tran

optimal_design_minlen50_4_eval_hist

1. a2 = 1e-5

mode_converter_optimize_minlen50_5

optimized_mode_converter_minlen50_5_refl_tran

optimal_design_minlen50_5_eval_hist

@smartalecH
Copy link

We're getting closer...

So something now seems wrong with that first epoch. It shouldn't take 40+ iterations for it to start making progress on the FOM... Are you sure you are pulling the right value for the dummy parameter (t)?

An easy way to check what's going on is to print the value of t and the broadband FOM after each iteration. Is t really the max here? If so, why is the optimizer taking so long to work on the constraints? Is it instead trying to wiggle t around?

But the second epoch is looking better.

The convergence history shows that during the start of the final epoch when the minimum linewidth constraint is applied, the performance gets significantly worse and does not change throughout the epoch. I wonder whether I need to keep adjusting the hyperparameter a1 to obtain a better result

This is classic behavior that the hyperparameters aren't sufficiently chosen. The optimizer is struggling to satisfy the GLC (i.e. the constraint is too stiff).

Be sure to print the two values of the GLC constraints after each iteration. If they are still positive, then the optimizer is stuck trying to bring those down. It's useful to plot the gradient of the GLC constraints too.

When you do play with a1, do so by powers of 10. as $a_1\to0$, then this corresponds to a constraint independent of the dummy parameter (i.e. the optimizer won't even consider the t when trying to bring the GLC constraint functions below 0). As $a_1\to1$, the constraint starts to share the same weighting as the FOMs themselves.

(A design with a 50 nm minimum length is measured to actually be 62.5 nm according to @mawc2019's ruler. This is a large discrepancy which would be good to address.)

What is this error relative to the resolution? As things go subpixel, the ruler is only so accurate...

@mawc2019
Copy link
Collaborator

mawc2019 commented Nov 3, 2022

A design with a 50 nm minimum length is measured to actually be 62.5 nm according to @mawc2019's ruler. This is a large discrepancy which would be good to address.

The results of the method are affected by implementation details. I have other versions based on cv2 and scipy.ndimage, which are packages that can perform morphological transformations. The result obtained with these versions is 56.25 nm. However, cv2 does not support 3d calculation, and scipy.ndimage is costly for 3d even if the size of the 3d array is not large. Because no genuine 3d design patterns are involved in this testbed project, maybe we should change ruler.py to the version based on cv2 or scipy.ndimage?

What is this error relative to the resolution? As things go subpixel, the ruler is only so accurate...

The resolution is 100, which means the pixel size is 10 nm. So the error 62.5-50=12.5 nm is slightly above the size of one pixel. I think such error is not quite unexpected.

@stevengj
Copy link
Contributor

stevengj commented Nov 3, 2022

run a single forward run before each epoch and manually set the epigraph variable to the largest value of the objective function over the six wavelengths

Note that it is always better for CCSA to start with a feasible starting point (one which satisfies the constraints). (Technically, it is not guaranteed to converge otherwise, although we do attempt to minimize the constraints to reach the feasible region.) In the case of an epigraph formulation, this means that you typically want to start each epoch with a dummy variable that equals the min/max of the functions you are maximizing/minimizing.

@oskooi
Copy link
Collaborator Author

oskooi commented Nov 7, 2022

So something now seems wrong with that first epoch. It shouldn't take 40+ iterations for it to start making progress on the FOM... Are you sure you are pulling the right value for the dummy parameter (t)?

The range of the objective function $F=R+(1-T)$ is [0,2.0] (i.e., min: $R=0$, $T=1$, max: $R=1$, $T=0$). Based on this, it would seem that the optimizer should drive the epigraph variable $t$ to lie in the range [Δ,2.0+Δ] where Δ is a small number (e.g., 0.1) in order to satisfy the constraint $F_{\max}(\lambda) - t < 0$.

However, what I observe in my runs is that when the bound on $t$ is removed (lb = np.insert(lb, -np.inf), ub = np.insert(ub, +np.inf) in the script): (1) $t$ becomes negative during certain iterations of the first epoch and (2) in the final epoch when glc is activated, $t$ > 5.0 and remains practically unchanged regardless of the value of the glc hyperparameter a1 (varied from 1e-3 to 1e-6). A negative $t$ is a clear violation of the constraint. Note that we are forcing the initial $t$ of each of the three epochs to satisfy the constraint. In this case, these values are 1.20, 0.33, 0.11 which decreasing.

There does not seem to be a bug in the setup of the minimax optimization using the epigraph formulation. Yet the results from (1) and (2) are unexpected.

@mawc2019
Copy link
Collaborator

mawc2019 commented Nov 7, 2022

So something now seems wrong with that first epoch. It shouldn't take 40+ iterations for it to start making progress on the FOM... Are you sure you are pulling the right value for the dummy parameter (t)?

This behavior does not seem to be related to the dummy variable, but related to the small and incorrect adjoint gradients. As mentioned here, the adjoint gradient is small and incorrect at the uniform structure, which is x = 0.5*np.ones(Nx*Ny) in that test example. If the optimization starts from a random structure, the flat curve does not appear in the initial stage, as shown below.
image

@smartalecH
Copy link

As mentioned #22 (comment), the adjoint gradient is small and incorrect at the uniform structure, which is x = 0.5np.ones(NxNy)

Like we've discussed before, the current subpixel averaging code for material grids grows increasingly ill-conditioned as the norm of the spatial gradient of the material grid approaches zero (which is the case here, when u=0.5 everywhere).

We should turn off subpixel smoothing for the material grid here. You aren't taking advantage of its features anyway.

@mawc2019
Copy link
Collaborator

mawc2019 commented Nov 7, 2022

As mentioned here, I turned off the smoothing, but the adjoint gradient was still incorrect.

@oskooi
Copy link
Collaborator Author

oskooi commented Nov 7, 2022

Here is a plot for a typical run of the objective function $F_{\max}(\lambda)$ and epigraph variable $t$ as a function of the iteration number. Each epoch consists of 50 iterations. Something is clearly wrong.

Epoch 1 and 2

  • in the first epoch (iterations 1-50), $F_{\max}(\lambda)$ and $t$ do not start to change until iteration ~35.
  • for most of the iterations, particularly in Epoch 2 starting at iteration 51, $F_{\max}(\lambda) - t > 0$ which violates the constraint

optimal_design_minlen50_11_eval_hist

Epoch 3: apply minimum linewidth constraint

  • starting at iteration 106 (the third epoch starts at iteration 101), $t$ starts becoming much larger than $F_{\max}(\lambda)$

optimal_design_minlen50_11_eval_hist_b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants