Skip to content

Generates code for products of spherical harmonics

License

Notifications You must be signed in to change notification settings

sjb3d/sh_product

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

sh_product

An implementation of Code Generation and Factoring for Fast Evaluation of Low-order Spherical Harmonic Products and Squares (paper by John Snyder).

The program does the following:

  • Numerically integrates spherical harmonic (dual) products for all unique combinations of basis functions
    • Checks that the results are close enough to 1 or 0 as expected
    • If not you may need to increase the step count (see PHI_STEPS in code)
  • Numerically integrates spherical harmonic triple products for all unique combinations of basis functions
    • This is parallelised using rayon but may take a while for higher orders!
  • Runs the algorithm from the paper to greedily pick basis function pairs that are most common
    • Emits C-like code in blocks for each picked pair, until all coefficients are done

Due to factoring out shared parts of these basis function pairs, the generated code has fewer multiplies compared with code that just sums all the non-zero parts of the triple product.

The output is order 2 (9 terms) spherical harmonics by default, but can emit higher order by changing the MAX_ORDER constant. (I have tried up to order 6, which generates code with 5351 multiplies in agreement with the paper, but I have not tried running the output!)

Here is the output for an order 2 multiply pasted into a C function:

// compute c = a * b for functions represented as 9-term SH
void sh_product(float *c, const float *a, const float *b)
{
    float ta, tb, t;

    const float C0 = 0.2820948f;
    const float C1 = 0.12615182f;
    const float C2 = 0.21850969f;
    const float C3 = 0.25228918f;
    const float C4 = 0.18022375f;
    const float C5 = 0.15607835f;
    const float C6 = 0.09013593f;

    // [0,0]: 0
    t = a[0]*b[0];
    c[0] = C0*t;

    // [1,1]: 0,6,8
    ta = C0*a[0] + (-C1)*a[6] + (-C2)*a[8];
    tb = C0*b[0] + (-C1)*b[6] + (-C2)*b[8];
    c[1] = ta*b[1] + tb*a[1];
    t = a[1]*b[1];
    c[0] += C0*t;
    c[6] = (-C1)*t;
    c[8] = (-C2)*t;

    // [1,2]: 5
    ta = C2*a[5];
    tb = C2*b[5];
    c[1] += ta*b[2] + tb*a[2];
    c[2] = ta*b[1] + tb*a[1];
    t = a[1]*b[2] + a[2]*b[1];
    c[5] = C2*t;

    // [2,2]: 0,6
    ta = C0*a[0] + C3*a[6];
    tb = C0*b[0] + C3*b[6];
    c[2] += ta*b[2] + tb*a[2];
    t = a[2]*b[2];
    c[0] += C0*t;
    c[6] += C3*t;

    // [1,3]: 4
    ta = C2*a[4];
    tb = C2*b[4];
    c[1] += ta*b[3] + tb*a[3];
    c[3] = ta*b[1] + tb*a[1];
    t = a[1]*b[3] + a[3]*b[1];
    c[4] = C2*t;

    // [2,3]: 7
    ta = C2*a[7];
    tb = C2*b[7];
    c[2] += ta*b[3] + tb*a[3];
    c[3] += ta*b[2] + tb*a[2];
    t = a[2]*b[3] + a[3]*b[2];
    c[7] = C2*t;

    // [3,3]: 0,6,8
    ta = C0*a[0] + (-C1)*a[6] + C2*a[8];
    tb = C0*b[0] + (-C1)*b[6] + C2*b[8];
    c[3] += ta*b[3] + tb*a[3];
    t = a[3]*b[3];
    c[0] += C0*t;
    c[6] += (-C1)*t;
    c[8] += C2*t;

    // [4,4]: 0,6
    ta = C0*a[0] + (-C4)*a[6];
    tb = C0*b[0] + (-C4)*b[6];
    c[4] += ta*b[4] + tb*a[4];
    t = a[4]*b[4];
    c[0] += C0*t;
    c[6] += (-C4)*t;

    // [4,5]: 7
    ta = C5*a[7];
    tb = C5*b[7];
    c[4] += ta*b[5] + tb*a[5];
    c[5] += ta*b[4] + tb*a[4];
    t = a[4]*b[5] + a[5]*b[4];
    c[7] += C5*t;

    // [5,5]: 0,6,8
    ta = C0*a[0] + C6*a[6] + (-C5)*a[8];
    tb = C0*b[0] + C6*b[6] + (-C5)*b[8];
    c[5] += ta*b[5] + tb*a[5];
    t = a[5]*b[5];
    c[0] += C0*t;
    c[6] += C6*t;
    c[8] += (-C5)*t;

    // [6,6]: 0,6
    ta = C0*a[0];
    tb = C0*b[0];
    c[6] += ta*b[6] + tb*a[6];
    t = a[6]*b[6];
    c[0] += C0*t;
    c[6] += C4*t;

    // [7,7]: 0,6,8
    ta = C0*a[0] + C6*a[6] + C5*a[8];
    tb = C0*b[0] + C6*b[6] + C5*b[8];
    c[7] += ta*b[7] + tb*a[7];
    t = a[7]*b[7];
    c[0] += C0*t;
    c[6] += C6*t;
    c[8] += C5*t;

    // [8,8]: 0,6
    ta = C0*a[0] + (-C4)*a[6];
    tb = C0*b[0] + (-C4)*b[6];
    c[8] += ta*b[8] + tb*a[8];
    t = a[8]*b[8];
    c[0] += C0*t;
    c[6] += (-C4)*t;

    // multiply count = 120
    // addition count = 74
}

About

Generates code for products of spherical harmonics

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages