-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP/RFC make the partial scalars VecElements #563
base: master
Are you sure you want to change the base?
Conversation
The llvm for Masterdefine void @"julia_rhs!_4840"({}* nonnull align 16 dereferenceable(40) %0, {}* nonnull align 16 dereferenceable(40) %1, double %2) {
top:
%3 = bitcast {}* %1 to { double, [1 x [8 x double]] }**
%4 = load { double, [1 x [8 x double]] }*, { double, [1 x [8 x double]] }** %3, align 8
%.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 0, i32 0
%.unpack = load double, double* %.elt, align 8
%.unpack1661.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 0, i32 1, i64 0, i64 0
%5 = bitcast double* %.unpack1661.unpack.elt to <8 x double>*
%6 = load <8 x double>, <8 x double>* %5, align 8
%7 = fmul double %.unpack, 3.500000e-01
%8 = fmul <8 x double> %6, <double 3.500000e-01, double 3.500000e-01, double 3.500000e-01, double 3.500000e-01, double 3.500000e-01, double 3.500000e-01, double 3.500000e-01, double 3.500000e-01>
%.elt1678 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 1, i32 0
%.unpack1679 = load double, double* %.elt1678, align 8
%.unpack1681.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 1, i32 1, i64 0, i64 0
%9 = bitcast double* %.unpack1681.unpack.elt to <8 x double>*
%10 = load <8 x double>, <8 x double>* %9, align 8
%.elt1698 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 3, i32 0
%.unpack1699 = load double, double* %.elt1698, align 8
%.unpack1701.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 3, i32 1, i64 0, i64 0
%11 = bitcast double* %.unpack1701.unpack.elt to <8 x double>*
%12 = load <8 x double>, <8 x double>* %11, align 8
%13 = fmul double %.unpack1679, 2.660000e+01
%14 = fmul <8 x double> %10, <double 2.660000e+01, double 2.660000e+01, double 2.660000e+01, double 2.660000e+01, double 2.660000e+01, double 2.660000e+01, double 2.660000e+01, double 2.660000e+01>
%15 = fmul double %13, %.unpack1699
%16 = insertelement <8 x double> undef, double %13, i32 0
%res.i1659 = shufflevector <8 x double> %16, <8 x double> undef, <8 x i32> zeroinitializer
%17 = fmul <8 x double> %res.i1659, %12
%18 = insertelement <8 x double> undef, double %.unpack1699, i32 0
%res.i1658 = shufflevector <8 x double> %18, <8 x double> undef, <8 x i32> zeroinitializer
%19 = call <8 x double> @llvm.fmuladd.v8f64(<8 x double> %res.i1658, <8 x double> %14, <8 x double> %17)
%.elt1718 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 4, i32 0
%.unpack1719 = load double, double* %.elt1718, align 8
%.unpack1721.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 4, i32 1, i64 0, i64 0
%20 = bitcast double* %.unpack1721.unpack.elt to <8 x double>*
%21 = load <8 x double>, <8 x double>* %20, align 8
%22 = fmul double %.unpack1719, 1.230000e+04
%23 = fmul <8 x double> %21, <double 1.230000e+04, double 1.230000e+04, double 1.230000e+04, double 1.230000e+04, double 1.230000e+04, double 1.230000e+04, double 1.230000e+04, double 1.230000e+04>
%24 = fmul double %22, %.unpack1679
%25 = insertelement <8 x double> undef, double %22, i32 0
%res.i1657 = shufflevector <8 x double> %25, <8 x double> undef, <8 x i32> zeroinitializer
%26 = fmul <8 x double> %res.i1657, %10
%27 = insertelement <8 x double> undef, double %.unpack1679, i32 0
%res.i1656 = shufflevector <8 x double> %27, <8 x double> undef, <8 x i32> zeroinitializer
%28 = call <8 x double> @llvm.fmuladd.v8f64(<8 x double> %res.i1656, <8 x double> %23, <8 x double> %26)
%.elt1758 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 6, i32 0
%.unpack1759 = load double, double* %.elt1758, align 8
%.unpack1761.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 6, i32 1, i64 0, i64 0
%29 = bitcast double* %.unpack1761.unpack.elt to <8 x double>*
%30 = load <8 x double>, <8 x double>* %29, align 8
%31 = fmul double %.unpack1759, 0x3F4C2E33EFF19503
%32 = fmul <8 x double> %30, <double 0x3F4C2E33EFF19503, double 0x3F4C2E33EFF19503, double 0x3F4C2E33EFF19503, double 0x3F4C2E33EFF19503, double 0x3F4C2E33EFF19503, double 0x3F4C2E33EFF19503, double 0x3F4C2E33EFF19503, double 0x3F4C2E33EFF19503>
%33 = fmul double %.unpack1759, 0x3F4ADEA897635E74
%34 = fmul <8 x double> %30, <double 0x3F4ADEA897635E74, double 0x3F4ADEA897635E74, double 0x3F4ADEA897635E74, double 0x3F4ADEA897635E74, double 0x3F4ADEA897635E74, double 0x3F4ADEA897635E74, double 0x3F4ADEA897635E74, double 0x3F4ADEA897635E74>
%.elt1818 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 5, i32 0
%.unpack1819 = load double, double* %.elt1818, align 8
%.unpack1821.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 5, i32 1, i64 0, i64 0
%35 = bitcast double* %.unpack1821.unpack.elt to <8 x double>*
%36 = load <8 x double>, <8 x double>* %35, align 8
%37 = fmul double %.unpack1759, 1.500000e+04
%38 = fmul <8 x double> %30, <double 1.500000e+04, double 1.500000e+04, double 1.500000e+04, double 1.500000e+04, double 1.500000e+04, double 1.500000e+04, double 1.500000e+04, double 1.500000e+04>
%39 = fmul double %37, %.unpack1819
%40 = insertelement <8 x double> undef, double %37, i32 0
%res.i1655 = shufflevector <8 x double> %40, <8 x double> undef, <8 x i32> zeroinitializer
%41 = fmul <8 x double> %res.i1655, %36
%42 = insertelement <8 x double> undef, double %.unpack1819, i32 0
%res.i1654 = shufflevector <8 x double> %42, <8 x double> undef, <8 x i32> zeroinitializer
%43 = call <8 x double> @llvm.fmuladd.v8f64(<8 x double> %res.i1654, <8 x double> %38, <8 x double> %41)
%.elt1838 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 8, i32 0
%.unpack1839 = load double, double* %.elt1838, align 8
%.unpack1841.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 8, i32 1, i64 0, i64 0
%44 = bitcast double* %.unpack1841.unpack.elt to <8 x double>*
%45 = load <8 x double>, <8 x double>* %44, align 8
%46 = fmul double %.unpack1839, 1.300000e-04
%47 = fmul <8 x double> %45, <double 1.300000e-04, double 1.300000e-04, double 1.300000e-04, double 1.300000e-04, double 1.300000e-04, double 1.300000e-04, double 1.300000e-04, double 1.300000e-04>
%48 = fmul double %.unpack1839, 2.400000e+04
%49 = fmul <8 x double> %45, <double 2.400000e+04, double 2.400000e+04, double 2.400000e+04, double 2.400000e+04, double 2.400000e+04, double 2.400000e+04, double 2.400000e+04, double 2.400000e+04>
%50 = fmul double %48, %.unpack1819
%51 = insertelement <8 x double> undef, double %48, i32 0
%res.i1653 = shufflevector <8 x double> %51, <8 x double> undef, <8 x i32> zeroinitializer
%52 = fmul <8 x double> %res.i1653, %36
%53 = call <8 x double> @llvm.fmuladd.v8f64(<8 x double> %res.i1654, <8 x double> %49, <8 x double> %52)
%.elt1898 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 10, i32 0
%.unpack1899 = load double, double* %.elt1898, align 8
%.unpack1901.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 10, i32 1, i64 0, i64 0
%54 = bitcast double* %.unpack1901.unpack.elt to <8 x double>*
%55 = load <8 x double>, <8 x double>* %54, align 8
%56 = fmul double %.unpack1899, 1.650000e+04
%57 = fmul <8 x double> %55, <double 1.650000e+04, double 1.650000e+04, double 1.650000e+04, double 1.650000e+04, double 1.650000e+04, double 1.650000e+04, double 1.650000e+04, double 1.650000e+04>
%58 = fmul double %56, %.unpack1679
%59 = insertelement <8 x double> undef, double %56, i32 0
%res.i1651 = shufflevector <8 x double> %59, <8 x double> undef, <8 x i32> zeroinitializer
%60 = fmul <8 x double> %res.i1651, %10
%61 = call <8 x double> @llvm.fmuladd.v8f64(<8 x double> %res.i1656, <8 x double> %57, <8 x double> %60)
%62 = fmul double %.unpack1899, 9.000000e+03
%63 = fmul <8 x double> %55, <double 9.000000e+03, double 9.000000e+03, double 9.000000e+03, double 9.000000e+03, double 9.000000e+03, double 9.000000e+03, double 9.000000e+03, double 9.000000e+03>
%64 = fmul double %62, %.unpack
%65 = insertelement <8 x double> undef, double %62, i32 0
%res.i1649 = shufflevector <8 x double> %65, <8 x double> undef, <8 x i32> zeroinitializer
%66 = fmul <8 x double> %res.i1649, %6
%67 = insertelement <8 x double> undef, double %.unpack, i32 0
%res.i1648 = shufflevector <8 x double> %67, <8 x double> undef, <8 x i32> zeroinitializer
%68 = call <8 x double> @llvm.fmuladd.v8f64(<8 x double> %res.i1648, <8 x double> %63, <8 x double> %66)
%.elt1978 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 12, i32 0
%.unpack1979 = load double, double* %.elt1978, align 8
%.unpack1981.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 12, i32 1, i64 0, i64 0
%69 = bitcast double* %.unpack1981.unpack.elt to <8 x double>*
%70 = load <8 x double>, <8 x double>* %69, align 8
%71 = fmul double %.unpack1979, 2.200000e-02
%72 = fmul <8 x double> %70, <double 2.200000e-02, double 2.200000e-02, double 2.200000e-02, double 2.200000e-02, double 2.200000e-02, double 2.200000e-02, double 2.200000e-02, double 2.200000e-02>
%.elt1998 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 9, i32 0
%.unpack1999 = load double, double* %.elt1998, align 8
%.unpack2001.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 9, i32 1, i64 0, i64 0
%73 = bitcast double* %.unpack2001.unpack.elt to <8 x double>*
%74 = load <8 x double>, <8 x double>* %73, align 8
%75 = fmul double %.unpack1999, 1.200000e+04
%76 = fmul <8 x double> %74, <double 1.200000e+04, double 1.200000e+04, double 1.200000e+04, double 1.200000e+04, double 1.200000e+04, double 1.200000e+04, double 1.200000e+04, double 1.200000e+04>
%77 = fmul double %75, %.unpack1679
%78 = insertelement <8 x double> undef, double %75, i32 0
%res.i1647 = shufflevector <8 x double> %78, <8 x double> undef, <8 x i32> zeroinitializer
%79 = fmul <8 x double> %res.i1647, %10
%80 = call <8 x double> @llvm.fmuladd.v8f64(<8 x double> %res.i1656, <8 x double> %76, <8 x double> %79)
%.elt2038 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 13, i32 0
%.unpack2039 = load double, double* %.elt2038, align 8
%.unpack2041.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 13, i32 1, i64 0, i64 0
%81 = bitcast double* %.unpack2041.unpack.elt to <8 x double>*
%82 = load <8 x double>, <8 x double>* %81, align 8
%83 = fmul double %.unpack2039, 1.880000e+00
%84 = fmul <8 x double> %82, <double 1.880000e+00, double 1.880000e+00, double 1.880000e+00, double 1.880000e+00, double 1.880000e+00, double 1.880000e+00, double 1.880000e+00, double 1.880000e+00>
%85 = fmul double %.unpack, 1.630000e+04
%86 = fmul <8 x double> %6, <double 1.630000e+04, double 1.630000e+04, double 1.630000e+04, double 1.630000e+04, double 1.630000e+04, double 1.630000e+04, double 1.630000e+04, double 1.630000e+04>
%87 = fmul double %85, %.unpack1819
%88 = insertelement <8 x double> undef, double %85, i32 0
%res.i1645 = shufflevector <8 x double> %88, <8 x double> undef, <8 x i32> zeroinitializer
%89 = fmul <8 x double> %res.i1645, %36
%90 = call <8 x double> @llvm.fmuladd.v8f64(<8 x double> %res.i1654, <8 x double> %86, <8 x double> %89)
%.elt2098 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 2, i32 0
%.unpack2099 = load double, double* %.elt2098, align 8
%.unpack2101.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 2, i32 1, i64 0, i64 0
%91 = bitcast double* %.unpack2101.unpack.elt to <8 x double>*
%92 = load <8 x double>, <8 x double>* %91, align 8
%93 = fmul double %.unpack2099, 4.800000e+06
%94 = fmul <8 x double> %92, <double 4.800000e+06, double 4.800000e+06, double 4.800000e+06, double 4.800000e+06, double 4.800000e+06, double 4.800000e+06, double 4.800000e+06, double 4.800000e+06>
%95 = fmul double %.unpack1699, 3.500000e-04
%96 = fmul <8 x double> %12, <double 3.500000e-04, double 3.500000e-04, double 3.500000e-04, double 3.500000e-04, double 3.500000e-04, double 3.500000e-04, double 3.500000e-04, double 3.500000e-04>
%97 = fmul double %.unpack1699, 1.750000e-02
%98 = fmul <8 x double> %12, <double 1.750000e-02, double 1.750000e-02, double 1.750000e-02, double 1.750000e-02, double 1.750000e-02, double 1.750000e-02, double 1.750000e-02, double 1.750000e-02>
%.elt2158 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 15, i32 0
%.unpack2159 = load double, double* %.elt2158, align 8
%.unpack2161.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 15, i32 1, i64 0, i64 0
%99 = bitcast double* %.unpack2161.unpack.elt to <8 x double>*
%100 = load <8 x double>, <8 x double>* %99, align 8
%101 = fmul double %.unpack2159, 1.000000e+08
%102 = fmul <8 x double> %100, <double 1.000000e+08, double 1.000000e+08, double 1.000000e+08, double 1.000000e+08, double 1.000000e+08, double 1.000000e+08, double 1.000000e+08, double 1.000000e+08>
%103 = fmul double %.unpack2159, 4.440000e+11
%104 = fmul <8 x double> %100, <double 4.440000e+11, double 4.440000e+11, double 4.440000e+11, double 4.440000e+11, double 4.440000e+11, double 4.440000e+11, double 4.440000e+11, double 4.440000e+11>
%.elt2198 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 16, i32 0
%.unpack2199 = load double, double* %.elt2198, align 8
%.unpack2201.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 16, i32 1, i64 0, i64 0
%105 = bitcast double* %.unpack2201.unpack.elt to <8 x double>*
%106 = load <8 x double>, <8 x double>* %105, align 8
%107 = fmul double %.unpack2199, 1.240000e+03
%108 = fmul <8 x double> %106, <double 1.240000e+03, double 1.240000e+03, double 1.240000e+03, double 1.240000e+03, double 1.240000e+03, double 1.240000e+03, double 1.240000e+03, double 1.240000e+03>
%109 = fmul double %107, %.unpack1819
%110 = insertelement <8 x double> undef, double %107, i32 0
%res.i1643 = shufflevector <8 x double> %110, <8 x double> undef, <8 x i32> zeroinitializer
%111 = fmul <8 x double> %res.i1643, %36
%112 = call <8 x double> @llvm.fmuladd.v8f64(<8 x double> %res.i1654, <8 x double> %108, <8 x double> %111)
%.elt2238 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 18, i32 0
%.unpack2239 = load double, double* %.elt2238, align 8
%.unpack2241.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 18, i32 1, i64 0, i64 0
%113 = bitcast double* %.unpack2241.unpack.elt to <8 x double>*
%114 = load <8 x double>, <8 x double>* %113, align 8
%115 = fmul double %.unpack2239, 2.100000e+00
%116 = fmul <8 x double> %114, <double 2.100000e+00, double 2.100000e+00, double 2.100000e+00, double 2.100000e+00, double 2.100000e+00, double 2.100000e+00, double 2.100000e+00, double 2.100000e+00>
%117 = fmul double %.unpack2239, 5.780000e+00
%118 = fmul <8 x double> %114, <double 5.780000e+00, double 5.780000e+00, double 5.780000e+00, double 5.780000e+00, double 5.780000e+00, double 5.780000e+00, double 5.780000e+00, double 5.780000e+00>
%119 = fmul double %.unpack, 4.740000e-02
%120 = fmul <8 x double> %6, <double 4.740000e-02, double 4.740000e-02, double 4.740000e-02, double 4.740000e-02, double 4.740000e-02, double 4.740000e-02, double 4.740000e-02, double 4.740000e-02>
%121 = fmul double %119, %.unpack1699
%122 = insertelement <8 x double> undef, double %119, i32 0
%res.i1641 = shufflevector <8 x double> %122, <8 x double> undef, <8 x i32> zeroinitializer
%123 = fmul <8 x double> %res.i1641, %12
%124 = call <8 x double> @llvm.fmuladd.v8f64(<8 x double> %res.i1658, <8 x double> %120, <8 x double> %123)
%125 = fmul double %.unpack2239, 1.780000e+03
%126 = fmul <8 x double> %114, <double 1.780000e+03, double 1.780000e+03, double 1.780000e+03, double 1.780000e+03, double 1.780000e+03, double 1.780000e+03, double 1.780000e+03, double 1.780000e+03>
%127 = fmul double %125, %.unpack
%128 = insertelement <8 x double> undef, double %125, i32 0
%res.i1639 = shufflevector <8 x double> %128, <8 x double> undef, <8 x i32> zeroinitializer
%129 = fmul <8 x double> %res.i1639, %6
%130 = call <8 x double> @llvm.fmuladd.v8f64(<8 x double> %res.i1648, <8 x double> %126, <8 x double> %129)
%.elt2358 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 19, i32 0
%.unpack2359 = load double, double* %.elt2358, align 8
%.unpack2361.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 19, i32 1, i64 0, i64 0
%131 = bitcast double* %.unpack2361.unpack.elt to <8 x double>*
%132 = load <8 x double>, <8 x double>* %131, align 8
%133 = fmul double %.unpack2359, 3.120000e+00
%134 = fmul <8 x double> %132, <double 3.120000e+00, double 3.120000e+00, double 3.120000e+00, double 3.120000e+00, double 3.120000e+00, double 3.120000e+00, double 3.120000e+00, double 3.120000e+00>
%135 = fneg double %7
%136 = fneg <8 x double> %8
%137 = fsub double %135, %64
%138 = fsub <8 x double> %136, %68
%139 = fsub double %137, %87
%140 = fsub <8 x double> %138, %90
%141 = fsub double %139, %121
%142 = fsub <8 x double> %140, %124
%143 = fsub double %141, %127
%144 = fsub <8 x double> %142, %130
%145 = fadd double %15, %143
%146 = fadd <8 x double> %19, %144
%147 = fadd double %24, %145
%148 = fadd <8 x double> %28, %146
%149 = fadd double %58, %147
%150 = fadd <8 x double> %61, %148
%151 = fadd double %71, %149
%152 = fadd <8 x double> %72, %150
%153 = fadd double %77, %151
%154 = fadd <8 x double> %80, %152
%155 = fadd double %117, %153
%156 = fadd <8 x double> %118, %154
%157 = fadd double %155, %133
%158 = fadd <8 x double> %156, %134
%159 = bitcast {}* %0 to { double, [1 x [8 x double]] }**
%160 = load { double, [1 x [8 x double]] }*, { double, [1 x [8 x double]] }** %159, align 8
%.repack = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %160, i64 0, i32 0
store double %157, double* %.repack, align 8
%161 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %160, i64 0, i32 1, i64 0
%162 = bitcast [8 x double]* %161 to <8 x double>*
store <8 x double> %158, <8 x double>* %162, align 8
%163 = fneg double %15
%164 = fneg <8 x double> %19
%165 = fsub double %163, %24
%166 = fsub <8 x double> %164, %28
%167 = fsub double %165, %58
%168 = fsub <8 x double> %166, %61
%169 = fsub double %167, %77
%170 = fsub <8 x double> %168, %80
%171 = fadd double %7, %169
%172 = fadd <8 x double> %8, %170
%173 = fadd double %171, %115
%174 = fadd <8 x double> %172, %116
%.repack2380 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %160, i64 1, i32 0
store double %173, double* %.repack2380, align 8
%175 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %160, i64 1, i32 1, i64 0
%176 = bitcast [8 x double]* %175 to <8 x double>*
store <8 x double> %174, <8 x double>* %176, align 8
%177 = fsub double %7, %93
%178 = fsub <8 x double> %8, %94
%179 = fadd double %177, %97
%180 = fadd <8 x double> %178, %98
%181 = fadd double %179, %103
%182 = fadd <8 x double> %180, %104
%183 = fadd double %181, %117
%184 = fadd <8 x double> %182, %118
%.repack2383 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %160, i64 2, i32 0
store double %183, double* %.repack2383, align 8
%185 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %160, i64 2, i32 1, i64 0
%186 = bitcast [8 x double]* %185 to <8 x double>*
store <8 x double> %184, <8 x double>* %186, align 8
%187 = fsub double %163, %95
%188 = fsub <8 x double> %164, %96
%189 = fsub double %187, %97
%190 = fsub <8 x double> %188, %98
%191 = fsub double %189, %121
%192 = fsub <8 x double> %190, %124
%193 = fadd double %93, %191
%194 = fadd <8 x double> %94, %192
%.repack2386 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %160, i64 3, i32 0
store double %193, double* %.repack2386, align 8
%195 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %160, i64 3, i32 1, i64 0
%196 = bitcast [8 x double]* %195 to <8 x double>*
store <8 x double> %194, <8 x double>* %196, align 8
%197 = fsub double %31, %24
%198 = fsub <8 x double> %32, %28
%199 = fadd double %31, %197
%200 = fadd <8 x double> %32, %198
%201 = fadd double %199, %39
%202 = fadd <8 x double> %200, %43
%203 = fadd double %201, %46
%204 = fadd <8 x double> %202, %47
%205 = fadd double %203, %83
%206 = fadd <8 x double> %204, %84
%207 = fadd double %205, %109
%208 = fadd <8 x double> %206, %112
%.repack2389 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %160, i64 4, i32 0
store double %207, double* %.repack2389, align 8
%209 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %160, i64 4, i32 1, i64 0
%210 = bitcast [8 x double]* %209 to <8 x double>*
store <8 x double> %208, <8 x double>* %210, align 8
%211 = fneg double %39
%212 = fneg <8 x double> %43
%213 = fsub double %211, %50
%214 = fsub <8 x double> %212, %53
%215 = fsub double %213, %87
%216 = fsub <8 x double> %214, %90
%217 = fsub double %215, %109
%218 = fsub <8 x double> %216, %112
%219 = fadd double %24, %217
%220 = fadd <8 x double> %28, %218
%221 = fadd double %101, %219
%222 = fadd <8 x double> %102, %220
%223 = fadd double %101, %221
%224 = fadd <8 x double> %102, %222
%.repack2392 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %160, i64 5, i32 0
store double %223, double* %.repack2392, align 8
%225 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %160, i64 5, i32 1, i64 0
%226 = bitcast [8 x double]* %225 to <8 x double>*
store <8 x double> %224, <8 x double>* %226, align 8
%227 = fneg double %31
%228 = fneg <8 x double> %32
%229 = fsub double %227, %33
%230 = fsub <8 x double> %228, %34
%231 = fsub double %229, %39
%232 = fsub <8 x double> %230, %43
%233 = fadd double %231, %83
%234 = fadd <8 x double> %232, %84
%.repack2395 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %160, i64 6, i32 0
store double %233, double* %.repack2395, align 8
%235 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %160, i64 6, i32 1, i64 0
%236 = bitcast [8 x double]* %235 to <8 x double>*
store <8 x double> %234, <8 x double>* %236, align 8
%237 = fadd double %31, %33
%238 = fadd <8 x double> %32, %34
%239 = fadd double %237, %39
%240 = fadd <8 x double> %238, %43
%241 = fadd double %239, %46
%242 = fadd <8 x double> %240, %47
%243 = load { double, [1 x [8 x double]] }*, { double, [1 x [8 x double]] }** %159, align 8
%.repack2398 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %243, i64 7, i32 0
store double %241, double* %.repack2398, align 8
%244 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %243, i64 7, i32 1, i64 0
%245 = bitcast [8 x double]* %244 to <8 x double>*
store <8 x double> %242, <8 x double>* %245, align 8
%246 = fneg double %46
%247 = fneg <8 x double> %47
%248 = fsub double %246, %50
%249 = fsub <8 x double> %247, %53
%.repack2401 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %243, i64 8, i32 0
store double %248, double* %.repack2401, align 8
%250 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %243, i64 8, i32 1, i64 0
%251 = bitcast [8 x double]* %250 to <8 x double>*
store <8 x double> %249, <8 x double>* %251, align 8
%252 = fsub double %46, %77
%253 = fsub <8 x double> %47, %80
%254 = fadd double %58, %252
%255 = fadd <8 x double> %61, %253
%.repack2404 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %243, i64 9, i32 0
store double %254, double* %.repack2404, align 8
%256 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %243, i64 9, i32 1, i64 0
%257 = bitcast [8 x double]* %256 to <8 x double>*
store <8 x double> %255, <8 x double>* %257, align 8
%258 = fneg double %58
%259 = fneg <8 x double> %61
%260 = fsub double %258, %64
%261 = fsub <8 x double> %259, %68
%262 = fadd double %50, %260
%263 = fadd <8 x double> %53, %261
%264 = fadd double %262, %71
%265 = fadd <8 x double> %263, %72
%.repack2407 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %243, i64 10, i32 0
store double %264, double* %.repack2407, align 8
%266 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %243, i64 10, i32 1, i64 0
%267 = bitcast [8 x double]* %266 to <8 x double>*
store <8 x double> %265, <8 x double>* %267, align 8
%.repack2410 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %243, i64 11, i32 0
store double %58, double* %.repack2410, align 8
%268 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %243, i64 11, i32 1, i64 0
%269 = bitcast [8 x double]* %268 to <8 x double>*
store <8 x double> %61, <8 x double>* %269, align 8
%270 = fsub double %64, %71
%271 = fsub <8 x double> %68, %72
%.repack2413 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %243, i64 12, i32 0
store double %270, double* %.repack2413, align 8
%272 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %243, i64 12, i32 1, i64 0
%273 = bitcast [8 x double]* %272 to <8 x double>*
store <8 x double> %271, <8 x double>* %273, align 8
%274 = fsub double %77, %83
%275 = fsub <8 x double> %80, %84
%.repack2416 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %243, i64 13, i32 0
store double %274, double* %.repack2416, align 8
%276 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %243, i64 13, i32 1, i64 0
%277 = bitcast [8 x double]* %276 to <8 x double>*
store <8 x double> %275, <8 x double>* %277, align 8
%.repack2419 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %243, i64 14, i32 0
store double %87, double* %.repack2419, align 8
%278 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %243, i64 14, i32 1, i64 0
%279 = bitcast [8 x double]* %278 to <8 x double>*
store <8 x double> %90, <8 x double>* %279, align 8
%280 = fneg double %101
%281 = fneg <8 x double> %102
%282 = fsub double %280, %103
%283 = fsub <8 x double> %281, %104
%284 = fadd double %95, %282
%285 = fadd <8 x double> %96, %283
%.repack2422 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %243, i64 15, i32 0
store double %284, double* %.repack2422, align 8
%286 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %243, i64 15, i32 1, i64 0
%287 = bitcast [8 x double]* %286 to <8 x double>*
store <8 x double> %285, <8 x double>* %287, align 8
%288 = fneg double %109
%289 = fneg <8 x double> %112
%.repack2425 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %243, i64 16, i32 0
store double %288, double* %.repack2425, align 8
%290 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %243, i64 16, i32 1, i64 0
%291 = bitcast [8 x double]* %290 to <8 x double>*
store <8 x double> %289, <8 x double>* %291, align 8
%.repack2428 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %243, i64 17, i32 0
store double %109, double* %.repack2428, align 8
%292 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %243, i64 17, i32 1, i64 0
%293 = bitcast [8 x double]* %292 to <8 x double>*
store <8 x double> %112, <8 x double>* %293, align 8
%294 = fneg double %115
%295 = fneg <8 x double> %116
%296 = fsub double %294, %117
%297 = fsub <8 x double> %295, %118
%298 = fsub double %296, %127
%299 = fsub <8 x double> %297, %130
%300 = fadd double %121, %298
%301 = fadd <8 x double> %124, %299
%302 = fadd double %300, %133
%303 = fadd <8 x double> %301, %134
%.repack2431 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %243, i64 18, i32 0
store double %302, double* %.repack2431, align 8
%304 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %243, i64 18, i32 1, i64 0
%305 = bitcast [8 x double]* %304 to <8 x double>*
store <8 x double> %303, <8 x double>* %305, align 8
%306 = fsub double %127, %133
%307 = fsub <8 x double> %130, %134
%308 = load { double, [1 x [8 x double]] }*, { double, [1 x [8 x double]] }** %159, align 8
%.repack2434 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %308, i64 19, i32 0
store double %306, double* %.repack2434, align 8
%309 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %308, i64 19, i32 1, i64 0
%310 = bitcast [8 x double]* %309 to <8 x double>*
store <8 x double> %307, <8 x double>* %310, align 8
ret void
} This PRdefine void @"julia_rhs!_2397"({}* nonnull align 16 dereferenceable(40) %0, {}* nonnull align 16 dereferenceable(40) %1, double %2) {
top:
%3 = bitcast {}* %1 to { double, [1 x [8 x double]] }**
%4 = load { double, [1 x [8 x double]] }*, { double, [1 x [8 x double]] }** %3, align 8
%.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 0, i32 0
%.unpack = load double, double* %.elt, align 8
%.unpack1798.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 0, i32 1, i64 0, i64 0
%5 = bitcast double* %.unpack1798.unpack.elt to <8 x double>*
%6 = load <8 x double>, <8 x double>* %5, align 8
%7 = fmul double %.unpack, 3.500000e-01
%res.i = fmul nsz contract <8 x double> %6, <double 3.500000e-01, double 3.500000e-01, double 3.500000e-01, double 3.500000e-01, double 3.500000e-01, double 3.500000e-01, double 3.500000e-01, double 3.500000e-01>
%.elt1815 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 1, i32 0
%.unpack1816 = load double, double* %.elt1815, align 8
%.unpack1818.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 1, i32 1, i64 0, i64 0
%8 = bitcast double* %.unpack1818.unpack.elt to <8 x double>*
%9 = load <8 x double>, <8 x double>* %8, align 8
%.elt1835 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 3, i32 0
%.unpack1836 = load double, double* %.elt1835, align 8
%.unpack1838.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 3, i32 1, i64 0, i64 0
%10 = bitcast double* %.unpack1838.unpack.elt to <8 x double>*
%11 = load <8 x double>, <8 x double>* %10, align 8
%12 = fmul double %.unpack1816, 2.660000e+01
%res.i1796 = fmul nsz contract <8 x double> %9, <double 2.660000e+01, double 2.660000e+01, double 2.660000e+01, double 2.660000e+01, double 2.660000e+01, double 2.660000e+01, double 2.660000e+01, double 2.660000e+01>
%13 = fmul double %12, %.unpack1836
%el1.i1790 = insertelement <8 x double> undef, double %.unpack1836, i32 0
%afactor.i1791 = shufflevector <8 x double> %el1.i1790, <8 x double> undef, <8 x i32> zeroinitializer
%el2.i1792 = insertelement <8 x double> undef, double %12, i32 0
%bfactor.i1793 = shufflevector <8 x double> %el2.i1792, <8 x double> undef, <8 x i32> zeroinitializer
%tmp.i1794 = fmul nsz contract <8 x double> %bfactor.i1793, %11
%res.i1795 = call nsz contract <8 x double> @llvm.fmuladd.v8f64(<8 x double> %res.i1796, <8 x double> %afactor.i1791, <8 x double> %tmp.i1794)
%.elt1855 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 4, i32 0
%.unpack1856 = load double, double* %.elt1855, align 8
%.unpack1858.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 4, i32 1, i64 0, i64 0
%14 = bitcast double* %.unpack1858.unpack.elt to <8 x double>*
%15 = load <8 x double>, <8 x double>* %14, align 8
%16 = fmul double %.unpack1856, 1.230000e+04
%res.i1789 = fmul nsz contract <8 x double> %15, <double 1.230000e+04, double 1.230000e+04, double 1.230000e+04, double 1.230000e+04, double 1.230000e+04, double 1.230000e+04, double 1.230000e+04, double 1.230000e+04>
%17 = fmul double %16, %.unpack1816
%el1.i1783 = insertelement <8 x double> undef, double %.unpack1816, i32 0
%afactor.i1784 = shufflevector <8 x double> %el1.i1783, <8 x double> undef, <8 x i32> zeroinitializer
%el2.i1785 = insertelement <8 x double> undef, double %16, i32 0
%bfactor.i1786 = shufflevector <8 x double> %el2.i1785, <8 x double> undef, <8 x i32> zeroinitializer
%tmp.i1787 = fmul nsz contract <8 x double> %bfactor.i1786, %9
%res.i1788 = call nsz contract <8 x double> @llvm.fmuladd.v8f64(<8 x double> %res.i1789, <8 x double> %afactor.i1784, <8 x double> %tmp.i1787)
%.elt1895 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 6, i32 0
%.unpack1896 = load double, double* %.elt1895, align 8
%.unpack1898.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 6, i32 1, i64 0, i64 0
%18 = bitcast double* %.unpack1898.unpack.elt to <8 x double>*
%19 = load <8 x double>, <8 x double>* %18, align 8
%20 = fmul double %.unpack1896, 0x3F4C2E33EFF19503
%res.i1782 = fmul nsz contract <8 x double> %19, <double 0x3F4C2E33EFF19503, double 0x3F4C2E33EFF19503, double 0x3F4C2E33EFF19503, double 0x3F4C2E33EFF19503, double 0x3F4C2E33EFF19503, double 0x3F4C2E33EFF19503, double 0x3F4C2E33EFF19503, double 0x3F4C2E33EFF19503>
%21 = fmul double %.unpack1896, 0x3F4ADEA897635E74
%res.i1781 = fmul nsz contract <8 x double> %19, <double 0x3F4ADEA897635E74, double 0x3F4ADEA897635E74, double 0x3F4ADEA897635E74, double 0x3F4ADEA897635E74, double 0x3F4ADEA897635E74, double 0x3F4ADEA897635E74, double 0x3F4ADEA897635E74, double 0x3F4ADEA897635E74>
%.elt1955 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 5, i32 0
%.unpack1956 = load double, double* %.elt1955, align 8
%.unpack1958.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 5, i32 1, i64 0, i64 0
%22 = bitcast double* %.unpack1958.unpack.elt to <8 x double>*
%23 = load <8 x double>, <8 x double>* %22, align 8
%24 = fmul double %.unpack1896, 1.500000e+04
%res.i1780 = fmul nsz contract <8 x double> %19, <double 1.500000e+04, double 1.500000e+04, double 1.500000e+04, double 1.500000e+04, double 1.500000e+04, double 1.500000e+04, double 1.500000e+04, double 1.500000e+04>
%25 = fmul double %24, %.unpack1956
%el1.i1774 = insertelement <8 x double> undef, double %.unpack1956, i32 0
%afactor.i1775 = shufflevector <8 x double> %el1.i1774, <8 x double> undef, <8 x i32> zeroinitializer
%el2.i1776 = insertelement <8 x double> undef, double %24, i32 0
%bfactor.i1777 = shufflevector <8 x double> %el2.i1776, <8 x double> undef, <8 x i32> zeroinitializer
%tmp.i1778 = fmul nsz contract <8 x double> %bfactor.i1777, %23
%res.i1779 = call nsz contract <8 x double> @llvm.fmuladd.v8f64(<8 x double> %res.i1780, <8 x double> %afactor.i1775, <8 x double> %tmp.i1778)
%.elt1975 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 8, i32 0
%.unpack1976 = load double, double* %.elt1975, align 8
%.unpack1978.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 8, i32 1, i64 0, i64 0
%26 = bitcast double* %.unpack1978.unpack.elt to <8 x double>*
%27 = load <8 x double>, <8 x double>* %26, align 8
%28 = fmul double %.unpack1976, 1.300000e-04
%res.i1773 = fmul nsz contract <8 x double> %27, <double 1.300000e-04, double 1.300000e-04, double 1.300000e-04, double 1.300000e-04, double 1.300000e-04, double 1.300000e-04, double 1.300000e-04, double 1.300000e-04>
%29 = fmul double %.unpack1976, 2.400000e+04
%res.i1772 = fmul nsz contract <8 x double> %27, <double 2.400000e+04, double 2.400000e+04, double 2.400000e+04, double 2.400000e+04, double 2.400000e+04, double 2.400000e+04, double 2.400000e+04, double 2.400000e+04>
%30 = fmul double %29, %.unpack1956
%el2.i1768 = insertelement <8 x double> undef, double %29, i32 0
%bfactor.i1769 = shufflevector <8 x double> %el2.i1768, <8 x double> undef, <8 x i32> zeroinitializer
%tmp.i1770 = fmul nsz contract <8 x double> %bfactor.i1769, %23
%res.i1771 = call nsz contract <8 x double> @llvm.fmuladd.v8f64(<8 x double> %res.i1772, <8 x double> %afactor.i1775, <8 x double> %tmp.i1770)
%.elt2035 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 10, i32 0
%.unpack2036 = load double, double* %.elt2035, align 8
%.unpack2038.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 10, i32 1, i64 0, i64 0
%31 = bitcast double* %.unpack2038.unpack.elt to <8 x double>*
%32 = load <8 x double>, <8 x double>* %31, align 8
%33 = fmul double %.unpack2036, 1.650000e+04
%res.i1765 = fmul nsz contract <8 x double> %32, <double 1.650000e+04, double 1.650000e+04, double 1.650000e+04, double 1.650000e+04, double 1.650000e+04, double 1.650000e+04, double 1.650000e+04, double 1.650000e+04>
%34 = fmul double %33, %.unpack1816
%el2.i1761 = insertelement <8 x double> undef, double %33, i32 0
%bfactor.i1762 = shufflevector <8 x double> %el2.i1761, <8 x double> undef, <8 x i32> zeroinitializer
%tmp.i1763 = fmul nsz contract <8 x double> %bfactor.i1762, %9
%res.i1764 = call nsz contract <8 x double> @llvm.fmuladd.v8f64(<8 x double> %res.i1765, <8 x double> %afactor.i1784, <8 x double> %tmp.i1763)
%35 = fmul double %.unpack2036, 9.000000e+03
%res.i1758 = fmul nsz contract <8 x double> %32, <double 9.000000e+03, double 9.000000e+03, double 9.000000e+03, double 9.000000e+03, double 9.000000e+03, double 9.000000e+03, double 9.000000e+03, double 9.000000e+03>
%36 = fmul double %35, %.unpack
%el1.i1752 = insertelement <8 x double> undef, double %.unpack, i32 0
%afactor.i1753 = shufflevector <8 x double> %el1.i1752, <8 x double> undef, <8 x i32> zeroinitializer
%el2.i1754 = insertelement <8 x double> undef, double %35, i32 0
%bfactor.i1755 = shufflevector <8 x double> %el2.i1754, <8 x double> undef, <8 x i32> zeroinitializer
%tmp.i1756 = fmul nsz contract <8 x double> %bfactor.i1755, %6
%res.i1757 = call nsz contract <8 x double> @llvm.fmuladd.v8f64(<8 x double> %res.i1758, <8 x double> %afactor.i1753, <8 x double> %tmp.i1756)
%.elt2115 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 12, i32 0
%.unpack2116 = load double, double* %.elt2115, align 8
%.unpack2118.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 12, i32 1, i64 0, i64 0
%37 = bitcast double* %.unpack2118.unpack.elt to <8 x double>*
%38 = load <8 x double>, <8 x double>* %37, align 8
%39 = fmul double %.unpack2116, 2.200000e-02
%res.i1751 = fmul nsz contract <8 x double> %38, <double 2.200000e-02, double 2.200000e-02, double 2.200000e-02, double 2.200000e-02, double 2.200000e-02, double 2.200000e-02, double 2.200000e-02, double 2.200000e-02>
%.elt2135 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 9, i32 0
%.unpack2136 = load double, double* %.elt2135, align 8
%.unpack2138.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 9, i32 1, i64 0, i64 0
%40 = bitcast double* %.unpack2138.unpack.elt to <8 x double>*
%41 = load <8 x double>, <8 x double>* %40, align 8
%42 = fmul double %.unpack2136, 1.200000e+04
%res.i1750 = fmul nsz contract <8 x double> %41, <double 1.200000e+04, double 1.200000e+04, double 1.200000e+04, double 1.200000e+04, double 1.200000e+04, double 1.200000e+04, double 1.200000e+04, double 1.200000e+04>
%43 = fmul double %42, %.unpack1816
%el2.i1746 = insertelement <8 x double> undef, double %42, i32 0
%bfactor.i1747 = shufflevector <8 x double> %el2.i1746, <8 x double> undef, <8 x i32> zeroinitializer
%tmp.i1748 = fmul nsz contract <8 x double> %bfactor.i1747, %9
%res.i1749 = call nsz contract <8 x double> @llvm.fmuladd.v8f64(<8 x double> %res.i1750, <8 x double> %afactor.i1784, <8 x double> %tmp.i1748)
%.elt2175 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 13, i32 0
%.unpack2176 = load double, double* %.elt2175, align 8
%.unpack2178.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 13, i32 1, i64 0, i64 0
%44 = bitcast double* %.unpack2178.unpack.elt to <8 x double>*
%45 = load <8 x double>, <8 x double>* %44, align 8
%46 = fmul double %.unpack2176, 1.880000e+00
%res.i1743 = fmul nsz contract <8 x double> %45, <double 1.880000e+00, double 1.880000e+00, double 1.880000e+00, double 1.880000e+00, double 1.880000e+00, double 1.880000e+00, double 1.880000e+00, double 1.880000e+00>
%47 = fmul double %.unpack, 1.630000e+04
%res.i1742 = fmul nsz contract <8 x double> %6, <double 1.630000e+04, double 1.630000e+04, double 1.630000e+04, double 1.630000e+04, double 1.630000e+04, double 1.630000e+04, double 1.630000e+04, double 1.630000e+04>
%48 = fmul double %47, %.unpack1956
%el2.i1738 = insertelement <8 x double> undef, double %47, i32 0
%bfactor.i1739 = shufflevector <8 x double> %el2.i1738, <8 x double> undef, <8 x i32> zeroinitializer
%tmp.i1740 = fmul nsz contract <8 x double> %bfactor.i1739, %23
%res.i1741 = call nsz contract <8 x double> @llvm.fmuladd.v8f64(<8 x double> %res.i1742, <8 x double> %afactor.i1775, <8 x double> %tmp.i1740)
%.elt2235 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 2, i32 0
%.unpack2236 = load double, double* %.elt2235, align 8
%.unpack2238.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 2, i32 1, i64 0, i64 0
%49 = bitcast double* %.unpack2238.unpack.elt to <8 x double>*
%50 = load <8 x double>, <8 x double>* %49, align 8
%51 = fmul double %.unpack2236, 4.800000e+06
%res.i1735 = fmul nsz contract <8 x double> %50, <double 4.800000e+06, double 4.800000e+06, double 4.800000e+06, double 4.800000e+06, double 4.800000e+06, double 4.800000e+06, double 4.800000e+06, double 4.800000e+06>
%52 = fmul double %.unpack1836, 3.500000e-04
%res.i1734 = fmul nsz contract <8 x double> %11, <double 3.500000e-04, double 3.500000e-04, double 3.500000e-04, double 3.500000e-04, double 3.500000e-04, double 3.500000e-04, double 3.500000e-04, double 3.500000e-04>
%53 = fmul double %.unpack1836, 1.750000e-02
%res.i1733 = fmul nsz contract <8 x double> %11, <double 1.750000e-02, double 1.750000e-02, double 1.750000e-02, double 1.750000e-02, double 1.750000e-02, double 1.750000e-02, double 1.750000e-02, double 1.750000e-02>
%.elt2295 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 15, i32 0
%.unpack2296 = load double, double* %.elt2295, align 8
%.unpack2298.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 15, i32 1, i64 0, i64 0
%54 = bitcast double* %.unpack2298.unpack.elt to <8 x double>*
%55 = load <8 x double>, <8 x double>* %54, align 8
%56 = fmul double %.unpack2296, 1.000000e+08
%res.i1732 = fmul nsz contract <8 x double> %55, <double 1.000000e+08, double 1.000000e+08, double 1.000000e+08, double 1.000000e+08, double 1.000000e+08, double 1.000000e+08, double 1.000000e+08, double 1.000000e+08>
%57 = fmul double %.unpack2296, 4.440000e+11
%res.i1731 = fmul nsz contract <8 x double> %55, <double 4.440000e+11, double 4.440000e+11, double 4.440000e+11, double 4.440000e+11, double 4.440000e+11, double 4.440000e+11, double 4.440000e+11, double 4.440000e+11>
%.elt2335 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 16, i32 0
%.unpack2336 = load double, double* %.elt2335, align 8
%.unpack2338.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 16, i32 1, i64 0, i64 0
%58 = bitcast double* %.unpack2338.unpack.elt to <8 x double>*
%59 = load <8 x double>, <8 x double>* %58, align 8
%60 = fmul double %.unpack2336, 1.240000e+03
%res.i1730 = fmul nsz contract <8 x double> %59, <double 1.240000e+03, double 1.240000e+03, double 1.240000e+03, double 1.240000e+03, double 1.240000e+03, double 1.240000e+03, double 1.240000e+03, double 1.240000e+03>
%61 = fmul double %60, %.unpack1956
%el2.i1726 = insertelement <8 x double> undef, double %60, i32 0
%bfactor.i1727 = shufflevector <8 x double> %el2.i1726, <8 x double> undef, <8 x i32> zeroinitializer
%tmp.i1728 = fmul nsz contract <8 x double> %bfactor.i1727, %23
%res.i1729 = call nsz contract <8 x double> @llvm.fmuladd.v8f64(<8 x double> %res.i1730, <8 x double> %afactor.i1775, <8 x double> %tmp.i1728)
%.elt2375 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 18, i32 0
%.unpack2376 = load double, double* %.elt2375, align 8
%.unpack2378.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 18, i32 1, i64 0, i64 0
%62 = bitcast double* %.unpack2378.unpack.elt to <8 x double>*
%63 = load <8 x double>, <8 x double>* %62, align 8
%64 = fmul double %.unpack2376, 2.100000e+00
%res.i1723 = fmul nsz contract <8 x double> %63, <double 2.100000e+00, double 2.100000e+00, double 2.100000e+00, double 2.100000e+00, double 2.100000e+00, double 2.100000e+00, double 2.100000e+00, double 2.100000e+00>
%65 = fmul double %.unpack2376, 5.780000e+00
%res.i1722 = fmul nsz contract <8 x double> %63, <double 5.780000e+00, double 5.780000e+00, double 5.780000e+00, double 5.780000e+00, double 5.780000e+00, double 5.780000e+00, double 5.780000e+00, double 5.780000e+00>
%66 = fmul double %.unpack, 4.740000e-02
%res.i1721 = fmul nsz contract <8 x double> %6, <double 4.740000e-02, double 4.740000e-02, double 4.740000e-02, double 4.740000e-02, double 4.740000e-02, double 4.740000e-02, double 4.740000e-02, double 4.740000e-02>
%67 = fmul double %66, %.unpack1836
%el2.i1717 = insertelement <8 x double> undef, double %66, i32 0
%bfactor.i1718 = shufflevector <8 x double> %el2.i1717, <8 x double> undef, <8 x i32> zeroinitializer
%tmp.i1719 = fmul nsz contract <8 x double> %bfactor.i1718, %11
%res.i1720 = call nsz contract <8 x double> @llvm.fmuladd.v8f64(<8 x double> %res.i1721, <8 x double> %afactor.i1791, <8 x double> %tmp.i1719)
%68 = fmul double %.unpack2376, 1.780000e+03
%res.i1714 = fmul nsz contract <8 x double> %63, <double 1.780000e+03, double 1.780000e+03, double 1.780000e+03, double 1.780000e+03, double 1.780000e+03, double 1.780000e+03, double 1.780000e+03, double 1.780000e+03>
%69 = fmul double %68, %.unpack
%el2.i = insertelement <8 x double> undef, double %68, i32 0
%bfactor.i = shufflevector <8 x double> %el2.i, <8 x double> undef, <8 x i32> zeroinitializer
%tmp.i = fmul nsz contract <8 x double> %bfactor.i, %6
%res.i1713 = call nsz contract <8 x double> @llvm.fmuladd.v8f64(<8 x double> %res.i1714, <8 x double> %afactor.i1753, <8 x double> %tmp.i)
%.elt2495 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 19, i32 0
%.unpack2496 = load double, double* %.elt2495, align 8
%.unpack2498.unpack.elt = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %4, i64 19, i32 1, i64 0, i64 0
%70 = bitcast double* %.unpack2498.unpack.elt to <8 x double>*
%71 = load <8 x double>, <8 x double>* %70, align 8
%72 = fmul double %.unpack2496, 3.120000e+00
%res.i1712 = fmul nsz contract <8 x double> %71, <double 3.120000e+00, double 3.120000e+00, double 3.120000e+00, double 3.120000e+00, double 3.120000e+00, double 3.120000e+00, double 3.120000e+00, double 3.120000e+00>
%73 = fneg double %7
%74 = fsub double %73, %36
%75 = fadd nsz contract <8 x double> %res.i, %res.i1757
%76 = fsub double %74, %48
%77 = fadd nsz contract <8 x double> %75, %res.i1741
%78 = fsub double %76, %67
%79 = fadd nsz contract <8 x double> %77, %res.i1720
%80 = fsub double %78, %69
%81 = fadd nsz contract <8 x double> %79, %res.i1713
%82 = fadd double %13, %80
%res.i1706 = fsub nsz contract <8 x double> %res.i1795, %81
%83 = fadd double %17, %82
%res.i1705 = fadd nsz contract <8 x double> %res.i1788, %res.i1706
%84 = fadd double %34, %83
%res.i1704 = fadd nsz contract <8 x double> %res.i1764, %res.i1705
%85 = fadd double %39, %84
%res.i1703 = fadd nsz contract <8 x double> %res.i1751, %res.i1704
%86 = fadd double %43, %85
%res.i1702 = fadd nsz contract <8 x double> %res.i1749, %res.i1703
%87 = fadd double %65, %86
%res.i1701 = fadd nsz contract <8 x double> %res.i1722, %res.i1702
%88 = fadd double %87, %72
%res.i1700 = fadd nsz contract <8 x double> %res.i1701, %res.i1712
%89 = bitcast {}* %0 to { double, [1 x [8 x double]] }**
%90 = load { double, [1 x [8 x double]] }*, { double, [1 x [8 x double]] }** %89, align 8
%.repack = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %90, i64 0, i32 0
store double %88, double* %.repack, align 8
%91 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %90, i64 0, i32 1, i64 0
%92 = bitcast [8 x double]* %91 to <8 x double>*
store <8 x double> %res.i1700, <8 x double>* %92, align 8
%93 = fneg double %13
%94 = fsub double %93, %17
%95 = fadd nsz contract <8 x double> %res.i1795, %res.i1788
%96 = fsub double %94, %34
%97 = fadd nsz contract <8 x double> %95, %res.i1764
%98 = fsub double %96, %43
%99 = fadd nsz contract <8 x double> %97, %res.i1749
%100 = fadd double %7, %98
%res.i1695 = fsub nsz contract <8 x double> %res.i, %99
%101 = fadd double %100, %64
%res.i1694 = fadd nsz contract <8 x double> %res.i1695, %res.i1723
%.repack2517 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %90, i64 1, i32 0
store double %101, double* %.repack2517, align 8
%102 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %90, i64 1, i32 1, i64 0
%103 = bitcast [8 x double]* %102 to <8 x double>*
store <8 x double> %res.i1694, <8 x double>* %103, align 8
%104 = fsub double %7, %51
%res.i1692 = fsub nsz contract <8 x double> %res.i, %res.i1735
%105 = fadd double %104, %53
%res.i1691 = fadd nsz contract <8 x double> %res.i1692, %res.i1733
%106 = fadd double %105, %57
%res.i1690 = fadd nsz contract <8 x double> %res.i1691, %res.i1731
%107 = fadd double %106, %65
%res.i1689 = fadd nsz contract <8 x double> %res.i1690, %res.i1722
%.repack2520 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %90, i64 2, i32 0
store double %107, double* %.repack2520, align 8
%108 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %90, i64 2, i32 1, i64 0
%109 = bitcast [8 x double]* %108 to <8 x double>*
store <8 x double> %res.i1689, <8 x double>* %109, align 8
%110 = fsub double %93, %52
%111 = fadd nsz contract <8 x double> %res.i1795, %res.i1734
%112 = fsub double %110, %53
%113 = fadd nsz contract <8 x double> %111, %res.i1733
%114 = fsub double %112, %67
%115 = fadd nsz contract <8 x double> %113, %res.i1720
%116 = fadd double %51, %114
%res.i1684 = fsub nsz contract <8 x double> %res.i1735, %115
%.repack2523 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %90, i64 3, i32 0
store double %116, double* %.repack2523, align 8
%117 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %90, i64 3, i32 1, i64 0
%118 = bitcast [8 x double]* %117 to <8 x double>*
store <8 x double> %res.i1684, <8 x double>* %118, align 8
%119 = fsub double %20, %17
%res.i1682 = fsub nsz contract <8 x double> %res.i1782, %res.i1788
%120 = fadd double %20, %119
%res.i1681 = fadd nsz contract <8 x double> %res.i1782, %res.i1682
%121 = fadd double %120, %25
%res.i1680 = fadd nsz contract <8 x double> %res.i1681, %res.i1779
%122 = fadd double %121, %28
%res.i1679 = fadd nsz contract <8 x double> %res.i1680, %res.i1773
%123 = fadd double %122, %46
%res.i1678 = fadd nsz contract <8 x double> %res.i1679, %res.i1743
%124 = fadd double %123, %61
%res.i1677 = fadd nsz contract <8 x double> %res.i1678, %res.i1729
%.repack2526 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %90, i64 4, i32 0
store double %124, double* %.repack2526, align 8
%125 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %90, i64 4, i32 1, i64 0
%126 = bitcast [8 x double]* %125 to <8 x double>*
store <8 x double> %res.i1677, <8 x double>* %126, align 8
%127 = fneg double %25
%128 = fsub double %127, %30
%129 = fadd nsz contract <8 x double> %res.i1779, %res.i1771
%130 = fsub double %128, %48
%131 = fadd nsz contract <8 x double> %129, %res.i1741
%132 = fsub double %130, %61
%133 = fadd nsz contract <8 x double> %131, %res.i1729
%134 = fadd double %17, %132
%res.i1672 = fsub nsz contract <8 x double> %res.i1788, %133
%135 = fadd double %56, %134
%res.i1671 = fadd nsz contract <8 x double> %res.i1732, %res.i1672
%136 = fadd double %56, %135
%res.i1670 = fadd nsz contract <8 x double> %res.i1732, %res.i1671
%.repack2529 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %90, i64 5, i32 0
store double %136, double* %.repack2529, align 8
%137 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %90, i64 5, i32 1, i64 0
%138 = bitcast [8 x double]* %137 to <8 x double>*
store <8 x double> %res.i1670, <8 x double>* %138, align 8
%139 = fneg double %20
%140 = fsub double %139, %21
%141 = fadd nsz contract <8 x double> %res.i1782, %res.i1781
%142 = fsub double %140, %25
%143 = fadd nsz contract <8 x double> %141, %res.i1779
%144 = fadd double %142, %46
%res.i1666 = fsub nsz contract <8 x double> %res.i1743, %143
%.repack2532 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %90, i64 6, i32 0
store double %144, double* %.repack2532, align 8
%145 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %90, i64 6, i32 1, i64 0
%146 = bitcast [8 x double]* %145 to <8 x double>*
store <8 x double> %res.i1666, <8 x double>* %146, align 8
%147 = fadd double %20, %21
%148 = fadd double %147, %25
%149 = fadd double %148, %28
%res.i1663 = fadd nsz contract <8 x double> %143, %res.i1773
%.repack2535 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %90, i64 7, i32 0
store double %149, double* %.repack2535, align 8
%150 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %90, i64 7, i32 1, i64 0
%151 = bitcast [8 x double]* %150 to <8 x double>*
store <8 x double> %res.i1663, <8 x double>* %151, align 8
%152 = fneg double %28
%153 = fsub double %152, %30
%154 = fadd nsz contract <8 x double> %res.i1773, %res.i1771
%res.i1661 = fneg nsz contract <8 x double> %154
%155 = load { double, [1 x [8 x double]] }*, { double, [1 x [8 x double]] }** %89, align 8
%.repack2538 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %155, i64 8, i32 0
store double %153, double* %.repack2538, align 8
%156 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %155, i64 8, i32 1, i64 0
%157 = bitcast [8 x double]* %156 to <8 x double>*
store <8 x double> %res.i1661, <8 x double>* %157, align 8
%158 = fsub double %28, %43
%res.i1659 = fsub nsz contract <8 x double> %res.i1773, %res.i1749
%159 = fadd double %34, %158
%res.i1658 = fadd nsz contract <8 x double> %res.i1764, %res.i1659
%.repack2541 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %155, i64 9, i32 0
store double %159, double* %.repack2541, align 8
%160 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %155, i64 9, i32 1, i64 0
%161 = bitcast [8 x double]* %160 to <8 x double>*
store <8 x double> %res.i1658, <8 x double>* %161, align 8
%162 = fneg double %34
%163 = fsub double %162, %36
%164 = fadd nsz contract <8 x double> %res.i1764, %res.i1757
%165 = fadd double %30, %163
%res.i1655 = fsub nsz contract <8 x double> %res.i1771, %164
%166 = fadd double %165, %39
%res.i1654 = fadd nsz contract <8 x double> %res.i1655, %res.i1751
%.repack2544 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %155, i64 10, i32 0
store double %166, double* %.repack2544, align 8
%167 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %155, i64 10, i32 1, i64 0
%168 = bitcast [8 x double]* %167 to <8 x double>*
store <8 x double> %res.i1654, <8 x double>* %168, align 8
%.repack2547 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %155, i64 11, i32 0
store double %34, double* %.repack2547, align 8
%169 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %155, i64 11, i32 1, i64 0
%170 = bitcast [8 x double]* %169 to <8 x double>*
store <8 x double> %res.i1764, <8 x double>* %170, align 8
%171 = fsub double %36, %39
%res.i1652 = fsub nsz contract <8 x double> %res.i1757, %res.i1751
%.repack2550 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %155, i64 12, i32 0
store double %171, double* %.repack2550, align 8
%172 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %155, i64 12, i32 1, i64 0
%173 = bitcast [8 x double]* %172 to <8 x double>*
store <8 x double> %res.i1652, <8 x double>* %173, align 8
%174 = fsub double %43, %46
%res.i1650 = fsub nsz contract <8 x double> %res.i1749, %res.i1743
%.repack2553 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %155, i64 13, i32 0
store double %174, double* %.repack2553, align 8
%175 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %155, i64 13, i32 1, i64 0
%176 = bitcast [8 x double]* %175 to <8 x double>*
store <8 x double> %res.i1650, <8 x double>* %176, align 8
%.repack2556 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %155, i64 14, i32 0
store double %48, double* %.repack2556, align 8
%177 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %155, i64 14, i32 1, i64 0
%178 = bitcast [8 x double]* %177 to <8 x double>*
store <8 x double> %res.i1741, <8 x double>* %178, align 8
%179 = fneg double %56
%180 = fsub double %179, %57
%181 = fadd nsz contract <8 x double> %res.i1732, %res.i1731
%182 = fadd double %52, %180
%res.i1647 = fsub nsz contract <8 x double> %res.i1734, %181
%.repack2559 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %155, i64 15, i32 0
store double %182, double* %.repack2559, align 8
%183 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %155, i64 15, i32 1, i64 0
%184 = bitcast [8 x double]* %183 to <8 x double>*
store <8 x double> %res.i1647, <8 x double>* %184, align 8
%185 = fneg double %61
%res.i1646 = fneg nsz contract <8 x double> %res.i1729
%.repack2562 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %155, i64 16, i32 0
store double %185, double* %.repack2562, align 8
%186 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %155, i64 16, i32 1, i64 0
%187 = bitcast [8 x double]* %186 to <8 x double>*
store <8 x double> %res.i1646, <8 x double>* %187, align 8
%.repack2565 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %155, i64 17, i32 0
store double %61, double* %.repack2565, align 8
%188 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %155, i64 17, i32 1, i64 0
%189 = bitcast [8 x double]* %188 to <8 x double>*
store <8 x double> %res.i1729, <8 x double>* %189, align 8
%190 = fneg double %64
%191 = fsub double %190, %65
%192 = fadd nsz contract <8 x double> %res.i1723, %res.i1722
%193 = fsub double %191, %69
%194 = fadd nsz contract <8 x double> %192, %res.i1713
%195 = fadd double %67, %193
%res.i1642 = fsub nsz contract <8 x double> %res.i1720, %194
%196 = fadd double %195, %72
%res.i1641 = fadd nsz contract <8 x double> %res.i1642, %res.i1712
%.repack2568 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %155, i64 18, i32 0
store double %196, double* %.repack2568, align 8
%197 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %155, i64 18, i32 1, i64 0
%198 = bitcast [8 x double]* %197 to <8 x double>*
store <8 x double> %res.i1641, <8 x double>* %198, align 8
%199 = fsub double %69, %72
%res.i1639 = fsub nsz contract <8 x double> %res.i1713, %res.i1712
%.repack2571 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %155, i64 19, i32 0
store double %199, double* %.repack2571, align 8
%200 = getelementptr inbounds { double, [1 x [8 x double]] }, { double, [1 x [8 x double]] }* %155, i64 19, i32 1, i64 0
%201 = bitcast [8 x double]* %200 to <8 x double>*
store <8 x double> %res.i1639, <8 x double>* %201, align 8
ret void
} There are many more named variables (instead of just |
The difference is large when the chunk sizes are not a power of 2. |
On this PR there are a bunch of |
How do you measure this? Just so we do the same. |
You can test by using this in the script: cfg = ForwardDiff.JacobianConfig(f!, du, u0, ForwardDiff.Chunk(5));
@time ForwardDiff.jacobian!(J, f!, du, u0, cfg);
@btime ForwardDiff.jacobian!($J, $f!, $du, $u0, $cfg); I tested at a few different sizes, and at the very least, it did not seem to have a beneficial impact on runtime or compile time performance at the chunk sizes I tested (even though I naively thought the llvm looked better at a glance). |
I meant how you measure the number of instructions. Are you using Cthulhu to step in or just directly calling the function with dual numbers? |
Sorry, I apparently switched ForwardDiff commits in between my comments from 5 and 1 hour ago. julia> @time ForwardDiff.jacobian!(J, f!, du, u0, cfg);
0.861934 seconds (4.71 M allocations: 253.155 MiB, 8.06% gc time, 99.99% compilation time)
julia> @btime ForwardDiff.jacobian!($J, $f!, $du, $u0, $cfg);
2.163 μs (0 allocations: 0 bytes) this PR: julia> @time ForwardDiff.jacobian!(J, f!, du, u0, cfg);
0.728831 seconds (3.94 M allocations: 221.083 MiB, 9.85% gc time, 99.99% compilation time)
julia> @btime ForwardDiff.jacobian!($J, $f!, $du, $u0, $cfg);
1.164 μs (0 allocations: 0 bytes) Lines of llvm are 723 vs 396 for me. %93 = fmul double %.unpack1561, 0x3F4C2E33EFF19503
%94 = extractelement <4 x double> %90, i32 0
%95 = insertelement <7 x double> undef, double %94, i32 0
%96 = extractelement <4 x double> %90, i32 1
%97 = insertelement <7 x double> %95, double %96, i32 1
%98 = extractelement <4 x double> %90, i32 2
%99 = insertelement <7 x double> %97, double %98, i32 2
%100 = extractelement <4 x double> %90, i32 3
%101 = insertelement <7 x double> %99, double %100, i32 3
%102 = extractelement <2 x double> %92, i32 0
%103 = insertelement <7 x double> %101, double %102, i32 4
%104 = extractelement <2 x double> %92, i32 1
%105 = insertelement <7 x double> %103, double %104, i32 5
%106 = insertelement <7 x double> %105, double %.unpack1563.unpack.unpack1576, i32 6 The actual assembly doesn't look nearly so bad, and uiCA predicts a much smaller difference than I observe:
Cthulhu. To count the number of lines, I copy/pasted into an editor. EDIT: |
Only implemented enough so that the benchmark in #555 can be tested. Putting it up here in case people want to play with it.
Results of the benchmark in #555:
Branch:
PR:
The number of LLVM instructions after optimization didn't really seem to change. Compile-time seems to improve quite a bit though, likely due to more compact LLVM IR pre-optimization.
This is a bit annoying since
VecElement
behave differently from numbers in some ways: