diff --git a/README.rst b/README.rst
index 0601631fb..e74ab1cf3 100644
--- a/README.rst
+++ b/README.rst
@@ -87,9 +87,9 @@ Follow the links below to learn more!
 High performance simulators
 ------------------------------------------
 * `OpenFermion-FQE <https://github.com/quantumlib/OpenFermion-FQE>`__ is
-  a high performance emulator of fermionic quantum evolutions specified 
-  by a sequence of fermion operators, which can exploit fermionic 
-  symmetries such as spin and particle number. 
+  a high performance emulator of fermionic quantum evolutions specified
+  by a sequence of fermion operators, which can exploit fermionic
+  symmetries such as spin and particle number.
 
 Circuit compilation plugins
 ------------------------------------------
diff --git a/dev_tools/conf/.coveragerc b/dev_tools/conf/.coveragerc
index ea02f01f5..f431403ee 100644
--- a/dev_tools/conf/.coveragerc
+++ b/dev_tools/conf/.coveragerc
@@ -4,4 +4,4 @@
 # Failure to do so will result in false positives.
 omit =
     ./profiling/performance_benchmarks.py
-    ./dev_tools/*
\ No newline at end of file
+    ./dev_tools/*
diff --git a/src/openfermion/resource_estimates/README.md b/src/openfermion/resource_estimates/README.md
new file mode 100644
index 000000000..9889e6922
--- /dev/null
+++ b/src/openfermion/resource_estimates/README.md
@@ -0,0 +1,140 @@
+### Disclaimer: testing, dependencies, etc.
+
+Code is system tested on Debian GNU/Linux with Python 3.8.5. All the code comes with tests (use `pytest`), but not unit tested with GitHub worflows.
+
+Since the FT costing is closely tied to manipulation of molecular integrals (localization, active space selection, benchmarking against CCSD(T), ...) the code depends on [PySCF](https://pyscf.org/). Since we do not want to burden all OpenFermion users with this dependency, testing is disabled in the GitHub workflow. Moreover, the `resource_estimates` functionality requires the dependencies
+
+```
+pyscf
+h5py~=3.3.0
+jax
+jaxlib
+```
+
+For THC factorization, it also requires [BTAS](https://github.com/ValeevGroup/BTAS) and the [PyBTAS](https://github.com/ncrubin/pybtas) wrapper, which require their own installation + depends.
+
+### Overview
+
+Module `openfermion.resource_estimates` to facilitate fault-tolerant (FT) resource estimates for chemical Hamiltonians.
+
+The following factorizations are included:
+* The [single](https://arxiv.org/abs/1902.02134) [factorization](https://arxiv.org/abs/1808.02625) (SF) method 
+* The [double factorization](https://arxiv.org/pdf/2007.14460) (DF) method
+* The [tensor hypercontraction](https://arxiv.org/abs/2011.03494) (THC) method
+
+For the methods listed above, there are sub-routines which:
+* factorize the two-electron integrals, `factorize()`
+* compute the lambda values, `compute_lambda()`
+* estimate the number of logical qubits and Toffoli gates required to simulate with this factorization, `compute_cost()`
+
+There are also some costing routines for the [sparse factorization](https://arxiv.org/abs/1902.02134), but this is a work in progress.
+
+### Details
+
+The philosophy for this new module is to rely on PySCF to generate, store, and manipulate molecular information. The data (integrals, etc) is stored as a PySCF mean-field (`mf`) object. As an example, one could input an ionized water molecule like so:
+
+```python
+from pyscf import gto, scf
+
+# input is just like any other PySCF script
+mol = gto.M(
+    atom = '''O    0.000000      -0.075791844    0.000000
+              H    0.866811829    0.601435779    0.000000
+              H   -0.866811829    0.601435779    0.000000
+           ''',
+    basis = 'augccpvtz',
+    symmetry = False,
+    charge = 1,
+    spin = 1
+)
+mf = scf.ROHF(mol)
+mf.verbose = 4
+mf.kernel()  # run the SCF
+```
+
+Then, given the `mf` object, `resource_estimates.molecule` has routines to further manipulate the molecule, such as testing for stability (and reoptimizing), as well as localizing orbitals and performing automated active space selection with [AVAS](https://pubs.acs.org/doi/10.1021/acs.jctc.7b00128). Continuing our example:
+
+```python
+from openfermion.resource_estimates.molecule import stability, localize, avas_active_space
+
+# make sure wave function is stable before we proceed
+mf = stability(mf)
+
+# localize before automatically selecting active space with AVAS
+mf = localize(mf, loc_type='pm')  # default is loc_type ='pm' (Pipek-Mezey)
+
+# you can use larger basis for `minao` to select non-valence...here select O 3s and 3p as well 
+mol, mf = avas_active_space(mf, ao_list=['H 1s', 'O 2s', 'O 2p', 'O 3s', 'O 3p'], minao='ccpvtz') 
+```
+
+In each case, the input is the mean-field `mf` object, and the output is a modified `mf` object. The `mf` object is not updated in-place, so it is possible to create additional copies in memory.
+
+At this point, we have a stable wave function, localized the orbitals, and selected an active space. At any point, the molecular Hamiltonian (e.g. active space) can be written out to HDF5 using `molecule.save_pyscf_to_casfile()`, or, if it exists, read in using `molecule.load_casfile_to_pyscf()`.
+
+Once an active space is selected/generated, costing is relatively straightforward. There are helper functions for the SF, DF, and THC factorization schemes that will make a nice table given some parameters. For example:
+
+```python
+from openfermion.resource_estimates import sf
+
+# make pretty SF costing table
+sf.generate_costing_table(mf, name='water', rank_range=[20,25,30,35,40,45,50])
+```
+which outputs to a file called `single_factorization_water.txt`, and contains:
+
+```
+ Single low rank factorization data for 'water'.
+    [*] using CAS((5a, 4b), 11o)
+        [+]                      E(SCF):       -75.63393088
+        [+] Active space CCSD(T) E(cor):        -0.08532629
+        [+] Active space CCSD(T) E(tot):       -75.71925716
+============================================================================================================
+     L          ||ERI - SF||       lambda      CCSD(T) error (mEh)       logical qubits       Toffoli count    
+------------------------------------------------------------------------------------------------------------
+     20          1.7637e-01        212.7              -2.97                   298                4.3e+08       
+     25          5.7546e-02        215.0               1.53                   298                4.7e+08       
+     30          2.9622e-02        216.1               0.11                   298                5.1e+08       
+     35          1.3728e-02        216.5              -0.07                   301                5.5e+08       
+     40          2.1439e-03        216.7               0.00                   460                5.8e+08       
+     45          2.8662e-04        216.8               0.00                   460                6.0e+08       
+     50          1.1826e-04        216.8               0.00                   460                6.2e+08       
+============================================================================================================
+```
+
+Note that the automated costing relies on error in CCSD(T)  - or CCSD, if desired - as the metric, so this may become a bottleneck for large active spaces.
+
+The philosophy is that all costing methods are captured in the namespace related to the type of factorization (e.g., . So if one wanted to repeat the costing for DF or THC factorizations, one could 
+
+```python
+from openfermion.resource_estimates import df, thc
+
+# make pretty DF costing table
+df.generate_costing_table(mf, name='water', thresh_range=[1e-2,5e-3,1e-3,5e-4,1e-4,5e-5,1e-5]) 
+
+# make pretty THC costing table
+# if you want to save each THC result to a file, you can set 'save_thc' to True
+thc.generate_costing_table(mf, name='water', nthc_range=[20,25,30,35,40,45,50], save_thc=False) 
+```
+
+Which generate similar outputs, e.g. the above would generate tables in `double_factorization_water.txt` and `thc_factorization_water.txt`. 
+
+More fine-grained control is given by subroutines that compute the factorization, the lambda values, and the cost estimates. For example, considering the double factorization, we could have
+
+```python
+factorized_eris, df_factors, _, _ = df.factorize(mf._eri, cutoff_threshhold)
+df_lambda  = df.compute_lambda(mf, df_factors)
+_, number_toffolis, num_logical_qubits = df.compute_cost(num_spin_orbitals, df_lambda, *args)
+```
+which, unlike the pretty tables above, require the user to handle and input several molecular quantities and intermediates, but at the gain of more functionality and control. Switching between factorization schemes is generally as easy as swapping out the namespace, for example to perform different factorizations on the ERIs, 
+
+```python
+sf.factorize()
+df.factorize()
+thc.factorize()
+```
+
+are all valid, as are the methods `compute_lambda()` and `compute_cost()` for the factorizations.
+
+
+For THC factorization, it also requires [BTAS](https://github.com/ValeevGroup/BTAS) and the [PyBTAS](https://github.com/ncrubin/pybtas) wrapper, which require their own installation + depends.
+
+Again, since we do not wish to burden all OpenFermion users with these dependencies, testing with GitHub workflows is disabled, but if you install the dependencies, running `pytest` should pass.
diff --git a/src/openfermion/resource_estimates/__init__.py b/src/openfermion/resource_estimates/__init__.py
new file mode 100644
index 000000000..d91d98e2f
--- /dev/null
+++ b/src/openfermion/resource_estimates/__init__.py
@@ -0,0 +1,19 @@
+#coverage:ignore
+#   Licensed under the Apache License, Version 2.0 (the "License");
+#   you may not use this file except in compliance with the License.
+#   You may obtain a copy of the License at
+#
+#       http://www.apache.org/licenses/LICENSE-2.0
+#
+#   Unless required by applicable law or agreed to in writing, software
+#   distributed under the License is distributed on an "AS IS" BASIS,
+#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#   See the License for the specific language governing permissions and
+#   limitations under the License.
+#coverage: ignore
+import pytest
+
+try:
+    import pyscf
+except ImportError:
+    pytest.skip('Need pyscf for resource estimates', allow_module_level=True)
diff --git a/src/openfermion/resource_estimates/df/__init__.py b/src/openfermion/resource_estimates/df/__init__.py
new file mode 100644
index 000000000..e8fbffdfc
--- /dev/null
+++ b/src/openfermion/resource_estimates/df/__init__.py
@@ -0,0 +1,19 @@
+#coverage:ignore
+#   Copyright 2020 Google LLC
+
+#   Licensed under the Apache License, Version 2.0 (the "License");
+#   you may not use this file except in compliance with the License.
+#   You may obtain a copy of the License at
+#
+#       http://www.apache.org/licenses/LICENSE-2.0
+#
+#   Unless required by applicable law or agreed to in writing, software
+#   distributed under the License is distributed on an "AS IS" BASIS,
+#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#   See the License for the specific language governing permissions and
+#   limitations under the License.
+
+from .compute_lambda_df import compute_lambda
+from .compute_cost_df import compute_cost
+from .factorize_df import factorize
+from .generate_costing_table_df import generate_costing_table
diff --git a/src/openfermion/resource_estimates/df/compute_cost_df.py b/src/openfermion/resource_estimates/df/compute_cost_df.py
new file mode 100644
index 000000000..53c778a44
--- /dev/null
+++ b/src/openfermion/resource_estimates/df/compute_cost_df.py
@@ -0,0 +1,217 @@
+#coverage:ignore
+""" Determine costs for DF decomposition in QC """
+from typing import Tuple
+import numpy as np
+from numpy.lib.scimath import arccos, arcsin  # has analytc continuation to cplx
+from openfermion.resource_estimates.utils import QR, QI, power_two
+
+
+def compute_cost(n: int,
+                 lam: float,
+                 dE: float,
+                 L: int,
+                 Lxi: int,
+                 chi: int,
+                 beta: int,
+                 stps: int,
+                 verbose: bool = False) -> Tuple[int, int, int]:
+    """ Determine fault-tolerant costs using DF decomposition in quantum chem
+
+    Args:
+        n (int) - the number of spin-orbitals
+        lam (float) - the lambda-value for the Hamiltonian
+        dE (float) - allowable error in phase estimation
+        L (int) - the rank of the first decomposition
+        Lxi (int) - the total number of eigenvectors
+        chi (int) - equivalent to aleph_1 and aleph_2 in the document, the
+            number of bits for the representation of the coefficients
+        beta (int) - equivalent to beth in the document, the number of bits
+            for the rotations
+        stps (int) - an approximate number of steps to choose the precision of
+            single qubit rotations in preparation of the equal superpositn state
+        verbose (bool) - do additional printing of intermediates?
+
+    Returns:
+        step_cost (int) - Toffolis per step
+        total_cost (int) - Total number of Toffolis
+        ancilla_cost (int) - Total ancilla cost
+    """
+
+    # The number of bits used for the second register.
+    nxi = np.ceil(np.log2(n // 2))
+
+    # The number of bits for the contiguous register.
+    nLxi = np.ceil(np.log2(Lxi + n // 2))
+
+    # The number of bits used for the first register.
+    nL = np.ceil(np.log2(L + 1))
+
+    # The power of 2 that is a factor of L + 1
+    eta = power_two(L + 1)
+
+    oh = [0] * 20
+    for p in range(20):
+        # JJG note: arccos arg may be > 1
+        v = np.round(np.power(2,p+1) / (2 * np.pi) * arccos(np.power(2,nL) /\
+            np.sqrt((L + 1)/2**eta)/2))
+        oh[p] = np.real(stps * (1 / (np.sin(3 * arcsin(np.cos(v * 2 * np.pi / \
+            np.power(2,p+1)) * \
+            np.sqrt((L + 1)/2**eta) / np.power(2,nL)))**2) - 1) + 4 * (p + 1))
+
+    # Bits of precision for rotation
+    br = int(np.argmin(oh) + 1)
+
+    # The following costs are from the list starting on page 50.
+
+    # The cost for preparing an equal superposition for preparing the first
+    # register in step 1 (a). We double this cost to account for the inverse.
+    cost1a = 2 * (3 * nL + 2 * br - 3 * eta - 9)
+
+    # The output size for the QROM for the first state preparation in Eq. (C27)
+    bp1 = nL + chi
+
+    # The cost of the QROM for the first state preparation in step 1 (b) and
+    # its inverse.
+    cost1b = QR(L + 1, bp1)[1] + QI(L + 1)[1]
+
+    # The cost for the inequality test, controlled swap and their inverse in
+    # steps 1 (c) and (d)
+    cost1cd = 2 * (chi + nL)
+
+    # The total cost for preparing the first register in step 1.
+    cost1 = cost1a + cost1b + cost1cd
+
+    # The output size for the QROM for the data to prepare the equal
+    # superposition on the second register, as given in Eq. (C29).
+    bo = nxi + nLxi + br + 1
+
+    # This is step 2. This is the cost of outputting the data to prepare the
+    # equal superposition on the second register. We will assume it is not
+    # uncomputed, because we want to keep the offset for applying the QROM for
+    # outputting the rotations.
+    cost2 = QR(L + 1, bo)[1] + QI(L + 1)[1]
+
+    # The number of bits for rotating the ancilla for the second preparation.
+    # We are just entering this manually because it is a typical value.
+    br = 7
+
+    # The cost of preparing an equal superposition over the second register in
+    # a controlled way. We pay this cost 4 times.
+    cost3a = 4 * (7 * nxi + 2 * br - 6)
+
+    # The cost of the offset to apply the QROM for state preparation on the
+    # second register.
+    cost3b = 4 * (nLxi - 1)
+
+    bp2 = nxi + chi + 2
+
+    # The cost of the QROMs and inverse QROMs for the state preparation, where
+    # in the first one we need + n/2 to account for the one-electron terms.
+    cost3c = QR(Lxi + n // 2, bp2)[1] + QI(Lxi + n // 2)[1] + QR(
+        Lxi, bp2)[1] + QI(Lxi)[1]
+
+    # The inequality test and state preparations.
+    cost3d = 4 * (nxi + chi)
+
+    # The total costs for state preparations on register 2.
+    cost3 = cost3a + cost3b + cost3c + cost3d
+
+    # The cost of adding offsets in steps 4 (a) and (h).
+    cost4ah = 4 * (nLxi - 1)
+
+    # The costs of the QROMs and their inverses in steps 4 (b) and (g).
+    cost4bg = QR(Lxi + n // 2, n * beta // 2)[1] + QI(Lxi + n // 2)[1] + QR(
+        Lxi, n * beta // 2)[1] + QI(Lxi)[1]
+
+    # The cost of the controlled swaps based on the spin qubit in steps 4c and f
+    cost4cf = 2 * n
+
+    # The controlled rotations in steps 4 (d) and (f).
+    cost4df = 4 * n * (beta - 2)
+
+    # The controlled Z operations in the middle for step 4 (e).
+    cost4e = 3
+
+    # This is the cost of the controlled rotations for step 4.
+    cost4 = cost4ah + cost4bg + cost4cf + cost4df + cost4e
+
+    # This is the cost of the reflection on the second register from step 6.
+    cost6 = nxi + chi + 2
+
+    # The cost of the final reflection req'd to construct the step of the
+    # quantum walk from step 9.
+    cost9 = nL + nxi + chi + 1
+
+    # The extra two qubits for unary iteration and making the rflxn controlled.
+    cost10 = 2
+
+    # The Toffoli cost for a single step
+    cost = cost1 + cost2 + cost3 + cost4 + cost6 + cost9 + cost10
+
+    # The number of steps needed
+    iters = np.ceil(np.pi * lam / (2 * dE))
+
+    # Now the number of qubits from the list on page 54.
+
+    k1 = np.power(2, QR(Lxi + n // 2, n * beta // 2)[0])
+
+    # The control register for phase estimation and iteration on it.
+    ac1 = np.ceil(np.log2(iters + 1)) * 2 - 1
+
+    # The system qubits
+    ac2 = n
+
+    # The first register prepared, a rotated qubit and a flag qubit.
+    ac3 = nL + 2
+
+    # The output of the QROM, the equal superposition state and a flag qubit.
+    ac4 = nL + chi * 2 + 1
+
+    # The data used for preparing the equal superposition state on the 2nd reg
+    ac5 = bo
+
+    # The second register, a rotated qubit and a flag qubit.
+    ac6 = nxi + 2
+
+    # The second preparation QROM output.
+    ac8 = bp2
+
+    # The equal superposition state and the result of the inequality test.
+    ac9 = chi + 1
+
+    # The angles for rotations.
+    ac10 = k1 * n * beta // 2
+
+    # The phase gradient state.
+    ac11 = beta
+
+    # A control qubit for the spin.
+    ac12 = 1
+
+    # A T state.
+    ac13 = 1
+
+    if verbose:
+        print("[*] Top of routine")
+        print("  [+] nxi = ", nxi)
+        print("  [+] nLxi = ", nLxi)
+        print("  [+] nL = ", nL)
+        print("  [+] eta = ", eta)
+        print("  [+] cost3 = ", cost3)
+        print("  [+] cost4 = ", cost4)
+        print("  [+] cost = ", cost)
+        print("  [+] iters = ", iters)
+
+    ancilla_cost = ac1 + ac2 + ac3 + ac4 + ac5 + ac6 + ac8 + ac9 + ac10 + ac11\
+                 + ac12 + ac13
+
+    # Sanity checks before returning as int
+    assert cost.is_integer()
+    assert iters.is_integer()
+    assert ancilla_cost.is_integer()
+
+    step_cost = int(cost)
+    total_cost = int(cost * iters)
+    ancilla_cost = int(ancilla_cost)
+
+    return step_cost, total_cost, ancilla_cost
diff --git a/src/openfermion/resource_estimates/df/compute_cost_df_test.py b/src/openfermion/resource_estimates/df/compute_cost_df_test.py
new file mode 100644
index 000000000..762fa1728
--- /dev/null
+++ b/src/openfermion/resource_estimates/df/compute_cost_df_test.py
@@ -0,0 +1,46 @@
+#coverage:ignore
+"""Test cases for costing_df.py
+"""
+from openfermion.resource_estimates import df
+
+
+def test_reiher_df():
+    """ Reproduce Reiher et al orbital DF FT costs from paper """
+    DE = 0.001
+    CHI = 10
+
+    # Reiher et al orbitals
+    N = 108
+    LAM = 294.8
+    L = 360
+    LXI = 13031
+    BETA = 16
+
+    # Here we're using an initial calculation with a very rough estimate of the
+    # number of steps to give a more accurate number of steps.
+    # Then we input that into the function again.
+    output = df.compute_cost(N, LAM, DE, L, LXI, CHI, BETA, stps=20000)
+    stps1 = output[0]
+    output = df.compute_cost(N, LAM, DE, L, LXI, CHI, BETA, stps1)
+    assert output == (21753, 10073183463, 3725)
+
+
+def test_li_df():
+    """ Reproduce Li et al orbital DF FT costs from paper """
+    DE = 0.001
+    CHI = 10
+
+    # Li et al orbitals
+    N = 152
+    LAM = 1171.2
+    L = 394
+    LXI = 20115
+    BETA = 20
+
+    # Here we're using an initial calculation with a very rough estimate of the
+    # number of steps to give a more accurate number of steps.
+    # Then we input that into the function again.
+    output = df.compute_cost(N, LAM, DE, L, LXI, CHI, BETA, stps=20000)
+    stps2 = output[0]
+    output = df.compute_cost(N, LAM, DE, L, LXI, CHI, BETA, stps2)
+    assert output == (35008, 64404812736, 6404)
diff --git a/src/openfermion/resource_estimates/df/compute_lambda_df.py b/src/openfermion/resource_estimates/df/compute_lambda_df.py
new file mode 100644
index 000000000..342cc0711
--- /dev/null
+++ b/src/openfermion/resource_estimates/df/compute_lambda_df.py
@@ -0,0 +1,36 @@
+#coverage:ignore
+""" Compute lambda for double low rank factoriz. method of von Burg, et al """
+import numpy as np
+from openfermion.resource_estimates.molecule import pyscf_to_cas
+
+
+def compute_lambda(pyscf_mf, df_factors):
+    """ Compute lambda for Hamiltonian using DF method of von Burg, et al.
+
+    Args:
+        pyscf_mf - Pyscf mean field object
+        df_factors (ndarray) - (N x N x rank) array of DF factors from ERI
+
+    Returns:
+        lambda_tot (float) - lambda value for the single factorized Hamiltonian
+    """
+    # grab tensors from pyscf_mf object
+    h1, eri_full, _, _, _ = pyscf_to_cas(pyscf_mf)
+
+    # one body contributions
+    T = h1 - 0.5 * np.einsum("illj->ij", eri_full) + np.einsum(
+        "llij->ij", eri_full)
+    e, _ = np.linalg.eigh(T)
+    lambda_T = np.sum(np.abs(e))
+
+    # two body contributions
+    lambda_F = 0.0
+    for vector in range(df_factors.shape[2]):
+        Lij = df_factors[:, :, vector]
+        #e, v = np.linalg.eigh(Lij)
+        e = np.linalg.eigvalsh(Lij)  # just need eigenvals
+        lambda_F += 0.25 * np.sum(np.abs(e))**2
+
+    lambda_tot = lambda_T + lambda_F
+
+    return lambda_tot
diff --git a/src/openfermion/resource_estimates/df/compute_lambda_df_test.py b/src/openfermion/resource_estimates/df/compute_lambda_df_test.py
new file mode 100644
index 000000000..7203b821d
--- /dev/null
+++ b/src/openfermion/resource_estimates/df/compute_lambda_df_test.py
@@ -0,0 +1,21 @@
+#coverage:ignore
+"""Test cases for compute_lambda_df.py
+"""
+from os import path
+import numpy as np
+from openfermion.resource_estimates import df
+from openfermion.resource_estimates.molecule import load_casfile_to_pyscf
+
+
+def test_reiher_df_lambda():
+    """ Reproduce Reiher et al orbital DF lambda from paper """
+
+    THRESH = 0.00125
+    NAME = path.join(path.dirname(__file__), '../integrals/eri_reiher.h5')
+    _, mf = load_casfile_to_pyscf(NAME, num_alpha=27, num_beta=27)
+    eri_rr, LR, L, Lxi = df.factorize(mf._eri, thresh=THRESH)
+    total_lambda = df.compute_lambda(mf, LR)
+    assert eri_rr.shape[0] * 2 == 108
+    assert L == 360
+    assert Lxi == 13031
+    assert np.isclose(np.round(total_lambda, decimals=1), 294.8)
diff --git a/src/openfermion/resource_estimates/df/factorize_df.py b/src/openfermion/resource_estimates/df/factorize_df.py
new file mode 100644
index 000000000..13cb96876
--- /dev/null
+++ b/src/openfermion/resource_estimates/df/factorize_df.py
@@ -0,0 +1,59 @@
+#coverage:ignore
+""" Double factorization rank reduction of ERIs """
+import numpy as np
+from openfermion.resource_estimates.utils import eigendecomp
+
+
+def factorize(eri_full, thresh):
+    """ Do double factorization of the ERI tensor
+
+    Args:
+       eri_full (np.ndarray) - 4D (N x N x N x N) full ERI tensor
+       thresh (float) - threshold for double factorization
+
+    Returns:
+       eri_rr (np.ndarray) - 4D approximate ERI tensor reconstructed
+           from df_factors vectors
+       df_factors (np.ndarray) - 3D (N x N x M) tensor containing DF vectors
+       rank (int) - rank retained from initial eigendecomposition
+       num_eigenvectors (int) - number of eigenvectors
+    """
+
+    n_orb = eri_full.shape[0]
+    assert n_orb**4 == len(eri_full.flatten())
+
+    # First, do an eigendecomposition of ERIs
+    L = eigendecomp(eri_full.reshape(n_orb**2, n_orb**2), tol=0.0)
+    L = L.reshape(n_orb, n_orb, -1)  # back to (N x N x rank)
+
+    sf_rank = L.shape[2]
+
+    num_eigenvectors = 0  # rolling number of eigenvectors
+    df_factors = []  # collect the selected vectors
+    for rank in range(sf_rank):
+        Lij = L[:, :, rank]
+        e, v = np.linalg.eigh(Lij)
+        normSC = np.sum(np.abs(e))
+
+        truncation = normSC * np.abs(e)
+
+        idx = truncation > thresh
+        plus = np.sum(idx)
+        num_eigenvectors += plus
+
+        if plus == 0:
+            break
+
+        e_selected = np.diag(e[idx])
+        v_selected = v[:, idx]
+
+        Lij_selected = v_selected.dot(e_selected).dot(v_selected.T)
+        df_factors.append(Lij_selected)
+
+    # raw factors from DF algorithm
+    df_factors = np.asarray(df_factors).T
+
+    # double-factorized and re-constructed ERIs
+    eri_rr = np.einsum('ijP,klP', df_factors, df_factors, optimize=True)
+
+    return eri_rr, df_factors, rank, num_eigenvectors
diff --git a/src/openfermion/resource_estimates/df/generate_costing_table_df.py b/src/openfermion/resource_estimates/df/generate_costing_table_df.py
new file mode 100644
index 000000000..018dee4f9
--- /dev/null
+++ b/src/openfermion/resource_estimates/df/generate_costing_table_df.py
@@ -0,0 +1,141 @@
+#coverage:ignore
+""" Pretty-print a table comparing DF vector thresh vs accuracy and cost """
+import numpy as np
+from pyscf import scf
+from openfermion.resource_estimates import df
+from openfermion.resource_estimates.molecule import (factorized_ccsd_t,
+                                                     cas_to_pyscf, pyscf_to_cas)
+
+
+def generate_costing_table(pyscf_mf,
+                           name='molecule',
+                           thresh_range=None,
+                           dE=0.001,
+                           chi=10,
+                           beta=20,
+                           use_kernel=True,
+                           no_triples=False):
+    """ Print a table to file for testing how various DF thresholds impact cost,
+        accuracy, etc.
+
+    Args:
+        pyscf_mf - PySCF mean field object
+        name (str) - file will be saved to 'double_factorization_<name>.txt'
+        thresh_range (list of floats) - list of thresholds to try for DF alg
+        dE (float) - max allowable phase error (default: 0.001)
+        chi (int) - number of bits for representation of coefficients
+                    (default: 10)
+        beta (int) - number of bits for rotations (default: 20)
+        use_kernel (bool) - re-do SCF prior to estimating CCSD(T) error?
+            Will use canonical orbitals and full ERIs for the one-body
+            contributions, using DF reconstructed ERIs for two-body
+        no_triples (bool) - if True, skip the (T) correction, doing only CCSD
+
+    Returns:
+       None
+    """
+
+    if thresh_range is None:
+        thresh_range = [0.0001]
+
+    DE = dE  # max allowable phase error
+    CHI = chi  # number of bits for representation of coefficients
+    BETA = beta  # number of bits for rotations
+
+    if isinstance(pyscf_mf, scf.rohf.ROHF):
+        num_alpha, num_beta = pyscf_mf.nelec
+        assert num_alpha + num_beta == pyscf_mf.mol.nelectron
+    else:
+        assert pyscf_mf.mol.nelectron % 2 == 0
+        num_alpha = pyscf_mf.mol.nelectron // 2
+        num_beta = num_alpha
+
+    num_orb = len(pyscf_mf.mo_coeff)
+    num_spinorb = num_orb * 2
+
+    cas_info = "CAS((%sa, %sb), %so)" % (num_alpha, num_beta, num_orb)
+
+    try:
+        assert num_orb**4 == len(pyscf_mf._eri.flatten())
+    except AssertionError:
+        # ERIs are not in correct form in pyscf_mf._eri, so this is a quick prep
+        _, pyscf_mf = cas_to_pyscf(*pyscf_to_cas(pyscf_mf))
+
+    # Reference calculation (eri_rr= None is full rank / exact ERIs)
+    escf, ecor, etot = factorized_ccsd_t(pyscf_mf,
+                                         eri_rr=None,
+                                         use_kernel=use_kernel,
+                                         no_triples=no_triples)
+
+    #exact_ecor = ecor
+    exact_etot = etot
+
+    filename = 'double_factorization_' + name + '.txt'
+
+    with open(filename, 'w') as f:
+        print("\n Double low rank factorization data for '" + name + "'.",
+              file=f)
+        print("    [*] using " + cas_info, file=f)
+        print("        [+]                      E(SCF): %18.8f" % escf, file=f)
+        if no_triples:
+            print("        [+]    Active space CCSD E(cor): %18.8f" % ecor,
+                  file=f)
+            print("        [+]    Active space CCSD E(tot): %18.8f" % etot,
+                  file=f)
+        else:
+            print("        [+] Active space CCSD(T) E(cor): %18.8f" % ecor,
+                  file=f)
+            print("        [+] Active space CCSD(T) E(tot): %18.8f" % etot,
+                  file=f)
+        print("{}".format('=' * 139), file=f)
+        if no_triples:
+            print("{:^12} {:^18} {:^12} {:^12} {:^12} {:^24} {:^20} {:^20}".
+                  format('threshold', '||ERI - DF||', 'L', 'eigenvectors',
+                         'lambda', 'CCSD error (mEh)', 'logical qubits',
+                         'Toffoli count'),
+                  file=f)
+        else:
+            print("{:^12} {:^18} {:^12} {:^12} {:^12} {:^24} {:^20} {:^20}".
+                  format('threshold', '||ERI - DF||', 'L', 'eigenvectors',
+                         'lambda', 'CCSD(T) error (mEh)', 'logical qubits',
+                         'Toffoli count'),
+                  file=f)
+        print("{}".format('-' * 139), file=f)
+    for thresh in thresh_range:
+        # First, up: lambda and CCSD(T)
+        eri_rr, LR, L, Lxi = df.factorize(pyscf_mf._eri, thresh=thresh)
+        lam = df.compute_lambda(pyscf_mf, LR)
+        escf, ecor, etot = factorized_ccsd_t(pyscf_mf,
+                                             eri_rr,
+                                             use_kernel=use_kernel,
+                                             no_triples=no_triples)
+        error = (etot - exact_etot) * 1E3  # to mEh
+        l2_norm_error_eri = np.linalg.norm(
+            eri_rr - pyscf_mf._eri)  # ERI reconstruction error
+
+        # now do costing
+        stps1 = df.compute_cost(num_spinorb,
+                                lam,
+                                DE,
+                                L=L,
+                                Lxi=Lxi,
+                                chi=CHI,
+                                beta=BETA,
+                                stps=20000)[0]
+        _, df_total_cost, df_logical_qubits = df.compute_cost(num_spinorb,
+                                                              lam,
+                                                              DE,
+                                                              L=L,
+                                                              Lxi=Lxi,
+                                                              chi=CHI,
+                                                              beta=BETA,
+                                                              stps=stps1)
+
+        with open(filename, 'a') as f:
+            print(
+                "{:^12.6f} {:^18.4e} {:^12} {:^12} {:^12.1f} {:^24.2f} {:^20} \
+                 {:^20.1e}".format(thresh, l2_norm_error_eri, L, Lxi, lam,
+                                   error, df_logical_qubits, df_total_cost),
+                file=f)
+    with open(filename, 'a') as f:
+        print("{}".format('=' * 139), file=f)
diff --git a/src/openfermion/resource_estimates/integrals/M_250_beta_16_eta_10.h5 b/src/openfermion/resource_estimates/integrals/M_250_beta_16_eta_10.h5
new file mode 100644
index 000000000..10867c310
Binary files /dev/null and b/src/openfermion/resource_estimates/integrals/M_250_beta_16_eta_10.h5 differ
diff --git a/src/openfermion/resource_estimates/integrals/__init__.py b/src/openfermion/resource_estimates/integrals/__init__.py
new file mode 100644
index 000000000..e69de29bb
diff --git a/src/openfermion/resource_estimates/integrals/eri_reiher.h5 b/src/openfermion/resource_estimates/integrals/eri_reiher.h5
new file mode 100644
index 000000000..1af236091
Binary files /dev/null and b/src/openfermion/resource_estimates/integrals/eri_reiher.h5 differ
diff --git a/src/openfermion/resource_estimates/molecule/__init__.py b/src/openfermion/resource_estimates/molecule/__init__.py
new file mode 100644
index 000000000..1aae53fdd
--- /dev/null
+++ b/src/openfermion/resource_estimates/molecule/__init__.py
@@ -0,0 +1,26 @@
+#coverage:ignore
+#   Licensed under the Apache License, Version 2.0 (the "License");
+#   you may not use this file except in compliance with the License.
+#   You may obtain a copy of the License at
+#
+#       http://www.apache.org/licenses/LICENSE-2.0
+#
+#   Unless required by applicable law or agreed to in writing, software
+#   distributed under the License is distributed on an "AS IS" BASIS,
+#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#   See the License for the specific language governing permissions and
+#   limitations under the License.
+
+from .pyscf_utils import (
+    stability,
+    localize,
+    avas_active_space,
+    cas_to_pyscf,
+    pyscf_to_cas,
+    get_num_active_alpha_beta,
+    load_casfile_to_pyscf,
+    save_pyscf_to_casfile,
+    factorized_ccsd_t,
+    ccsd_t,
+    open_shell_t1_d1,
+)
diff --git a/src/openfermion/resource_estimates/molecule/pyscf_utils.py b/src/openfermion/resource_estimates/molecule/pyscf_utils.py
new file mode 100644
index 000000000..f70ff636a
--- /dev/null
+++ b/src/openfermion/resource_estimates/molecule/pyscf_utils.py
@@ -0,0 +1,640 @@
+#coverage:ignore
+""" Drivers for various PySCF electronic structure routines """
+from typing import Tuple, Optional
+import sys
+import h5py
+import numpy as np
+from pyscf import gto, scf, ao2mo, mcscf, lo, tools, cc
+from pyscf.mcscf import avas
+
+
+def stability(pyscf_mf):
+    """
+    Test wave function stability and re-optimize SCF.
+
+    Args:
+        pyscf_mf: PySCF mean field object (e.g. `scf.RHF()`)
+
+    Returns:
+        pyscf_mf: Updated PySCF mean field object
+    """
+    new_orbitals = pyscf_mf.stability()[0]
+    new_1rdm = pyscf_mf.make_rdm1(new_orbitals, pyscf_mf.mo_occ)
+    pyscf_mf = pyscf_mf.run(new_1rdm)
+
+    return pyscf_mf
+
+
+def localize(pyscf_mf, loc_type='pm', verbose=0):
+    """ Localize orbitals given a PySCF mean-field object
+
+    Args:
+        pyscf_mf:  PySCF mean field object
+        loc_type (str): localization type;
+            Pipek-Mezey ('pm') or Edmiston-Rudenberg ('er')
+        verbose (int): print level during localization
+
+    Returns:
+        pyscf_mf:  Updated PySCF mean field object with localized orbitals
+    """
+    # Note: After loading with `load_casfile_to_pyscf()` you can quiet message
+    # by resetting mf.mol, i.e., mf.mol = gto.M(...)
+    # but this assumes you have the *exact* molecular specification on hand.
+    # I've gotten acceptable results by restoring mf.mol this way (usually
+    # followed by calling mf.kernel()). But consistent localization is not a
+    # given (not unique) despite restoring data this way, hence the message.
+    if len(pyscf_mf.mol.atom) == 0:
+        sys.exit("`localize()` requires atom loc. and atomic basis to be" + \
+                 " defined.\n  " + \
+                 "It also can be sensitive to the initial guess and MO" + \
+                 " coefficients.\n  " + \
+                 "Best to try re-creating the PySCF molecule and doing the" + \
+                 " SCF, rather than\n  " + \
+                 "try to load the mean-field object with" + \
+                 " `load_casfile_to_pyscf()`. You can \n " + \
+                 "try to provide the missing information, but consistency" + \
+                 " cannot be guaranteed!")
+
+    # Split-localize (localize DOCC, SOCC, and virtual separately)
+    docc_idx = np.where(np.isclose(pyscf_mf.mo_occ, 2.))[0]
+    socc_idx = np.where(np.isclose(pyscf_mf.mo_occ, 1.))[0]
+    virt_idx = np.where(np.isclose(pyscf_mf.mo_occ, 0.))[0]
+
+    # Pipek-Mezey
+    if loc_type.lower() == 'pm':
+        print("Localizing doubly occupied ... ", end="")
+        loc_docc_mo = lo.PM(
+            pyscf_mf.mol,
+            pyscf_mf.mo_coeff[:, docc_idx]).kernel(verbose=verbose)
+        print("singly occupied ... ", end="")
+        loc_socc_mo = lo.PM(
+            pyscf_mf.mol,
+            pyscf_mf.mo_coeff[:, socc_idx]).kernel(verbose=verbose)
+        print("virtual ... ", end="")
+        loc_virt_mo = lo.PM(
+            pyscf_mf.mol,
+            pyscf_mf.mo_coeff[:, virt_idx]).kernel(verbose=verbose)
+        print("DONE")
+
+    # Edmiston-Rudenberg
+    elif loc_type.lower() == 'er':
+        print("Localizing doubly occupied ... ", end="")
+        loc_docc_mo = lo.ER(
+            pyscf_mf.mol,
+            pyscf_mf.mo_coeff[:, docc_idx]).kernel(verbose=verbose)
+        print("singly occupied ... ", end="")
+        loc_socc_mo = lo.ER(
+            pyscf_mf.mol,
+            pyscf_mf.mo_coeff[:, socc_idx]).kernel(verbose=verbose)
+        print("virtual ... ", end="")
+        loc_virt_mo = lo.ER(
+            pyscf_mf.mol,
+            pyscf_mf.mo_coeff[:, virt_idx]).kernel(verbose=verbose)
+        print("DONE")
+
+    # overwrite orbitals with localized orbitals
+    pyscf_mf.mo_coeff[:, docc_idx] = loc_docc_mo.copy()
+    pyscf_mf.mo_coeff[:, socc_idx] = loc_socc_mo.copy()
+    pyscf_mf.mo_coeff[:, virt_idx] = loc_virt_mo.copy()
+
+    return pyscf_mf
+
+
+def avas_active_space(pyscf_mf,
+                      ao_list=None,
+                      molden_fname='avas_localized_orbitals',
+                      **kwargs):
+    """ Return AVAS active space as PySCF molecule and mean-field object
+
+    Args:
+        pyscf_mf:  PySCF mean field object
+
+    Kwargs:
+        ao_list: list of strings of AOs (print mol.ao_labels() to see options)
+                 Example: ao_list = ['H 1s', 'O 2p', 'O 2s'] for water
+        verbose (bool): do additional print
+        molden_fname (str): MOLDEN filename to save AVAS active space orbitals.
+                            Default is to save
+                            to 'avas_localized_orbitals.molden'
+        **kwargs: other keyworded arguments to pass into avas.avas()
+
+    Returns:
+        pyscf_active_space_mol: Updated PySCF molecule object from
+                                AVAS-selected active space
+        pyscf_active_space_mf:  Updated PySCF mean field object from
+                                AVAS-selected active space
+    """
+
+    # Note: requires openshell_option = 3 for this to work, which keeps all
+    #     singly occupied in CAS
+    # we also require canonicalize = False so that we don't destroy local orbs
+    avas_output = avas.avas(pyscf_mf,
+                            ao_list,
+                            canonicalize=False,
+                            openshell_option=3,
+                            **kwargs)
+    active_norb, active_ne, reordered_orbitals = avas_output
+
+    active_alpha, _ = get_num_active_alpha_beta(pyscf_mf, active_ne)
+
+    if molden_fname is not None:
+        # save set of localized orbitals for active space
+        if isinstance(pyscf_mf, scf.rohf.ROHF):
+            frozen_alpha = pyscf_mf.nelec[0] - active_alpha
+            assert frozen_alpha >= 0
+        else:
+            frozen_alpha = pyscf_mf.mol.nelectron // 2 - active_alpha
+            assert frozen_alpha >= 0
+
+        active_space_idx = slice(frozen_alpha, frozen_alpha + active_norb)
+        active_mos = reordered_orbitals[:, active_space_idx]
+        tools.molden.from_mo(pyscf_mf.mol,
+                             molden_fname + '.molden',
+                             mo_coeff=active_mos)
+
+    # Choosing an active space changes the molecule ("freezing" electrons,
+    # for example), so we
+    # form the active space tensors first, then re-form the PySCF objects to
+    # ensure consistency
+    pyscf_active_space_mol, pyscf_active_space_mf = cas_to_pyscf(
+        *pyscf_to_cas(pyscf_mf,
+                      cas_orbitals=active_norb,
+                      cas_electrons=active_ne,
+                      avas_orbs=reordered_orbitals))
+
+    return pyscf_active_space_mol, pyscf_active_space_mf
+
+
+def cas_to_pyscf(h1, eri, ecore, num_alpha, num_beta):
+    """ Return a PySCF molecule and mean-field object from pre-computed CAS Ham
+
+    Args:
+        h1 (ndarray) - 2D matrix containing one-body terms (MO basis)
+        eri (ndarray) - 4D tensor containing two-body terms (MO basis)
+        ecore (float) - frozen core electronic energy + nuclear repulsion energy
+        num_alpha (int) - number of spin up electrons in CAS space
+        num_beta (int) - number of spin down electrons in CAS space
+
+    Returns:
+        pyscf_mol: PySCF molecule object
+        pyscf_mf:  PySCF mean field object
+    """
+
+    n_orb = len(h1)  # number orbitals
+    assert [n_orb] * 4 == [*eri.shape]  # check dims are consistent
+
+    pyscf_mol = gto.M()
+    pyscf_mol.nelectron = num_alpha + num_beta
+    n_orb = h1.shape[0]
+    alpha_diag = [1] * num_alpha + [0] * (n_orb - num_alpha)
+    beta_diag = [1] * num_beta + [0] * (n_orb - num_beta)
+
+    # Assumes Hamiltonian is either RHF or ROHF ... should be OK since UHF will
+    # have two h1s, etc.
+    if num_alpha == num_beta:
+        pyscf_mf = scf.RHF(pyscf_mol)
+        scf_energy = ecore + \
+                     2*np.einsum('ii',  h1[:num_alpha,:num_alpha]) + \
+                     2*np.einsum('iijj',
+                           eri[:num_alpha,:num_alpha,:num_alpha,:num_alpha]) - \
+                       np.einsum('ijji',
+                           eri[:num_alpha,:num_alpha,:num_alpha,:num_alpha])
+
+    else:
+        pyscf_mf = scf.ROHF(pyscf_mol)
+        pyscf_mf.nelec = (num_alpha, num_beta)
+        # grab singly and doubly occupied orbitals (assume high-spin open shell)
+        docc = slice(None, min(num_alpha, num_beta))
+        socc = slice(min(num_alpha, num_beta), max(num_alpha, num_beta))
+        scf_energy = ecore + \
+                     2.0*np.einsum('ii',h1[docc, docc]) + \
+                         np.einsum('ii',h1[socc, socc]) + \
+                     2.0*np.einsum('iijj',eri[docc, docc, docc, docc]) - \
+                         np.einsum('ijji',eri[docc, docc, docc, docc]) + \
+                         np.einsum('iijj',eri[socc, socc, docc, docc]) - \
+                     0.5*np.einsum('ijji',eri[socc, docc, docc, socc]) + \
+                         np.einsum('iijj',eri[docc, docc, socc, socc]) - \
+                     0.5*np.einsum('ijji',eri[docc, socc, socc, docc]) + \
+                     0.5*np.einsum('iijj',eri[socc, socc, socc, socc]) - \
+                     0.5*np.einsum('ijji',eri[socc, socc, socc, socc])
+
+    pyscf_mf.get_hcore = lambda *args: np.asarray(h1)
+    pyscf_mf.get_ovlp = lambda *args: np.eye(h1.shape[0])
+    pyscf_mf.energy_nuc = lambda *args: ecore
+    pyscf_mf._eri = eri  # ao2mo.restore('8', np.zeros((8, 8, 8, 8)), 8)
+    pyscf_mf.e_tot = scf_energy
+
+    pyscf_mf.init_guess = '1e'
+    pyscf_mf.mo_coeff = np.eye(n_orb)
+    pyscf_mf.mo_occ = np.array(alpha_diag) + np.array(beta_diag)
+    pyscf_mf.mo_energy, _ = np.linalg.eigh(pyscf_mf.get_fock())
+
+    return pyscf_mol, pyscf_mf
+
+
+def pyscf_to_cas(pyscf_mf,
+                 cas_orbitals: Optional[int] = None,
+                 cas_electrons: Optional[int] = None,
+                 avas_orbs=None):
+    """ Return CAS Hamiltonian tensors from a PySCF mean-field object
+
+    Args:
+        pyscf_mf: PySCF mean field object
+        cas_orbitals (int, optional):  number of orbitals in CAS space,
+                                       default all orbitals
+        cas_electrons (int, optional): number of electrons in CAS space,
+                                       default all electrons
+        avas_orbs (ndarray, optional): orbitals selected by AVAS in PySCF
+
+    Returns:
+        h1 (ndarray) - 2D matrix containing one-body terms (MO basis)
+        eri (ndarray) - 4D tensor containing two-body terms (MO basis)
+        ecore (float) - frozen core electronic energy + nuclear repulsion energy
+        num_alpha (int) - number of spin up electrons in CAS space
+        num_beta (int) - number of spin down electrons in CAS space
+    """
+
+    # Only RHF or ROHF possible with mcscf.CASCI
+    assert isinstance(pyscf_mf, scf.rhf.RHF)  # ROHF is child of RHF class
+
+    if cas_orbitals is None:
+        cas_orbitals = len(pyscf_mf.mo_coeff)
+    if cas_electrons is None:
+        cas_electrons = pyscf_mf.mol.nelectron
+
+    cas = mcscf.CASCI(pyscf_mf, ncas=cas_orbitals, nelecas=cas_electrons)
+    h1, ecore = cas.get_h1eff(mo_coeff=avas_orbs)
+    eri = cas.get_h2cas(mo_coeff=avas_orbs)
+    eri = ao2mo.restore('s1', eri, h1.shape[0])  # chemist convention (11|22)
+    ecore = float(ecore)
+
+    num_alpha, num_beta = get_num_active_alpha_beta(pyscf_mf, cas_electrons)
+
+    return h1, eri, ecore, num_alpha, num_beta
+
+
+def get_num_active_alpha_beta(pyscf_mf, cas_electrons):
+    """ Return number of alpha and beta electrons in the active space given
+        number of CAS electrons
+        This assumes that all the unpaired electrons are in the active space
+
+    Args:
+        pyscf_mf: PySCF mean field object
+        cas_orbitals (int):  number of electrons in CAS space,
+
+    Returns:
+        num_alpha (int): number of alpha (spin-up) electrons in active space
+        num_beta (int):  number of beta (spin-down) electrons in active space
+    """
+    # Sanity checks and active space info
+    total_electrons = pyscf_mf.mol.nelectron
+    frozen_electrons = total_electrons - cas_electrons
+    assert frozen_electrons % 2 == 0
+
+    # ROHF == RHF but RHF != ROHF, and we only do either RHF or ROHF
+    if isinstance(pyscf_mf, scf.rohf.ROHF):
+        frozen_alpha = frozen_electrons // 2
+        frozen_beta = frozen_electrons // 2
+        num_alpha = pyscf_mf.nelec[0] - frozen_alpha
+        num_beta = pyscf_mf.nelec[1] - frozen_beta
+        assert np.isclose(num_beta + num_alpha, cas_electrons)
+
+    else:
+        assert cas_electrons % 2 == 0
+        num_alpha = cas_electrons // 2
+        num_beta = cas_electrons // 2
+
+    return num_alpha, num_beta
+
+
+def load_casfile_to_pyscf(fname,
+                          num_alpha: Optional[int] = None,
+                          num_beta: Optional[int] = None):
+    """ Load CAS Hamiltonian from pre-computed HD5 file into a PySCF molecule
+        and mean-field object
+
+    Args:
+        fname (str): path to hd5 file to be created containing CAS one and two
+                     body terms
+        num_alpha (int, optional): number of spin up electrons in CAS space
+        num_beta (int, optional):  number of spin down electrons in CAS space
+
+    Returns:
+        pyscf_mol: PySCF molecule object
+        pyscf_mf:  PySCF mean field object
+    """
+
+    with h5py.File(fname, "r") as f:
+        eri = np.asarray(f['eri'][()])
+        # h1 one body elements are sometimes called different things. Try a few.
+        try:
+            h1 = np.asarray(f['h0'][()])
+        except KeyError:
+            try:
+                h1 = np.asarray(f['hcore'][()])
+            except KeyError:
+                try:
+                    h1 = np.asarray(f['h1'][()])
+                except KeyError:
+                    raise KeyError("Could not find 1-electron Hamiltonian")
+        # ecore sometimes exists, and sometimes as enuc (no frozen electrons)
+        try:
+            ecore = float(f['ecore'][()])
+        except KeyError:
+            try:
+                ecore = float(f['enuc'][()])
+            except KeyError:
+                ecore = 0.0
+        # read the number of spin up and spin down electrons if not input
+        if (num_alpha is None) or (num_beta is None):
+            try:
+                num_alpha = int(f['active_nalpha'][()])
+            except KeyError:
+                sys.exit("In `load_casfile_to_pyscf()`: \n" + \
+                         " No values found on file for num_alpha " + \
+                         "(key: 'active_nalpha' in h5). " + \
+                         " Try passing in a value for num_alpha, or" + \
+                         " re-check integral file.")
+            try:
+                num_beta = int(f['active_nbeta'][()])
+            except KeyError:
+                sys.exit("In `load_casfile_to_pyscf()`: \n" + \
+                         " No values found on file for num_beta " + \
+                         "(key: 'active_nbeta' in h5). " + \
+                         " Try passing in a value for num_beta, or" + \
+                         " re-check integral file.")
+
+    pyscf_mol, pyscf_mf = cas_to_pyscf(h1, eri, ecore, num_alpha, num_beta)
+
+    return pyscf_mol, pyscf_mf
+
+
+def save_pyscf_to_casfile(fname,
+                          pyscf_mf,
+                          cas_orbitals: Optional[int] = None,
+                          cas_electrons: Optional[int] = None,
+                          avas_orbs=None):
+    """ Save CAS Hamiltonian from a PySCF mean-field object to an HD5 file
+
+    Args:
+        fname (str): path to hd5 file to be created containing CAS terms
+        pyscf_mf: PySCF mean field object
+        cas_orbitals (int, optional):  number of orb in CAS space, default all
+        cas_electrons (int, optional): number of elec in CAS, default all elec
+        avas_orbs (ndarray, optional): orbitals selected by AVAS in PySCF
+    """
+    h1, eri, ecore, num_alpha, num_beta = \
+        pyscf_to_cas(pyscf_mf, cas_orbitals, cas_electrons, avas_orbs)
+
+    with h5py.File(fname, 'w') as fid:
+        fid.create_dataset('ecore', data=float(ecore), dtype=float)
+        fid.create_dataset(
+            'h0',
+            data=h1)  # note the name change to be consistent with THC paper
+        fid.create_dataset('eri', data=eri)
+        fid.create_dataset('active_nalpha', data=int(num_alpha), dtype=int)
+        fid.create_dataset('active_nbeta', data=int(num_beta), dtype=int)
+
+
+def factorized_ccsd_t(pyscf_mf, eri_rr = None, use_kernel = True,\
+    no_triples=False) -> Tuple[float, float, float]:
+    """ Compute CCSD(T) energy using rank-reduced ERIs
+
+    Args:
+        pyscf_mf - PySCF mean field object
+        eri_rr (ndarray) - rank-reduced ERIs, or use full ERIs from pyscf_mf
+        use_kernel (bool) - re-do SCF, using canonical orbitals for one-body?
+        no_triples (bool) - skip the perturbative triples correction? (CCSD)
+
+    Returns:
+        e_scf (float) - SCF energy
+        e_cor (float) - Correlation energy from CCSD(T)
+        e_tot (float) - Total energy; i.e. SCF + Corr energy from CCSD(T)
+    """
+    h1, eri_full, ecore, num_alpha, num_beta = pyscf_to_cas(pyscf_mf)
+
+    # If no rank-reduced ERIs, use the full (possibly local) ERIs from pyscf_mf
+    if eri_rr is None:
+        eri_rr = eri_full
+
+    e_scf, e_cor, e_tot = ccsd_t(h1, eri_rr, ecore, num_alpha, num_beta,\
+        eri_full, use_kernel, no_triples)
+
+    return e_scf, e_cor, e_tot
+
+
+def ccsd_t(h1, eri, ecore, num_alpha: int, num_beta: int, eri_full = None,\
+    use_kernel=True, no_triples=False) -> Tuple[float, float, float]:
+    """ Helper function to do CCSD(T) on set of one- and two-body Hamil elems
+
+    Args:
+        h1 (ndarray) - 2D matrix containing one-body terms (MO basis)
+        eri (ndarray) - 4D tensor containing two-body terms (MO basis)
+                        may be from integral factorization (e.g. SF/DF/THC)
+        ecore (float) - frozen core electronic energy + nuclear repulsion energy
+        num_alpha (int) - number of spin alpha electrons in Hamiltonian
+        num_beta (int) - number of spin beta electrons in Hamiltonian
+        eri_full (ndarray) - optional 4D tensor containing full two-body
+            terms (MO basis) for the SCF procedure only
+        use_kernel (bool) - re-run SCF prior to doing CCSD(T)?
+        no_triples (bool) - skip the perturbative triples correction? (CCSD)
+
+    Returns:
+        e_scf (float) - SCF energy
+        e_cor (float) - Correlation energy from CCSD(T)
+        e_tot (float) - Total energy; i.e. SCF + Corr energy from CCSD(T)
+    """
+
+    mol = gto.M()
+    mol.nelectron = num_alpha + num_beta
+    n_orb = h1.shape[0]
+    alpha_diag = [1] * num_alpha + [0] * (n_orb - num_alpha)
+    beta_diag = [1] * num_beta + [0] * (n_orb - num_beta)
+
+    # If eri_full not provided, use (possibly rank-reduced) ERIs for check
+    if eri_full is None:
+        eri_full = eri
+
+    # either RHF or ROHF ... should be OK since UHF will have two h1s, etc.
+    if num_alpha == num_beta:
+        mf = scf.RHF(mol)
+        scf_energy = ecore + \
+                     2*np.einsum('ii',h1[:num_alpha,:num_alpha]) + \
+                     2*np.einsum('iijj',eri_full[:num_alpha,\
+                                                 :num_alpha,\
+                                                 :num_alpha,\
+                                                 :num_alpha]) - \
+                       np.einsum('ijji',eri_full[:num_alpha,\
+                                                 :num_alpha,\
+                                                 :num_alpha,\
+                                                 :num_alpha])
+
+    else:
+        mf = scf.ROHF(mol)
+        mf.nelec = (num_alpha, num_beta)
+        # grab singly and doubly occupied orbitals (assume high-spin open shell)
+        docc = slice(None, min(num_alpha, num_beta))
+        socc = slice(min(num_alpha, num_beta), max(num_alpha, num_beta))
+        scf_energy = ecore + \
+                     2.0*np.einsum('ii',h1[docc, docc]) + \
+                         np.einsum('ii',h1[socc, socc]) + \
+                     2.0*np.einsum('iijj',eri_full[docc, docc, docc, docc]) - \
+                         np.einsum('ijji',eri_full[docc, docc, docc, docc]) + \
+                         np.einsum('iijj',eri_full[socc, socc, docc, docc]) - \
+                     0.5*np.einsum('ijji',eri_full[socc, docc, docc, socc]) + \
+                         np.einsum('iijj',eri_full[docc, docc, socc, socc]) - \
+                     0.5*np.einsum('ijji',eri_full[docc, socc, socc, docc]) + \
+                     0.5*np.einsum('iijj',eri_full[socc, socc, socc, socc]) - \
+                     0.5*np.einsum('ijji',eri_full[socc, socc, socc, socc])
+
+    mf.get_hcore = lambda *args: np.asarray(h1)
+    mf.get_ovlp = lambda *args: np.eye(h1.shape[0])
+    mf.energy_nuc = lambda *args: ecore
+    mf._eri = eri_full  # ao2mo.restore('8', np.zeros((8, 8, 8, 8)), 8)
+
+    mf.init_guess = '1e'
+    mf.mo_coeff = np.eye(n_orb)
+    mf.mo_occ = np.array(alpha_diag) + np.array(beta_diag)
+    w, _ = np.linalg.eigh(mf.get_fock())
+    mf.mo_energy = w
+
+    # Rotate the interaction tensors into the canonical basis.
+    # Reiher and Li tensors, for example, are read-in in the local MO basis,
+    # which is not optimal for the CCSD(T) calculation (canonical gives better
+    # energy estimate whereas QPE is invariant to choice of basis)
+    if use_kernel:
+        mf.conv_tol = 1e-7
+        mf.init_guess = '1e'
+        mf.verbose = 4
+        mf.diis_space = 24
+        mf.level_shift = 0.5
+        mf.conv_check = False
+        mf.max_cycle = 800
+        mf.kernel(mf.make_rdm1(mf.mo_coeff,
+                               mf.mo_occ))  # use MO info to generate guess
+        mf = stability(mf)
+        mf = stability(mf)
+        mf = stability(mf)
+
+        # Check if SCF has changed by doing restart, and print warning if so
+        try:
+            assert np.isclose(scf_energy, mf.e_tot, rtol=1e-14)
+        except AssertionError:
+            print(
+                "WARNING: E(SCF) from input integrals does not match E(SCF)" + \
+                " from mf.kernel()")
+            print("  Will use E(SCF) = {:12.6f} from mf.kernel going forward.".
+                  format(mf.e_tot))
+        print("E(SCF, ints) = {:12.6f} whereas E(SCF) = {:12.6f}".format(
+            scf_energy, mf.e_tot))
+
+        # New SCF energy and orbitals for CCSD(T)
+        scf_energy = mf.e_tot
+
+    # Now re-set the eri's to the (possibly rank-reduced) ERIs
+    mf._eri = eri
+    mf.mol.incore_anyway = True
+
+    mycc = cc.CCSD(mf)
+    mycc.max_cycle = 800
+    mycc.conv_tol = 1E-8
+    mycc.conv_tol_normt = 1E-4
+    mycc.diis_space = 24
+    mycc.verbose = 4
+    mycc.kernel()
+
+    if no_triples:
+        et = 0.0
+    else:
+        et = mycc.ccsd_t()
+
+    e_scf = scf_energy  # may be read-in value or 'fresh' SCF value
+    e_cor = mycc.e_corr + et
+    e_tot = e_scf + e_cor
+
+    print("E(SCF):       ", e_scf)
+    print("E(cor):       ", e_cor)
+    print("Total energy: ", e_tot)
+    return e_scf, e_cor, e_tot
+
+
+def open_shell_t1_d1(t1a, t1b, mo_occ, nalpha, nbeta):
+    """
+    T1-diagnostic for open-shell is defined w.r.t Sx eigenfunction of T1
+        where reference is ROHF.
+
+    given i double occ, c unoccupied, x is single occuplied The T1 amps
+      (high spin) in Sz basis are:
+    T1 = t_{ia}^{ca}(ca^ ia) + t_{ib}^{cb}(cb^ ib)
+       + t_{xa}^{ca}(ca^ xa) + t_{ib}^{xb}(xb^ ib)
+    T1 in the Sx basis are
+    T1 = f_{i}^{c}E_{ci} + v_{i}^{c}A_{ci}
+       + sqrt(2)f_{x}^{c}(ca^ xa) + sqrt(2)f_{i}^{x}(xb^ ib)
+
+    where E_{ci} = ca^ ia + cb^ ib and A_{ci} = ca^ ia - cb^ ib.
+
+    See:  The Journal of Chemical Physics 98, 9734 (1993);
+              doi: 10.1063/1.464352
+          Chemical Physics Letters 372 (2003) 362–367;
+              doi:10.1016/S0009-2614(03)00435-4
+
+    based on these and two papers from Lee the T1-openshell diagnostic is
+
+    sqrt(sum_{ia}(f_{ia})^2 + 2sum_{xa}(t_{xa}^{ca})^2
+        + 2 sum_{ix}(t_{ib}^{xb})^2) / 2 sqrt{N}
+
+    To get this relate eqs 3-7 from Chemical Physics Letters 372 (2003) 362–367
+    to Eqs. 45, 46, and 51 from Journal of Chemical Physics 98, 9734 (1993);
+    doi: 10.1063/1.464352.
+    """
+    # compute t1-diagnostic
+    docc_idx = np.where(np.isclose(mo_occ, 2.))[0]
+    socc_idx = np.where(np.isclose(mo_occ, 1.))[0]
+    virt_idx = np.where(np.isclose(mo_occ, 0.))[0]
+    t1a_docc = t1a[docc_idx, :]  # double occ-> virtual
+    t1b_docc = t1b[docc_idx, :][:, -len(virt_idx):]  # double occ-> virtual
+    if len(socc_idx) > 0:
+        t1_xa = t1a[socc_idx, :]  # single occ -> virtual
+        t1_ix = t1b[docc_idx, :][:, :len(socc_idx)]  # double occ -> single occ
+    else:
+        t1_xa = np.array(())
+        t1_ix = np.array(())
+
+    if nalpha - nbeta + len(virt_idx) != t1b.shape[1]:
+        raise ValueError(
+            "Inconsistent shapes na {}, nb {}, t1b.shape {},{}".format(
+                nalpha, nbeta, t1b.shape[0], t1b.shape[1]))
+
+    if t1a_docc.shape != (len(docc_idx), len(virt_idx)):
+        raise ValueError("T1a_ia does not have the right shape")
+    if t1b_docc.shape != (len(docc_idx), len(virt_idx)):
+        raise ValueError("T1b_ia does not have the right shape")
+    if len(socc_idx) > 0:
+        if t1_ix.shape != (len(docc_idx), len(socc_idx)):
+            raise ValueError("T1_ix does not have the right shape")
+        if t1_xa.shape != (len(socc_idx), len(virt_idx)):
+            raise ValueError("T1_xa does not have the right shape")
+
+    t1_diagnostic = np.sqrt(
+        np.sum((t1a_docc + t1b_docc)**2) + 2 * np.sum(t1_xa**2) +
+        2 * np.sum(t1_ix**2)) / (2 * np.sqrt(nalpha + nbeta))
+    # compute D1-diagnostic
+    f_ia = 0.5 * (t1a_docc + t1b_docc)
+    s_f_ia_2, _ = np.linalg.eigh(f_ia @ f_ia.T)
+    s_f_ia_2_norm = np.sqrt(np.max(s_f_ia_2, initial=0))
+
+    if len(socc_idx) > 0:
+        f_xa = np.sqrt(1 / 2) * t1_xa
+        f_ix = np.sqrt(1 / 2) * t1_ix
+        s_f_xa_2, _ = np.linalg.eigh(f_xa @ f_xa.T)
+        s_f_ix_2, _ = np.linalg.eigh(f_ix @ f_ix.T)
+    else:
+        s_f_xa_2 = np.array(())
+        s_f_ix_2 = np.array(())
+    s_f_xa_2_norm = np.sqrt(np.max(s_f_xa_2, initial=0))
+    s_f_ix_2_norm = np.sqrt(np.max(s_f_ix_2, initial=0))
+
+    d1_diagnostic = np.max(
+        np.array([s_f_ia_2_norm, s_f_xa_2_norm, s_f_ix_2_norm]))
+
+    return t1_diagnostic, d1_diagnostic
diff --git a/src/openfermion/resource_estimates/molecule/pyscf_utils_test.py b/src/openfermion/resource_estimates/molecule/pyscf_utils_test.py
new file mode 100644
index 000000000..27a50f6ad
--- /dev/null
+++ b/src/openfermion/resource_estimates/molecule/pyscf_utils_test.py
@@ -0,0 +1,264 @@
+#coverage:ignore
+"""Test cases for pyscf_utils.py
+"""
+from os import path
+import unittest
+import pytest
+import numpy as np
+
+try:
+    from pyscf import gto, scf, cc
+    HAS_PYSCF = True
+except ModuleNotFoundError:
+    # resource_estimates depend on pyscf, which may not be installed
+    HAS_PYSCF = False
+
+if HAS_PYSCF:
+    from openfermion.resource_estimates import sf, df
+    from openfermion.resource_estimates.utils import QR, QI, QR2, QI2, power_two
+    from openfermion.resource_estimates.molecule import (load_casfile_to_pyscf,
+                                                         pyscf_to_cas, ccsd_t,
+                                                         stability,
+                                                         factorized_ccsd_t,
+                                                         open_shell_t1_d1)
+
+
+@pytest.mark.skipif(not HAS_PYSCF, reason='Not detecting `pyscf`.')
+class OpenFermionPyscfUtilsTest(unittest.TestCase):
+
+    def test_full_ccsd_t(self):
+        """ Test resource_estimates full CCSD(T) from h1/eri/ecore tensors
+            matches regular PySCF CCSD(T)
+        """
+
+        for scf_type in ['rhf', 'rohf']:
+            mol = gto.Mole()
+            mol.atom = 'H 0 0 0; F 0 0 1.1'
+            mol.charge = 0
+            if scf_type == 'rhf':
+                mol.spin = 0
+            elif scf_type == 'rohf':
+                mol.spin = 2
+            mol.basis = 'ccpvtz'
+            mol.symmetry = False
+            mol.build()
+
+            if scf_type == 'rhf':
+                mf = scf.RHF(mol)
+            elif scf_type == 'rohf':
+                mf = scf.ROHF(mol)
+
+            mf.init_guess = 'mindo'
+            mf.conv_tol = 1e-10
+            mf.kernel()
+            mf = stability(mf)
+
+            # Do PySCF CCSD(T)
+            mycc = cc.CCSD(mf)
+            mycc.max_cycle = 500
+            mycc.conv_tol = 1E-9
+            mycc.conv_tol_normt = 1E-5
+            mycc.diis_space = 24
+            mycc.diis_start_cycle = 4
+            mycc.kernel()
+            et = mycc.ccsd_t()
+
+            pyscf_escf = mf.e_tot
+            pyscf_ecor = mycc.e_corr + et
+            pyscf_etot = pyscf_escf + pyscf_ecor
+            pyscf_results = np.array([pyscf_escf, pyscf_ecor, pyscf_etot])
+
+            n_elec = mol.nelectron
+            n_orb = mf.mo_coeff[0].shape[-1]
+
+            resource_estimates_results = ccsd_t(
+                *pyscf_to_cas(mf, n_orb, n_elec))
+            resource_estimates_results = np.asarray(resource_estimates_results)
+
+            # ignore relative tolerance, we just want absolute tolerance
+            assert np.allclose(pyscf_results,
+                               resource_estimates_results,
+                               rtol=1E-14)
+
+    def test_reduced_ccsd_t(self):
+        """ Test resource_estimates reduced (2e space) CCSD(T) from tensors
+            matches PySCF CAS(2e,No)
+        """
+
+        for scf_type in ['rhf', 'rohf']:
+            mol = gto.Mole()
+            mol.atom = 'H 0 0 0; F 0 0 1.1'
+            mol.charge = 0
+            if scf_type == 'rhf':
+                mol.spin = 0
+            elif scf_type == 'rohf':
+                mol.spin = 2
+            mol.basis = 'ccpvtz'
+            mol.symmetry = False
+            mol.build()
+
+            if scf_type == 'rhf':
+                mf = scf.RHF(mol)
+            elif scf_type == 'rohf':
+                mf = scf.ROHF(mol)
+
+            mf.init_guess = 'mindo'
+            mf.conv_tol = 1e-10
+            mf.kernel()
+            mf = stability(mf)
+
+            # PySCF CAS(No,2e) for 2 electrons CCSD (and so CCSD(T)) is exact
+            n_elec = 2  # electrons
+            n_orb = mf.mo_coeff[0].shape[-1] - mf.mol.nelectron - n_elec
+            mycas = mf.CASCI(n_orb, n_elec).run()
+
+            pyscf_etot = mycas.e_tot
+
+            # Don't do triples (it's zero anyway for 2e) b/c div by zero w/ ROHF
+            _, _, resource_estimates_etot = ccsd_t(
+                *pyscf_to_cas(mf, n_orb, n_elec), no_triples=True)
+
+            # ignore relative tolerance, we just want absolute tolerance
+            assert np.isclose(pyscf_etot, resource_estimates_etot, rtol=1E-14)
+
+    def test_reiher_sf_ccsd_t(self):
+        """ Reproduce Reiher et al FeMoco SF CCSD(T) errors from paper """
+
+        NAME = path.join(path.dirname(__file__), '../integrals/eri_reiher.h5')
+        _, mf = load_casfile_to_pyscf(NAME, num_alpha=27, num_beta=27)
+        _, ecorr, _ = factorized_ccsd_t(
+            mf, eri_rr=None)  # use full (local) ERIs for 2-body
+        exact_energy = ecorr
+        rank = 100
+        eri_rr, _ = sf.factorize(mf._eri, rank)
+        _, ecorr, _ = factorized_ccsd_t(mf, eri_rr)
+        appx_energy = ecorr
+
+        error = (appx_energy - exact_energy) * 1E3  # mEh
+
+        assert np.isclose(np.round(error, decimals=2), 1.55)
+
+    def test_reiher_df_ccsd_t(self):
+        """ Reproduce Reiher et al FeMoco DF CCSD(T) errors from paper """
+
+        NAME = path.join(path.dirname(__file__), '../integrals/eri_reiher.h5')
+        _, mf = load_casfile_to_pyscf(NAME, num_alpha=27, num_beta=27)
+        _, ecorr, _ = factorized_ccsd_t(
+            mf, eri_rr=None)  # use full (local) ERIs for 2-body
+        exact_energy = ecorr
+        appx_energy = []
+        THRESH = 0.00125
+        eri_rr, _, _, _ = df.factorize(mf._eri, thresh=THRESH)
+        _, ecorr, _ = factorized_ccsd_t(mf, eri_rr)
+        appx_energy = ecorr
+
+        error = (appx_energy - exact_energy) * 1E3  # mEh
+
+        assert np.isclose(np.round(error, decimals=2), 0.44)
+
+    def test_t1_d1_openshell(self):
+        """Test open shell t1-diagnostic by reducing back to closed shell"""
+        mol = gto.M()
+        mol.atom = 'N 0 0 0; N 0 0 1.4'
+        mol.basis = 'cc-pvtz'
+        mol.spin = 0
+        mol.build()
+
+        mf = scf.RHF(mol)
+        mf.kernel()
+
+        mycc = cc.CCSD(mf)
+        mycc.kernel()
+
+        true_t1d, true_d1d = mycc.get_t1_diagnostic(), mycc.get_d1_diagnostic()
+
+        uhf_mf = scf.convert_to_uhf(mf)
+        mycc_uhf = cc.CCSD(uhf_mf)
+        mycc_uhf.kernel()
+        t1a, t1b = mycc_uhf.t1
+        test_t1d, test_d1d = open_shell_t1_d1(
+            t1a, t1b, uhf_mf.mo_occ[0] + uhf_mf.mo_occ[1], uhf_mf.nelec[0],
+            uhf_mf.nelec[1])
+
+        assert np.isclose(test_t1d, true_t1d)
+        assert np.isclose(test_d1d, true_d1d)
+        assert np.sqrt(2) * test_t1d <= test_d1d
+
+    def test_t1_d1_oxygen(self):
+        """Test open shell t1-diagnostic on O2 molecule
+
+        Compare with output from Psi4
+
+        * Input:
+
+        molecule oxygen {
+          0 3
+          O 0.0 0.0 0.0
+          O 0.0 0.0 1.1
+          no_reorient
+          symmetry c1
+        }
+
+        set {
+          reference rohf
+          basis cc-pvtz
+        }
+
+        energy('CCSD')
+
+        * Output
+        @ROHF Final Energy:  -149.65170765644311
+
+                       Solving CC Amplitude Equations
+                        ------------------------------
+        Iter     Energy           RMS        T1Diag      D1Diag    New D1Diag
+        ----     ----------    ---------   ----------  ----------  ----------
+        ...
+        10        -0.464506    1.907e-07    0.004390    0.009077    0.009077
+        11        -0.464506    5.104e-08    0.004390    0.009077    0.009077
+
+        Iterations converged.
+        """
+
+        mol = gto.M()
+        mol.atom = 'O 0 0 0; O 0 0 1.1'
+        mol.basis = 'cc-pvtz'
+        mol.spin = 2
+        mol.build()
+
+        mf = scf.ROHF(mol)
+        mf.kernel()
+
+        uhf_mf = scf.convert_to_uhf(mf)
+        mycc_uhf = cc.CCSD(mf)
+        mycc_uhf.kernel()
+
+        t1a, t1b = mycc_uhf.t1
+        test_t1d, test_d1d = open_shell_t1_d1(
+            t1a, t1b, uhf_mf.mo_occ[0] + uhf_mf.mo_occ[1], uhf_mf.nelec[0],
+            uhf_mf.nelec[1])
+
+        assert np.isclose(mf.e_tot, -149.651708, atol=1e-6)
+        assert np.isclose(mycc_uhf.e_corr, -0.464507, atol=1e-6)
+        assert np.isclose(test_t1d, 0.004390, atol=1e-4)
+        assert np.isclose(test_d1d, 0.009077, atol=1e-4)
+
+    def test_t1_d1_bound(self):
+        """sqrt(2) * t1 <= d1"""
+        mol = gto.M()
+        mol.atom = 'O 0 0 0; O 0 0 1.4'
+        mol.basis = 'cc-pvtz'
+        mol.spin = 2
+        mol.build()
+        mf = scf.ROHF(mol)
+        mf.kernel()
+        mycc = cc.CCSD(mf)
+        mycc.kernel()
+        uhf_mf = scf.convert_to_uhf(mf)
+        mycc_uhf = cc.CCSD(uhf_mf)
+        mycc_uhf.kernel()
+        t1a, t1b = mycc_uhf.t1
+        test_t1d, test_d1d = open_shell_t1_d1(
+            t1a, t1b, uhf_mf.mo_occ[0] + uhf_mf.mo_occ[1], uhf_mf.nelec[0],
+            uhf_mf.nelec[1])
+        assert np.sqrt(2) * test_t1d <= test_d1d
diff --git a/src/openfermion/resource_estimates/sf/__init__.py b/src/openfermion/resource_estimates/sf/__init__.py
new file mode 100644
index 000000000..09185450d
--- /dev/null
+++ b/src/openfermion/resource_estimates/sf/__init__.py
@@ -0,0 +1,19 @@
+#coverage:ignore
+#   Copyright 2020 Google LLC
+
+#   Licensed under the Apache License, Version 2.0 (the "License");
+#   you may not use this file except in compliance with the License.
+#   You may obtain a copy of the License at
+#
+#       http://www.apache.org/licenses/LICENSE-2.0
+#
+#   Unless required by applicable law or agreed to in writing, software
+#   distributed under the License is distributed on an "AS IS" BASIS,
+#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#   See the License for the specific language governing permissions and
+#   limitations under the License.
+
+from .compute_lambda_sf import compute_lambda
+from .compute_cost_sf import compute_cost
+from .factorize_sf import factorize
+from .generate_costing_table_sf import generate_costing_table
diff --git a/src/openfermion/resource_estimates/sf/compute_cost_sf.py b/src/openfermion/resource_estimates/sf/compute_cost_sf.py
new file mode 100644
index 000000000..f1686a2d5
--- /dev/null
+++ b/src/openfermion/resource_estimates/sf/compute_cost_sf.py
@@ -0,0 +1,183 @@
+#coverage:ignore
+""" Determine costs for SF decomposition in QC """
+from typing import Tuple
+import numpy as np
+from numpy.lib.scimath import arccos, arcsin  # has analytic continutn to cplx
+from openfermion.resource_estimates.utils import QR, QI, QR2, power_two
+
+
+def compute_cost(n: int,
+                 lam: float,
+                 dE: float,
+                 L: int,
+                 chi: int,
+                 stps: int,
+                 verbose: bool = False) -> Tuple[int, int, int]:
+    """ Determine fault-tolerant costs using SF decomposition in quantum chem
+
+    Args:
+        n (int) - the number of spin-orbitals
+        lam (float) - the lambda-value for the Hamiltonian
+        dE (float) - allowable error in phase estimation
+        L (int) - the rank of the first decomposition
+        chi (int) - equivalent to aleph_1 and aleph_2 in the document, the
+            number of bits for the representation of the coefficients
+        stps (int) - an approximate number of steps to choose the precision of
+            single qubit rotations in preparation of the equal superposn state
+        verbose (bool) - do additional printing of intermediates?
+
+    Returns:
+        step_cost (int) - Toffolis per step
+        total_cost (int) - Total number of Toffolis
+        total_qubit_count (int) - Total qubit count
+    """
+
+    # Number of trailing zeros
+    eta = power_two(L + 1)
+
+    # Number of qubits for the first register
+    nL = np.ceil(np.log2(L + 1))
+
+    # Number of qubits for p and q registers
+    nN = np.ceil(np.log2(n // 2))
+
+    oh = [0] * 20
+    for p in range(20):
+        # JJG note: arccos arg may be > 1
+        v = np.round(np.power(2,p+1) / (2 * np.pi) * arccos(np.power(2,nL) /\
+            np.sqrt((L + 1)/2**eta)/2))
+        oh[p] = np.real(stps * (1 / (np.sin(3 * arcsin(np.cos(v * 2 * np.pi / \
+            np.power(2,p+1)) * \
+            np.sqrt((L + 1)/2**eta) / np.power(2,nL)))**2) - 1) + 4 * (p + 1))
+
+    # Bits of precision for rotation
+    br = int(np.argmin(oh) + 1)
+
+    # The following costs are given in the list of steps on pages 43 and 44.
+
+    # Cost of preparing the equal superposition state on the 1st register in 1a
+    cost1a = 2 * (3 * nL - 3 * eta + 2 * br - 9)
+
+    # We have added two qubits
+    bL = nL + chi + 2
+
+    # QROM costs for first register preparation in step 1(b)
+    cost1b = QR(L + 1, bL)[-1] + QI(L + 1)[-1]
+
+    # The inequality test of cost chi in step 1(c) and the controlled swap with
+    # cost nL + 1 in step 1(d) (and their inversions).
+    cost1cd = 2 * (chi + nL + 1)
+
+    oh = [0] * 20
+    nprime = int(n**2 // 8 + n // 4)
+    for p in range(20):
+        v = np.round(
+            np.power(2, p + 1) / (2 * np.pi) *
+            arccos(np.power(2, 2 * nN) / np.sqrt(nprime) / 2))
+        oh[p] = np.real(20000 * (1 / (np.sin(3 * arcsin(np.cos(v * 2 * np.pi / \
+            np.power(2, p+1)) * \
+            np.sqrt(nprime) / np.power(2,2*nN)))**2) - 1) + 4 * (p + 1))
+
+    # Bits of precision for rotation for preparing the next equal
+    # superposition states
+    br = int(np.argmin(oh) + 1)
+
+    # Cost of preparing equal superposition of p and q registers in step 2(a).
+    cost2a = 4 * (6 * nN + 2 * br - 7)
+
+    # Cost of computing contiguous register in step 2 (b).
+    cost2b = 4 * (nN**2 + nN - 1)
+
+    # Number of coefficients for first state preparation on p & q.
+    n1 = (L + 1) * nprime
+
+    # Number of coefficients for second state preparation on p & q.
+    n2 = L * nprime
+
+    # The output size for the QROM for the second state preparation.
+    bp = int(2 * nN + chi + 2)
+
+    # Cost of QROMs for state preparation on p and q in step 2 (c).
+    cost2c = QR2(L + 1, nprime, bp)[-1] + QI(n1)[-1] + QR2(L, nprime,
+                                                           bp)[-1] + QI(n2)[-1]
+
+    # The cost of the inequality test and controlled swap for the quantum alias
+    # sampling in steps 2 (d) and (e).
+    cost2de = 4 * (chi + 2 * nN)
+
+    # Swapping the p & q registers for symmetry in step 3.  This needs to be
+    # done and inverted twice, hence the factor of 4.
+    cost3 = 4 * nN
+
+    # The SELECT operation in step 4, which needs to be done twice.
+    cost4 = 4 * n - 8
+
+    # The reflection used on the 2nd register in the middle, with cost in step 6
+    cost6 = 2 * nN + chi + 3
+
+    # Step 7 involves performing steps 2 to 5 again, which are accounted for
+    # above, but there is one extra Toffoli for checking that l <> 0.
+    cost7 = 1
+
+    # The total reflection for the quantum walk in step 9.
+    cost9 = nL + 2 * nN + 2 * chi + 2
+
+    # The two Toffolis for the control for phase estimation and iteration,
+    # given in step 10.
+    cost10 = 2
+
+    # The number of steps.
+    iters = np.ceil(np.pi * lam / (dE * 2))
+
+    # The total Toffoli costs for a step.
+    cost = cost1a + cost1b + cost1cd + cost2a + cost2b + cost2c + cost2de + \
+           cost3 + cost4 + cost6 + cost7 + cost9 + cost10
+
+    # Control for phase estimation and its iteration
+    ac1 = 2 * np.ceil(np.log2(iters)) - 1
+
+    # System qubits
+    ac2 = n
+
+    # First ell register that rotated and the flag for success
+    ac3 = nL + 2
+
+    # State preparation on the first register
+    ac4 = nL + 2 * chi + 3
+
+    # For p and q registers, the rotated qubit and the success flag
+    ac5 = 2 * nN + 2
+
+    # The size of the contiguous register
+    ac6 = np.ceil(np.log2(nprime))
+
+    kp = QR2(L + 1, nprime, bp)[:2]
+
+    # The equal superposition state for the second state preparation
+    ac7 = chi
+
+    # The phase gradient state
+    ac8 = br
+
+    # The QROM on the p & q registers
+    ac9 = kp[0] * kp[1] * bp + np.ceil(np.log2(
+        (L + 1) / kp[0])) + np.ceil(np.log2(nprime / kp[1]))
+
+    if verbose:
+        print("[*] Top of routine")
+        print("  [+] eta = ", eta)
+        print("  [+] nL = ", nL)
+        print("  [+] nN = ", nN)
+
+    total_qubit_count = ac1 + ac2 + ac3 + ac4 + ac5 + ac6 + ac7 + ac8 + ac9
+
+    # Sanity checks before returning as int
+    assert cost.is_integer()
+    assert iters.is_integer()
+    assert total_qubit_count.is_integer()
+
+    step_cost = int(cost)
+    total_cost = int(cost * iters)
+    total_qubit_count = int(total_qubit_count)
+
+    return step_cost, total_cost, total_qubit_count
diff --git a/src/openfermion/resource_estimates/sf/compute_cost_sf_test.py b/src/openfermion/resource_estimates/sf/compute_cost_sf_test.py
new file mode 100644
index 000000000..ebd355106
--- /dev/null
+++ b/src/openfermion/resource_estimates/sf/compute_cost_sf_test.py
@@ -0,0 +1,41 @@
+#coverage:ignore
+"""Test cases for costing_sf.py
+"""
+from openfermion.resource_estimates import sf
+
+
+def test_reiher_sf():
+    """ Reproduce Reiher et al orbital SF FT costs from paper """
+    DE = 0.001
+    CHI = 10
+
+    # Reiher et al orbitals
+    N = 108
+    LAM = 4258.0
+    L = 200
+
+    # Here we're using an initial calculation with a very rough estimate of the
+    # number of steps to give a more accurate number of steps.
+    # Then we input that into the function again.
+    output = sf.compute_cost(N, LAM, DE, L, CHI, stps=20000)
+    stps1 = output[0]
+    output = sf.compute_cost(N, LAM, DE, L, CHI, stps1)
+    assert output == (14184, 94868988984, 3320)
+
+
+def test_li_sf():
+    """ Reproduce Li et al orbital SF FT costs from paper """
+    DE = 0.001
+    CHI = 10
+
+    # Li et al orbitals
+    N = 152
+    LAM = 3071.8
+    L = 275
+    # Here we're using an initial calculation with a very rough estimate of the
+    # number of steps to give a more accurate number of steps.
+    # Then we input that into the function again.
+    output = sf.compute_cost(N, LAM, DE, L, CHI, stps=20000)
+    stps2 = output[0]
+    output = sf.compute_cost(N, LAM, DE, L, CHI, stps2)
+    assert output == (24396, 117714920508, 3628)
diff --git a/src/openfermion/resource_estimates/sf/compute_lambda_sf.py b/src/openfermion/resource_estimates/sf/compute_lambda_sf.py
new file mode 100644
index 000000000..d0a46dc4c
--- /dev/null
+++ b/src/openfermion/resource_estimates/sf/compute_lambda_sf.py
@@ -0,0 +1,34 @@
+#coverage:ignore
+""" Compute lambda for single low rank factorization method of Berry, et al """
+
+import numpy as np
+from openfermion.resource_estimates.molecule import pyscf_to_cas
+
+
+def compute_lambda(pyscf_mf, sf_factors):
+    """ Compute lambda for Hamiltonian using SF method of Berry, et al.
+
+    Args:
+        pyscf_mf - PySCF mean field object
+        sf_factors (ndarray) - (N x N x rank) array of SF factors from rank
+            reduction of ERI
+
+    Returns:
+        lambda_tot (float) - lambda value for the single factorized Hamiltonian
+    """
+
+    # grab tensors from pyscf_mf object
+    h1, eri_full, _, _, _ = pyscf_to_cas(pyscf_mf)
+
+    # Effective one electron operator contribution
+    T = h1 - 0.5 * np.einsum("pqqs->ps", eri_full, optimize=True) +\
+        np.einsum("pqrr->pq", eri_full, optimize = True)
+
+    lambda_T = np.sum(np.abs(T))
+
+    # Two electron operator contributions
+    lambda_W = 0.25 * np.einsum(
+        "ijP,klP->", np.abs(sf_factors), np.abs(sf_factors), optimize=True)
+    lambda_tot = lambda_T + lambda_W
+
+    return lambda_tot
diff --git a/src/openfermion/resource_estimates/sf/compute_lambda_sf_test.py b/src/openfermion/resource_estimates/sf/compute_lambda_sf_test.py
new file mode 100644
index 000000000..6b9092345
--- /dev/null
+++ b/src/openfermion/resource_estimates/sf/compute_lambda_sf_test.py
@@ -0,0 +1,20 @@
+#coverage:ignore
+"""Test cases for compute_lambda_sf.py
+"""
+from os import path
+import numpy as np
+from openfermion.resource_estimates import sf
+from openfermion.resource_estimates.molecule import load_casfile_to_pyscf
+
+
+def test_reiher_sf_lambda():
+    """ Reproduce Reiher et al orbital SF lambda from paper """
+
+    RANK = 200
+
+    NAME = path.join(path.dirname(__file__), '../integrals/eri_reiher.h5')
+    _, reiher_mf = load_casfile_to_pyscf(NAME, num_alpha=27, num_beta=27)
+    eri_rr, sf_factors = sf.factorize(reiher_mf._eri, RANK)
+    lambda_tot = sf.compute_lambda(reiher_mf, sf_factors)
+    assert eri_rr.shape[0] * 2 == 108
+    assert np.isclose(lambda_tot, 4258.0)
diff --git a/src/openfermion/resource_estimates/sf/factorize_sf.py b/src/openfermion/resource_estimates/sf/factorize_sf.py
new file mode 100644
index 000000000..c4f7391b5
--- /dev/null
+++ b/src/openfermion/resource_estimates/sf/factorize_sf.py
@@ -0,0 +1,39 @@
+#coverage:ignore
+""" Single factorization of the ERI tensor """
+import numpy as np
+from openfermion.resource_estimates.utils import eigendecomp
+
+
+def factorize(eri_full, rank):
+    """ Do single factorization of the ERI tensor
+
+    Args:
+       eri_full (np.ndarray) - 4D (N x N x N x N) full ERI tensor
+       rank (int) - number of vectors to retain in ERI rank-reduction procedure
+
+    Returns:
+       eri_rr (np.ndarray) - 4D approximate ERI tensor reconstructed from LR vec
+       LR (np.ndarray) - 3D (N x N x rank) tensor containing SF vectors
+    """
+    n_orb = eri_full.shape[0]
+    assert n_orb**4 == len(eri_full.flatten())
+
+    L = eigendecomp(eri_full.reshape(n_orb**2, n_orb**2), tol=1e-16)
+
+    # Do rank-reduction of ERIs following ERI factorization
+    if rank is None:
+        LR = L[:, :]
+    else:
+        LR = L[:, :rank]
+    eri_rr = np.einsum('ik,kj->ij', LR, LR.T, optimize=True)
+    eri_rr = eri_rr.reshape(n_orb, n_orb, n_orb, n_orb)
+    LR = LR.reshape(n_orb, n_orb, -1)
+    if rank is not None:
+        try:
+            assert LR.shape[2] == rank
+        except AssertionError:
+            sys.exit(
+                "LR.shape:     %s\nrank: %s\nLR.shape and rank are inconsistent"
+                % (LR.shape, rank))
+
+    return eri_rr, LR
diff --git a/src/openfermion/resource_estimates/sf/generate_costing_table_sf.py b/src/openfermion/resource_estimates/sf/generate_costing_table_sf.py
new file mode 100644
index 000000000..7eadb94d1
--- /dev/null
+++ b/src/openfermion/resource_estimates/sf/generate_costing_table_sf.py
@@ -0,0 +1,126 @@
+#coverage:ignore
+""" Pretty-print a table comparing number of SF vectors versus acc and cost """
+import numpy as np
+from pyscf import scf
+from openfermion.resource_estimates import sf
+from openfermion.resource_estimates.molecule import (factorized_ccsd_t,
+                                                     cas_to_pyscf, pyscf_to_cas)
+
+
+def generate_costing_table(pyscf_mf,
+                           name='molecule',
+                           rank_range=range(50, 401, 25),
+                           chi=10,
+                           dE=0.001,
+                           use_kernel=True,
+                           no_triples=False):
+    """ Print a table to file for how various SF ranks impact cost, acc., etc.
+
+    Args:
+        pyscf_mf - PySCF mean field object
+        name (str) - file will be saved to 'single_factorization_<name>.txt'
+        rank_range (list of ints) - list number of vectors to retain in SF alg
+        dE (float) - max allowable phase error (default: 0.001)
+        chi (int) - number of bits for representation of coefficients
+           (default: 10)
+        use_kernel (bool) - re-do SCF prior to estimating CCSD(T) error?
+            Will use canonical orbitals and full ERIs for the one-body
+            contributions, using rank-reduced ERIs for two-body
+        no_triples (bool) - if True, skip the (T) correction, doing only CCSD
+
+    Returns:
+       None
+    """
+
+    DE = dE  # max allowable phase error
+    CHI = chi  # number of bits for representation of coefficients
+
+    if isinstance(pyscf_mf, scf.rohf.ROHF):
+        num_alpha, num_beta = pyscf_mf.nelec
+        assert num_alpha + num_beta == pyscf_mf.mol.nelectron
+    else:
+        assert pyscf_mf.mol.nelectron % 2 == 0
+        num_alpha = pyscf_mf.mol.nelectron // 2
+        num_beta = num_alpha
+
+    num_orb = len(pyscf_mf.mo_coeff)
+    num_spinorb = num_orb * 2
+
+    cas_info = "CAS((%sa, %sb), %so)" % (num_alpha, num_beta, num_orb)
+
+    try:
+        assert num_orb**4 == len(pyscf_mf._eri.flatten())
+    except AssertionError:
+        # ERIs are not in correct form in pyscf_mf._eri, so this is a quick prep
+        _, pyscf_mf = cas_to_pyscf(*pyscf_to_cas(pyscf_mf))
+
+    # Reference calculation (eri_rr = None is full rank / exact ERIs)
+    escf, ecor, etot = factorized_ccsd_t(pyscf_mf,
+                                         eri_rr=None,
+                                         use_kernel=use_kernel,
+                                         no_triples=no_triples)
+
+    exact_etot = etot
+
+    filename = 'single_factorization_' + name + '.txt'
+
+    with open(filename, 'w') as f:
+        print("\n Single low rank factorization data for '" + name + "'.",
+              file=f)
+        print("    [*] using " + cas_info, file=f)
+        print("        [+]                      E(SCF): %18.8f" % escf, file=f)
+        if no_triples:
+            print("        [+]    Active space CCSD E(cor): %18.8f" % ecor,
+                  file=f)
+            print("        [+]    Active space CCSD E(tot): %18.8f" % etot,
+                  file=f)
+        else:
+            print("        [+] Active space CCSD(T) E(cor): %18.8f" % ecor,
+                  file=f)
+            print("        [+] Active space CCSD(T) E(tot): %18.8f" % etot,
+                  file=f)
+        print("{}".format('=' * 108), file=f)
+        if no_triples:
+            print("{:^12} {:^18} {:^12} {:^24} {:^20} {:^20}".format('L',\
+            '||ERI - SF||','lambda','CCSD error (mEh)',\
+            'logical qubits', 'Toffoli count'),file=f)
+        else:
+            print("{:^12} {:^18} {:^12} {:^24} {:^20} {:^20}".format('L',\
+            '||ERI - SF||','lambda','CCSD(T) error (mEh)',\
+            'logical qubits', 'Toffoli count'),file=f)
+        print("{}".format('-' * 108), file=f)
+    for rank in rank_range:
+        # First, up: lambda and CCSD(T)
+        eri_rr, LR = sf.factorize(pyscf_mf._eri, rank)
+        lam = sf.compute_lambda(pyscf_mf, LR)
+        escf, ecor, etot = factorized_ccsd_t(pyscf_mf,
+                                             eri_rr,
+                                             use_kernel=use_kernel,
+                                             no_triples=no_triples)
+        error = (etot - exact_etot) * 1E3  # to mEh
+        l2_norm_error_eri = np.linalg.norm(
+            eri_rr - pyscf_mf._eri)  # eri reconstruction error
+
+        # now do costing
+        stps1 = sf.compute_cost(num_spinorb,
+                                lam,
+                                DE,
+                                L=rank,
+                                chi=CHI,
+                                stps=20000)[0]
+
+        _, sf_total_cost, sf_logical_qubits = sf.compute_cost(num_spinorb,
+                                                              lam,
+                                                              DE,
+                                                              L=rank,
+                                                              chi=CHI,
+                                                              stps=stps1)
+
+        with open(filename, 'a') as f:
+            print(
+                "{:^12} {:^18.4e} {:^12.1f} {:^24.2f} {:^20} {:^20.1e}".format(
+                    rank, l2_norm_error_eri, lam, error, sf_logical_qubits,
+                    sf_total_cost),
+                file=f)
+    with open(filename, 'a') as f:
+        print("{}".format('=' * 108), file=f)
diff --git a/src/openfermion/resource_estimates/sparse/__init__.py b/src/openfermion/resource_estimates/sparse/__init__.py
new file mode 100644
index 000000000..09839684a
--- /dev/null
+++ b/src/openfermion/resource_estimates/sparse/__init__.py
@@ -0,0 +1,14 @@
+#coverage:ignore
+#   Copyright 2020 Google LLC
+
+#   Licensed under the Apache License, Version 2.0 (the "License");
+#   you may not use this file except in compliance with the License.
+#   You may obtain a copy of the License at
+#
+#       http://www.apache.org/licenses/LICENSE-2.0
+#
+#   Unless required by applicable law or agreed to in writing, software
+#   distributed under the License is distributed on an "AS IS" BASIS,
+#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#   See the License for the specific language governing permissions and
+#   limitations under the License.
diff --git a/src/openfermion/resource_estimates/sparse/costing_sparse.py b/src/openfermion/resource_estimates/sparse/costing_sparse.py
new file mode 100644
index 000000000..767640c7d
--- /dev/null
+++ b/src/openfermion/resource_estimates/sparse/costing_sparse.py
@@ -0,0 +1,106 @@
+#coverage:ignore
+""" Determine costs for sparse decomposition in QC
+    Note this is WIP
+"""
+from typing import Tuple
+import numpy as np
+from numpy.lib.scimath import arccos, arcsin  # has analytc continuation to cplx
+from openfermion.resource_estimates.utils import QI, power_two
+
+
+def cost_sparse(n: int, lam: float, d: int, dE: float, chi: int,
+                stps: int) -> Tuple[int, int, int]:
+    """ Determine fault-tolerant costs using sparse decomposition in quantum
+        chemistry
+
+    Args:
+        n (int) - the number of spin-orbitals
+        lam (float) - the lambda-value for the Hamiltonian
+        dE (float) - allowable error in phase estimation
+        L (int) - the rank of the first decomposition
+        Lxi (int) - the total number of eigenvectors
+        chi (int) - equivalent to aleph_1 and aleph_2 in the document, the
+            number of bits for the representation of the coefficients
+        beta (int) - equivalent to beth in the document, the number of bits
+            for the rotations
+        stps (int) - an approximate number of steps to choose the precision
+            of single qubit rotations in preparation of the equal superposition
+            state
+
+    Returns:
+        step_cost (int) - Toffolis per step
+        total_cost (int) - Total number of Toffolis
+        ancilla_cost (int) - Total ancilla cost
+    """
+
+    # I think there is a bug in the mathematica notebook. It does not check if
+    # 2 is a factor first, which it should, cf. the similar function in
+    # costingdf.nb Below is correct using the power_two() function, to give
+    # power of 2 that is a factor of d.
+    eta = power_two(d)
+
+    nN = np.ceil(np.log2(n // 2))
+
+    m = chi + 8 * nN + 4  # Eq (A13)
+
+    oh = [0] * 20
+
+    nM = (np.ceil(np.log2(d)) - eta) / 2
+
+    for p in range(2, 22):
+        # JJG note: arccos arg may be > 1
+        v = np.round(np.power(2,p+1) / (2 * np.pi) * arccos(np.power(2,nM) /\
+            np.sqrt(d/2**eta)/2))
+        oh[p-2] = np.real(stps * (1 / (np.sin(3 * arcsin(np.cos(v * 2 * np.pi /\
+            np.power(2,p+1)) * \
+            np.sqrt(d/2**eta) / np.power(2,nM)))**2) - 1) + 4 * (p + 1))
+
+    # Bits of precision for rotation
+    br = int(np.argmin(oh) + 1) + 2
+
+    # Hand selecting the k expansion factor
+    k1 = 32
+
+    # Equation (A17)
+    cost = np.ceil(d/k1) + m * (k1 -1) + QI(d)[1] + 4 * n + 8 * nN + 2 * chi + \
+        7 * np.ceil(np.log2(d)) - 6 * eta + 4 * br - 19
+
+    # Number of iterations needed for the phase estimation.
+    iters = np.ceil(np.pi * lam / (dE * 2))
+
+    # The following are the number of qubits from the listing on page 39.
+
+    # Control registers for phase estimation and iteration on them.
+    ac1 = 2 * np.ceil(np.log2(iters)) - 1
+
+    # System qubits
+    ac2 = n
+
+    # The register used for the QROM
+    ac3 = np.ceil(np.log2(d))
+
+    # The qubit used for flagging the success of the equal superposition state
+    # preparation and the ancilla qubit for rotation
+    ac45 = 2
+
+    # The phase gradient state
+    ac6 = br
+
+    # The equal superposition state for coherent alias sampling
+    ac7 = chi
+
+    # The ancillas used for QROM
+    ac8 = np.ceil(np.log2(d / k1)) + m * k1
+
+    ancilla_cost = ac1 + ac2 + ac3 + ac45 + ac6 + ac7 + ac8
+
+    # Sanity checks before returning as int
+    assert cost.is_integer()
+    assert iters.is_integer()
+    assert ancilla_cost.is_integer()
+
+    step_cost = int(cost)
+    total_cost = int(cost * iters)
+    ancilla_cost = int(ancilla_cost)
+
+    return step_cost, total_cost, ancilla_cost
diff --git a/src/openfermion/resource_estimates/sparse/costing_sparse_test.py b/src/openfermion/resource_estimates/sparse/costing_sparse_test.py
new file mode 100644
index 000000000..33c2adffd
--- /dev/null
+++ b/src/openfermion/resource_estimates/sparse/costing_sparse_test.py
@@ -0,0 +1,42 @@
+#coverage:ignore
+"""Test cases for costing_sparse.py
+"""
+from openfermion.resource_estimates.sparse.costing_sparse import cost_sparse
+
+
+def test_reiher_sparse():
+    """ Reproduce Reiher et al orbital sparse FT costs from paper """
+    DE = 0.001
+    CHI = 10
+
+    # Reiher et al orbitals
+    N = 108
+    LAM = 2135.3
+    D = 705831
+
+    # Here we're using an initial calculation with a very rough estimate of the
+    # number of steps to give a more accurate number of steps. Then we input
+    # that into the function again.
+    output = cost_sparse(N, LAM, D, DE, CHI, stps=20000)
+    stps1 = output[0]
+    output = cost_sparse(N, LAM, D, DE, CHI, stps1)
+    assert output == (26347, 88371052334, 2190)
+
+
+def test_li_sparse():
+    """ Reproduce Li et al orbital sparse FT costs from paper """
+    DE = 0.001
+    CHI = 10
+
+    # Li et al orbitals
+    N = 152
+    LAM = 1547.3
+    D = 440501
+
+    # Here we're using an initial calculation with a very rough estimate of the
+    # number of steps to give a more accurate number of steps. Then we input
+    # that into the function again.
+    output = cost_sparse(N, LAM, D, DE, CHI, stps=20000)
+    stps2 = output[0]
+    output = cost_sparse(N, LAM, D, DE, CHI, stps2)
+    assert output == (18143, 44096452642, 2489)
diff --git a/src/openfermion/resource_estimates/surface_code_compilation/__init__.py b/src/openfermion/resource_estimates/surface_code_compilation/__init__.py
new file mode 100644
index 000000000..e69de29bb
diff --git a/src/openfermion/resource_estimates/surface_code_compilation/physical_costing.py b/src/openfermion/resource_estimates/surface_code_compilation/physical_costing.py
new file mode 100644
index 000000000..f74751209
--- /dev/null
+++ b/src/openfermion/resource_estimates/surface_code_compilation/physical_costing.py
@@ -0,0 +1,207 @@
+#coverage:ignore
+import dataclasses
+import datetime
+import math
+from typing import Tuple, Iterator
+
+
+@dataclasses.dataclass(frozen=True, unsafe_hash=True)
+class MagicStateFactory:
+    details: str
+    physical_qubit_footprint: int
+    rounds: int
+    failure_rate: int
+
+
+@dataclasses.dataclass(frozen=True, unsafe_hash=True)
+class CostEstimate:
+    physical_qubit_count: int
+    duration: datetime.timedelta
+    algorithm_failure_probability: float
+
+
+@dataclasses.dataclass(frozen=True, unsafe_hash=True)
+class AlgorithmParameters:
+    physical_error_rate: float
+    surface_code_cycle_time: datetime.timedelta
+    logical_data_qubit_distance: int
+    magic_state_factory: MagicStateFactory
+    toffoli_count: int
+    max_allocated_logical_qubits: int
+    factory_count: int
+    routing_overhead_proportion: float
+    proportion_of_bounding_box: float = 1
+
+    def estimate_cost(self) -> CostEstimate:
+        """Determine algorithm single-shot layout and costs for given params.
+
+        ASSUMES:
+            - There is enough routing area to get needed data qubits to the
+                factories as fast as the factories run.
+            - The bottleneck time cost is waiting for magic states.
+        """
+        logical_storage = int(
+            math.ceil(self.max_allocated_logical_qubits *
+                      (1 + self.routing_overhead_proportion)))
+        storage_area = logical_storage * _physical_qubits_per_logical_qubit(
+            self.logical_data_qubit_distance)
+        distillation_area = self.factory_count \
+                          * self.magic_state_factory.physical_qubit_footprint
+        rounds = int(self.toffoli_count / self.factory_count *
+                     self.magic_state_factory.rounds)
+        distillation_failure = self.magic_state_factory.failure_rate \
+                             * self.toffoli_count
+        data_failure = self.proportion_of_bounding_box \
+                     * _topological_error_per_unit_cell(
+            self.logical_data_qubit_distance,
+            gate_err=self.physical_error_rate) * logical_storage * rounds
+
+        return CostEstimate(physical_qubit_count=storage_area +
+                            distillation_area,
+                            duration=rounds * self.surface_code_cycle_time,
+                            algorithm_failure_probability=min(
+                                1., data_failure + distillation_failure))
+
+
+def _topological_error_per_unit_cell(code_distance: int,
+                                     gate_err: float) -> float:
+    return 0.1 * (100 * gate_err)**((code_distance + 1) / 2)
+
+
+def _total_topological_error(code_distance: int, gate_err: float,
+                             unit_cells: int) -> float:
+    return unit_cells * _topological_error_per_unit_cell(
+        code_distance, gate_err)
+
+
+def iter_known_factories(physical_error_rate: float
+                        ) -> Iterator[MagicStateFactory]:
+    if physical_error_rate == 0.001:
+        yield _two_level_t_state_factory_1p1000(
+            physical_error_rate=physical_error_rate)
+    yield from iter_auto_ccz_factories(physical_error_rate)
+
+
+def _two_level_t_state_factory_1p1000(physical_error_rate: float
+                                     ) -> MagicStateFactory:
+    assert physical_error_rate == 0.001
+    return MagicStateFactory(
+        details="https://arxiv.org/abs/1808.06709",
+        failure_rate=4 * 9 * 10**-17,
+        physical_qubit_footprint=(12 * 8) * (4) *
+        _physical_qubits_per_logical_qubit(31),
+        rounds=6 * 31,
+    )
+
+
+def _autoccz_factory_dimensions(l1_distance: int,
+                                l2_distance: int) -> Tuple[int, int, float]:
+    """Determine the width, height, depth of the magic state factory."""
+    t1_height = 4 * l1_distance / l2_distance
+    t1_width = 8 * l1_distance / l2_distance
+    t1_depth = 5.75 * l1_distance / l2_distance
+
+    ccz_depth = 5
+    ccz_height = 6
+    ccz_width = 3
+    storage_width = 2 * l1_distance / l2_distance
+
+    ccz_rate = 1 / ccz_depth
+    t1_rate = 1 / t1_depth
+    t1_factories = int(math.ceil((ccz_rate * 8) / t1_rate))
+    t1_factory_column_height = t1_height * math.ceil(t1_factories / 2)
+
+    width = int(math.ceil(t1_width * 2 + ccz_width + storage_width))
+    height = int(math.ceil(max(ccz_height, t1_factory_column_height)))
+    depth = max(ccz_depth, t1_depth)
+
+    return width, height, depth
+
+
+def iter_auto_ccz_factories(physical_error_rate: float
+                           ) -> Iterator[MagicStateFactory]:
+    for l1_distance in range(5, 25, 2):
+        for l2_distance in range(l1_distance + 2, 41, 2):
+            w, h, d = _autoccz_factory_dimensions(l1_distance=l1_distance,
+                                                  l2_distance=l2_distance)
+            f = _compute_autoccz_distillation_error(
+                l1_distance=l1_distance,
+                l2_distance=l2_distance,
+                physical_error_rate=physical_error_rate)
+
+            yield MagicStateFactory(
+                details=
+                f"AutoCCZ({physical_error_rate=},{l1_distance=},{l2_distance=}",
+                physical_qubit_footprint=w * h *
+                _physical_qubits_per_logical_qubit(l2_distance),
+                rounds=d * l2_distance,
+                failure_rate=f,
+            )
+
+
+def _compute_autoccz_distillation_error(l1_distance: int, l2_distance: int,
+                                        physical_error_rate: float) -> float:
+    # Level 0
+    L0_distance = l1_distance // 2
+    L0_distillation_error = physical_error_rate
+    L0_topological_error = _total_topological_error(
+        unit_cells=100,  # Estimated 100 for T injection.
+        code_distance=L0_distance,
+        gate_err=physical_error_rate)
+    L0_total_T_error = L0_distillation_error + L0_topological_error
+
+    # Level 1
+    L1_topological_error = _total_topological_error(
+        unit_cells=1100,  # Estimated 1000 for factory, 100 for T injection.
+        code_distance=l1_distance,
+        gate_err=physical_error_rate)
+    L1_distillation_error = 35 * L0_total_T_error**3
+    L1_total_T_error = L1_distillation_error + L1_topological_error
+
+    # Level 2
+    L2_topological_error = _total_topological_error(
+        unit_cells=1000,  # Estimated 1000 for factory.
+        code_distance=l2_distance,
+        gate_err=physical_error_rate)
+    L2_distillation_error = 28 * L1_total_T_error**2
+    L2_total_CCZ_or_2T_error = L2_topological_error + L2_distillation_error
+
+    return L2_total_CCZ_or_2T_error
+
+
+def _physical_qubits_per_logical_qubit(code_distance: int) -> int:
+    return (code_distance + 1)**2 * 2
+
+
+def cost_estimator(num_logical_qubits,
+                   num_toffoli,
+                   physical_error_rate=1.0E-3,
+                   portion_of_bounding_box=1.):
+    """
+    Produce best cost in terms of physical qubits and real run time based on
+    number of toffoli, number of logical qubits, and physical error rate.
+
+    """
+    best_cost = None
+    best_params = None
+    for factory in iter_known_factories(
+            physical_error_rate=physical_error_rate):
+        for logical_data_qubit_distance in range(7, 35, 2):
+            params = AlgorithmParameters(
+                physical_error_rate=physical_error_rate,
+                surface_code_cycle_time=datetime.timedelta(microseconds=1),
+                logical_data_qubit_distance=logical_data_qubit_distance,
+                magic_state_factory=factory,
+                toffoli_count=num_toffoli,
+                max_allocated_logical_qubits=num_logical_qubits,
+                factory_count=4,
+                routing_overhead_proportion=0.5,
+                proportion_of_bounding_box=portion_of_bounding_box)
+            cost = params.estimate_cost()
+            if cost.algorithm_failure_probability > 0.1:
+                continue
+            if best_cost is None or cost.physical_qubit_count * cost.duration \
+                < best_cost.physical_qubit_count * best_cost.duration:
+                best_cost = cost
+                best_params = params
+    return best_cost, best_params
diff --git a/src/openfermion/resource_estimates/thc/__init__.py b/src/openfermion/resource_estimates/thc/__init__.py
new file mode 100644
index 000000000..da686a035
--- /dev/null
+++ b/src/openfermion/resource_estimates/thc/__init__.py
@@ -0,0 +1,25 @@
+#coverage:ignore
+#   Copyright 2020 Google LLC
+
+#   Licensed under the Apache License, Version 2.0 (the "License");
+#   you may not use this file except in compliance with the License.
+#   You may obtain a copy of the License at
+#
+#       http://www.apache.org/licenses/LICENSE-2.0
+#
+#   Unless required by applicable law or agreed to in writing, software
+#   distributed under the License is distributed on an "AS IS" BASIS,
+#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#   See the License for the specific language governing permissions and
+#   limitations under the License.
+import pytest
+
+try:
+    import jax, pybtas
+    from .spacetime import qubit_vs_toffoli
+    from .compute_lambda_thc import compute_lambda
+    from .compute_cost_thc import compute_cost
+    from .factorize_thc import thc_via_cp3 as factorize
+    from .generate_costing_table_thc import generate_costing_table
+except ImportError:
+    pytest.skip('Need jax and pybtas for THC', allow_module_level=True)
diff --git a/src/openfermion/resource_estimates/thc/compute_cost_thc.py b/src/openfermion/resource_estimates/thc/compute_cost_thc.py
new file mode 100644
index 000000000..23df1a929
--- /dev/null
+++ b/src/openfermion/resource_estimates/thc/compute_cost_thc.py
@@ -0,0 +1,225 @@
+#coverage:ignore
+""" Determine costs for THC decomposition in QC """
+from typing import Tuple
+import numpy as np
+from numpy.lib.scimath import arccos, arcsin  # has analytc continuatn to cplx
+from openfermion.resource_estimates.utils import QR, QI
+
+
+def compute_cost(n: int,
+                 lam: float,
+                 dE: float,
+                 chi: int,
+                 beta: int,
+                 M: int,
+                 stps: int,
+                 verbose: bool = False) -> Tuple[int, int, int]:
+    """ Determine fault-tolerant costs using THC decomposition in quantum chem
+
+    Args:
+        n (int) - the number of spin-orbitals
+        lam (float) - the lambda-value for the Hamiltonian
+        dE (float) - allowable error in phase estimation
+        chi (int) - equivalent to aleph in the document, the number of bits for
+            the representation of the coefficients
+        beta (int) - equivalent to beth in the document, the number of bits for
+            the rotations
+        M (int) - the dimension for the THC decomposition
+        stps (int) - an approximate number of steps to choose the precision of
+            single qubit rotations in preparation of the equal superpositn state
+
+    Returns:
+        step_cost (int) - Toffolis per step
+        total_cost (int) - Total number of Toffolis
+        ancilla_cost (int) - Total ancilla cost
+    """
+    # The number of bits used for each register.
+    nM = np.ceil(np.log2(M + 1))
+
+    # This is the number of distinct items of data we need to output.
+    d = M * (M + 1) // 2 + n // 2
+
+    # The number of bits used for the contiguous register.
+    nc = np.ceil(np.log2(d))
+
+    # The output size is 2*log[M] for the alt values, chi for the keep value,
+    # and 2 for the two sign bits, as in Eq. (29).
+    m = 2 * nM + 2 + chi
+    oh = [0] * 20
+    for p in range(20):
+        # arccos arg may be > 1
+        v = np.round(
+            np.power(2, p + 1) / (2 * np.pi) *
+            arccos(np.power(2, nM) / np.sqrt(d) / 2))
+        oh[p] = stps * (1 / (np.sin(3 * arcsin(np.cos(v * 2 * np.pi / \
+            np.power(2,p+1)) * \
+            np.sqrt(d) / np.power(2,nM)))**2) - 1) + 4 * (p + 1)
+
+    # Set it to be the number of bits that minimises the cost, usually 7.
+    # Python is 0-index, so need to add the one back in vs mathematica nb
+    br = np.argmin(oh) + 1
+
+    # This is the costing for preparing the equal superposition over the
+    # input registers from below Eq. (27).
+    cp1 = 2 * (10 * nM + 2 * br - 9)
+
+    # This is the cost of computing the contiguous register and inverting it.
+    # This is with a sophisticated scheme adding together triplets of bits.
+    # This is the cost of step 2 in the list on pages 15 and 16, with a factor
+    # of 2 to account for its inverse.
+    cp2 = 2 * (nM**2 + nM - 1)
+
+    # This is the cost of the QROM for the state preparation in step 3 and its
+    # inverse. Note: arg_min is first value, min is second value
+    cp3 = QR(d, m)[1] + QI(d)[1]
+
+    # The cost for the inequality test in step 4 and its inverse.
+    cp4 = 2 * chi
+
+    # The cost 2*nM for the controlled swap in step 5 and its inverse.
+    cp5 = 4 * nM
+
+    # Then there is a cost of nM + 1 for swapping the mu and nu registers in
+    # step 6, where the + 1 is because we need to control on two registers.
+    # There is the same cost for the inverse.
+    cp6 = 2 * nM + 2
+
+    # This is the total cost in Eq. (32).
+    CPCP = cp1 + cp2 + cp3 + cp4 + cp5 + cp6
+
+    # This is the cost of swapping based on the spin register. It is steps 1
+    # and 7 from the list of steps on pages 16 and 17.
+    cs1 = 2 * n
+
+    # The QROM for the rotation angles the first time in step 2. and the second
+    # part is the cost for inverting them with advanced QROM (step 6).
+    cs2a = M + n / 2 - 2
+
+    # The QROM for the rotation angles the second time. Here the cost M - 2 is
+    # for generating the angles the second time (again step 2), and QI[M] is
+    # for inverting the QROM (step 6).
+    cs2b = M - 2
+
+    # The cost of the rotations in steps 3 and 5, which must be done 4 times.
+    cs3 = 4 * n * (beta - 2)
+
+    # Cost for making the Z doubly controlled in step 4, and a controlled swap
+    #  of the spin qubits. Note: is this really just "2"? Does this need a var?
+    cs4 = 2
+
+    # want the argument, so that is first element
+    k1 = np.power(2, QI(M + n // 2)[0])
+
+    # The cost for inverting the rotation angles the first time in step 6.
+    cs6a = np.ceil(M / k1) + np.ceil(n / 2 / k1) + k1
+
+    # The cost for inverting the rotation angles the second time in step 6.
+    # value is in second output argument of QI
+    cs6b = QI(M)[1]
+
+    # The total select cost in Eq. (42).
+    CS = cs1 + cs2a + cs2b + cs3 + cs4 + cs6a + cs6b
+
+    # The cost of the reflections used in Eq. (43).
+    costref = 2 * nM + chi + 4
+
+    # The total cost in Eq. (43).
+    cost = CPCP + CS + costref
+
+    # The cost of the control register and temporary registers given for part 1.
+    ac1 = lambda iters: 2 * np.ceil(np.log2(iters + 1)) - 1
+
+    # Qubits for the system register in part 2.
+    ac2 = n
+
+    # The cost for the \[Mu] and \[Nu] registers in part 3.
+    ac3 = 2 * nM
+
+    # This is for the equal superposition state to perform the inequal
+    # test with the keep register.
+    ac4 = chi
+
+    # Some minor qubit counts in parts 5 to 9.
+    ac59 = 7
+
+    # The phase gradient state in part 10.
+    ac10 = beta
+
+    # The contiguous register in part 11
+    ac11 = nc
+
+    # we want arg_min so first argument
+    kt = np.power(2, QR(d, m))[0]
+
+    # The qubits used for the QROM in part 12, of which only m are persistent.
+    ac12 = m * kt + np.ceil(np.log2(d / kt))
+
+    # This is the data needed for the rotations in part 13.
+    ac13 = beta * n / 2
+
+    # These are the ancillas needed for adding into the phase gradient state
+    # in part 14.
+    ac14 = beta - 2
+
+    # These are the temporary ancillas in between erasing the first QROM
+    # ancillas and inverting that QROM. The + m is for output of the first QROM.
+    acc = ac13 + ac14 + m
+
+    # Total number of iterations
+    iters = np.ceil(np.pi * lam / (dE * 2))
+    aca = ac1(iters) + ac2 + ac3 + ac4 + ac59 + ac10 + ac11
+
+    if verbose:
+        print("[*] Top of routine")
+        print("  [+] nM = ", nM)
+        print("  [+] d = ", d)
+        print("  [+] nc = ", nc)
+        print("  [+] m = ", m)
+        print("  [+] oh = ", oh)
+        print("  [+] br = ", br)
+        print("  [+] cp1 = ", cp1)
+        print("  [+] cp2 = ", cp2)
+        print("    [*] QR[d,m] = ", QR(d, m))
+        print("    [*] QI[d] = ", QI(d))
+        print("  [+] cp3 = ", cp3)
+        print("  [+] cp4 = ", cp4)
+        print("  [+] cp5 = ", cp5)
+        print("  [+] cp6 = ", cp6)
+        print("  [+] CPCP = ", CPCP)
+        print("  [+] cs1 = ", cs1)
+        print("  [+] cs2a = ", cs2a)
+        print("  [+] cs2b = ", cs2b)
+        print("  [+] cs3 = ", cs3)
+        print("  [+] k1 = ", k1)
+        print("  [+] cs6a = ", cs6a)
+        print("  [+] cs6b = ", cs6b)
+        print("  [+] CS = ", CS)
+        print("  [+] costref = ", costref)
+        print("  [+] cost = ", cost)
+        print("  [+] ac1 = ", ac1)
+        print("  [+] ac2 = ", ac2)
+        print("  [+] ac3 = ", ac3)
+        print("  [+] ac4 = ", ac4)
+        print("  [+] ac10 = ", ac10)
+        print("  [+] ac11 = ", ac11)
+        print("  [+] kt = ", kt)
+        print("  [+] ac12 = ", ac12)
+        print("  [+] ac13 = ", ac13)
+        print("  [+] ac14 = ", ac14)
+        print("  [+] acc = ", acc)
+        print("  [+] iters = ", iters)
+        print("  [+] aca = ", aca)
+
+    # Sanity checks before returning as int
+    # assert cost.is_integer()
+    # assert iters.is_integer()
+    # assert aca.is_integer()
+    # assert ac12.is_integer()
+    # assert acc.is_integer()
+
+    step_cost = int(cost)
+    total_cost = int(cost * iters)
+    ancilla_cost = int(np.max([aca + ac12, aca + acc]))
+
+    # step-cost, Toffoli count, logical qubits
+    return step_cost, total_cost, ancilla_cost
diff --git a/src/openfermion/resource_estimates/thc/compute_cost_thc_test.py b/src/openfermion/resource_estimates/thc/compute_cost_thc_test.py
new file mode 100644
index 000000000..111ab4d2e
--- /dev/null
+++ b/src/openfermion/resource_estimates/thc/compute_cost_thc_test.py
@@ -0,0 +1,48 @@
+#coverage:ignore
+"""Test cases for costing_thc.py
+"""
+import unittest
+import pytest
+
+from openfermion.resource_estimates import thc
+
+
+class THCCostTest(unittest.TestCase):
+
+    def test_reiher_thc(self):
+        """ Reproduce Reiher et al orbital THC FT costs from paper """
+        DE = 0.001
+        CHI = 10
+
+        # Reiher et al orbitals
+        N = 108
+        LAM = 306.3
+        BETA = 16
+        THC_DIM = 350
+
+        # Here we're using an initial calculation with a very rough estimate of
+        # the number of steps to give a more accurate number of steps. Then we
+        # input that into the function again.
+        output = thc.compute_cost(N, LAM, DE, CHI, BETA, THC_DIM, stps=20000)
+        stps1 = output[0]
+        output = thc.compute_cost(N, LAM, DE, CHI, BETA, THC_DIM, stps1)
+        assert output == (10912, 5250145120, 2142)
+
+    def test_li_thc(self):
+        """ Reproduce Li et al orbital THC FT costs from paper """
+        DE = 0.001
+        CHI = 10
+
+        # Li et al orbitals
+        N = 152
+        LAM = 1201.5
+        BETA = 20
+        THC_DIM = 450
+
+        # Here we're using an initial calculation with a very rough estimate of
+        # the number of steps to give a more accurate number of steps. Then we
+        # input that into the function again.
+        output = thc.compute_cost(N, LAM, DE, CHI, BETA, THC_DIM, stps=20000)
+        stps2 = output[0]
+        output = thc.compute_cost(N, LAM, DE, CHI, BETA, THC_DIM, stps2)
+        assert output == (16923, 31938980976, 2196)
diff --git a/src/openfermion/resource_estimates/thc/compute_lambda_thc.py b/src/openfermion/resource_estimates/thc/compute_lambda_thc.py
new file mode 100644
index 000000000..16fa0ef8d
--- /dev/null
+++ b/src/openfermion/resource_estimates/thc/compute_lambda_thc.py
@@ -0,0 +1,78 @@
+#coverage:ignore
+"""
+Compute lambdas for THC according to
+PRX QUANTUM 2, 030305 (2021) Section II. D.
+"""
+import numpy as np
+from openfermion.resource_estimates.molecule import pyscf_to_cas
+
+
+def compute_lambda(pyscf_mf,
+                   etaPp: np.ndarray,
+                   MPQ: np.ndarray,
+                   use_eri_thc_for_t=False):
+    """
+    Compute lambda thc
+
+    Args:
+        pyscf_mf - PySCF mean field object
+        etaPp - leaf tensor for THC that is dim(nthc x norb).  The nthc and norb
+                is inferred from this quantity.
+        MPQ - central tensor for THC factorization. dim(nthc x nthc)
+
+    Returns:
+    """
+
+    nthc = etaPp.shape[0]
+
+    # grab tensors from pyscf_mf object
+    h1, eri_full, _, _, _ = pyscf_to_cas(pyscf_mf)
+
+    # computing Least-squares THC residual
+    CprP = np.einsum("Pp,Pr->prP", etaPp,
+                     etaPp)  # this is einsum('mp,mq->pqm', etaPp, etaPp)
+    BprQ = np.tensordot(CprP, MPQ, axes=([2], [0]))
+    Iapprox = np.tensordot(CprP, np.transpose(BprQ), axes=([2], [0]))
+    deri = eri_full - Iapprox
+    res = 0.5 * np.sum((deri)**2)
+
+    # NOTE: remove in future once we resolve why it was used in the first place.
+    # NOTE: see T construction for details.
+    eri_thc = np.einsum("Pp,Pr,Qq,Qs,PQ->prqs",
+                        etaPp,
+                        etaPp,
+                        etaPp,
+                        etaPp,
+                        MPQ,
+                        optimize=True)
+
+    # projecting into the THC basis requires each THC factor mu to be nrmlzd.
+    # we roll the normalization constant into the central tensor zeta
+    SPQ = etaPp.dot(
+        etaPp.T)  # (nthc x norb)  x (norb x nthc) -> (nthc  x nthc) metric
+    cP = np.diag(np.diag(
+        SPQ))  # grab diagonal elements. equivalent to np.diag(np.diagonal(SPQ))
+    # no sqrts because we have two normalized THC vectors (index by mu and nu)
+    # on each side.
+    MPQ_normalized = cP.dot(MPQ).dot(cP)  # get normalized zeta in Eq. 11 & 12
+
+    lambda_z = np.sum(np.abs(MPQ_normalized)) * 0.5  # Eq. 13
+    # NCR: originally Joonho's code add np.einsum('llij->ij', eri_thc)
+    # NCR: I don't know how much this matters.
+    if use_eri_thc_for_t:
+        # use eri_thc for second coulomb contraction.  This was in the original
+        # code which is different than what the paper says.
+        T = h1 - 0.5 * np.einsum("illj->ij", eri_full) + np.einsum(
+            "llij->ij", eri_thc)  # Eq. 3 + Eq. 18
+    else:
+        T = h1 - 0.5 * np.einsum("illj->ij", eri_full) + np.einsum(
+            "llij->ij", eri_full)  # Eq. 3 + Eq. 18
+    #e, v = np.linalg.eigh(T)
+    e = np.linalg.eigvalsh(T)  # only need eigenvalues
+    lambda_T = np.sum(
+        np.abs(e))  # Eq. 19. NOTE: sum over spin orbitals removes 1/2 factor
+
+    lambda_tot = lambda_z + lambda_T  # Eq. 20
+
+    #return nthc, np.sqrt(res), res, lambda_T, lambda_z, lambda_tot
+    return lambda_tot, nthc, np.sqrt(res), res, lambda_T, lambda_z
diff --git a/src/openfermion/resource_estimates/thc/compute_lambda_thc_test.py b/src/openfermion/resource_estimates/thc/compute_lambda_thc_test.py
new file mode 100644
index 000000000..e43ee0307
--- /dev/null
+++ b/src/openfermion/resource_estimates/thc/compute_lambda_thc_test.py
@@ -0,0 +1,23 @@
+#coverage:ignore
+import os
+import h5py
+import numpy as np
+import openfermion.resource_estimates.integrals as int_folder
+from openfermion.resource_estimates import thc
+from openfermion.resource_estimates.molecule import load_casfile_to_pyscf
+
+
+def test_lambda():
+    integral_path = int_folder.__file__.replace('__init__.py', '')
+    thc_factor_file = os.path.join(integral_path, 'M_250_beta_16_eta_10.h5')
+    eri_file = os.path.join(integral_path, 'eri_reiher.h5')
+    with h5py.File(thc_factor_file, 'r') as fid:
+        MPQ = fid['MPQ'][...]
+        etaPp = fid['etaPp'][...]
+
+    _, mf = load_casfile_to_pyscf(eri_file, num_alpha=27, num_beta=27)
+
+    lambda_tot, nthc, _, _, _, _  = \
+        thc.compute_lambda(mf, etaPp=etaPp, MPQ=MPQ)
+    assert nthc == 250
+    assert np.isclose(np.round(lambda_tot), 294)
diff --git a/src/openfermion/resource_estimates/thc/factorize_thc.py b/src/openfermion/resource_estimates/thc/factorize_thc.py
new file mode 100644
index 000000000..f8589e959
--- /dev/null
+++ b/src/openfermion/resource_estimates/thc/factorize_thc.py
@@ -0,0 +1,150 @@
+#coverage:ignore
+""" THC rank reduction of ERIs """
+import sys
+import time
+import uuid
+import numpy as np
+import h5py
+from openfermion.resource_estimates.thc.utils import (lbfgsb_opt_thc_l2reg,
+                                                      adagrad_opt_thc)
+
+
+def thc_via_cp3(eri_full,
+                nthc,
+                thc_save_file=None,
+                first_factor_thresh=1.0E-14,
+                conv_eps=1.0E-4,
+                perform_bfgs_opt=True,
+                bfgs_maxiter=5000,
+                random_start_thc=True,
+                verify=False):
+    """
+    THC-CP3 performs an SVD decomposition of the eri matrix followed by a CP
+    decomposition via pybtas. The CP decomposition is assumes the tensor is
+    symmetric in in the first two indices corresponding to a reshaped
+    (and rescaled by the singular value) singular vector.
+
+    Args:
+        eri_full - (N x N x N x N) eri tensor in Mulliken (chemists) ordering
+        nthc (int) - number of THC factors to use
+        thc_save_file (str) - if not None, save output to filename (as HDF5)
+        first_factor_thresh - SVD threshold on initial factorization of ERI
+        conv_eps (float) - convergence threshold on CP3 ALS
+        perform_bfgs_opt - Perform extra gradient opt. on top of CP3 decomp
+        bfgs_maxiter - Maximum bfgs steps to take. Default 1500.
+        random_start_thc - Perform random start for CP3.
+                           If false perform HOSVD start.
+        verify - check eri properties. Default is False
+
+    returns:
+        eri_thc - (N x N x N x N) reconstructed ERIs from THC factorization
+        thc_leaf - THC leaf tensor
+        thc_central - THC central tensor
+        info (dict) - arguments set during the THC factorization
+    """
+    # fail fast if we don't have the tools to use this routine
+    try:
+        import pybtas
+    except ImportError:
+        raise ImportError(
+            "pybtas could not be imported. Is it installed and in PYTHONPATH?")
+
+    info = locals()
+    info.pop('eri_full', None)  # data too big for info dict
+    info.pop('pybtas', None)  # not needed for info dict
+
+    norb = eri_full.shape[0]
+    if verify:
+        assert np.allclose(eri_full,
+                           eri_full.transpose(1, 0, 2, 3))  # (ij|kl) == (ji|kl)
+        assert np.allclose(eri_full,
+                           eri_full.transpose(0, 1, 3, 2))  # (ij|kl) == (ij|lk)
+        assert np.allclose(eri_full,
+                           eri_full.transpose(1, 0, 3, 2))  # (ij|kl) == (ji|lk)
+        assert np.allclose(eri_full,
+                           eri_full.transpose(2, 3, 0, 1))  # (ij|kl) == (kl|ij)
+
+    eri_mat = eri_full.transpose(0, 1, 3, 2).reshape((norb**2, norb**2))
+    if verify:
+        assert np.allclose(eri_mat, eri_mat.T)
+
+    u, sigma, vh = np.linalg.svd(eri_mat)
+
+    if verify:
+        assert np.allclose(u @ np.diag(sigma) @ vh, eri_mat)
+
+    non_zero_sv = np.where(sigma >= first_factor_thresh)[0]
+    u_chol = u[:, non_zero_sv]
+    diag_sigma = np.diag(sigma[non_zero_sv])
+    u_chol = u_chol @ np.diag(np.sqrt(sigma[non_zero_sv]))
+
+    if verify:
+        test_eri_mat_mulliken = u[:,
+                                  non_zero_sv] @ diag_sigma @ vh[non_zero_sv, :]
+        assert np.allclose(test_eri_mat_mulliken, eri_mat)
+
+    start_time = time.time()  # timing results if requested by user
+    beta, gamma, scale = pybtas.cp3_from_cholesky(u_chol.copy(),
+                                                  nthc,
+                                                  random_start=random_start_thc,
+                                                  conv_eps=conv_eps)
+    cp3_calc_time = time.time() - start_time
+
+    if verify:
+        u_alpha = np.zeros((norb, norb, len(non_zero_sv)))
+        for ii in range(len(non_zero_sv)):
+            u_alpha[:, :, ii] = np.sqrt(sigma[ii]) * u[:, ii].reshape(
+                (norb, norb))
+            assert np.allclose(
+                u_alpha[:, :, ii],
+                u_alpha[:, :, ii].T)  # consequence of working with Mulliken rep
+
+        u_alpha_test = np.einsum("ar,br,xr,r->abx", beta, beta, gamma,
+                                 scale.ravel())
+        print("\tu_alpha l2-norm ", np.linalg.norm(u_alpha_test - u_alpha))
+
+    thc_leaf = beta.T
+    thc_gamma = np.einsum('xr,r->xr', gamma, scale.ravel())
+    thc_central = thc_gamma.T @ thc_gamma
+
+    if verify:
+        eri_thc = np.einsum("Pp,Pr,Qq,Qs,PQ->prqs",
+                            thc_leaf,
+                            thc_leaf,
+                            thc_leaf,
+                            thc_leaf,
+                            thc_central,
+                            optimize=True)
+        print("\tERI L2 CP3-THC ", np.linalg.norm(eri_thc - eri_full))
+        print("\tCP3 timing: ", cp3_calc_time)
+
+    if perform_bfgs_opt:
+        x = np.hstack((thc_leaf.ravel(), thc_central.ravel()))
+        #lbfgs_start_time = time.time()
+        x = lbfgsb_opt_thc_l2reg(eri_full,
+                                 nthc,
+                                 initial_guess=x,
+                                 maxiter=bfgs_maxiter)
+        #lbfgs_calc_time = time.time() - lbfgs_start_time
+        thc_leaf = x[:norb * nthc].reshape(nthc,
+                                           norb)  # leaf tensor  nthc x norb
+        thc_central = x[norb * nthc:norb * nthc + nthc * nthc].reshape(
+            nthc, nthc)  # central tensor
+
+    #total_calc_time = time.time() - start_time
+
+    eri_thc = np.einsum("Pp,Pr,Qq,Qs,PQ->prqs",
+                        thc_leaf,
+                        thc_leaf,
+                        thc_leaf,
+                        thc_leaf,
+                        thc_central,
+                        optimize=True)
+
+    if thc_save_file is not None:
+        with h5py.File(thc_save_file + '.h5', 'w') as fid:
+            fid.create_dataset('thc_leaf', data=thc_leaf)
+            fid.create_dataset('thc_central', data=thc_central)
+            fid.create_dataset('info', data=str(info))
+
+    return eri_thc, thc_leaf, thc_central, info
diff --git a/src/openfermion/resource_estimates/thc/generate_costing_table_thc.py b/src/openfermion/resource_estimates/thc/generate_costing_table_thc.py
new file mode 100644
index 000000000..1d78acf9e
--- /dev/null
+++ b/src/openfermion/resource_estimates/thc/generate_costing_table_thc.py
@@ -0,0 +1,151 @@
+#coverage:ignore
+""" Pretty-print a table comparing number of THC vectors vs accy and cost """
+import numpy as np
+from pyscf import scf
+from openfermion.resource_estimates import thc
+from openfermion.resource_estimates.molecule import (factorized_ccsd_t,
+                                                     cas_to_pyscf, pyscf_to_cas)
+
+
+def generate_costing_table(pyscf_mf,
+                           name='molecule',
+                           nthc_range=None,
+                           dE=0.001,
+                           chi=10,
+                           beta=20,
+                           save_thc=False,
+                           use_kernel=True,
+                           no_triples=False,
+                           **kwargs):
+    """ Print a table to file for testing how various THC thresholds impact
+        cost, accuracy, etc.
+
+    Args:
+        pyscf_mf - PySCF mean field object
+        name (str) - file will be saved to 'thc_factorization_<name>.txt'
+        nthc_range (list of ints) - list of number of THC vectors to retain
+        dE (float) - max allowable phase error (default: 0.001)
+        chi (int) - number of bits for repr of coefficients (default: 10)
+        beta (int) - number of bits for rotations (default: 20)
+        save_thc (bool) - if True, save the THC factors (leaf and central only)
+        use_kernel (bool) - re-do SCF prior to estimating CCSD(T) error?
+            Will use canonical orbitals and full ERIs for the one-body
+            contributions, using rank-reduced ERIs for two-body
+        no_triples (bool) - if True, skip the (T) correction, doing only CCSD
+        kwargs: additional keyword arguments to pass to thc.factorize()
+
+    Returns:
+       None
+    """
+
+    if nthc_range is None:
+        nthc_range = [250, 300, 350]
+
+    DE = dE  # max allowable phase error
+    CHI = chi  # number of bits for representation of coefficients
+    BETA = beta  # number of bits for the rotations
+
+    if isinstance(pyscf_mf, scf.rohf.ROHF):
+        num_alpha, num_beta = pyscf_mf.nelec
+        assert num_alpha + num_beta == pyscf_mf.mol.nelectron
+    else:
+        assert pyscf_mf.mol.nelectron % 2 == 0
+        num_alpha = pyscf_mf.mol.nelectron // 2
+        num_beta = num_alpha
+
+    num_orb = len(pyscf_mf.mo_coeff)
+    num_spinorb = num_orb * 2
+
+    cas_info = "CAS((%sa, %sb), %so)" % (num_alpha, num_beta, num_orb)
+
+    try:
+        assert num_orb**4 == len(pyscf_mf._eri.flatten())
+    except AssertionError:
+        # ERIs are not in correct form in pyscf_mf._eri, so this is a quick prep
+        _, pyscf_mf = cas_to_pyscf(*pyscf_to_cas(pyscf_mf))
+
+    # Reference calculation (eri_rr= None is full rank / exact ERIs)
+    escf, ecor, etot = factorized_ccsd_t(pyscf_mf,
+                                         eri_rr=None,
+                                         use_kernel=use_kernel,
+                                         no_triples=no_triples)
+
+    exact_etot = etot
+
+    filename = 'thc_factorization_' + name + '.txt'
+
+    with open(filename, 'w') as f:
+        print("\n THC factorization data for '" + name + "'.", file=f)
+        print("    [*] using " + cas_info, file=f)
+        print("        [+]                      E(SCF): %18.8f" % escf, file=f)
+        if no_triples:
+            print("        [+]    Active space CCSD E(cor): %18.8f" % ecor,
+                  file=f)
+            print("        [+]    Active space CCSD E(tot): %18.8f" % etot,
+                  file=f)
+        else:
+            print("        [+] Active space CCSD(T) E(cor): %18.8f" % ecor,
+                  file=f)
+            print("        [+] Active space CCSD(T) E(tot): %18.8f" % etot,
+                  file=f)
+        print("{}".format('=' * 111), file=f)
+        if no_triples:
+            print("{:^12} {:^18} {:^24} {:^12} {:^20} {:^20}".format(
+                'M', '||ERI - THC||', 'CCSD error (mEh)', 'lambda',
+                'Toffoli count', 'logical qubits'),
+                  file=f)
+        else:
+            print("{:^12} {:^18} {:^24} {:^12} {:^20} {:^20}".format(
+                'M', '||ERI - THC||', 'CCSD(T) error (mEh)', 'lambda',
+                'Toffoli count', 'logical qubits'),
+                  file=f)
+        print("{}".format('-' * 111), file=f)
+    for nthc in nthc_range:
+        # First, up: lambda and CCSD(T)
+        if save_thc:
+            fname = name + '_nTHC_' + str(nthc).zfill(
+                5)  # will save as HDF5 and add .h5 extension
+        else:
+            fname = None
+        eri_rr, thc_leaf, thc_central, info = thc.factorize(pyscf_mf._eri,
+                                                            nthc,
+                                                            thc_save_file=fname,
+                                                            **kwargs)
+        lam = thc.compute_lambda(pyscf_mf, thc_leaf, thc_central)[0]
+        escf, ecor, etot = factorized_ccsd_t(pyscf_mf,
+                                             eri_rr,
+                                             use_kernel=use_kernel,
+                                             no_triples=no_triples)
+        error = (etot - exact_etot) * 1E3  # to mEh
+        l2_norm_error_eri = np.linalg.norm(
+            eri_rr - pyscf_mf._eri)  # ERI reconstruction error
+
+        # now do costing
+        stps1 = thc.compute_cost(num_spinorb,
+                                 lam,
+                                 DE,
+                                 chi=CHI,
+                                 beta=BETA,
+                                 M=nthc,
+                                 stps=20000)[0]
+        _, thc_total_cost, thc_logical_qubits = thc.compute_cost(num_spinorb,
+                                                                 lam,
+                                                                 DE,
+                                                                 chi=CHI,
+                                                                 beta=BETA,
+                                                                 M=nthc,
+                                                                 stps=stps1)
+
+        with open(filename, 'a') as f:
+            print(
+                "{:^12} {:^18.4e} {:^24.2f} {:^12.1f} {:^20.1e} {:^20}".format(
+                    nthc, l2_norm_error_eri, error, lam, thc_total_cost,
+                    thc_logical_qubits),
+                file=f)
+    with open(filename, 'a') as f:
+        print("{}".format('=' * 111), file=f)
+
+    with open(filename, 'a') as f:
+        print("THC factorization settings at exit:", file=f)
+        for key, value in info.items():
+            print("\t", key, value, file=f)
diff --git a/src/openfermion/resource_estimates/thc/spacetime.py b/src/openfermion/resource_estimates/thc/spacetime.py
new file mode 100644
index 000000000..548b363b7
--- /dev/null
+++ b/src/openfermion/resource_estimates/thc/spacetime.py
@@ -0,0 +1,624 @@
+#coverage:ignore
+"""Compute qubit vs toffoli for THC LCU"""
+from math import pi
+import itertools
+import numpy as np
+import matplotlib.pyplot as plt
+from numpy.lib.scimath import arccos, arcsin  # has analytc continuation to cplx
+from openfermion.resource_estimates.utils import QR, QI
+
+
+def qubit_vs_toffoli(lam,
+                     dE,
+                     eps,
+                     n,
+                     chi,
+                     beta,
+                     M,
+                     algorithm='half',
+                     verbose=False):
+    """
+    Args:
+        lam (float) - the lambda-value for the Hamiltonian
+        dE (float) - allowable error in phase estimation. usually 0.001
+        eps (float) - allowable error for synthesis (dE/(10 * lam)) usually
+        n (int) - number of spin orbitals.
+        chi (int) - number of bits of precision for state prep
+        beta (int) - number of bits of precision for rotations
+        M (int) - THC rank or r_{Thc}
+        algorithm (str) - 'half', where half of the phasing angles are loaded
+                                  at a time
+                          'full', where angles loaded from QROM to perform
+                                  phasing operations are loaded at the same time
+                          Note: In arXiv:2011.03494,
+                              'half' corresponds to Fig 11, while
+                              'full' corresponds to Fig 12.
+        verbose (bool) - do additional printing of intermediates?
+
+    """
+    # only valid algorithms accepted
+    assert algorithm in ['half', 'full']
+
+    # The number of iterations for the phase estimation.
+    iters = np.ceil(pi * lam / (dE * 2))
+    # The number of bits used for each register.
+    nM = np.ceil(np.log2(M + 1))
+    #This is number of distinct items of data we need to output, see Eq. (28).
+    d = M * (M + 1) / 2 + n / 2
+    # The number of bits used for the contiguous register.
+    nc = np.ceil(np.log2(d))
+    # The output size is 2*Log[M] for the alt values, χ for the keep value,
+    # and 2 for the two sign bits.
+    m = 2 * nM + 2 + chi
+
+    # The next block of code finds the optimal number of bits to use for the
+    # rotation angle for the amplitude amplification taking into account the
+    # probability of failure and the cost of the rotations.
+    oh = np.zeros(20, dtype=float)
+    for p in range(1, 20 + 1):
+        cos_term = arccos(np.power(2, nM) / np.sqrt(d) / 2)
+        # print(cos_term)
+        v = np.round(np.power(2, p) / (2 * pi) * cos_term)
+        asin_term = arcsin(
+            np.cos(v * 2 * pi / np.power(2, p)) * np.sqrt(d) / np.power(2, nM))
+        sin_term = np.sin(3 * asin_term)**2
+        oh[p - 1] = (20_000 * (1 / sin_term - 1) + 4 * p).real
+    # br is the number of bits used in the rotation.
+    br = np.argmin(oh) + 1
+    # Next are the costs for the state preparation.
+    cp1 = 2 * (10 * nM + 2 * br - 9)
+    # There is cost 10*Log[M] for preparing the equal superposition over the
+    # input registers. This is the costing from above Eq. (29).
+    # This is the cost of computing the contiguous register and inverting it.
+    # This is with a sophisticated scheme adding together triplets of bits.
+    # This is the cost of step 2 in the list on page 14.
+    cp2 = 2 * (nM**2 + nM - 1)
+    # This is the cost of the QROM for the state preparation and its inverse.
+    cp3 = QR(d, m)[1] + QI(d)[1]
+    # The cost for the inequality test.
+    cp4 = 2 * chi
+    # The cost 2*nM for the controlled swaps.
+    cp5 = 4 * nM
+    # Then there is a cost of nM+1 for swapping the μ and ν registers, where
+    # the +3 is because we need to control on two registers, and
+    # control swap of the spin registers.
+    cp6 = 2 * nM + 3
+    #  The total cost in Eq. (33).
+    CPCP = cp1 + cp2 + cp3 + cp4 + cp5 + cp6
+
+    # Next are the costs for the select operation.
+    # This is the cost of swapping based on the spin register. These costs are
+    # from the list on page 15, and this is steps 1 and 7.
+    cs1 = 2 * n
+    k1 = 2**QI(M + n / 2)[0]
+    cs2a = M + n / 2 - 2 + np.ceil(M / k1) + np.ceil(n / 2 / k1) + k1
+
+    # The QROM for the rotation angles the first time.  Here M+n/2-2 is the cost
+    # for generating them, and the second part is the cost for inverting them
+    # with advanced QROM. The QROM for the rotation angles the second time.
+    # Here the cost M-2 is for generating the angles the second time, and QI[M]
+    # is for inverting the QROM. Steps 2 and 6.
+    cs2b = M - 2 + QI(M)[1]
+    # The cost of the rotations steps 3 and 5.
+    cs3 = 4 * n * (beta - 2)
+    # Cost for extra part in making the Z doubly controlled step 4.
+    cs4 = 1
+    # The total select cost in Eq. (43).
+    CS = cs1 + cs2a + cs2b + cs3 + cs4
+    # The cost given slightly above Eq. (44) is 2*nM+5. That is a typo and it
+    # should have the aleph (equivalent to χ here) like at the top of the
+    # column. Here we have +3 in this line, +1 in the next line and +1 for cs4,
+    #  to give the same total.
+    costref = 2 * nM + chi + 3
+    cost = CPCP + CS + costref + 1
+
+    # Next are qubit costs.
+    ac1 = 2 * np.ceil(np.log2(iters + 1)) - 1
+    ac2 = n
+    ac3 = 2 * nM
+    ac47 = 5
+    ac8 = beta
+    ac9 = nc
+
+    kt = 2**QR(d, m)[0]
+    ac10 = m * kt + np.ceil(np.log2(d / kt))
+    # This is for the equal superposition state to perform the inequality test
+    # with the keep register.
+    ac11 = chi
+    # The qubit to control the swap of the μ and ν registers.
+    ac12 = 1
+    aca = ac1 + ac2 + ac3 + ac47 + ac8 + ac9 + ac11 + ac12
+    # This is the data needed for the rotations.
+    ac13 = beta * n / 2
+    # These are the ancillas needed for adding into the phase gradient state.
+    ac14 = beta - 2
+    # These are the temporary ancillas in between erasing the first QROM
+    # ancillas and inverting that QROM. The +m is for output of first QROM.
+    acc = ac13 + ac14 + m
+
+    if verbose:
+        print("Total Toffoli cost ", cost * iters)
+        print("Ancilla for first QROM ", aca + ac10)
+        print("Actual ancilla ... ", np.max([aca + ac10, aca + acc]))
+        print("Spacetime volume ", np.max([aca + ac10, aca + acc]) * cost)
+
+    # First are the numbers of qubits that must be kept throughout the
+    # computation. See page 18.
+    if algorithm == 'half':
+        # The qubits used as the control registers for the phase estimation,
+        # that must be kept the whole way through. If we used independent
+        # controls each time that would increase the Toffoli cost by
+        # np.ceil(np.log2iters+1]]-3, while saving
+        # np.ceil(np.log2iters+1]]-1 qubits
+        ac1 = np.ceil(np.log2(iters + 1))
+    elif algorithm == 'full':
+        # The qubits used as the control registers for the phase estimation,
+        # that must be kept the whole way through. If we used independent
+        # controls each time that would increase the Toffoli cost by
+        # np.ceil(np.log2iters+1]]-3, while saving
+        # np.ceil(np.log2iters+1]]-1 qubits.
+        ac1 = 2 * np.ceil(np.log2(iters + 1)) - 1
+    # The system qubits that must always be included.
+    ac2 = n
+    # The μ and ν registers, that must be kept because they are control
+    # registers that aren't fully erased and must be reflected on.
+    ac3 = 2 * nM
+    # These are the qubits for the spin in the control state as well as the
+    # qubit that is rotated for the preparation of the equal superposition
+    # state, AND the qubit that is used to control.
+    # None of these are fully inversely prepared.
+    ac4512 = 4
+    # The qubits for the phase gradient state.
+    ac8 = beta
+    # This is for the equal superposition state to perform the inequality test
+    # with the keep register. It must be kept around and reflected upon.
+    ac11 = chi
+    # The total number of permanent qubits.
+    perm = ac1 + ac2 + ac3 + ac4512 + ac8 + ac11
+    # In preparing the equal superposition state there are 6 temporary qubits
+    # used in the rotation of the ancilla.  There are another three that are
+    # needed for the temporary results of inequality tests. By far the largest
+    # number, however, come from keeping the temporary ancillas from the
+    # inequality tests.  That should be 3*nM+nN-4.  There are an other two
+    # qubits in output at the end that will be kept until this step is undone.
+    # Note: not used?
+    #nN = np.ceil(np.log2(n / 2))
+
+    # This is the maximum number of qubits used while preparing the equal
+    # superposition state.
+    qu1 = perm + 4 * nM - 1
+    # To explain the number of temporary ancillas, we have nM+1 to perform the
+    # inequality test on mu and nu with out-of-place addition.  We have another
+    # nM-2 for the equality test.  Then we can do the inequality tests on
+    # mu and nu with constants (temporarily) overwriting these variables, and
+    # keeping nM-1 qubits on each.  Then there are another 2 temporary qubits
+    # used for the reflection.  That gives 4*nM-1 total.
+    # This is the number of Toffolis during this step.
+    tof1 = 10 * nM + 2 * br - 9
+
+    # This is increasing the running number of permanent ancillas by 2 for the
+    #  ν=M+1 flag qubit and the success flag qubit.
+    perm = perm + 2
+    # The number of temporary qubits used in this computation is the the same
+    # as the number of Toffolis plus one.
+    qu2 = perm + nM**2 + nM
+    # The Toffoli cost of computing the contiguous register.
+    tof2 = nM**2 + nM - 1
+    # The running number of qubits is increased by the number needed for the
+    # contiguous register.
+    perm = perm + nc
+
+    if algorithm == 'half':
+        # Here I'm setting the k-value for the QROM by hand instead of choosing
+        # the optimal one for Toffolis.
+        kt = 16
+    elif algorithm == 'full':
+        # Here I'm setting the k-value for the QROM by hand instead of choosing
+        #  the optimal one for Toffolis.
+        kt = 32
+    # This is the number of qubits needed during the QROM.
+    qu3 = perm + m * kt + np.ceil(np.log2(d / kt))
+    # The number of Toffolis for the QROM.
+    tof3 = np.ceil(d / kt) + m * (kt - 1)
+    # The number of ancillas used increases by the actual output size of  QROM.
+    perm = perm + m
+    # The number of ancilla qubits used for the subtraction for the ineql test.
+    qu4 = perm + chi
+    # We can use one of the qubits from the registers that are subtracted as
+    # the flag qubit so we don't need an extra flag qubit.
+    # The number of Toffolis needed for the inequality test. The number of
+    # permanent ancillas is unchanged.*)
+    tof4 = chi
+    # We don't need any extra ancillas for the controlled swaps.
+    qu5 = perm
+    # We are swapping pairs of registers of size nM
+    tof5 = 2 * nM
+    # One extra ancilla for the controlled swap of mu and nu because it is
+    # controlled on two qubits.
+    qu6 = perm
+    # One more Toffoli for the double controls.
+    tof6 = nM + 1
+    # Swapping based on the spin register.
+    qu7 = perm
+    tof7 = n / 2
+
+    if algorithm == 'half':
+        # We use these temporary ancillas for the first QROM for the rot angles.
+        qu8 = perm + nM + beta * n / 4
+    elif algorithm == 'full':
+        # We use these temporary ancillas for the first QROM for the rot angles.
+        qu8 = perm + nM + beta * n / 2
+    # The cost of outputting the rotation angles including those for the
+    # one-electron part.
+    tof8 = M + n / 2 - 2
+
+    if algorithm == 'half':
+        # We are now need the output rotation angles, though we don't need the
+        # temporary qubits from the unary iteration.
+        perm = perm + beta * n / 4
+    elif algorithm == 'full':
+        # We are now need the output rotation angles, though we don't need the
+        # temporary qubits from the unary iteration.
+        perm = perm + beta * n / 2
+    # We need a few temp registers for adding into the phase grad register.
+    qu9 = perm + (beta - 2)
+
+    if algorithm == 'half':
+        # The cost of the rotations.
+        tof9 = n * (beta - 2) / 2
+        # Make a list where we keep subtr the data qubits that can be erased.
+        # Table[-j*beta,{j,0,n/4-1}]+perm+(beta-2)
+        qu10 = np.array([-j * beta for j in range(int(n / 4))]) \
+               + perm + beta - 2
+        # The cost of the rotations.
+        # Table[2*(beta-2),{j,0,n/4-1}]
+        tof10 = np.array([2 * (beta - 2) for j in range(int(n / 4))])
+        # We've erased the data
+        perm = perm - beta * n / 4
+    elif algorithm == 'full':
+        # The cost of the rotations.
+        tof9 = n * (beta - 2)
+        # Make a list where we keep subtr the data qubits that can be erased.
+        # Table[-j*beta,{j,0,n/2-1}]+perm+(beta-2)
+        qu10 = np.array([-j * beta for j in range(int(n / 2))]) \
+               + perm + beta - 2
+        # The cost of the rotations.
+        # Table[2*(beta-2),{j,0,n/2-1}]
+        tof10 = np.array([2 * (beta - 2) for j in range(int(n / 2))])
+        # We've erased the data
+        perm = perm - beta * n / 2
+
+    # Find the k for the phase fixup for the erasure of the rotations.
+    k1 = 2**QI(M + n / 2)[0]
+
+    # Temp qubits used. Data qubits were already erased, so don't change perm.
+    qu11 = perm + k1 + np.ceil(np.log2(M / k1))
+    tof11 = np.ceil(M / k1) + np.ceil(n / 2 / k1) + k1
+
+    # Swapping based on the spin register.
+    qu12 = perm
+    tof12 = n / 2
+
+    # Swapping the spin registers.
+    qu12a = perm
+    tof12a = 1
+
+    # Swapping based on the spin register.
+    qu13 = perm
+    tof13 = n / 2
+
+    if algorithm == 'half':
+        # We use these temp ancillas for the second QROM for the rot angles.
+        qu14 = perm + nM - 1 + beta * n / 4
+        perm = perm + beta * n / 4
+    elif algorithm == 'full':
+        # We use these temp ancillas for the second QROM for the rot angles.
+        qu14 = perm + nM - 1 + beta * n / 2
+        perm = perm + beta * n / 2
+    tof14 = M - 2
+
+    # We need a few temporary registers for adding into the phase grad register
+    qu15 = perm + (beta - 2)
+
+    if algorithm == 'half':
+        # The cost of the rotations.
+        tof15 = n * (beta - 2) / 2
+    elif algorithm == 'full':
+        # The cost of the rotations.
+        tof15 = n * (beta - 2)
+
+    # Just one Toffoli to do the controlled Z1.
+    qu16 = perm
+    tof16 = 1
+
+    if algorithm == 'half':
+        # Make a list where we keep subtr the data qubits that can be erased.
+        # Table[-j*beta,{j,0,n/4-1}]+perm+(beta-2)
+        qu17 = np.array([-j * beta for j in range(int(n / 4))
+                        ]) + perm + beta - 2
+        # The cost of the rotations.
+        # Table[2*(beta-2),{j,0,n/4-1}]
+        tof17 = np.array([2 * (beta - 2) for j in range(int(n / 4))])
+        # We've erased the data.
+        perm = perm - beta * n / 4
+    elif algorithm == 'full':
+        # Make a list where we keep subtr the data qubits that can be erased.
+        # Table[-j*beta,{j,0,n/2-1}]+perm+(beta-2)
+        qu17 = np.array([-j * beta for j in range(int(n / 2))]) \
+               + perm + beta - 2
+        # The cost of the rotations.
+        # Table[2*(beta-2),{j,0,n/2-1}]
+        tof17 = np.array([2 * (beta - 2) for j in range(int(n / 2))])
+        # We've erased the data
+        perm = perm - beta * n / 2
+
+    # Find the k for the phase fixup for the erasure of the rotations.
+    k1 = 2**QI(M)[0]
+
+    # The temp qubits used. The data qubits were already erased,
+    # so don't change perm.
+    qu18 = perm + k1 + np.ceil(np.log2(M / k1))
+    tof18 = np.ceil(M / k1) + k1
+
+    # Swapping based on the spin register.
+    qu19 = perm
+    tof19 = n / 2
+
+    # One extra ancilla for the controlled swap of mu and nu because it is
+    # controlled on two qubits.
+    qu20 = perm + 1
+    # One extra Toffoli, because we are controlling on two qubits.
+    tof20 = nM + 1
+
+    # We don't need any extra ancillas for the controlled swaps.
+    qu21 = perm
+    # We are swapping pairs of registers of size nM
+    tof21 = 2 * nM
+
+    # The number of ancilla qubits used for the subtraction for the inequal test
+    qu22 = perm + chi
+    # We can use one of the qubits from the registers that are subtracted as
+    # the flag qubit so we don't need an extra flag qubit.
+    # The number of Toffolis needed for inverting the inequality test.
+    # The number of permanent ancillas is unchanged.
+    tof22 = chi
+    # We can erase the data for the QROM for inverting the state preparation,
+    # then do the phase fixup.
+    perm = perm - m
+
+    kt = 2**QI(d)[0]
+    # This is the number of qubits needed during the QROM.
+    qu23 = perm + kt + np.ceil(np.log2(d / kt))
+    # The number of Toffolis for the QROM.
+    tof23 = np.ceil(d / kt) + kt
+    # The number of temporary qubits used in this computation is the same as
+    # the number of Toffolis plus one. We are erasing the contiguous register
+    # as we go so can subtract nc.
+    qu24 = perm - nc + nM**2 + nM
+    # The Toffoli cost of computing the contiguous register.
+    tof24 = nM**2 + nM - 1
+    # The contiguous register has now been deleted.
+    perm = perm - nc
+    # This is the maximum number of qubits used while preparing the equal
+    # superposition state.
+    qu25 = perm + 4 * nM - 1
+    # This is the number of Toffolis during this step.
+    tof25 = 10 * nM + 2 * br - 9
+    # This is increasing the running number of permanent ancillas by 2 for
+    # the ν=M+1 flag qubit and the success flag qubit.
+    perm = perm - 2
+
+    if algorithm == 'half':
+        # We need some ancillas to perform a reflection on multiple qubits.
+        # We are including one more Toffoli to make it controlled.
+        qu26 = perm + costref + np.ceil(np.log2(iters + 1))
+        tof26 = costref + np.ceil(np.log2(iters + 1))
+    elif algorithm == 'full':
+        # We need some ancillas to perform a reflection on multiple qubits.
+        # We are including one more Toffoli to make it controlled.
+        qu26 = perm + costref
+        tof26 = costref
+
+    qu27 = perm  # Iterate the control register.
+    tof27 = 1
+
+    # Labels
+    sm = 'small element'
+    pq = 'preparation QROM'
+    rq = 'rotation QROM'
+    ri = r'R$^{\dag}$'
+    ro = 'R'
+    iq = 'inverse QROM'
+
+    color_dict = {
+        sm: '#435CE8',
+        pq: '#E83935',
+        rq: '#F59236',
+        ri: '#E3D246',
+        ro: '#36B83E',
+        iq: '#E83935'
+    }
+
+    if algorithm == 'half':
+        tgates = np.hstack((np.array([
+            tof1, tof2, tof3, tof4, tof5, tof6, tof7, tof8, tof9, tof8, tof9,
+            tof9, tof8
+        ]), tof10,
+                            np.array([
+                                tof11, tof12, tof12a, tof13, tof14, tof15,
+                                tof14, tof15, tof16, tof15, tof14
+                            ]), tof17,
+                            np.array([
+                                tof18, tof19, tof20, tof21, tof22, tof23, tof24,
+                                tof25, tof26, tof27
+                            ])))
+        qubits = np.hstack(
+            (np.array([
+                qu1, qu2, qu3, qu4, qu5, qu6, qu7, qu8, qu9, qu8, qu9, qu9, qu8
+            ]), qu10,
+             np.array([
+                 qu11, qu12, qu12a, qu13, qu14, qu15, qu14, qu15, qu16, qu15,
+                 qu14
+             ]), qu17,
+             np.array(
+                 [qu18, qu19, qu20, qu21, qu22, qu23, qu24, qu25, qu26, qu27])))
+        labels = [sm, sm, pq, sm, sm, sm, sm, rq, ri, rq, ri, ro, rq] + \
+                 [ro] * len(qu10) + \
+                 [rq, sm, sm, sm, rq, ri, rq, ri, sm, ro, rq] + \
+                 [ro] * len(qu17) + \
+                 [rq, sm, sm, sm, sm, iq, sm, sm, sm, sm]
+
+        colors = [color_dict[i] for i in labels]
+    elif algorithm == 'full':
+        tgates = np.hstack(
+            (np.array([tof1, tof2, tof3, tof4, tof5, tof6, tof7, tof8,
+                       tof9]), tof10,
+             np.array([tof11, tof12, tof12a, tof13, tof14, tof15,
+                       tof16]), tof17,
+             np.array([
+                 tof18, tof19, tof20, tof21, tof22, tof23, tof24, tof25, tof26,
+                 tof27
+             ])))
+        qubits = np.hstack(
+            (np.array([qu1, qu2, qu3, qu4, qu5, qu6, qu7, qu8, qu9]), qu10,
+             np.array([qu11, qu12, qu12a, qu13, qu14, qu15, qu16]), qu17,
+             np.array(
+                 [qu18, qu19, qu20, qu21, qu22, qu23, qu24, qu25, qu26, qu27])))
+        labels = [sm, sm, pq, sm, sm, sm, sm, rq, ri] + \
+                 [ro] * len(qu10) + \
+                 [rq, sm, sm, sm, rq, ri, sm] + \
+                 [ro] * len(qu17) + \
+                 [rq, sm, sm, sm, sm, iq, sm, sm, sm, sm]
+
+        colors = [color_dict[i] for i in labels]
+
+    # check lists are at least consistent
+    assert all(
+        len(element) == len(tgates) for element in [qubits, labels, colors])
+
+    return tgates, qubits, labels, colors
+
+
+def plot_qubit_vs_toffoli(tgates,
+                          qubits,
+                          labels,
+                          colors,
+                          tgate_label_thresh=100):
+    """ Helper function to plot qubit vs toffoli similar to Figs 11 and 12 from
+        'Even more efficient quantum...' paper (arXiv:2011.03494)
+
+    Args:
+        tgates (list or 1D vector) - list of toffoli values
+        qubits (list or 1D vector) - list of qubit values
+        labels (list) - list of labels corresponding to different steps of alg
+        colors (list) - list of colors corresponding to different steps of alg
+        tgate_label_thresh - don't label steps "thinner" than thresh of Toffolis
+    """
+    # To align the bars on the right edge pass a negative width and align='edge'
+    ax = plt.gca()
+    plt.bar(np.cumsum(tgates),
+            qubits,
+            width=-tgates,
+            align='edge',
+            color=colors)
+    plt.bar(0, qubits[-1], width=sum(tgates), align='edge', color='#D7C4F2')
+    plt.xlabel('Toffoli count')
+    plt.ylabel('Number of qubits')
+
+    # Now add the labels
+    # First, group labels and neighboring tgates
+    labels_grouped, tgates_grouped, qubits_grouped = group_steps(
+        labels, tgates, qubits)
+    for step, label in enumerate(labels_grouped):
+        if 'small' in label:
+            # skip the steps identified as 'small'
+            continue
+        elif tgates_grouped[step] < tgate_label_thresh:
+            # otherwise skip really narrow steps
+            continue
+        else:
+            x = np.cumsum(tgates_grouped)[step] - (tgates_grouped[step] * 0.5)
+            y = 0.5 * (qubits_grouped[step] - qubits[-1]) + qubits[-1]
+            ax.text(x,
+                    y,
+                    label,
+                    rotation='vertical',
+                    va='center',
+                    ha='center',
+                    fontsize='x-small')
+
+    # Finally add system and control qubit label
+    ax.text(0.5*np.sum(tgates), 0.5*qubits[-1], "System and control qubits", \
+            va='center', ha='center',fontsize='x-small')
+
+    plt.show()
+
+
+def table_qubit_vs_toffoli(tgates, qubits, labels, colors):
+    """ Helper function to generate qubit vs toffoli table .. text version of
+         Fig 11 and Fig 12 in arXiv:2011.03494
+
+    Args:
+        tgates (list or 1D vector) - list of toffoli values
+        qubits (list or 1D vector) - list of qubit values
+        labels (list) - list of labels corresponding to different steps of alg
+        colors (list) - list of colors corresponding to different steps of alg
+    """
+
+    print("=" * 60)
+    print("{:>8s}{:>11s}{:>9s}{:>20s}{:>12s}".format('STEP', 'TOFFOLI',
+                                                     'QUBIT*', 'LABEL',
+                                                     'COLOR'))
+    print("-" * 60)
+    for step in range(len(tgates)):
+        print('{:8d}{:11d}{:9d}{:>20s}{:>12s}'.format(step, int(tgates[step]),
+                                                      int(qubits[step]),
+                                                      labels[step],
+                                                      colors[step]))
+    print("=" * 60)
+    print("  *Includes {:d} system and control qubits".format(int(qubits[-1])))
+
+
+def group_steps(labels, tgates, qubits):
+    """ Group similar adjacent steps by label. In addition to the grouped
+        labels, also  returning the total Toffoli count and average qubits
+        allocated for that grouping.
+        Useful for collecting similar steps in the spacetime plots.
+
+        Example:
+          Input:
+            labels = ['R', 'R', 'QROM', 'QROM, 'I-QROM', 'QROM', 'QROM', 'R']
+            tgates = [  5,   8,     20,    10,       14,     30,     10,  20]
+            qubits = [ 10,  10,     40,    20,        4,     80,     60,  60]
+
+          Output:
+            grouped_labels = ['R', 'QROM', 'I-QROM', 'QROM', 'R']
+            grouped_tgates = [ 13,     30,       14,     40,  20]  (sum)
+            grouped_qubits = [ 10,     30,        4,     70,  60]  (mean)
+
+    """
+    assert len(labels) == len(tgates)
+    assert len(labels) == len(qubits)
+
+    # Key function -- group identical nearest neighbors in labels (x[0])
+    key_func = lambda x: x[0]
+
+    # create grouped labels and tgates first
+    grouped_labels = []
+    grouped_tgates = []
+    L = zip(labels, tgates)
+    for label, group in itertools.groupby(L, key_func):
+        grouped_labels.append(label)
+        grouped_tgates.append(np.sum([i[1] for i in group]))
+
+    # now do the grouped qubits
+    # somehow doing multiple list comprehension the group breaks the grouping?
+    # so we have to do this in a separate loop.
+    grouped_qubits = []
+    L = zip(labels, qubits)
+    for label, group in itertools.groupby(L, key_func):
+        grouped_qubits.append(np.mean([i[1] for i in group]))
+
+    # sanity check -- shouldn't be losing total value in toffoli
+    assert np.sum(tgates) == np.sum(grouped_tgates)
+    return grouped_labels, grouped_tgates, grouped_qubits
diff --git a/src/openfermion/resource_estimates/thc/spacetime_test.py b/src/openfermion/resource_estimates/thc/spacetime_test.py
new file mode 100644
index 000000000..7b5c9829a
--- /dev/null
+++ b/src/openfermion/resource_estimates/thc/spacetime_test.py
@@ -0,0 +1,85 @@
+#coverage:ignore
+"""Tests for computing qubit vs toffoli for THC LCU"""
+import numpy as np
+from openfermion.resource_estimates.thc import qubit_vs_toffoli
+
+
+def test_qubit_vs_toffoli_original_strategy():
+    lam = 307.68
+    dE = 0.001
+    eps = dE / (10 * lam)
+    n = 108
+    chi = 10
+    beta = 16
+    M = 350
+
+    # tof, qu array from costingTHCsteps.nb
+    ref_tof = np.asarray([95,89,2852,10,18,10,54,402,1512,28,28,28,28,28,28,\
+                          28,28,28,28,28,28,28,\
+                          28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,\
+                          28,28,28,28,28,28,28,\
+                          28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,\
+                          42,54,1,54,348,1512,\
+                          1,28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,\
+                          28,28,28,28,28,28,\
+                          28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,\
+                          28,28,28,28,28,28,28,\
+                          28,28,28,28,28,28,28,38,54,10,18,10,497,89,95,31,1])
+    ref_qub = np.asarray([228,285,1182,251,241,241,241,1114,1119,1119,1103,\
+                          1087,1071,1055,1039,\
+                          1023,1007,991,975,959,943,927,911,895,879,863,847,\
+                          831,815,799,783,767,\
+                          751,735,719,703,687,671,655,639,623,607,591,575,559,\
+                          543,527,511,495,479,\
+                          463,447,431,415,399,383,367,351,335,319,303,287,271,\
+                          262,241,241,241,1113,\
+                          1119,1105,1119,1103,1087,1071,1055,1039,1023,1007,\
+                          991,975,959,943,927,\
+                          911,895,879,863,847,831,815,799,783,767,751,735,719,\
+                          703,687,671,655,639,\
+                          623,607,591,575,559,543,527,511,495,479,463,447,431,\
+                          415,399,383,367,351,\
+                          335,319,303,287,271,262,241,242,241,251,475,285,230,\
+                          224,193])
+
+    tgates, qubits, _, _ = qubit_vs_toffoli(lam, dE, eps, n, chi, beta, M, \
+                                            algorithm='full', verbose=False)
+
+    assert np.allclose(tgates.astype(int), ref_tof)
+    assert np.allclose(qubits.astype(int), ref_qub)
+
+
+def test_qubit_vs_toffoli_improved_strategy():
+    lam = 307.68
+    dE = 0.001
+    eps = dE / (10 * lam)
+    n = 108
+    chi = 10
+    beta = 16
+    M = 350
+
+    # tof, qu array from costingTHCsteps2.nb
+    ref2_tof = np.asarray([95,89,4293,10,18,10,54,402,756,402,756,756,402,28,\
+                           28,28,28,28,28,28,28,\
+                           28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,28,\
+                           28,28,42,54,1,54,348,\
+                           756,348,756,1,756,348,28,28,28,28,28,28,28,28,28,\
+                           28,28,28,28,28,28,28,\
+                           28,28,28,28,28,28,28,28,28,28,28,38,54,10,18,10,497,\
+                           89,95,50,1])
+    ref2_qub = np.asarray([210,267,685,233,223,223,223,664,669,664,669,669,664,\
+                           669,653,637,621,605,\
+                           589,573,557,541,525,509,493,477,461,445,429,413,397,\
+                           381,365,349,333,317,\
+                           301,285,269,253,244,223,223,223,663,669,663,669,655,\
+                           669,663,669,653,637,\
+                           621,605,589,573,557,541,525,509,493,477,461,445,429,\
+                           413,397,381,365,349,\
+                           333,317,301,285,269,253,244,223,224,223,233,457,267,\
+                           212,225,175])
+
+    tgates, qubits, _, _ = qubit_vs_toffoli(lam, dE, eps, n, chi, beta, M, \
+                                            algorithm='half', verbose=False)
+
+    assert np.allclose(tgates.astype(int), ref2_tof)
+    assert np.allclose(qubits.astype(int), ref2_qub)
diff --git a/src/openfermion/resource_estimates/thc/utils/__init__.py b/src/openfermion/resource_estimates/thc/utils/__init__.py
new file mode 100644
index 000000000..26b33a8e0
--- /dev/null
+++ b/src/openfermion/resource_estimates/thc/utils/__init__.py
@@ -0,0 +1,23 @@
+#coverage:ignore
+#   Copyright 2020 Google LLC
+
+#   Licensed under the Apache License, Version 2.0 (the "License");
+#   you may not use this file except in compliance with the License.
+#   You may obtain a copy of the License at
+#
+#       http://www.apache.org/licenses/LICENSE-2.0
+#
+#   Unless required by applicable law or agreed to in writing, software
+#   distributed under the License is distributed on an "AS IS" BASIS,
+#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#   See the License for the specific language governing permissions and
+#   limitations under the License.
+
+from .thc_factorization import (lbfgsb_opt_thc, lbfgsb_opt_thc_l2reg,
+                                adagrad_opt_thc, lbfgsb_opt_cholesky)
+from .thc_objectives import (thc_objective_jax, thc_objective_grad_jax,
+                             thc_objective, thc_objective_regularized,
+                             thc_objective_grad, thc_objective_and_grad,
+                             cp_ls_cholesky_factor_objective)
+from .adagrad import (adagrad, constant, exponential_decay, inverse_time_decay,
+                      polynomial_decay, piecewise_constant, make_schedule)
diff --git a/src/openfermion/resource_estimates/thc/utils/adagrad.py b/src/openfermion/resource_estimates/thc/utils/adagrad.py
new file mode 100644
index 000000000..9206f6709
--- /dev/null
+++ b/src/openfermion/resource_estimates/thc/utils/adagrad.py
@@ -0,0 +1,118 @@
+#coverage:ignore
+# Copyright 2018 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     https://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from typing import Any, Callable, Union
+import numpy as np
+
+Step = int
+Schedule = Callable[[Step], float]
+
+
+def adagrad(step_size, momentum=0.9):
+    """Construct optimizer triple for Adagrad.
+  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization:
+  http://www.jmlr.org/papers/volume12/duchi11a/duchi11a.pdf
+  Args:
+    step_size: positive scalar, or a callable representing a step size schedule
+      that maps the iteration index to positive scalar.
+    momentum: optional, a positive scalar value for momentum
+  Returns:
+    An (init_fun, update_fun, get_params) triple.
+  """
+    step_size = make_schedule(step_size)
+
+    def init(x0):
+        g_sq = np.zeros_like(x0)
+        m = np.zeros_like(x0)
+        return x0, g_sq, m
+
+    def update(i, g, state):
+        x, g_sq, m = state
+        g_sq += np.square(g)
+        g_sq_inv_sqrt = np.where(g_sq > 0, 1. / np.sqrt(g_sq), 0.0)
+        m = (1. - momentum) * (g * g_sq_inv_sqrt) + momentum * m
+        x = x - step_size(i) * m
+        return x, g_sq, m
+
+    def get_params(state):
+        x, _, _ = state
+        return x
+
+    return init, update, get_params
+
+
+### learning rate schedules
+
+
+def constant(step_size) -> Schedule:
+
+    def schedule(i):
+        return step_size
+
+    return schedule
+
+
+def exponential_decay(step_size, decay_steps, decay_rate):
+
+    def schedule(i):
+        return step_size * decay_rate**(i / decay_steps)
+
+    return schedule
+
+
+def inverse_time_decay(step_size, decay_steps, decay_rate, staircase=False):
+    if staircase:
+
+        def schedule(i):
+            return step_size / (1 + decay_rate * np.floor(i / decay_steps))
+    else:
+
+        def schedule(i):
+            return step_size / (1 + decay_rate * i / decay_steps)
+
+    return schedule
+
+
+def polynomial_decay(step_size, decay_steps, final_step_size, power=1.0):
+
+    def schedule(step_num):
+        step_num = np.minimum(step_num, decay_steps)
+        step_mult = (1 - step_num / decay_steps)**power
+        return step_mult * (step_size - final_step_size) + final_step_size
+
+    return schedule
+
+
+def piecewise_constant(boundaries: Any, values: Any):
+    boundaries = np.array(boundaries)
+    values = np.array(values)
+    if not boundaries.ndim == values.ndim == 1:
+        raise ValueError("boundaries and values must be sequences")
+    if not boundaries.shape[0] == values.shape[0] - 1:
+        raise ValueError(
+            "boundaries length must be one shorter than values length")
+
+    def schedule(i):
+        return values[np.sum(i > boundaries)]
+
+    return schedule
+
+
+def make_schedule(scalar_or_schedule: Union[float, Schedule]) -> Schedule:
+    if callable(scalar_or_schedule):
+        return scalar_or_schedule
+    elif np.ndim(scalar_or_schedule) == 0:
+        return constant(scalar_or_schedule)
+    else:
+        raise TypeError(type(scalar_or_schedule))
diff --git a/src/openfermion/resource_estimates/thc/utils/thc_factorization.py b/src/openfermion/resource_estimates/thc/utils/thc_factorization.py
new file mode 100644
index 000000000..817cb5bb6
--- /dev/null
+++ b/src/openfermion/resource_estimates/thc/utils/thc_factorization.py
@@ -0,0 +1,301 @@
+#coverage:ignore
+import os
+from uuid import uuid4
+import h5py
+import numpy
+import numpy.random
+import numpy.linalg
+from scipy.optimize import minimize
+import jax
+import jax.numpy as jnp
+from jax.config import config
+from jax import jit, grad
+from .adagrad import adagrad
+from .thc_objectives import (thc_objective, thc_objective_grad,
+                             thc_objective_and_grad,
+                             cp_ls_cholesky_factor_objective,
+                             thc_objective_regularized)
+
+# set mkl thread count for numpy einsum/tensordot calls
+# leave one CPU un used  so we can still access this computer
+os.environ["MKL_NUM_THREADS"] = "{}".format(os.cpu_count() - 1)
+config.update("jax_enable_x64", True)
+
+
+class CallBackStore:
+
+    def __init__(self, chkpoint_file, freqency=500):
+        """Generic callback function  for storing intermediates from BFGS and
+           Adagrad optimizations
+        """
+        self.chkpoint_file = chkpoint_file
+        self.freq = freqency
+        self.iter = 0
+
+    def __call__(self, xk):
+        if self.iter % self.freq == 0:
+            f = h5py.File(self.chkpoint_file, "w")
+            f["xk"] = xk
+            f.close()
+
+
+def lbfgsb_opt_thc(eri,
+                   nthc,
+                   chkfile_name=None,
+                   initial_guess=None,
+                   random_seed=None,
+                   maxiter=150_000,
+                   disp=False):
+    """
+    Least-squares fit of two-electron integral tensors with  L-BFGS-B
+    """
+    # initialize chkfile name if one isn't set
+    if chkfile_name is None:
+        chkfile_name = str(uuid4()) + '.h5'
+
+    # callback func stores checkpoints
+    callback_func = CallBackStore(chkfile_name)
+
+    # set initial guess
+    norb = eri.shape[0]
+    if initial_guess is None:
+        if random_seed is None:
+            x = numpy.random.randn(norb * nthc + nthc * nthc)
+        else:
+            numpy.random.seed(random_seed)
+            x = numpy.random.randn(norb * nthc + nthc * nthc)
+    else:
+        x = initial_guess  # add more checks here for safety
+
+    # L-BFGS-B optimization
+    res = minimize(thc_objective_and_grad,
+                   x,
+                   args=(norb, nthc, eri),
+                   jac=True,
+                   method='L-BFGS-B',
+                   options={
+                       'disp': disp,
+                       'maxiter': maxiter
+                   },
+                   callback=callback_func)
+    # print(res)
+    params = res.x
+    x = numpy.array(params)
+    f = h5py.File(chkfile_name, "w")
+    f["etaPp"] = x[:norb * nthc].reshape(nthc, norb)
+    f["ZPQ"] = x[norb * nthc:].reshape(nthc, nthc)
+    f.close()
+    return params
+
+
+def lbfgsb_opt_thc_l2reg(eri,
+                         nthc,
+                         chkfile_name=None,
+                         initial_guess=None,
+                         random_seed=None,
+                         maxiter=150_000,
+                         disp_freq=98,
+                         penalty_param=None,
+                         disp=False):
+    """
+    Least-squares fit of two-electron integral tensors with  L-BFGS-B with
+    l2-regularization of lambda
+
+    disp is ignored.
+    disp_freq sets the freqnecy of printing
+    """
+    if disp_freq > 98 or disp_freq < 1:
+        raise ValueError(
+            "disp_freq {} is not valid. must be between [1, 98]".format(
+                disp_freq))
+
+    if chkfile_name is None:
+        #chkfile_name = str(uuid4()) + '.h5'
+        callback_func = None
+    else:
+        # callback func stores checkpoints
+        callback_func = CallBackStore(chkfile_name)
+
+    # set initial guess
+    norb = eri.shape[0]
+    if initial_guess is None:
+        if random_seed is None:
+            x = numpy.random.randn(norb * nthc + nthc * nthc)
+        else:
+            numpy.random.seed(random_seed)
+            x = numpy.random.randn(norb * nthc + nthc * nthc)
+    else:
+        x = initial_guess  # add more checks here for safety
+
+    # compute inital lambda to set penalty param
+    etaPp = x[:norb * nthc].reshape(nthc, norb)  # leaf tensor  nthc x norb
+    MPQ = x[norb * nthc:norb * nthc + nthc * nthc].reshape(
+        nthc, nthc)  # central tensor
+    SPQ = etaPp.dot(
+        etaPp.T)  # (nthc x norb)  x (norb x nthc) -> (nthc  x nthc) metric
+    cP = jnp.diag(jnp.diag(
+        SPQ))  # grab diagonal elements. equivalent to np.diag(np.diagonal(SPQ))
+    # no sqrts because we have two normalized THC vectors (index by mu and nu)
+    # on each side.
+    MPQ_normalized = cP.dot(MPQ).dot(cP)  # get normalized zeta in Eq. 11 & 12
+    lambda_z = jnp.sum(jnp.abs(MPQ_normalized)) * 0.5
+    # lambda_z = jnp.sum(MPQ_normalized**2) * 0.5
+    CprP = jnp.einsum("Pp,Pr->prP", etaPp,
+                      etaPp)  # this is einsum('mp,mq->pqm', etaPp, etaPp)
+    Iapprox = jnp.einsum('pqU,UV,rsV->pqrs',
+                         CprP,
+                         MPQ,
+                         CprP,
+                         optimize=[(0, 1), (0, 1)])
+    deri = eri - Iapprox
+    # set penalty
+    if penalty_param is None:
+        sum_square_loss = 0.5 * numpy.sum((deri)**2)
+        penalty_param = sum_square_loss / lambda_z
+        print("lambda_z {}".format(lambda_z))
+        print("penalty_param {}".format(penalty_param))
+
+    # L-BFGS-B optimization
+    thc_grad = jax.grad(thc_objective_regularized, argnums=[0])
+    print("Initial Grad")
+    print(thc_grad(jnp.array(x), norb, nthc, jnp.array(eri), penalty_param))
+    print()
+    res = minimize(thc_objective_regularized,
+                   jnp.array(x),
+                   args=(norb, nthc, jnp.array(eri), penalty_param),
+                   method='L-BFGS-B',
+                   jac=thc_grad,
+                   options={
+                       'disp': None,
+                       'iprint': disp_freq,
+                       'maxiter': maxiter
+                   },
+                   callback=callback_func)
+
+    # print(res)
+    params = numpy.array(res.x)
+    x = numpy.array(params)
+    if chkfile_name is not None:
+        f = h5py.File(chkfile_name, "w")
+        f["etaPp"] = x[:norb * nthc].reshape(nthc, norb)
+        f["ZPQ"] = x[norb * nthc:].reshape(nthc, nthc)
+        f.close()
+    return params
+
+
+def adagrad_opt_thc(eri,
+                    nthc,
+                    chkfile_name=None,
+                    initial_guess=None,
+                    random_seed=None,
+                    stepsize=0.01,
+                    momentum=0.9,
+                    maxiter=50_000,
+                    gtol=1.0E-5):
+    """
+    THC opt usually starts with BFGS and then is completed with Adagrad or other
+    first order solver.  This  function implements an Adagrad optimization.
+
+    Optimization runs for 50 K iterations.  This is the ONLY stopping cirteria
+    used in the FT-THC paper by Lee et al.
+    """
+    # initialize chkfile name if one isn't set
+    if chkfile_name is None:
+        chkfile_name = str(uuid4()) + '.h5'
+
+    # callback func stores checkpoints
+    callback_func = CallBackStore(chkfile_name)
+
+    # set initial guess
+    norb = eri.shape[0]
+    if initial_guess is None:
+        if random_seed is None:
+            x = numpy.random.randn(norb * nthc + nthc * nthc)
+        else:
+            numpy.random.seed(random_seed)
+            x = numpy.random.randn(norb * nthc + nthc * nthc)
+    else:
+        x = initial_guess  # add more checks here for safety
+    opt_init, opt_update, get_params = adagrad(step_size=stepsize,
+                                               momentum=momentum)
+    opt_state = opt_init(x)
+
+    def update(i, opt_state):
+        params = get_params(opt_state)
+        gradient = thc_objective_grad(params, norb, nthc, eri)
+        grad_norm_l1 = numpy.linalg.norm(gradient, ord=1)
+        return opt_update(i, gradient, opt_state), grad_norm_l1
+
+    for t in range(maxiter):
+        opt_state, grad_l1 = update(t, opt_state)
+        params = get_params(opt_state)
+        if t % callback_func.freq == 0:
+            # callback_func(params)
+            fval = thc_objective(params, norb, nthc, eri)
+            outline = "Objective val {: 5.15f}".format(fval)
+            outline += "\tGrad L1-norm {: 5.15f}".format(grad_l1)
+            print(outline)
+        if grad_l1 <= gtol:
+            # break out of loop
+            # which sends to save
+            break
+    else:
+        print("Maximum number of iterations reached")
+    # save results before returning
+    x = numpy.array(params)
+    f = h5py.File(chkfile_name, "w")
+    f["etaPp"] = x[:norb * nthc].reshape(nthc, norb)
+    f["ZPQ"] = x[norb * nthc:].reshape(nthc, nthc)
+    f.close()
+    return params
+
+
+def lbfgsb_opt_cholesky(cholesky_factor,
+                        nthc,
+                        chkfile_name=None,
+                        initial_guess=None,
+                        random_seed=None):
+    """
+    Least-squares fit of cholesky tensors with  L-BFGS-B
+
+    cholesky_factor is reshaped into (norb, norb, num_cholesky)
+    """
+    # initialize chkfile name if one isn't set
+    if chkfile_name is None:
+        chkfile_name = str(uuid4()) + '.h5'
+
+    # callback func stores checkpoints
+    callback_func = CallBackStore(chkfile_name)
+
+    # set initial guess
+    norb = cholesky_factor.shape[0]
+    ncholfactor = cholesky_factor.shape[-1]
+
+    if initial_guess is None:
+        if random_seed is None:
+            x = numpy.random.randn(norb * nthc + nthc * ncholfactor)
+        else:
+            numpy.random.seed(random_seed)
+            x = numpy.random.randn(norb * nthc + nthc * ncholfactor)
+    else:
+        x = initial_guess  # add more checks here for safety
+
+    # L-BFGS-B optimization
+    res = minimize(cp_ls_cholesky_factor_objective,
+                   x,
+                   args=(norb, nthc, cholesky_factor, True),
+                   jac=True,
+                   method='L-BFGS-B',
+                   options={
+                       'disp': True,
+                       'ftol': 1.0E-4
+                   },
+                   callback=callback_func)
+    print(res)
+    params = res.x
+    x = numpy.array(params)
+    f = h5py.File(chkfile_name, "w")
+    f["etaPp"] = x[:norb * nthc].reshape(nthc, norb)
+    f["gamma"] = x[norb * nthc:].reshape(nthc, cholesky_factor.shape[-1])
+    f.close()
+    return params
diff --git a/src/openfermion/resource_estimates/thc/utils/thc_objectives.py b/src/openfermion/resource_estimates/thc/utils/thc_objectives.py
new file mode 100644
index 000000000..ecb0af321
--- /dev/null
+++ b/src/openfermion/resource_estimates/thc/utils/thc_objectives.py
@@ -0,0 +1,335 @@
+#coverage:ignore
+import os
+from uuid import uuid4
+import scipy.optimize
+import jax.numpy as jnp
+from jax.config import config
+from jax import jit, grad
+import h5py
+import numpy
+import numpy.random
+import numpy.linalg
+from scipy.optimize import minimize
+from .adagrad import adagrad
+
+# set mkl thread count for numpy einsum/tensordot calls
+# leave one CPU un used  so we can still access this computer
+os.environ["MKL_NUM_THREADS"] = "{}".format(os.cpu_count() - 1)
+config.update("jax_enable_x64", True)
+
+
+def thc_objective_jax(xcur, norb, nthc, eri):
+    """
+    Loss function for THC factorization using jax numpy
+
+    0.5 sum_{pqrs}(eri(pqrs) - G(pqrs))^{2}
+
+    G(pqrs) = sum_{uv}X_{u,p}X_{u,q}Z_{uv}X_{v,r}X_{v,s}
+
+    :param xcur: Current parameters for eta and Z
+    :param norb: number of orbitals
+    :param nthc: thc-basis dimension
+    :param eri: two-electron repulsion integrals in chemist notation
+    :return:
+    """
+    etaPp = xcur[:norb * nthc].reshape(nthc, norb)  # leaf tensor  nthc x norb
+    MPQ = xcur[norb * nthc:norb * nthc + nthc * nthc].reshape(
+        nthc, nthc)  # central tensor
+
+    CprP = jnp.einsum("Pp,Pr->prP", etaPp,
+                      etaPp)  # this is einsum('mp,mq->pqm', etaPp, etaPp)
+    Iapprox = jnp.einsum('pqU,UV,rsV->pqrs',
+                         CprP,
+                         MPQ,
+                         CprP,
+                         optimize=[(0, 1), (0, 1)])
+    deri = eri - Iapprox
+    res = 0.5 * jnp.sum((deri)**2)
+    return res
+
+
+def thc_objective_grad_jax(xcur, norb, nthc, eri):
+    """
+    Gradient for THC least-squares objective jax compatible
+
+    :param xcur: Current parameters for eta and Z
+    :param norb: number of orbitals
+    :param nthc: thc-basis dimension
+    :param eri: two-electron repulsion integrals in chemist notation
+    :param verbose: optional (False) for print iteration residual and inf norm
+    """
+    etaPp = xcur[:norb * nthc].reshape(nthc, norb)  # leaf tensor  nthc x norb
+    MPQ = xcur[norb * nthc:norb * nthc + nthc * nthc].reshape(
+        nthc, nthc)  # central tensor
+
+    # m indexes the nthc and p,q,r,s are orbital indices
+    CprP = jnp.einsum("Pp,Pr->prP", etaPp,
+                      etaPp)  # this is einsum('mp,mq->pqm', etaPp, etaPp)
+    Iapprox = jnp.einsum('pqU,UV,rsV->pqrs',
+                         CprP,
+                         MPQ,
+                         CprP,
+                         optimize=[(0, 1), (0, 1)])
+    deri = eri - Iapprox
+
+    # O(norb^5)
+    dL_dZab = -jnp.einsum(
+        'pqrs,pqA,rsB->AB', deri, CprP, CprP, optimize=[(0, 1), (0, 1)])
+
+    # O(norb^5)
+    dL_dX_GT = -2 * jnp.einsum('Tqrs,Gq,Gv,rsv->GT',
+                               deri,
+                               etaPp,
+                               MPQ,
+                               CprP,
+                               optimize=[(0, 3), (1, 2), (0, 1)])
+
+    dL_dX_GT -= 2 * jnp.einsum('pqTs,pqu,uG,Gs->GT',
+                               deri,
+                               CprP,
+                               MPQ,
+                               etaPp,
+                               optimize=[(0, 1), (0, 2), (0, 1)])
+
+    return jnp.hstack((dL_dX_GT.ravel(), dL_dZab.ravel()))
+
+
+def thc_objective(xcur, norb, nthc, eri, verbose=False):
+    """
+    Loss function for THC factorization
+
+    0.5 sum_{pqrs}(eri(pqrs) - G(pqrs))^{2}
+
+    G(pqrs) = sum_{uv}X_{u,p}X_{u,q}Z_{uv}X_{v,r}X_{v,s}
+
+    :param xcur: Current parameters for eta and Z
+    :param norb: number of orbitals
+    :param nthc: thc-basis dimension
+    :param eri: two-electron repulsion integrals in chemist notation
+    :param verbose: optional (False) for print iteration residual and inf norm
+    :return:
+    """
+    etaPp = xcur[:norb * nthc].reshape(nthc, norb)  # leaf tensor  nthc x norb
+    MPQ = xcur[norb * nthc:norb * nthc + nthc * nthc].reshape(
+        nthc, nthc)  # central tensor
+
+    CprP = numpy.einsum("Pp,Pr->prP", etaPp,
+                        etaPp)  # this is einsum('mp,mq->pqm', etaPp, etaPp)
+    Iapprox = numpy.einsum('pqU,UV,rsV->pqrs',
+                           CprP,
+                           MPQ,
+                           CprP,
+                           optimize=['einsum_path', (0, 1), (0, 1)])
+    deri = eri - Iapprox
+    res = 0.5 * numpy.sum((deri)**2)
+
+    if verbose:
+        print("res, max, lambda = {}, {}".format(res,
+                                                 numpy.max(numpy.abs(deri))))
+
+    return res
+
+
+def thc_objective_regularized(xcur,
+                              norb,
+                              nthc,
+                              eri,
+                              penalty_param,
+                              verbose=False):
+    """
+    Loss function for THC factorization
+
+    0.5 sum_{pqrs}(eri(pqrs) - G(pqrs))^{2}
+
+    G(pqrs) = sum_{uv}X_{u,p}X_{u,q}Z_{uv}X_{v,r}X_{v,s}
+
+    :param xcur: Current parameters for eta and Z
+    :param norb: number of orbitals
+    :param nthc: thc-basis dimension
+    :param eri: two-electron repulsion integrals in chemist notation
+    :param verbose: optional (False) for print iteration residual and inf norm
+    :return:
+    """
+    etaPp = xcur[:norb * nthc].reshape(nthc, norb)  # leaf tensor  nthc x norb
+    MPQ = xcur[norb * nthc:norb * nthc + nthc * nthc].reshape(
+        nthc, nthc)  # central tensor
+
+    CprP = jnp.einsum("Pp,Pr->prP", etaPp,
+                      etaPp)  # this is einsum('mp,mq->pqm', etaPp, etaPp)
+    Iapprox = jnp.einsum('pqU,UV,rsV->pqrs',
+                         CprP,
+                         MPQ,
+                         CprP,
+                         optimize=[(0, 1), (0, 1)])
+    deri = eri - Iapprox
+
+    SPQ = etaPp.dot(
+        etaPp.T)  # (nthc x norb)  x (norb x nthc) -> (nthc  x nthc) metric
+    cP = jnp.diag(jnp.diag(
+        SPQ))  # grab diagonal elements. equivalent to np.diag(np.diagonal(SPQ))
+    # no sqrts because we have two normalized THC vectors (index by mu and nu)
+    # on each side.
+    MPQ_normalized = cP.dot(MPQ).dot(cP)  # get normalized zeta in Eq. 11 & 12
+
+    lambda_z = jnp.sum(jnp.abs(MPQ_normalized)) * 0.5
+
+    res = 0.5 * jnp.sum((deri)**2) + penalty_param * (lambda_z**2)
+
+    if verbose:
+        print("res, max, lambda**2 = {}, {}".format(res, lambda_z**2))
+
+    return res
+
+
+def thc_objective_grad(xcur, norb, nthc, eri, verbose=False):
+    """
+    Gradient for THC least-squares objective
+
+    :param xcur: Current parameters for eta and Z
+    :param norb: number of orbitals
+    :param nthc: thc-basis dimension
+    :param eri: two-electron repulsion integrals in chemist notation
+    :param verbose: optional (False) for print iteration residual and inf norm
+    """
+    etaPp = numpy.array(xcur[:norb * nthc]).reshape(
+        nthc, norb)  # leaf tensor  nthc x norb
+    MPQ = numpy.array(xcur[norb * nthc:norb * nthc + nthc * nthc]).reshape(
+        nthc, nthc)  # central tensor
+
+    # m indexes the nthc and p,q,r,s are orbital indices
+    CprP = numpy.einsum("Pp,Pr->prP", etaPp,
+                        etaPp)  # this is einsum('mp,mq->pqm', etaPp, etaPp)
+    Iapprox = numpy.einsum('pqU,UV,rsV->pqrs',
+                           CprP,
+                           MPQ,
+                           CprP,
+                           optimize=['einsum_path', (0, 1), (0, 1)])
+    deri = eri - Iapprox
+    res = 0.5 * numpy.sum((deri)**2)
+
+    if verbose:
+        print("res, max, lambda = {}, {}".format(res,
+                                                 numpy.max(numpy.abs(deri))))
+
+    # O(norb^5)
+    dL_dZab = -numpy.einsum('pqrs,pqA,rsB->AB',
+                            deri,
+                            CprP,
+                            CprP,
+                            optimize=['einsum_path', (0, 1), (0, 1)])
+    # O(norb^5)
+    dL_dX_GT = -2 * numpy.einsum(
+        'Tqrs,Gq,Gv,rsv->GT',
+        deri,
+        etaPp,
+        MPQ,
+        CprP,
+        optimize=['einsum_path', (0, 3), (1, 2), (0, 1)])
+
+    dL_dX_GT -= 2 * numpy.einsum(
+        'pqTs,pqu,uG,Gs->GT',
+        deri,
+        CprP,
+        MPQ,
+        etaPp,
+        optimize=['einsum_path', (0, 1), (0, 2), (0, 1)])
+
+    return numpy.hstack((dL_dX_GT.ravel(), dL_dZab.ravel()))
+
+
+def thc_objective_and_grad(xcur, norb, nthc, eri, verbose=False):
+    """
+    Loss function for THC factorization
+
+    0.5 sum_{pqrs}(eri(pqrs) - G(pqrs))^{2}
+
+    G(pqrs) = sum_{uv}X_{u,p}X_{u,q}Z_{uv}X_{v,r}X_{v,s}
+
+    :param xcur: Current parameters for eta and Z
+    :param norb: number of orbitals
+    :param nthc: thc-basis dimension
+    :param eri: two-electron repulsion integrals in chemist notation
+    :param verbose: optional (False) for print iteration residual and inf norm
+    :return:
+    """
+    etaPp = xcur[:norb * nthc].reshape(nthc, norb)  # leaf tensor  nthc x norb
+    MPQ = xcur[norb * nthc:norb * nthc + nthc * nthc].reshape(
+        nthc, nthc)  # central tensor
+    CprP = numpy.einsum("Pp,Pr->prP", etaPp,
+                        etaPp)  # this is einsum('mp,mq->pqm', etaPp, etaPp)
+
+    Iapprox = numpy.einsum('pqU,UV,rsV->pqrs',
+                           CprP,
+                           MPQ,
+                           CprP,
+                           optimize=['einsum_path', (0, 1), (0, 1)])
+    deri = eri - Iapprox
+    res = 0.5 * numpy.sum((deri)**2)
+    # O(norb^5)
+    dL_dZab = -numpy.einsum('pqrs,pqA,rsB->AB',
+                            deri,
+                            CprP,
+                            CprP,
+                            optimize=['einsum_path', (0, 1), (0, 1)])
+    # O(norb^4 * nthc)
+    dL_dX_GT = -2 * numpy.einsum(
+        'Tqrs,Gq,Gv,rsv->GT',
+        deri,
+        etaPp,
+        MPQ,
+        CprP,
+        optimize=['einsum_path', (0, 3), (1, 2), (0, 1)])
+
+    dL_dX_GT -= 2 * numpy.einsum(
+        'pqTs,pqu,uG,Gs->GT',
+        deri,
+        CprP,
+        MPQ,
+        etaPp,
+        optimize=['einsum_path', (0, 1), (0, 2), (0, 1)])
+
+    return res, numpy.hstack((dL_dX_GT.ravel(), dL_dZab.ravel()))
+
+
+def cp_ls_cholesky_factor_objective(beta_gamma,
+                                    norb,
+                                    nthc,
+                                    cholesky_factor,
+                                    calcgrad=False):
+    """cholesky_factor is reshaped into (norb, norb, num_cholesky)
+
+    Cholesky factor B_{ab,x}
+
+    Lst sq fit obj ||B_{ab,x} - sum_{r}beta_{a,x}beta_{b,x}gamma_{ab,x}||
+
+    This function provides the objective function value and gradient with
+      respect to beta and gamma
+    """
+    # compute objective
+    num_cholfactors = cholesky_factor.shape[-1]
+    beta_bR = beta_gamma[:norb * nthc].reshape((norb, nthc))
+    gamma_yR = beta_gamma[norb * nthc:norb * nthc +
+                          nthc * num_cholfactors].reshape(
+                              (num_cholfactors, nthc))
+    beta_abR = numpy.einsum('aR,bR->abR', beta_bR, beta_bR)
+    chol_approx = numpy.einsum('abR,XR->abX', beta_abR, gamma_yR)
+    delta = cholesky_factor - chol_approx
+    fval = 0.5 * numpy.sum((delta)**2)
+
+    if calcgrad:
+        # compute grad
+        # \partial O / \partial beta_{c,s}
+        grad_beta = -2 * numpy.einsum('Cbx,bS,xS->CS',
+                                      delta,
+                                      beta_bR,
+                                      gamma_yR,
+                                      optimize=['einsum_path', (0, 2), (0, 1)])
+        grad_gamma = -numpy.einsum('abY,aS,bS->YS',
+                                   delta,
+                                   beta_bR,
+                                   beta_bR,
+                                   optimize=['einsum_path', (1, 2), (0, 1)])
+        grad = numpy.hstack((grad_beta.ravel(), grad_gamma.ravel()))
+        return fval, grad
+    else:
+        return fval
diff --git a/src/openfermion/resource_estimates/utils.py b/src/openfermion/resource_estimates/utils.py
new file mode 100644
index 000000000..3deec27fb
--- /dev/null
+++ b/src/openfermion/resource_estimates/utils.py
@@ -0,0 +1,190 @@
+#coverage:ignore
+""" Utilities for FT costing calculations """
+from typing import Tuple, Optional
+import sys
+import os
+import h5py
+import numpy as np
+
+
+def QR(L: int, M1: int) -> Tuple[int, int]:
+    """ This gives the optimal k and minimum cost for a QROM over L values of
+        size M.
+
+    Args:
+        L (int) -
+        M1 (int) -
+
+    Returns:
+       k_opt (int) - k that yields minimal (optimal) cost of QROM
+       val_opt (int) - minimal (optimal) cost of QROM
+    """
+    k = 0.5 * np.log2(L / M1)
+    try:
+        assert k >= 0
+    except AssertionError:
+        sys.exit("In function QR: \
+        \n  L is smaller than M: increase RANK or lower THRESH \
+        (or alternatively decrease CHI)")
+    value = lambda k: L / np.power(2, k) + M1 * (np.power(2, k) - 1)
+    k_int = [np.floor(k), np.ceil(k)]  # restrict optimal k to integers
+    k_opt = k_int[np.argmin(value(k_int))]  # obtain optimal k
+    val_opt = np.ceil(value(k_opt))  # obtain ceiling of optimal value given k
+    assert k_opt.is_integer()
+    assert val_opt.is_integer()
+    return int(k_opt), int(val_opt)
+
+
+def QR2(L1: int, L2: int, M1: int) -> Tuple[int, int, int]:
+    """ This gives the optimal k values and minimum cost for a QROM using
+        two L values of size M,
+        e.g. the optimal k values for the QROM on two registers.
+    Args:
+        L1 (int) -
+        L2 (int) -
+        M1 (int) -
+
+    Returns:
+       k1_opt (int) - k1 that yields minimal (optimal) cost of QROM with two reg
+       k2_opt (int) - k2 that yields minimal (optimal) cost of QROM with two reg
+       val_opt (int) - minimal (optimal) cost of QROM
+    """
+
+    k1_opt, k2_opt = 0, 0
+    val_opt = 1e50
+    # Doing this as a stupid loop for now, worth refactoring but cost is quick
+    # Biggest concern is if k1 / k2 range is not large enough!
+    for k1 in range(1, 17):
+        for k2 in range(1, 17):
+            value = np.ceil(L1 / np.power(2, k1)) * np.ceil(L2 / \
+                np.power(2, k2)) +\
+                M1 * (np.power(2, k1 + k2) - 1)
+            if value < val_opt:
+                val_opt = value
+                k1_opt = k1
+                k2_opt = k2
+
+    assert val_opt.is_integer()
+    return int(np.power(2, k1_opt)), int(np.power(2, k2_opt)), int(val_opt)
+
+
+def QI(L: int) -> Tuple[int, int]:
+    """ This gives the opt k and minimum cost for an inverse QROM over L vals
+
+    Args:
+        L (int) -
+
+    Returns:
+       k_opt (int) - k that yiles minimal (optimal) cost of inverse QROM
+       val_opt (int) - minimal (optimal) cost of inverse QROM
+    """
+    k = 0.5 * np.log2(L)
+    assert k >= 0
+    value = lambda k: L / np.power(2, k) + np.power(2, k)
+    k_int = [np.floor(k), np.ceil(k)]  # restrict optimal k to integers
+    k_opt = k_int[np.argmin(value(k_int))]  # obtain optimal k
+    val_opt = np.ceil(value(k_opt))  # obtain ceiling of optimal value given k
+    assert k_opt.is_integer()
+    assert val_opt.is_integer()
+    return int(k_opt), int(val_opt)
+
+
+# Is this ever used? It's defined in costingsf.nb, but I don't it's ever called.
+def QI2(L1: int, L2: int) -> Tuple[int, int, int]:
+    """ This gives the optimal k values and minimum cost for inverse QROM
+        using two L values,
+        e.g. the optimal k values for the inverse QROM on two registers.
+
+    Args:
+        L1 (int) -
+        L2 (int) -
+
+    Returns:
+       k1_opt (int) - k1 with minimal (optimal) cost of inverse QROM with 2 regs
+       k2_opt (int) - k2 with minimal (optimal) cost of inverse QROM with 2 regs
+       val_opt (int) - minimal (optimal) cost of inverse QROM with two registers
+    """
+
+    k1_opt, k2_opt = 0, 0
+    val_opt = 1e50
+    # Doing this as a stupid loop for now, worth refactoring but cost is quick
+    # Biggest concern is if k1 / k2 range is not large enough!
+    for k1 in range(1, 17):
+        for k2 in range(1, 17):
+            value = np.ceil(L1 / np.power(2, k1)) * np.ceil(L2 / \
+                np.power(2, k2)) +\
+                np.power(2, k1 + k2)
+            if value < val_opt:
+                val_opt = value
+                k1_opt = k1
+                k2_opt = k2
+
+    assert val_opt.is_integer()
+    return int(np.power(2, k1_opt)), int(np.power(2, k2_opt)), int(val_opt)
+
+
+def power_two(m: int) -> int:
+    """ Return the power of two that is a factor of m """
+    assert m >= 0
+    if m % 2 == 0:
+        count = 0
+        while (m > 0) and (m % 2 == 0):
+            m = m // 2
+            count += 1
+        return count
+    return 0
+
+
+class RunSilent(object):
+    """ Context manager to prevent function writing to stdout/stderr
+        e.g. for noisy_function(), wrap it like so
+
+        with RunSilent():
+            noisy_function()
+
+        ... and your terminal will no longer be littered with prints
+    """
+
+    def __init__(self, stdout=None, stderr=None):
+        self.devnull = open(os.devnull, 'w')
+        self._stdout = stdout or self.devnull or sys.stdout
+        self._stderr = stderr or self.devnull or sys.stderr
+
+    def __enter__(self):
+        self.old_stdout, self.old_stderr = sys.stdout, sys.stderr
+        self.old_stdout.flush()
+        self.old_stderr.flush()
+        sys.stdout, sys.stderr = self._stdout, self._stderr
+
+    def __exit__(self, exc_type, exc_value, traceback):
+        self._stdout.flush()
+        self._stderr.flush()
+        sys.stdout = self.old_stdout
+        sys.stderr = self.old_stderr
+        self.devnull.close()
+
+
+def eigendecomp(M, tol=1.15E-16):
+    """ Decompose matrix M into L.L^T where rank(L) < rank(M) to some threshold
+
+    Args:
+       M (np.ndarray) - (N x N) positive semi-definite matrix to be decomposed
+       tol (float) - eigenpairs with eigenvalue above tol will be kept
+
+    Returns:
+       L (np.ndarray) - (K x N) array such that K <= N and L.L^T = M
+    """
+    eigenvalues, eigenvectors = np.linalg.eigh(M)
+
+    # Put in descending order
+    eigenvalues = eigenvalues[::-1]
+    eigenvectors = eigenvectors[:, ::-1]
+
+    # Truncate
+    idx = np.where(eigenvalues > tol)[0]
+    eigenvalues, eigenvectors = eigenvalues[idx], eigenvectors[:, idx]
+
+    # eliminate eigenvalues from eigendecomposition
+    L = np.einsum("ij,j->ij", eigenvectors, np.sqrt(eigenvalues))
+
+    return L
diff --git a/src/openfermion/resource_estimates/utils_test.py b/src/openfermion/resource_estimates/utils_test.py
new file mode 100644
index 000000000..d4bb0dc3e
--- /dev/null
+++ b/src/openfermion/resource_estimates/utils_test.py
@@ -0,0 +1,58 @@
+#coverage:ignore
+"""Test cases for util.py
+"""
+from openfermion.resource_estimates.utils import QR, QI, QR2, QI2, power_two
+
+
+def test_QR():
+    """ Tests function QR which gives the minimum cost for a QROM over L values
+        of size M.
+    """
+    # Tests checked against Mathematica noteboook `costingTHC.nb`
+    # Arguments are otherwise random
+    assert QR(12341234, 5670) == (6, 550042)
+    assert QR(12201990, 520199) == (2, 4611095)
+
+
+def test_QI():
+    """ Tests function QI which gives the minimum cost for inverse QROM over
+        L values.
+    """
+    # Tests checked against Mathematica noteboook `costingTHC.nb`
+    # Arguments are otherwise random
+    assert QI(987654) == (10, 1989)
+    assert QI(8052021) == (11, 5980)
+
+
+def test_QR2():
+    """ Tests function QR2 which gives the minimum cost for a QROM with two
+        registers.
+    """
+    # Tests checked against Mathematica noteboook `costingsf.nb`
+    # Arguments are otherwise random
+    assert QR2(12, 34, 81) == (2, 2, 345)
+    assert QR2(712, 340111, 72345) == (4, 16, 8341481)
+
+
+def test_QI2():
+    """ Tests function QI which gives the minimum cost for inverse QROM with
+        two registers.
+    """
+    # Tests checked against Mathematica noteboook `costingsf.nb`
+    # Arguments are otherwise random
+    assert QI2(1234, 5678) == (32, 64, 5519)
+    assert QI2(7120, 1340111) == (4, 32768, 204052)
+
+
+def test_power_two():
+    """ Test for power_two(m) which returns power of 2 that is a factor of m """
+    try:
+        power_two(-1234)
+    except AssertionError:
+        pass
+    assert power_two(0) == 0
+    assert power_two(2) == 1
+    assert power_two(3) == 0
+    assert power_two(104) == 3  # 2**3 * 13
+    assert power_two(128) == 7  # 2**7
+    assert power_two(393120) == 5  # 2**5 * 3**3 * 5 * 7 * 13