Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create temporary ledger chunks while recovery is in progress #3563

Merged
merged 44 commits into from
Feb 22, 2022

Conversation

jumaffre
Copy link
Contributor

@jumaffre jumaffre commented Feb 16, 2022

Resolves #1652

This PR introduces temporary the .recovery suffix for ledger chunks created while the recovery is in progress. These .recovery files are ignored by subsequent nodes on start-up, which means that a recovery can now be automatically performed after a previous attempt failed.

This done by:

  • Marking new ledger chunks as .recovery from the point the historical ledger is truncated (i.e. start of recovery) until the last recovery share is submitted.
  • These .recovery. can of course be committed while the recovery is in progress, i.e. ledger_x-y.committed.recovery.
  • When the recovery is complete, the .recovery suffix is removed from all recovery ledger chunks.
  • Nodes ignore or delete .recovery ledger files on startup. In other words, they ignore previously failed recovery attempts and re-initiate a recovery as if the failed recovery never happened.

How is this tested?

  • Unit test in ledger.cpp
  • A new end-to-end test that checks that a service shutdown before the last recovery share is submitted can be recovered by a new service

TODO:

  • Further ledger unit tests
  • Add test to suite
  • Documentation
  • Inline comments in PR
  • Fix ledger_open call without argument
  • Service open as well as recovery

@ghost
Copy link

ghost commented Feb 17, 2022

recovery_temporary_ledger_chunks@42264 aka 20220222.6 vs main ewma over 20 builds from 41879 to 42256

Click to see table

main

build_id build_number tpcc_sgx_cft^ tpcc_sgx_cft_mem ls_sgx_cft^ ls_sgx_cft_mem ls_jwt_sgx_cft^ ls_jwt_sgx_cft_mem ls_js_sgx_cft^ ls_js_sgx_cft_mem ls_v8_sgx_cft^ ls_v8_sgx_cft_mem ls_full_js_sgx_cft^ ls_full_js_sgx_cft_mem ls_full_v8_sgx_cft^ ls_full_v8_sgx_cft_mem ls_js_jwt_sgx_cft^ ls_js_jwt_sgx_cft_mem hist_sgx_cft^ RB put (/s)^ CHAMP put (/s)^ RB get (/s)^ CHAMP get (/s)^
41879 20220215.35 5867.64 8.97885e+07 19734.9 1.6126e+07 5682.42 1.56017e+07 2527.3 1.0621e+07 1629.15 1.63713e+08 2051.63 1.21939e+07 1463.9 9.89635e+07 1961.04 9.04813e+06 19070.9 907997 1.35315e+06 9.36867e+06 3.65714e+07
41919 20220215.49 5659.44 9.05749e+07 18947.4 1.63882e+07 5526.55 1.56017e+07 2472.27 9.83456e+06 1572.28 1.63451e+08 2138.1 9.04813e+06 1423.79 9.87014e+07 1914.34 9.31027e+06 18735.6 877457 1.34356e+06 9.0419e+06 3.50679e+07
41940 20220215.57 5976.65 9.26721e+07 20000.8 1.63882e+07 5684.32 1.56017e+07 2524.82 1.03588e+07 1661.56 1.64237e+08 2109.61 1.03588e+07 1475.98 9.87014e+07 1878.96 1.11453e+07 21970.4 904869 1.3768e+06 9.27112e+06 3.58669e+07
41943 20220216.1 6053.86 9.10992e+07 19977.7 1.63882e+07 5632.89 1.6126e+07 2524.46 1.0621e+07 1663.47 1.63975e+08 2166.73 1.00967e+07 1482.81 9.89635e+07 1969.38 9.31027e+06 17362.9 879418 1.40148e+06 9.40308e+06 3.58669e+07
41971 20220217.1 5961.12 9.00506e+07 19956.7 1.66503e+07 5662.83 1.56017e+07 2545.28 1.03588e+07 1655.01 1.62402e+08 2185.9 9.83456e+06 1487.66 9.87014e+07 1984.7 9.31027e+06 19455.6 908722 1.38303e+06 9.21688e+06 3.67019e+07
42010 20220217.15 6000.5 9.03128e+07 19673.9 1.69124e+07 5661.87 1.58639e+07 2400.44 1.03588e+07 1637.96 1.63451e+08 2230.07 9.57242e+06 1478.06 9.87014e+07 1966.5 9.31027e+06 17686.2 903869 1.38602e+06 9.30905e+06 3.63766e+07
42017 20220217.18 5901.08 9.13613e+07 19836 1.74367e+07 5518.43 1.56017e+07 2533.56 1.00967e+07 1638.9 1.62665e+08 2149.57 9.83456e+06 1450.01 9.87014e+07 1937.65 1.11453e+07 19972 901482 1.39044e+06 9.33876e+06 3.58669e+07
42040 20220217.26 5837.37 8.97885e+07 19762.9 1.66503e+07 5636.59 1.56017e+07 2538.87 1.0621e+07 1644.94 1.63189e+08 2163.74 1.00967e+07 1462.36 9.76528e+07 1964.24 9.83456e+06 20054.6 903595 1.37394e+06 9.28798e+06 3.57417e+07
42063 20220218.1 5836.31 8.97885e+07 19979.8 1.58639e+07 5654.85 1.58639e+07 2534.49 1.03588e+07 1652.36 1.645e+08 2161.46 9.83456e+06 1461.92 9.89635e+07 2018.57 8.78598e+06 21834.2 903032 1.37219e+06 9.22934e+06 3.58663e+07
42069 20220218.3 6118.04 9.16235e+07 18538.9 1.6126e+07 5573.33 1.58639e+07 2525.65 1.03588e+07 1656.49 1.64237e+08 2162.93 1.00967e+07 1425.53 9.87014e+07 1964.72 9.31027e+06 20305.9 906268 1.41857e+06 9.4074e+06 3.54939e+07
42075 20220218.5 5645.07 9.05749e+07 19205.9 1.66503e+07 5589.19 1.56017e+07 2531.81 1.0621e+07 1635.21 1.61878e+08 2160.44 1.03588e+07 1467.45 9.87014e+07 1968.46 9.31027e+06 18034.8 922557 1.38107e+06 9.4074e+06 3.64413e+07
42089 20220218.11 5970.53 9.10992e+07 19748.5 1.63882e+07 5613.78 1.58639e+07 2526.78 1.08831e+07 1620.27 1.63975e+08 2114.89 9.83456e+06 1422.61 9.76528e+07 1907.5 9.31027e+06 18943.2 910096 1.38537e+06 9.25855e+06 3.57417e+07
42094 20220218.13 5968.92 9.00506e+07 20004.6 1.6126e+07 5685.72 1.56017e+07 2542.61 1.0621e+07 1659.39 1.61354e+08 2167.73 1.00967e+07 1477.41 9.92257e+07 1972.93 9.04813e+06 19940.9 910262 1.37829e+06 9.20445e+06 3.58036e+07
42121 20220218.24 5867.23 9.00506e+07 19773.3 1.66503e+07 5581.76 1.56017e+07 2542.89 1.03588e+07 1629.23 1.645e+08 2164.03 1.00967e+07 1479.84 9.84392e+07 1968.76 9.04813e+06 17943.9 893929 1.39395e+06 9.39445e+06 3.55556e+07
42144 20220218.32 5924.88 8.97885e+07 19749.4 1.69124e+07 5693.79 1.58639e+07 2523.42 1.0621e+07 1619.45 1.63975e+08 2165.48 1.14074e+07 1460.44 9.89635e+07 1938.67 9.57242e+06 19984.8 907952 1.38377e+06 9.21265e+06 3.58663e+07
42175 20220221.2 6075.56 9.13613e+07 19781 1.63882e+07 5494.48 1.56017e+07 2529.54 1.0621e+07 1624.12 1.61616e+08 2119.01 9.83456e+06 1475.48 9.84392e+07 1971.31 9.31027e+06 18140.9 902755 1.37902e+06 9.4334e+06 3.57417e+07
42191 20220221.7 6051.75 9.03128e+07 20063.8 1.69124e+07 5638.91 1.56017e+07 2540.21 1.08831e+07 1661.68 1.61354e+08 2182.64 9.83456e+06 1491.37 9.84392e+07 1943.5 9.57242e+06 17974.1 894982 1.38284e+06 9.27108e+06 3.56174e+07
42211 20220221.14 5930.94 9.13613e+07 19594.8 1.63882e+07 5652.28 1.56017e+07 2425.32 1.03588e+07 1612.61 1.63975e+08 2158.03 1.03588e+07 1473.15 9.84392e+07 2015 8.78598e+06 19769.4 879795 1.38154e+06 9.26269e+06 3.54939e+07
42229 20220221.20 5790.78 9.10992e+07 19795.7 1.6126e+07 5669.65 1.56017e+07 2402.08 1.0621e+07 1574.69 1.63713e+08 2159.76 1.32424e+07 1464.57 9.87014e+07 1926.65 9.57242e+06 17878.3 880893 1.34976e+06 9.19615e+06 3.5249e+07
42256 20220222.2 6017.97 9.18856e+07 19773.7 1.69124e+07 5594.56 1.56017e+07 2533.2 1.0621e+07 1650.07 1.64237e+08 2167.91 1.00967e+07 1474.53 9.87014e+07 1971.99 9.57242e+06 20985.9 898634 1.37634e+06 9.33876e+06 3.63766e+07

recovery_temporary_ledger_chunks

build_id build_number tpcc_sgx_cft^ tpcc_sgx_cft_mem ls_sgx_cft^ ls_sgx_cft_mem ls_jwt_sgx_cft^ ls_jwt_sgx_cft_mem ls_js_sgx_cft^ ls_js_sgx_cft_mem ls_v8_sgx_cft^ ls_v8_sgx_cft_mem ls_full_js_sgx_cft^ ls_full_js_sgx_cft_mem ls_full_v8_sgx_cft^ ls_full_v8_sgx_cft_mem ls_js_jwt_sgx_cft^ ls_js_jwt_sgx_cft_mem hist_sgx_cft^ RB put (/s)^ CHAMP put (/s)^ RB get (/s)^ CHAMP get (/s)^
42125 20220218.25 5898.51 8.92642e+07 20092.8 1.66503e+07 5577.06 1.58639e+07 2540.55 1.03588e+07 1624.06 1.63713e+08 2164.71 1.00967e+07 1469.85 9.87014e+07 1967.68 9.57242e+06 18129.2 901207 1.40784e+06 9.20445e+06 3.64413e+07
42157 20220218.36 5833.9 9.03128e+07 19565.5 1.6126e+07 5645.26 1.53396e+07 2532.75 1.03588e+07 1610.56 1.63451e+08 2155.16 1.03588e+07 1448.32 9.84392e+07 1965.4 9.31027e+06 21885.4 888388 1.38341e+06 9.28377e+06 3.58669e+07
42204 20220221.11 6020.49 9.03128e+07 19493.2 1.63882e+07 5699.14 1.56017e+07 2532.97 1.08831e+07 1657.83 1.61354e+08 2159.27 1.00967e+07 1497.35 9.79149e+07 1963.78 9.31027e+06 18063.9 909241 1.37476e+06 9.1961e+06 3.54933e+07
42245 20220221.26 5929.28 9.03128e+07 19925.5 1.63882e+07 5671.86 1.56017e+07 2529.98 1.0621e+07 1626.42 1.64237e+08 2232.92 9.57242e+06 1480.46 9.87014e+07 2013.82 8.78598e+06 20006.7 918381 1.37717e+06 9.39441e+06 3.65062e+07
42264 20220222.6 5878.64 8.97885e+07 19858.3 1.63882e+07 5632.47 1.58639e+07 2522 1.03588e+07 1570.65 1.64237e+08 2159.45 1.00967e+07 1471.06 9.81771e+07 1924.78 1.14074e+07 19442.8 926148 1.36843e+06 9.47271e+06 3.64413e+07

images

src/host/ledger.h Outdated Show resolved Hide resolved
@jumaffre jumaffre marked this pull request as ready for review February 17, 2022 15:24
@jumaffre jumaffre requested a review from a team February 17, 2022 15:24
@jumaffre jumaffre merged commit 296069b into microsoft:main Feb 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failed public recovery entries will block further recoveries
3 participants