Sichuan Cookbook (四川菜谱) is a renowned classic, first published in July 1972 by Chengdu Catering Company (成都市饮食公司). It contains 312 traditional recipes, many of which are no longer served in modern restaurants. The aim of this project is to digitize the Sichuan Cookbook correctly, preserving its rich culinary history.
- Photograph each page of the book using a digital camera.
- Enhance the photos, correcting exposure and perspective distortion, and adjust the typographical area.
- Convert the enhanced photos into binary images.
- Perform Optical Character Recognition (OCR) on the binary images.
- Generate a PDF of the original book from the binary images and the OCR results.
- Abstract each recipe into a data structure and document them in LaTeX.
- Replicate the original book's appearance using LaTeX typesetting.
- Compile the LaTeX documents to create a final PDF version.
- Optionally, annotate the book with additional research findings.
According to the Copyright Law of China, concerning a work of a legal person or other organization, the term of protection for the copyrights to that work shall be 50 years and shall end on December 31 of the 50th year after the work's first publication. Consequently, the first edition of the Sichuan Cookbook (四川菜谱) entered the public domain after December 31, 2022. This project was made publicly available on January 1, 2023.
For a modern eBook format, download the Sichuan Cookbook 1972 Remake (3.46 MB) in A5 paper size (210mm x 148mm), a remade edition of the original paperback.
You can download a draft of the Sichuan Cookbook 1972 (75.6 MB), which is a scanned copy in 185mm x 130mm paper size with an unproofed OCR text layer, for reference.
Preferred operating systems are Debian or Ubuntu, though other Linux distributions are compatible. Apple macOS is also suitable if necessary command-line tools are installed.
- Each book page was photographed with a digital camera and processed with Adobe Lightroom Classic for RAW decoding, perspective distortion correction, and other minor adjustments.
- To avoid bloating the git repository, all JPEG photos are hosted on
user-images.githubusercontent.com
. Download all JPEG photos (625 MiB) using the following command:make -C jpeg
- Alternatively, download all JPEG files in a single tarball from here (625 MB).
- The original book's dimensions (185mm x 130mm) are maintained, corresponding to an aspect ratio of approximately √2:1.
- For 600dpi resolution, the digital image size should be 4370px x 3091px.
- All JPEG photos are processed with ImageMagick and performed OCR with
Tesseract Open Source OCR Engine.
sudo apt-get install imagemagick tesseract-ocr tesseract-ocr-chi-sim
- Compile the processed images into a PDF using the following commands:
make -C jpeg make scan
- The book is recreated using XeLaTeX with support for multiple fonts:
sudo apt-get install -y fonts-cns11643-kai fonts-hanazono fonts-noto texlive-full
- Compile the LaTeX remake into a PDF with the following command:
make -C latex
- We invite contributions for proofreading each recipe under the latex directory.
- Volunteers can claim unassigned tasks listed in the GitHub issues.
- Utilize the A4 paper size scanned copy for printing (80.6 MB) as a reference for proofreading.
- Report any typographical errors by commenting on the issue page where you claimed the task. Pull requests for corrections are highly encouraged.