Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The script is overly complex for what it does and should also use arguments as input/ouput parameters #2

Open
systemofapwne opened this issue Jan 6, 2023 · 0 comments

Comments

@systemofapwne
Copy link

First, I have to thank you for your script, since it pointed me towards the steps on properly converting colorized PDFs to monochrome PDFs.

However, the script looks extremely overcomplex and written by someone, who tinkered around "until it works" without knowing the full potential of bash scripting. I do not blame you for that: We all began like that.
This is why I wanted to contribute with a simplified and optimized version, that does everything way more elegant and also faster, utilized parallel processing.

IN=my_colorized.pdf
OUT=./out.pdf

# Make sure, we have some temp directory to work in
TMP="/tmp/pdfimg"
mkdir -p $TMP

# Convert PDF to single images
pdfimages ${IN} ${TMP}/im

# Convert with limited amount of threads using parallel
THREADS=4
find ${TMP} -name "*.ppm" | parallel -I% -j ${THREADS} --max-args 1 convert -monochrome % %.pdf

# Combine
pdfunite ${TMP}/*.pdf ./${OUT}

# Cleanup
rm ${TMP}/*

The only additional dependency here is GNU parallel, that distributes the BW conversion tasks to multiple threads in order to speed it up.

I also suggest, that you follow the unix philosophy of using arguments as input parameters to your program.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant