John Hawthorn

Extracting images from a PDF

I was reading a PDF which was composed entirely of scanned JPEGs. I found this slow and cumbersome in my PDF viewer and would have preferred to be reading it in meh. Fortunately, there’s a simple tool for accomplishing this.

pdfimages -j src.pdf dest

Handy. This will extract the images in src.pdf to names like dest-000.jpg, dest-001.jpg, etc. The -j flag indicates that images in the PDF which are JPEGs should be created as JPEGs, it won’t convert images and lost quality.