Quick PDF parsing

PDF Parser Command Line

xpdf

xpdf file.pdf
xpdf file.pdf :18

OCR Commands

PDF pages to Image
magick -density 150 file.pdf[1-6] -quality 90 -resize 75% pdfPage-%d.png

Image to Text
rem tesseract pdfPage-2.png pdfText/pdfPage-2.txt
rem tesseract pdfPage-[3-6].png pdfText/pdfPage-%d.md
FOR /L %y IN (3, 1, 6) do tesseract "pdfPage-%y.png" pdfText/pdfPage-%y
rem FOR /L %%y IN (3, 1, 6) do tesseract "pdfPage-%%y.png" pdfText/pdfPage-%%y

tesseract pdfPage-6.png pdfText/pdfPage-6 -l mal