I have about 20-30 folders of old documents from university (mostly typewritten with illustrations and handwritten notes).
I recently bought a duplex scanner (Epson DS-410) and am still pondering the choice of file format, DPI setting, and OCR. I would like to archive the documents in the best possible quality and then dispose of the folders.
Question 1: I am currently thinking about PDF/A and wondering if it is the best choice or if .tiff or .png would have an advantage? Unlike PDF, I noticed that I can't choose a compression level with PDF/A. Is PDF/A lossless?
Question 2: Can multiple .tiff or .png documents also be easily converted into a PDF, or is this not a good idea?
Question 3: For the PDF or PDF/A file format, my scanner has the option to create a searchable document (OCR) directly. Is this recommended or would it be better to add OCR afterwards either by using a specialized tool or to import it into paperless-ngx (which I don't have yet, probably will use it in near future) ?
Not sure if there is difference in OCR, if there is something as "bad" or "good" OCR.
My goal is: To create the scan in the best possible file format. If .tiff or .png is preferable as an archive file, the option to easily convert it to PDF should be available; otherwise those formats would not be an option for me.
I would appreciate your advice.