PDF Conversion Options
|
|
|
In Batch Mode, the PDF Conversion Options are on the Batch Settings tab. In Watch Mode, these options belong to the watch folder properties. Double click on a folder at the Watch Selection List to see the options.
The PDF Conversion Options govern the output PDF file creation. There are three major groups of options: Compression, OCR and Linearization. Compression is one of the most important concepts in document image processing. A high-resolution color 8.5" x 11" page could easily consume 100 MB of hard disk space if it is not compressed. Black & white images are always compressed with the best available lossless compression algorithm called CCITT Group 4 Fax, which is also used by fax machines. However, for grayscale and color images you decide which compression to use:
If file sizes are important to you, also take a look at the Processing and Downscale options. Most likely a smooth and a downscale will reduce your file size drastically, while keeping the quality acceptable. Optical Character Recognition (OCR) is a part of the professional version of ELAN Converter™, which lets you make your PDF files searchable. If you choose to OCR your input images, the output PDF files will contain a hidden layer of text, each word overlapping the word in the raster image. After OCRing your output is still raster, looking exactly the way the image was scanned, but users can select the words, copy the text to the clipboard, and most importantly, search within the document, and publish the output PDF files to a full-text searchable environment, such as ELAN Web Search™ or ELAN CD-ROM Retrieval™. OCR is something that takes time. It makes the conversion significantly slower, and the output file size a little bit bigger. So if you do not need hidden text, do not OCR. The PDF Format Type option has two choices: Image-only or Image+Text. Choose Image-only for raster-only PDF files, and select Image+Text if you want to add a hidden text layer and make the output PDF searchable. Notes: Downscaling and hidden text work perfectly together. Moreover, we take extra caution to do the OCR before the downscale, so the OCR be performed on the high resolution source image, even if the output is a very low resolution image. Linearization is often called optimization. ELAN Converter™ has the option to optimize your PDF files to make them Web-ready. Note, however, that optimized PDF files are not really optimal in terms of size or conversion speed. It takes a little bit more time to create an optimized (Linearized) PDF file, and the output files will be slightly bigger (about a few hundred bytes per page bigger). Also, linearized PDFs are not going to download any faster from an FTP site than non-optimized files. The real advantage of linearization is online browsing, when your users do not plan to download your PDF files, but browse them on the Web. If a PDF file is not linearized, the visitors have to download the entire file before the first page appears on screen, which could take minutes or even an hour. If a PDF file is linearized, Web servers are able to work together with Adobe Acrobat Reader™ to serve the file page by page, meaning the first page appears in seconds, and any page where the visitor navigates appears in seconds. Only the visited pages will be downloaded. Instead of downloading the entire document, you only download a fraction of it, on demand. If you are interested about Web hosting, you want to linearize your PDF files, which takes little extra time and extra file size, so it is worth it. Check Linearized (Web-ready) PDF to do so, uncheck it if you never plan to host your PDF on Web. Note that if you create a non-linearized file, it will not be east to linearize them later, so think it over carefully. We recommend that you linearize your files if you are in doubt. |
© 2002-2009 ELAN GMK



