You are here: Home Products ELAN Converter Pro Manual (html) PDF Conversion Options
Personal tools
Document Actions

PDF Conversion Options

 PDF Conversion Options
Conveter User Manual Top  Previous  Next

In Batch Mode, the PDF Conversion Options are on the Batch Settings tab. In Watch Mode, these options belong to the watch folder properties. Double click on a folder at the Watch Selection List to see the options.

 

pdf_options-02

The PDF Conversion Options govern the output PDF file creation. There are three major groups of options: Compression, OCR and Linearization.

Compression is one of the most important concepts in document image processing. A high-resolution color 8.5" x 11" page could easily consume 100 MB of hard disk space if it is not compressed. Black & white images are always compressed with the best available lossless compression algorithm called CCITT Group 4 Fax, which is also used by fax machines. However, for grayscale and color images you decide which compression to use:

Deflate: This is a fast and reasonably compact lossless compression format, designed for the professional market, where the unquestionably flawless image quality is more important than file size. This compression produces extremely huge output PDF files with immaculate images if the input images are not JPEG. Deflate is a lossless compression, which means every single pixel in the output PDF file will be exactly the same as in the input image. You can type in a compactness factor between 1 and 100. The higher the number the smaller the output and the slower the conversion will be. Numbers above 60 do not significantly improve the compression but tremendously slow down the conversion. No matter what number you type in, the resulting PDF will be very big files, often 10 times bigger than using JPEG compression. If the input images are JPEG, it is absolutely not recommended to use Deflate compression, because if your input images are not perfect, the output will be far from perfect even with the best quality compression, and your files will be unreasonably large.
JPEG: This is the default and recommended compression. It is lossy, which means the output image will not look exactly the same as the input, but they will be reasonably similar. You can specify a quality factor between 1 and 100, where 100 means premium quality, 83 is good quality and 50 is low quality. Do not go below 60 if quality is important for you. The bigger the number the better the quality and the bigger the PDF files will be. If your input files are JPEG, always choose JPEG compression. If you pick a number 90 or higher, the lossiness of the compression will not be noticeable - the quality will come very close to that of the Deflate compression, while the file size will still be considerably smaller. As you decrease the quality towards 70, the file size will drastically be reduced, without damaging the quality too much. Going lower than 60 does not reduce the output size too much but does reduce the quality drastically. As you can see, there is a threshold after which the file size does not improve but the quality suffers noticeably. Unfortunately this threshold is subjective, is different for each image, and is different for printing and screen quality. It is impossible to tell a universal number that will satisfy every user. That is why it is so important that you are aware of the JPEG Quality option.

If file sizes are important to you, also take a look at the Processing and Downscale options. Most likely a smooth and a downscale will reduce your file size drastically, while keeping the quality acceptable.

Optical Character Recognition (OCR) is a part of the professional version of ELAN Converter™, which lets you make your PDF files searchable. If you choose to OCR your input images, the output PDF files will contain a hidden layer of text, each word overlapping the word in the raster image. After OCRing your output is still raster, looking exactly the way the image was scanned, but users can select the words, copy the text to the clipboard, and most importantly, search within the document, and publish the output PDF files to a full-text searchable environment, such as ELAN Web Search™ or ELAN CD-ROM Retrieval™.

OCR is something that takes time. It makes the conversion significantly slower, and the output file size a little bit bigger. So if you do not need hidden text, do not OCR. The PDF Format Type option has two choices: Image-only or Image+Text. Choose Image-only for raster-only PDF files, and select Image+Text if you want to add a hidden text layer and make the output PDF searchable.

Notes: Downscaling and hidden text work perfectly together. Moreover, we take extra caution to do the OCR before the downscale, so the OCR be performed on the high resolution source image, even if the output is a very low resolution image.

Linearization is often called optimization. ELAN Converter™ has the option to optimize your PDF files to make them Web-ready. Note, however, that optimized PDF files are not really optimal in terms of size or conversion speed. It takes a little bit more time to create an optimized (Linearized) PDF file, and the output files will be slightly bigger (about a few hundred bytes per page bigger). Also, linearized PDFs are not going to download any faster from an FTP site than non-optimized files. The real advantage of linearization is online browsing, when your users do not plan to download your PDF files, but browse them on the Web. If a PDF file is not linearized, the visitors have to download the entire file before the first page appears on screen, which could take minutes or even an hour. If a PDF file is linearized, Web servers are able to work together with Adobe Acrobat Reader™ to serve the file page by page, meaning the first page appears in seconds, and any page where the visitor navigates appears in seconds. Only the visited pages will be downloaded. Instead of downloading the entire document, you only download a fraction of it, on demand.

If you are interested about Web hosting, you want to linearize your PDF files, which takes little extra time and extra file size, so it is worth it. Check Linearized (Web-ready) PDF to do so, uncheck it if you never plan to host your PDF on Web. Note that if you create a non-linearized file, it will not be east to linearize them later, so think it over carefully. We recommend that you linearize your files if you are in doubt.


© 2002-2009 ELAN GMK