You are here: Home Services Support Info General Document Scanning Process
Personal tools
Document Actions

General Document Scanning Process

Standard workflow steps and features involving a data capture process for re-printing (on demand printing) as the main destination

Document Preparation

This step involves to get ready the source documents to put in the document feeder of the high speed scanner. There might be several steps required to perform this operation. The most common steps are:

  • Remove the spine of the book or manual
  • Remove the documents from the binder
  • Replace TABS with bar code separator sheets
  • Remove staples and paper clips

High speed Scanning

The standard scanning resolution for re-printing purposes is 600 DPI, which gives excellent reproduction rate and can be downscaled later in the process if needed. The standard scanning mode is grayscale or color, which is need for performing page segmentation for documents containing pictures and images. An exception to this process is text-only documents, where the scanning mode is 600 DPI TIFF g4 format.

  • Scanning with page number tracking file naming convention for books or manuals (no double or missing pages), whenever applicable
  • QC for page completeness

Image Processing

The scanning and preparation of a document requires special attention if the main usage of the document is re-printing. The challenge is presented by an environment where the generated PDF file can be printed on a variety of different printers. There is usually a big problem with re-printing scanned documents with grayscale and color images mixed with text. We address this problem by page segmentation that handles text areas and image areas differently and creates a multi-layer PDF file with  the right image processing routines applied to different areas of the page. The following steps are performed:

  • Deskew using high quality interpolated rotation routines 9server-based processing)
  • QC of deskewed images
  • Page alignment (front to back registration)
  • Page alignment QC
  • Page segmentation (separate to background B&W and grayscale or color subimages)
  • Image processing of B&W background images
    • Thresholding using dynamic or manual settings
    • Speckle removal
    • Punch hole removal
    • Border cleanup
  • Image processing of color/grayscale subimages
    • brightness adjustment
    • De-screening
    • Moire-pattern removal

PDF generation

The last process is to generate the final PDF file. The output PDF file has the following features:

  • Optimized for WEB delivery (byte-serving enabled)
  • OCR of B&W background as an option (searchable PDF files)