What is PDE?

Elan GMK PDF Drawing Extractor (PDE) is a software application used to systematically extract images/illustrations from PDF files. The output is used by document assembly applications to re-create certain content, usually technical manuals and books. The XML-based workflow makes automatic document assembly possible.

Why PDE?

The main users of PDE are companies who re-publish existing documents that are in (scanned) PDF format or regular PDFs. There are plenty of tools to convert text entities (OCR, re-typing) but handling images is a more challenging task. Typically, customers of PDE are facing one or more of the following challenges:

Missed images
Bad size and quality of images
Unorganized, hard to track, file naming etc.
Missing captions and disassociated form figures

The main consumers of the output from PDE are XML-based publishing systems such as Siemens Teamcenter. Because each output image is accompanied by an xml file that contains the metadata, document hierarchy and caption information, it is easy to integrate in the newly created XML-based publication/manual.

Features

PDE has some unique features which makes it ideal for those needing to re-create technical manuals or books.

Multi-User Network Environment

Suitable for operations at scale.

Automatic Job Creation

Every time the user opens a file, a job will be created for that particular extraction task.

PDF Input; Multiple Format Output

PDF input and PDF output along with image (BMP, PBG, JPG, TIFF) formats.

Shared Folders and Job Locking

Enables multiple users to open existing jobs and lock for the duration of editing.

Easy Mark Up

Images, illustrations or drawings that are present in the PDF file can be detected automatically and/or marked up using the mark-up tool.

Caption Extraction

The image captions can also be extracted using user-friendly interface elements.

Batch Operations

Operations can be performed on all of the extracted images. The imaging operations allow for clean-up, resizing, turning, margin adjustment and image editing.

Flexible Metadata Handling

Metadata sets can be defined, saved and loaded for a particular job or defaults created. Users can enter metadata (index) values associated with the job or the particular image/illustration or drawing. Default and dynamic values can be set up and a screen scraping (OCR) tool is also also provided.

Batch Conversion

All marked-up image areas will be extracted from the PDF file. The extracted images can be saved as PDF files or image files in configurable subdirectories.