|
![]() |
OCR |
DEVONthink contains an optical character recognition (OCR) module that allows you to import scanned documents and make them searchable. These documents are "read" by the embedded OCR engine and stored as PDF files that contain an additional (invisible) text layer with the recognized, computer-readable text. Use these options to fine tune the OCR process. You can import scanned documents, or scan them directly from within DEVONthink. Convert Incoming Scans Check Convert to searchable PDF to apply OCR to file that have been sent to DEVONthink from a known scanning software, e.g., ScanSnap Home. Choose the desired format of the resulting file: searchable PDF, RTF document, Word document, or WebArchive. Original Document Check Move to Trash if you want DEVONthink to move the original documents to the trash after they have been successfully imported using OCR. If files are converted by OCR within the database, the original document is deleted from the database. If files are converted at import, the original document is moved to the Finder's trash. Using this option is a great way to prevent your incoming group/folder or database from growing cluttered after OCR is done.
Searchable PDF Check Enter metadata after text recognition to metadata entry dialog whenever a PDF is imported using OCR. Use this dialog window to enter the preferred document name, the author of the document, and any keywords describing the document. You can also adjust the timestamp of the PDF to the actual date of the paper document. The dialog window is shown when OCR processing has been completed. When checked, the metadata entry dialog will appear whenever you scan a document or import an image file with OCR. You may want to switch this option off when you are scanning/importing multiple files in a batch.
Check Compress PDF to apply compression to the resulting PDF, creating a smaller file. Compression only applies when adding metadata post-OCR or preserving annotations from an original PDF after OCR. Resolution Set the desired resolution for the image layer in the PDF. Only values between 150 and 300 dpi are allowed. Auto correct Check Deskew to allow DEVONthink to attempt to straighten the resulting PDF. Check Page Orientation to allow DEVONthink to detect and correct the page orientation. Dictionary and Languages Custom Dictionary: Check Use Dictionary to use a custom dictionary of acceptable words. For example, you may have an unusual spelling of someone's name in some documents. You can enter the name as an acceptable choice for the OCR engine to choose from. Click the Configure button to add custom entries for OCR detection. Note you can only have one dictionary, specified for the language chosen in the Language dropdown. Languages: The Languages section of the OCR preferences lets you identify the languages of the documents you scan in. DEVONthink's OCR engines uses this information to improve the accuracy of the text recognition. DEVONthink comes with more than 150 different language dictionaries. Select the languages you intend to scan or import with OCR. Set a primary language and add one or more secondary languages using the pop-up menu . Simply select the languages you want to use from the list on the right (Available) and move them to the left side (Selected) using the right-to-left arrow button. To deactivate a selected secondary language, select it from the list on the left and move it to the right using the arrow button. You can select a maximum of four secondary languages.
|