|
From setting up your default chat engine and deciding what kind of permissions you give it when searching or working with your Mac, the AI view is where you set these options. For creating AI-based images, see the Image Generation view. And if you need to detect or convert speech to text in images and media files, see the Transcription view.
Chat
Choose your AI model and settings specific to it, as needed. Also set from where the model can get information, if it can effect changes to your database, and what kind of summaries you'd like it to return.
Chat Setup: Specify what large language model (LLM) you want to use and set up any required parameters for it. Note several of the controls here are dynamic and the options will change depending on what LLM you've chosen.
-
Provider: Choose from the list of supported chat engines, e.g, ChatGPT, Anthropic's Claude, or even one you are running locally.
-
API Key: Enter the personalized key you were provided by your AI service provider.
-
URL: Enter the URL of a locally running LLM server. This option will only appear as needed.
-
Model: Choose from the list of models for a specific LLM, e.g., Gemini Flash 8B. Each model may show one of several icons showing its capabilities:  for reasoning,  for vision,  for tooling,  for deep search,  for coding support, and  for cost, with the icon's boldness indicating higher or lower costs.
-
Context Window: This is the number of tokens at a time the LLM can process and "remember" in a conversation. A larger context window means more data is passed or rememebered. However, if you're trying to run a local LLM, larger context windows use more RAM. This also displays how many tokens have been used, sending and receiving responses.
-
Usage: Choose an option to balance cost and quality of results, from fewer tokens with less precision to more tokens but a higher chance of useful results.
-
Role: Define an optional default "persona" or instructions for the AI, e.g., "You are an undergraduate professor presenting to your class. Use Markdown formatting with sections and subsections but no lists. Include links to your sources." This is used in automation, like AI assisted scripting.
Assistant: Certain AI models have access to "tooling" and may be able to accept DEVONthink-related commands. You need to decide what behaviors you will allow it to use on your Mac and with your databases.
-
Allow property & content changes: Decide whether the chat assistant can make changes to your database, e.g., add tags to a document or create a new one for you.
-
Allow download of web pages: Allow AI to download the content of web pages to examine as part of its querying process. This may be considered an insecure option.
-
Allow screenshots of window: Allow the AI assistant to capture and examine a screen capture of DEVONthink's window for use in queries. Requires a compatible AI model, e.g, Claude Sonnet.
-
Allow image generation: Allows the AI assistant to create images, e.g., asking What does the Eiffel Tower look like? This utilizes the text-to-image engine chosen in the
AI > Images settings.
-
Load remote images automatically: Control whether linked images in web content is downloaded for examination. This may be considered an insecure option.
Max. Recent Chats: Set the number of remembered chats you can revisit in the Recent Chats popup in the
Chat assistant.
MCP
The MCP settings provide controls for setting up DEVONthink's MCP server, allowing connecting to and working with your databases while outside our application.
-
Privacy: Enable Redact sensitive content to remove or obscure recognizable private information like credit card numbers, email addresses, phone numbers, etc. before the content is sent to the AI model.
-
MCP Server: If you are integrating with Claude Code or other HTTP-based MCP clients, enable Launch at login. Note that Claude Desktop and Cowork launch their own instance of the server as they need it.
-
Accept connections: Choose From the Mac only to restrict access only to local use, From the local network to connect to devices on your network, or Any device to allow even remote connections. The last two options require a valid TLS certificate and bearer token.
-
TLS Certificate: Choose an existing valid TLS certificate you have installed. Note that self-signed certificates aren't supported.
-
Bearer Token: Connecting other devices requires a bearer token for authentication. Press the Create button to generate one specifically for DEVONthink's MCP server.
-
Port: The port the server is using, even when running locally. A default port is assigned but can be changed to a different unused port, as needed.
-
URL: Select and copy this URL to use in other applications or on other devices accessing the MCP server.
-
Connectors: If you are chatting or using Cowork in Claude Desktop, click Install for Claude Desktop. If you have installed the Claude Code command-line interface (CLI), enable Install for Claude Code. For OpenAI's Codex access, enable Install Codex Desktop & CLI. To use the Hermes Agent, including via Ollama, click Install for Hermes Agent. Each option will only be enabled if the respective application is installed.
-
Log: To access information about what's happening with the MCP server, click the Log button.
Search
The Search settings control where you want external AI to search. For example, when researching medical issues, you may want to let it search PubMed but not Wikipedia. The options are: in your Databases, on the arXiv, PubMed or Wikipedia websites, or on the Web, in general.
Regarding web and Wikipedia searching, you can also choose a specific engine for deep research: DEVONthink, Perplexity, Exa AI, or ollama.com for web searches. The commercial services provide quick results but utilize indexed content. DEVONthink's results are returned more slowly but are up to date. Also, be aware the commercial services require an account and an API key.
Summarization
Model: Choose a specific AI provider and model for summarization or Default to use the
Chat model.
Language: Choose a specific language to use for the summary or leave it as Automatic to use the system's primary language.
Style: Determine what summary format you'd like in response to asking chat to summarize a document. The choices are:
-
Text: Gives you a brief synopsis in a few paragraphs.
-
Bullet Points: Returns a list of the main points.
-
Key Points: Provide a distilled response of the main topics.
-
Table: Create a table of columns and rows, often used for correlating pages or links to text.
-
Custom: Provide a summary defined by a template you define in this settings pane.
Custom Prompt: Create your own prompt defining what kind of response you'd like, including how you'd like the summary to be structured. Use the special placeholder %@ to refer to the information being summarized.
Image Generation
Choose and set up a text-to-image AI model. These controls are dynamic and their options change depending on the model you choose.
-
-
Aspect Ration: Choose a predefined aspect ratio for the images. Supported sizes are: 1:1, 16:9 (9:16), and 4:3 (3:4), depending on the chosen model.
-
Style: Choose a predefined style, if applicable.
-
Quality: Decide whether to generate Standard or HD images, if available.
-
API Key: Enter the API key you received from the image generation provider, e.g., Replicate.com for the Flux generator.
Transcription
AI speech-to-text processes incoming media files and processes them per these settings. For example, an .mp3 file could be transcribed into a separate annotation file for future use.
For each type of text recognition, images or audio/video, choose where to store the recognized text via the Destination dropdown. The available options are:
-
Searchable Text: This is similar to Apple's Live Text feature in that a text layer isn't added to the document, but instead is stored in the database's index and associated with the file.
-
-
Comment: Add the transcribed text as a Finder comment on the file.
Images: Decide what live OCR engine you want to process images added to your database:
-
Fast Apple Vision text recognition: Quickly detect text in images using Apple's Vision framework. Often sufficient for many use cases.
-
Accurate Apple Vision text recognition: Detect text in images with an emphasis on accuracy over speed.
-
Text recognition via chat: Uses your chosen Chat model to detect text in images, provided the model supports image analysis. When enabled, you can choose a vision-capable AI model, if desired.
Audio & Video: Choose the transcription engine you want to process media files added to your database:
|
Note:
On macOS 10.13.5 Ventura and 10.14 Sonoma, Apple's audio transcription engines are Local Apple Speech transcription and Remote Apple Speech transcription The local engine may be less accurate and requires Siri or Dictation to be enabled on your Mac. However, you aren't required to share the information with Apple. Also, OpenAI's transcription is done via GPT-4o.
|
Add timestamps to transcription: Examines the speech and inserts timestamps at certain points.
Transcription Language: Choose the language of the media file to be transcribed. Only used with OpenAI's Whisper.
API Key: Enter the API key you received from your AI transcription provider, e.g., OpenAI.
|