# Search

DEVONthink supports a wide variety of searchable attributes. These include common attributes, like names or tags, but also include document or DEVONthink-specific items, like word counts or custom metadata. 
Similar to searching with Spotlight or other applications, the use of search prefixes is supported. These take the form of a prefix and a prefix operator, e.g., `name:`, followed by the search term. 

## Search prefixes

Below is a list of the available search field prefixes:

- `text`: Text contents in a file.
- `metadata`: The metadata for a file. Includes name, URL, comment, authors, recipients, title, description, subject, headline, keywords, organization, copyright, album, composer, creator and producer but does not include the text contents.
- `name`: The name of an item. For documents, this is distinct from the filename and does not include the file extension.
- `url`: The associated URL.
- `comment`: Finder comments.
- `docAuthors`: The name of the sender of an email or the author of a document.
- `docAuthorEmailAddresses`: The email address of the sender of an email.
- `docRecipients`: The name of a recipient of an email.
- `docRecipientEmailAddresses`: The email address of a recipient of an email.
- `docTitle`: The title of a file. The title may be distinct from its name, e.g., a song title for an MP3 file.
- `docComment`: The document-specific (e.g. of RTF or PDF) comments for a file.
- `docHeadline`: A headline applied to some files.
- `docSubject`: The subject line from an email.
- `docDescription`: The description found on some files, e.g. images.
- `docKeywords`: The e.g. PDF or RTF specific keywords for a file.
- `docOrganization`: The company or organization specified in the metadata
- `docCopyright`: Copyright information in the metadata of a file.
- `docAlbum`: The album information from media metadata, e.g., MP3 files.
- `docComposer`: The composer information from media metadata, e.g., MP3 files.
- `docCreator`: The process or application used to create a file.
- `docProducer`: The producer of a file, usually applied to media files.
- `aliases`: Aliases applied to an item.
- `tags`: Tags applied to an item.
- `label`: The color label of an item, from 0 (no label) through 7 or by name, e.g., Important.
- `rating`: The star rating of an item, from 0 (unrated) through 5.
- `width`: The width of a document in points, i.e., the width multiplied by 72.
- `height`: The height of a document in points, i.e., the height multiplied by 72.
- `length`: The number of pages in a file or length of a media file in seconds.
- `duration`: The duration of a media file in seconds.
- `size`: The size of an item in bytes, KB, MB, or GB, e.g., `size >= 50 MB`.
- `wordcount`: The number of words in the contents of a file. Also useful to figure out whether a document (e.g. a PDF) has searchable text (`wordcount>0`) or not (`wordcount==0`).
- `charactercount`: The number of characters in the contents of a file.
- `hits`: The number of times a file has been viewed or opened.
- `filename`: The name of the file in the file system, including the file extension.
- `extension`: The extension of a file, e.g., `txt`. This also supports an `any` option to filter filenames having or lacking an extension.
- `kind`: Supports `document`, `group`, `smartgroup`, `tag`, `ordinarytag`, `grouptag`, `text`, `rtf`, `formattednote`, `markdown`, `html`, `webarchive`, `xmlfile`, `propertylist`, `image`, `pdf`, `pdf+text` (searchable, OCRed PDF documents), `quicktime`, `video`, `audio`, `bookmark`, `feed`, `news`, `script`, `sheet`, `email`, and `other`. **NOTE:** Remember to **always** include `kind:document` if the user is only interested in documents, files or contents but does not explicitly mention a certain type.

### Item prefixes

These special prefixes are for state-based queries, like if items are replicants or contain aliases. They all follow the form of `item:<specified state>`, e.g., `item:locked`. You can specify the state of items, i.e., is or is not. The available options are as follows:

- `replicated`: Matches items that are replicants.
- `duplicated`: Matches items that are duplicates.
- `indexed`: Matches items that are indexed, not imported.
- `pending`: Matches items whose contents aren't downloaded and available.
- `tagged`: Matches items with tags applied.

The other option is specifying whether items do or don't contain a certain property. For example, you can search for items containing aliases and the `item:<specified state>` syntax is also used. This includes the negated form. Here are the searchable properties:

- `aliases`: Matches items with aliases.
- `annotation`: Matches items with an associated annotation file.
- `comment`: Matches items with a Finder comment.
- `data`: Matches documents with a filesize > 0 or groups/feeds with child items. An empty group or document can be found via `item:!data`, meaning the item does not contain data.
- `metadata`: Matches items with metadata.
- `reminder`: Matches items with a reminder set. (See Annotations & Reminders)
- `script`: Matches items with a script applied in the Info inspector.
- `thumbnail`: Matches items with a thumbnail applied.
- `url`: Matches items with a URL set in the Info inspector.
- `marked`: Matches items with a certain state, e.g., locked. The search prefix is `item:<specified state>` and its negated form, `item:!<specified state>`. Here are the marking options available:
	- *Flag*: The flag of an item. Supports `flagged` or `unflagged`.
	- *Unread*: The unread state of an item. Supports `read` or `unread`.
	- *Locking*: The locking state of an item. Supports `locked` or `unlocked`.

### Date Prefixes

Dates are a very commonly used property in searches, for example, if you're looking for a document you created two days ago. Here are the date-based properties you can search for. See the Date Operators in the next section for the syntax you can use with these.

- `added`: The date the item was added to the database. The long form `additionDate` is also supported.
- `created`: The date the item was created. The long form `creationDate` is also supported.
- `modified`: The date the item was last modified. The long form `modificationDate` is also supported.
- `opened`: The date the item was last opened. The long form `openingDate` is also supported.
- `due`: The due date set in a Reminder for an item. The long form `dueDate` is also supported.

### Date Operators

These are operators used with date-based queries, like the creation date of files.

`<` is equal to the term **before**.
`<=` is equal to the term **before or on**.
`>` is equal to the term **later**. 
`>=` is equal to the term **later or on**.
`:#` is equal to the condition **within last number** of days. The negated form, `:!#` is also supported.
`:` is equal to the condition **is** for date-based queries. Supported date options are `Today`, `Yesterday`, `This Week`, `Last Week`, `This Month`, `Last Month`, `This Quarter`, `Last Quarter`, `This Year`, and `Last Year`. The negated form, `:!` is also supported. 

**NOTE:** Weekdays, e.g., `This Sunday` or `Last Tuesday`, are **NOT** supported. Therefore queries like `creationDate:Last Sunday` or `opened>=This Monday` do **not** work. Calculate the date of the weekday instead.

#### Examples

Date Created is after January 31, 2019 => `creationDate>=2019-01-31`
Date opened is not within last 5 days => `openingDate:!#5days`
Date Due is not Today => `dueDate:!Today`

Date searching allows for some flexibility in formats. Time is not a required parameter, but can be specified. These searches are all equivalent:

```
additionDate>10 march, 2019 
additionDate>March 10, 19 
additionDate>2019-03-10 08:30:00 -0500
```

### Miscellaneous Prefixes

- `md_attachments`: The number of attachments in an email or the number of resources added to an RTFD file.
- `md_annotationcount`: The number of annotations set in a PDF file.
- `md_encrypted`: The encrypted state of a PDF. This is a Boolean value denoted numerically, e.g., md_encrypted==1 when a file is encrypted.
- `md_incomingItemLinkCount`: The number of item links to a document from other documents in DEVONthink.
- `md_outgoingItemLinkCount`: The number of item links to other documents present in a document.
- `md_language`: An abbreviation of the detected language in the contents of a file. E.g., "en" for English, "de" for German, "fr" for French, etc.
- `md_country`: An abbreviation of the country in the geolocation data for a file. E.g., "US" for United States, "DE" for Germany, "FR" for France, etc.
- `md_zipcode`: The postal code detected in the geolocation data for a file.
- `md_area`: The state, province, or region detected in the geolocation data for a file.
- `md_locality`: The city detected in the geolocation data for a file.

### Custom Metadata Prefixes

Custom metadata attributes defined by the user are also available as search prefixes. The search prefix you will type is a concatenated, lowercase form of the attribute's name without spaces, prefixed with `md`. For example, an attribute named "Total Cost" would have a search prefix of `mdtotalcost`.

### Sub-criteria

Criteria in curly `{ ... }` braces create sub-criteria for the search and extend the search options to create even more complex queries. In addition, there are two special prefixes for sub-criterias:

- `any`: This specifies to return results matching **any** of the criteria. 
- `all`: This specifies to return results matching **all** of the criteria.

These prefixes can only be used when specifying more than one search prefix, e.g., `tags` and `filename`. The default is `all` if not specified otherwise.

### Criteria without prefix

Strings without a search prefix match all string-based fields and should be at the beginning of the criteria or sub-criteria, e.g., `test width>128 width<1024`. Search operators are supported in this case too, e.g., `Steve NEAR Jobs kind:document`  

### Examples

```
"Devonian Dinosaurs"
Steve NEAR Jobs
test tags:done
additionDate>=2019-03-10 
tags:sync; methods
any: name:test OR imprint {any: tags:blue; red}
```

**NOTE:** Semicolon is used to separate multiple tags.

## Prefix Operators

Many times prefixes end with a colon, e.g., `tags:` but some use other forms, depending on the available options in the criterion. For example, words can "begin with" some characters, but a size is greater or less than a value. If you select a criterion you would see what options apply. Use the keys below to map the option to its operator.

### Matches, Is, Is Not

`:` is equal to the condition **matches**. With strings-based queries, it allows for wildcards to be used. It is also used for state-based queries, like Kind. The negated form, `:!` is also supported.
`==` is equal to the condition **is**. This must be an exact match of the search term. 

These can be used in strings and number-based queries. The negated form, `!=` is also supported.

#### Examples

Kind is group => `kind:group`
Item is indexed => `item:indexed`
Item is not replicated => `item:!replicated`
Extension is xml => `extension==xml` 
Language is not English => `md_language!=en`
Text matches word beginning with house => `text:house*`
Name matches word beginning with scan, followed by a number => `name:scan[0-9]*`

### String Matching

These are operators that are used with string-based queries, like names or text content. These queries also support:

`:<` is equal to the condition **begins with**.
`:>` is equal to the condition **ends with**.
`:~` is equal to the condition **contains**.

**NOTE:** Wildcards are **NOT** supported by these operators, e.g., `name:<img[0-9][0-9][0-9]` is invalid.

#### Examples

Subject begins with party => `docSubject:<party`
Locality ends with field => `md_locality:>field`
Name contains tech => `name:~tech`

### Number Matching

In addition to the `==` and `!=` operators, numbers can also use these operators:

`<` is equal to the condition **is less than**.
`<=` is equal to the condition **is less than or equal to**.
`>` is equal to the condition **is greater than**.
`>=` is equal to the condition **is greater than or equal to**.

#### Examples

Word Count is less than 1000 => `wordcount<1000`
Size is greater than 10MB => `size>10 MB`
Hits is greater than or equal to 1 => `hits>=1`

### Range Matching

For all numerical attributes, you can use a range matching syntax, `attribute:lowerLimit-upperLimit`. For example, `wordcount:500-1000` matches files with between 500 and 1000 words. This is identical to the longer form syntax of `wordcount>=500 wordcount<=1000`. Range matches can be used with `width`, `height`, `duration`, `length`, `hits`, `wordcount`, `charactercount`, `size`, and `rating` prefixes.

## Search operators

### Precedence of terms

Search terms and associated operators will be interpreted from left to right, except as modified by including portions of the query within parentheses.

### Wildcards

To make searching more flexible, you can replace parts of words with wildcards. For example, you can search for plural forms of words without having to type, e.g., "dog" and "dogs". The available wildcards are:

- `?`: Matches exactly one character.
- `*`: Matches none, one, or multiple characters.
- `[a-z]`: Matches one character of the range a through z.
- `[abc...]` or `[a|b|c|...]`: Matches one character out of the given list of characters.
- `[^...]`: Matches one character that is not contained in the given list or range.

#### Examples

1. Searching a document containing this text `DEVONtechnologies makes great software`:

`text:~tech` matches as the tech is contained in one of the words. 
`text:tech` does not match as there is no word tech. 
`text:tech*` does not match as there is no word beginning with tech. 
`text:*tech` does not match as there is no word ending with tech. 
`text:*tech*` matches as tech is found in a word with text before and after it. However, unless you have a specific purpose, using the contains `~` operator is more succinct. 

2. Searching a document with the text `He made a cake. She is making cookies. They live in Madeira`: 

`text: ma[dk]*` matches "made", "making", and "Madeira". 
`text: ma[dk]?` matches only "made". 

3. Given a document named `2024-2-14_Big Light Electric`: 

`name:[0-9]` would only match "2". 
`name:[0-9][0-9]` would only match "14" 
`name:[0-9]*` would match all the numbers, regardless of length. 
`name:[0-9][0-9]*` would match two or more consecutive digits, e.g., "14" and "2024". 
`name:19[0-9][0-9]` If you were looking for documents only in the 1900s, this would match, e.g., "1914". 
`name:202[0-9]` would match this document or others from the 2020s, e.g., "2021-3-9_Big Light Electric".

**NOTE:** Regular expressions are **NOT** supported.

### Boolean Operators

The operators (often called Boolean operators) are words or symbols that establish logical rules for the terms in the search query. If no operator is given, DEVONthink infers AND. The available Boolean operators are:

- `term1 AND term2`: Contains term1 AND term2
- `term1 OR term2`: Contains term1 OR term2
- `term1 XOR term2`: Contains term1 or term2, but not both
- `term1 NOT term2`: Contains term1 but not term2
- `term1 AND NOT term2`: Contains term1 but not term2
- `NOT term`: Does not contain term
- `"term1"`: Contains the string of words term1, in exactly this form

Besides the classic Boolean operators, DEVONthink uses a number of operators that usually are found in high-end databases. Use these operators as a replacement for `AND` and `"quotes"` to fine tune your query.

- `term1 OPT term2`: term1 is required, term2 is optional. If term2 is also found, the found document ranks higher in the search results.
- `term1 NEAR term2`: term1 occurs 10 words or less before or after term2
- `term1 NEAR/n term2`: term1 occurs n or less words before or after term2
- `term1 BEFORE term2`: term1 occurs before term2
- `term1 BEFORE/n term2`: term1 occurs n or less words before term2
- `term1 AFTER term2`: term1 occurs after term2
- `term1 AFTER/n term2`: term1 occurs n or less words after term2
- `~term1`: Contains term1, also as part of a word

Operators are evaluated in the following priority: parenthesis > phrase/hyphens > (`NOT`) `BEFORE`/`AFTER`/`NEAR` > `NOT` > `AND`/`OR`/`XOR`. Terms with same priority but without parenthesis are evaluated from left to right.

#### Examples

1. `Devonian Dinosaurs`

This query looks for all documents that contain the words "devonian" and "dinosaurs".

2. `(Steve NEAR Jobs) AND iMac AND NOT MacBook OPT Pro`

This query looks for documents that contain the words "Steve" and "Jobs" no farther than ten words away from each other, as well as the word "iMac" (no specific position relative to Steve and Jobs), but not the word "MacBook". The word "Pro" does not need to occur, but if it does, the document is ranked higher in the list of search results.

3. `Paracetamol NEAR (~effect OR impact) AND ((side OR second*) NEAR/2 ~effect)`

This query looks for documents containing the word "Paracetamol" near (within 10 words) words either starting with "effect" (and so also "effects") or "impact". In addition, the document needs to contain the word "side" or any word starting with "second" located within a two-word range of any word starting with "effect".

### White Space Handling

Words linked by non-white separators (e.g., page-index or page_id) are treated like phrases put into "quotes". Words separated by hyphens are handled like `word1word2 OR "word1 word2"`. Characters separated by dots are considered to be abbreviations and therefore handled like words separated by hyphens, e.g., the term `t.a.t.u` is equal to `"t a t u" OR tatu`.

