Getting Started : Building Your Database

Quick links:

Adding your files
Organizing
Case study: Bill's Database Farm

Building Your Database

The first step in starting your own custom database is to create a new database with File > New Database. Give your database an easily recognizable name, and choose where you want to save your database. We advocate putting them in a folder in your home directory, like ~/Databases. With your new database created, you are ready to add information.

Encrypted Databases: If you have databases containing sensitive or private information, you can create an encrypted database. This is specialized AES-256 encrypted disk image that will not appear in the Finder or your desktop when it's open. In the Navigate sidebar, you will see a key icon to the right of the database's name, denoting it is an encrypted database. Quitting DEVONthink or closing the database unmounts the disk, so you are always required to enter the password to access it.

Choose File > New Encrypted Database and enter a password that will be used to unlock it. Enter a reasonable anticipated size for the database, in megabytes or gigabytes. This is how large you think the database will get. Since the encrypted database functions like a connected drive, you define how large it is and "fill it up". You can choose to let Spotlight index the contents, but bear in mind the Spotlight index is stored locally and isn't encrypted. This means someone could find a document in the database exists via a Spotlight search. However, they wouldn't be able to open and access the database without the proper password.

Note: You cannot create or store a database in a cloud-synced folder, e.g., iCloud Drive or Dropbox. This is not data-safe so the behavior is explicitly disallowed. The advocated location is a folder in your home directory, like ~/Databases. If you try to open a database in one of these locations, you will be prompted to let DEVONthink move the database, or reveal it so you can manually relocate it.

Adding your files

After you've created your database, you'll add your files to it. Often it's a simple matter of dragging and dropping files into your database, but we've covered several additional methods in the In and Out chapter. Also, please take a moment to review the Import and Index section to familiarize yourself with these two options.

While you may be tempted to dump every file on your hard drive into DEVONthink and sort it out later, you're best off being more selective in what you add (especially in the beginning). Having a large, "dump it all in" database can contain a lot of files that will do you no practical good (for example, DEVONthink can't read your Microsoft Office user profile files), and weeding these files out after-the-fact can be both time-consuming and frustrating. Also remember, DEVONthink has to index metadata and contents of any compatible files. This will be less productive if you add files you'd never want to use or search for.

Here's a practical example: Your iTunes database runs over 22 gigabytes. There's nothing to be gained by simply copying that into a DEVONthink database. Copying the iTunes database into DEVONthink would result in a large, inefficient, and slow DEVONthink database, and would cripple DEVONthink's ability to manage and use content in a well-designed manner (much less perform simple search and organizational tasks).

As your growing databases use RAM, processor time, etc., smaller, more focused databases are often a more effective approach than using singular, monolithic databases. Separate databases generally perform better, sync faster, and in the rare case of a catastrophe, can help avoid data loss since you're not keeping "all your eggs in one basket". Another benefit of this approach is the ability to conserve some machine resources. With a single, large database all the information is always using resources, even files unrelated to what you're working on at the moment. With separate databases, you can close and open specific databases as the need dictates.

One way to effectively create separate databases is to use a topical database approach. Create multiple databases, with each holding only related information: a bird watching database full of birding articles and newsletters; a quantum physics research database with research briefs and email. This method can improve the effectiveness of DEVONthink's artificial intelligence (AI) features with each database. The AI work best within a database that contains contextual relationships among many documents; clogging your new database with everything from A (apple pie recipes) to Z (Zengobi user documents) will only hamper the AI's ability to work effectively.

Having topical databases can help down the road as well. You may be collaborating on a database, syncing between machines in a group. Imagine having just one database: You decide to share your painstakingly researched academic articles with colleagues, only to find that you've mistakenly also shared personal financial records and chats. Not hard to imagine how that has the potential to be both dangerous and embarrassing. Having multiple, topical databases will allow you to keep your data separate and private.

Organizing

Database organization depends on the parties involved. For collaborative work, you'll want to organize it in a manner that's understandable to all parties using it. This is especially important as our sync technology is a mirroring sync, meaning changes to one copy of the database gets synced to the other copies. For personal work, just set up your database in a manner that makes sense to you. There is no right or wrong way to organize it. This is something you've likely already been doing in the Finder, making folders and filing things in them. Apply the same personal choices to DEVONthink. You can also use smart groups to create virtual groups.

Remember that creating databases isn't an inviolable commitment. Create and destroy them as you see fit. Start with one way of organization and decide later to re-organize your databases. With DEVONthink you can keep multiple databases open simultaneously, easily moving documents from one database to the other at any time. As you work with your databases, new ideas may spark new approaches which can easily be tried and adopted or discarded.

Case study: Bill's Database Farm

Bill DeVille, formerly DEVONtechnologies' Evangelist, worked in a number of scientific areas. Bill's main database covered environmental science and technology topics, with related interests in science and technology exchanges with developing nations. The database even contained some projects dealing with graduate education in environmental sciences and engineering. There's a broad topical relationship among these subjects and the database covers disciplines ranging from chemistry, toxicology, statistics, risk assessment, and engineering to economics, legal, regulatory, and policy issues. These disciplines fit together and combinations of these topics are necessary in many real-world cases.

As you can imagine from the above description, Bill's main database was quite large, containing about 20,000 documents and over 20,000,000 total words. Because of the relationships knitting together all these scientific, technical, legal, and policy issues, the artificial intelligence features of DEVONthink worked very well for Bill in researching the database and contextualizing the information.

In addition to his main database, Bill had seven additional databases (so, eight total). For example, he had one database for Apple Newton literature he has accumulated over the years. It's almost as big as his main database, but the topical coverage has no practical relationship to the main database, so Bill kept the Apple Newton literature in its own domain. If he were to keep this unrelated information in his main research database, the result would be a larger, slower database, with poorer performance by the artificial intelligence.

Occasionally, Bill added topical materials to it that are not related to its main purpose. However, when those "unrelated" topics grew large enough in volume, he spun them off into to a new database in order to preserve AI accuracy and relevance.

If you'd like to follow Bill's method, start by creating a database with some collections of files that interest you, but don't be afraid to create other databases that contain "different" material as your interests, and main your database, grow. And if you need to search across databases, simply open all of them at the same time. DEVONthink searches all of them almost simultaneously.