As the main Forensic tool I like to use Autopsy/Sleuthkit. As it is missing some features in comparison to (commercial) Windows products, I’ve decided to contribute and add some new features to Autopsy and Sleuthkit. This is done in cooperation with Brian Carrier.
One of the major missing features is indexed searching. Indexed searching greatly speeds up searches for words during investigations. So Searchtools was introduced. This article describes the features. You can also read more on the inner workings of Searchtools.
This third version uses an even more optimized file format and data representation than the previous versions.
Improvements over the previous versions are:
- Generalized the internal structure to support multiple index types.
- Added extra index type in addition to the already existing raw indexes: raw fragments indexes. These indexes contain all the strings that exist within files on the image but are stored in two non adjecent disk fragments.
- Much improved/optimized file format, resulting in more index data stored in less disk space.
- Improved memory model and handling of the index tree resulting in more index data fitting in the memory during the indexing.
- Reading of images now uses the fstools library (from sleuthkit) in order to not remake the filesystem understanding knowledge.
- Better organized index files/directories.
- Higher stability of the tools.
- Added extra tools for validating files/printing data from the indexes.
- Better integration within Autopsy.
The current version supports the following features:
Two different ways of indexing strings:
- Raw indexes. These indexes contain all the indexable strings directly contained withing the image.
- Raw fragments indexes. These indexes contain all the indexable strings that exist within files on the image but are stored in two non adjecent blocks on the filesystem of the image.
- Tools for Indexed searching in Sleuthkit.
- Creation of indexes integrated into Autopsy interface.
- Indexed Search field (At the bottom of the ‘Keyword search’ page).
- Case insensitive searching.
- Possibility to search for whole words only or parts of words.
- No strings file necessary. Only the Image file is needed for indexing. The size for a normal combined index is about the same as a strings file for the same image. (This depends on the settings used for indexing).
- Can be used to index image files of any size. (Indexing results in multiple small indexes).
Support for different default index-character sets. This release lets you index using:
- Alphabet [a-z,A-Z]
- Alphanumeric [a-z,A-Z,0-9]
- E-mail and Alphanumeric [a-z,A-Z,0-9,.,_,-,@]
The smaller the set, the smaller the index file. Lots of flexibility for the index proces. (Specify the maximum memory usage, the minimum and maximum indexword length and more)
Support for folding (Mapping diacritic characters to their normal equivalent, allowing for more powerful searches).
Default support for folding of the default ISO-8859-1 character set.
The following are on my todo list:
- More documentation on the format used in the index file and the process involved.
- Support for handling/interpreting file formats in indexes, thus enabling indexing of strings encoded in these files.
Currently only the patch for Sleuthkit 2.04 is available. Autopsy patch will be included in a short while. They add the third version of Searchtools (indexed searching) to Autopsy.
While the beta stage is over, I would greatly appreciate it if people would test the indexed searching on other machines and images and send their problems, feedback and feature requests to me.
All feedback is appreciated! My goal is to add useful features (like indexed searching) to Autopsy and Sleuthkit. This requires feedback! ;-)
You can download the current version (3.2) of Sleuthkit/Autopsy Searchtools patch here: sleuthkit-2.04-searchtools-3.2.patch