Index Settings

After changing any of the settings below, a reindex of all existing volumes is required. Refer to Volumes for more information.

Setting	Default Value	Description
Convert X400 addresses using supplied domains	Yes	Attempt to convert X400 addresses to SMTP addresses without performing an Active Directory lookup of the address. With this option enabled, MailArchiva attempts to convert X400 addresses using the local domains defined in Configuration->Domains. As such, it may not offer an accurate conversion.
Default index language	English	Default indexing language. If MailArchiva detects a different language, it will index the document using that language's analyzer.
Default indexing charset	UTF-8	When indexing an email, if the character set is not specified, MailArchiva will default to indexing using the selected character set.
Default zip file name charset	UTF-8	Select a default character set for Zip file name usage.
Max chars to index per field	100000	Maximum no. of characters in the body of a message or attachment content permissible to index.
Stemming Method	Light Stemming	Whether or not to apply stemming algorithms such as Porter Stemming to the search terms during indexing and search. With stemming enabled, all search/index terms will be mapped to their root form. For example, at the time of indexing, the word "running" will be reduced to "run". Thus, a search for the word "run" will match both "run" and "running". The decision to stem or not, involves a choosing a level of usability vs accuracy of search. The "light stemming" setting typically offers a fair balance between usability and accuracy. After changing this setting, a re-index is necessary. If a high degree of accuracy is required, set the stemming method to "precise".
Index stop words	Yes	To save on disk space, on certain fields expected to contain a large amount of data (e.g. body, attachments), stop words such as (the, and, there, etc.) are not recorded in the index. As a result, exact phrase searching on these fields may not produce the expected results. For instance, the query body:"There is no risk”, may inadvertently result in all documents containing the word "risk" to be returned. This is due to the fact that the words "there, is and no" are deemed stop words, and are consequently not indexed. The phrase "There is no risk" is reduced to "risk". For legal e-discovery, exact phrase matching on fields such as body, etc. may be a strong requirement. In this case, it is necessary to apply a check mark to the index stop words settings. After changing this setting, a re-index is necessary.
Maximum queue size	10000	Specifies the size of the receive queue. Reducing the size of the receive (in Configuration->General->Archive) and index (specified in Configuration->General->Index) queues, limit the total amount of space that the queues may consume on disk. When importing / archiving data, data is temporarily cached on disk until either the archive or index queue is full, at which time, if importing, the import operation will block; while if archiving via SMTP, the operation may result in MailArchiva's inbuilt SMTP server returning a retry later error code such as 421, or 451. Given that, the speed of reading source data is typically faster than speed of encryption, compression + write, especially, with regards to import, it is crucial to set an appropriate queue size taking into account the amount of disk space available on the queue partition such that all available disk space cannot be consumed during an import operation.