Index Settings

 

After changing any of the settings below, a reindex of all existing volumes is required. Refer to Volumes for more information.

 

Setting

Default Value

Description

Convert X400 addresses using supplied domains

Yes

Attempt to convert X400 addresses to SMTP addresses without performing an Active Directory lookup of the address. With this option enabled, MailArchiva attempts to convert X400 addresses using the local domains defined in Configuration->Domains. As such, it may not offer an accurate conversion.

Default index language

 English

Default indexing language. If MailArchiva detects a different language, it will index the document using that language's analyzer.

Default indexing charset

UTF-8

When indexing an email, if the character set is not specified, MailArchiva will default to
indexing using the selected character set.


Default zip file name
charset

UTF-8

Select a default character set for Zip file name usage.

Max chars to index per field

100000

Maximum no. of characters in the body of a message or attachment content permissible to index.

Stemming Method

Light
Stemming

Whether or not to apply stemming algorithms such as Porter Stemming to the search terms during indexing and search. With stemming enabled, all search/index terms will be mapped to their root form. For example, at the time of indexing, the word "running" will be reduced to "run". Thus, a search for the word "run" will match both "run" and "running". The decision to stem or not, involves a choosing a level of usability vs accuracy of search. The "light stemming" setting typically offers a fair balance between usability and accuracy. After changing this setting, a re-index is necessary. If a high degree of accuracy is required, set the stemming method to "precise".

Index stop words

Yes

To save on disk space, on certain fields expected to contain a large amount of data (e.g. body, attachments), stop words such as (the, and, there, etc.) are not recorded in the index. As a result, exact phrase searching on these fields may not produce the expected results. For instance, the query body:"There is no risk”, may inadvertently result in all documents containing the word "risk" to be returned. This is due to the fact that the words "there, is and no" are deemed stop words, and are consequently not indexed. The phrase "There is no risk" is reduced to "risk". For legal e-discovery, exact phrase matching on fields such as body, etc. may be a strong requirement. In this case, it is necessary to apply a check mark to the index stop words settings. After changing this setting, a re-index is necessary.

Maximum queue size 10000 Specifies the size of the receive queue.  Reducing the size of the receive (in Configuration->General->Archive) and index (specified in Configuration->General->Index) queues, limit the total amount of space that the queues may consume on disk. When importing / archiving data, data is temporarily cached on disk until either the archive or index queue is full, at which time, if importing, the import operation will block; while if archiving via SMTP, the operation may result in MailArchiva's inbuilt SMTP server returning a retry later error code such as 421, or 451. Given that, the speed of reading source data is typically faster than speed of encryption, compression + write, especially, with regards to import, it is crucial to set an appropriate queue size taking into account the amount of disk space available on the queue partition such that all available disk space cannot be consumed during an import operation.

 

 

 

 

 

© 2005 - 2024 ProProfs

Found this information useful? Visit mailarchiva.com to learn more about MailArchiva.

-