Requirements

When designing the new search system, the designers came up with a core set of goals:

  1. Users should be able to search not only their emails, but eventually everything stored in the system, including attachments, contacts, calendar events, and task lists.
  2. Search should be fast and accurate.
  3. Any new emails should be searchable within seconds of arriving in the mailbox.
  4. The first version of search will be released to customers who use the webmail client; therefore, POP emails downloaded by POP accounts do not need to be indexed.
  5. The search system should be easy to scale as our customer base grows.
  6. Failover and backup are absolutely required.

We aggregate all of this data into a collection of several thousand constantly evolving spam tests that are performed on every email that enters the email hosting system. The results of these tests are combined together to identify more than 98% of spam with virtually zero false-positives.