In this dissertation, we describe the Word Stemming/Hashing Algorithm as an approach to help regain a spam-free inbox. Since it is impossible to foresee future spam creation techniques, it is important to react quickly to their development. Word Stemming/Hashing Algorithm has been design to detect the modified suspicious terms of the harmful messages as the content based spam filter technique works well only if the suspicious terms are lexically correct due to the spammers rearrangement of the Suspicious terms to foil the filter. That means the terms must be valid terms with correct spelling. Otherwise most content based spam filters will be unable to detect the suspicious terms. In this dissertation, we shown that if we use some sort of word stemming or word hashing technique that can extract the base or stem of a misspelled or modified term, the efficiency of any content based spam filter can be significantly improved in mail classification. Finally, we present a simple rule based Word Stemming Algorithm that can fish out and handle the modified suspicious terms for effective mail classification.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.