How do you implement a "Did you mean"?
Possible Duplicate: ¢ How does the Google “Did you mean?” Algorithm work?
Suppose you have a search system currently in your internet site. Just how can you implement the "Did you suggest:
<spell_checked_word>" like Google carries out in some search queries?
Actually what Google does is significantly non - unimportant as well as additionally in the beginning counter - instinctive. They do not do anything like check versus a thesaurus, yet instead they take advantage of data to recognize "similar" questions that returned even more outcomes than your question, the specific algorithm is certainly not recognized.
There are various below - troubles to address below, as a basic basis for natural Language Processing data relevant there is one have to have publication: Foundation of Statistical Natural Language Processing.
Concretely to address the trouble of word/query resemblance I have actually had excellent outcomes with making use of Edit Distance, a mathematical action of string resemblance that functions remarkably well. I made use of to make use of Levenshtein yet the others might deserve checking into.
Soundex - in my experience - is crap.
In fact successfully saving and also looking a huge thesaurus of misspelled words and also having below 2nd access is once more non - unimportant, your best choice is to take advantage of existing complete message indexing and also access engines (i.e. not your data source is one), of which Lucene is presently among the most effective and also together ported to several several systems.
If you have sector details translations, you will likely require a synonym replacement tool. As an example, I operated in the precious jewelry sector and also there were abbreviate in our summaries such as kt - karat, rd - round, cwt - carat weight
I assume this relies on just how large your internet site it. On our neighborhood Intranet which is made use of by concerning 500 participant of team, I merely consider the keywords that returned absolutely no outcomes and also enter that keywords with the new recommended keywords right into a SQL table.
I them get in touch with that table if no search engine result has actually been returned, nonetheless, this just functions if the website is reasonably tiny and also I just do it for keywords which are one of the most usual.
You could additionally intend to consider my response to a comparable inquiry:
Soundex benefits phonetic suits, yet functions ideal with individuals' names (it was initially created for demographics information)
Also look into Full - Text - Indexing, the syntax is various from Google reasoning, yet it is really fast and also can manage comparable language components.