Archives par mot-clé : Moteur de recherche

Quand on parle d’un moteur de recherche

Stop using Bing, it’s a limited version of Google

We suddenly grasp why Microsoft can claim that Bing works better and better.

Bing steals results from Google

When Bing has no search result for a query, it googles and adds Google’s first result in its index. This is proved by the experiment that Google conducted:

We created about 100 “synthetic queries”—queries that you would never expect a user to type, such as [hiybbprqag]. As a one-time experiment, for each synthetic query we inserted as Google’s top result a unique (real) webpage which had nothing to do with the query. […] We were surprised that within a couple weeks of starting this experiment, our inserted results started appearing in Bing.

Bottom line

some Bing results increasingly look like an incomplete, stale version of Google results—a cheap imitation.

Moteur de recherche avec phonémisation

Au bureau, on cherche à écrire un moteur de recherche avec approximation phonétique. Pour cela, on cherche un algorithme de phonémisation approximatif.

Avez vous des références? j’ai déjà trouvé

  • qdicorime, un programme qui aide à trouver des rimes — les développeurs sont de grands poètes
  • un outil de synthèse vocale, comme FreeTTS ou Franfest
  • metaphoneme de Lawrence Philips publié en juin 2000 dans C/C++ Users Journal et double metaphoneme qui est une amélioration du premier
  • une implémentation PHP de Soundex 2. Celui-ci réduit les mots sur seulement 4 caractères, il a implique donc une recherche plus vague
  • une implémentation PHP de Phonex qui est une évolution de Soundex

Une implémentation java serait idéale, vu qu’on souhaitait étendre Apache Lucene.

Microsoft Office Sharepoint Services: Search Engine blocking bug

This is hardly credible, but yet real (bug #932619). When you configure a MOSS to index http://server/homePage, it fails with

http://server/homepage The crawler could not communicate with the server. Check that the server is available and that the firewall access is configured correctly.

And indeed, SharePoint has changed the case, making every characters lower case.

We have just applied the SP1 which is supposed to fix this ridiculous bug. However, applying SP1 is not enough, the system admin must also change the value of the registry key CaseSensitiveUrls to 1.