BYTE.com
RSS feed

Newsletter
Free E-mail Newsletter from BYTE.com
Email Address
First Name
Last Name




 
    
             
BYTE.com > Features > 2005

Search Engine Quirks and Search Engine Jerks

By Lynne Greer Jolitz

May 9, 2005

(Search Engine Quirks and Search Engine Jerks :  Page 1 of 1 )



Everyone talks about hot search engine companies, and the next big thing in search (currently locality, with video emerging). But how many search engines are trolling the web, gleaning bits and pieces of the Internet corpus collosseum, and how do they differ in the process by which they search?

According to Nielsen/NetRatings MegaView Search Service, Google has 47 percent of all online searches, Yahoo! has 21 percent, and MSN has 13 percent, leaving a surprising 19 percent for everybody else. Excluding Brittney, Paris, and Christina searches (which currently must consume 90 percent of all bandwidth that's not spent processing spam), the search volume is huge and presents an opportunity for all kinds of specialized, custom, and unique search engines to jump in and "get in the game."

While reams have been written about search theory and methods inside a search engine itself, can we gain any meaningful technical information on how a particular search engine performs in practice simply by observing the way it interacts with a site as it conducts a search? In other words, do search engine quirks matter?

Even a small datacenter with well-trained staff can carefully monitor search engine practices over a period of time (preferably several years) and corrolate findings into catagories and find interesting results. Quirks of search engines are almost human in their obsessions, preferences, and desires.

The proviso, however, is that there must be enough significant traffic based on specific keywords or brand names that have extensive Internet history, but not so much that everyone randomly chooses the keyword (sorry Paris). On the other hand, low-traffic sites, non-technical sites (such as marketing firms), and recent sites lacking historical merit are unlikely to provide sufficient information. It's a peculiar balancing act in that you want good information, but not too much.

Along with legit search engines with quirky characteristics, there are a plethera of things that may look in the logs like search engine activity but that have quite different aims, ranging from annoying-but-innocuous to a severe threat.

 Page 1 of 1 


BYTE.com > Features > 2005
Dr. Dobb's Media Center

What Zope Did Wrong (and How It's Being Fixed)
Dr. Dobb's talks with Lennart Regebro about the many things that Zope 2 did right and did wrong. Lennart has also been one of the driving forces behind Five, the integration of Zope 3 technologies into Zope 2.

Ubuntu and the Software Around It
Dr. Dobb's interviews Ubuntu's Gerry Carr about the Linux-based Ubuntu operating sytem and the application lifecycle tools -- such as the recently released Launchpad -- that surround it.

BYTE.com Store

BYTE CD-ROM
NOW, on one CD-ROM, you can instantly access more than 8 years of BYTE.
 
The Best of BYTE: Volume 2 - Heuristic Algorithms
The Best of BYTE: Volume 2 - Heuristic Algorithms
In this volume of Best of BYTE, we explore the emergence of some heuristic algorithms. Although we have only scratched the surface of this intriguing subject, we hope we've suggested the potential of the synthesis of heuristics and algorithms.

© 2008 Think Services, Privacy Policy, Terms of Service, United Business Media Limited
Site comments: webmaster@byte.com
Web Sites: BYTE.com, dotnetjunkies.com, Dr. Dobb's Journal, SD Expo, Sys Admin, sqljunkies.com, Unixreview



MarketPlace
simple helix is the most trusted name in the hosting industry! Join us and host with the experts!
HP network adapters help get the most from your virtualized servers. Learn more at HP.IntelVT.com.
Automatically capture customer crash data, no debugger required. Support for .NET, C++, OS X, Java.
Understand C/C++ code in less time. A new team member ? Inherited legacy code ? Get up to speed faster with Crystal Flow for C/C++. Code-formatting improves readability. Flowcharts are integrated with code browser. Export flowcharts to Visio.
Develop 10 times faster ! ALM, IDE, .Net, RAD, 5GL, Database, 5GL, 64-bit, etc. Free Express version
Wanna see your ad here?
 

web2