D.A. Clements
1
S hE i Search Engines
2
What is a Search Engine? y A client/server application y A document retrieval system y Use regularly updated indexes to operate quickly and
efficiently ffi i l y Designed to help find information stored:
y On a computer system, such as on the World Wide Web y Inside a corporate or proprietary network y In a personal computer
y Different selection and relevance criteria can apply in
different environments, or for different uses diff t i t f diff t y Allows one to ask for content meeting specific criteria y Typically those containing a given word or phrase y Retrieves a list of items that match those criteria R i li f i h h h i i
3
What else? y Some also mine data available in: l d l bl y Newsgroups y Large databases L d t b y Open directories like DMOZ.org y What about the text of books
y Web directories – maintained by human editors y Search engines – operate algorithmically y Many website “search engines” are actually front
ends to search engines of others 4
History y Archie – First search tool for the Internet y “Archive" without the "v", not the character from 'Archie' comic book y Created (1990) by Alan Emtage, a student at McGill University, Montreal y Downloaded the directory listings of all files located on public anonymous FTP sites y Creating a searchable database of filenames ‐ not file contents
y Gopher – indexed plain text documents y Created (1991) by Mark McCahill at the University of Minnesota y Named after the school's mascot y Most of the Gopher sites became websites after the creation of the WWW
y Veronica – Veronica searched the files stored in Gopher index systems searched the files stored in Gopher inde s stems y Very Easy Rodent‐Oriented Net‐wide Index to Computerized Archives y Provided a keyword search of most Gopher menu titles
y Jughead – searched the files stored in Gopher index systems y Jonzy Jonzy's Universal Gopher Hierarchy Excavation And Display s Universal Gopher Hierarchy Excavation And Display y Tool for obtaining menu information from various Gopher servers
y Wandex – first Web search engine y Used index collected by the World Wide Web Wanderer, a web crawler developed by
Matthew Gray at MIT in 1993
5
Google y 2001 – rose to prominence t i y Currently the most popular search engine y Success based on the concept of link popularity and p p p y
PageRank y PageRank – The number of websites and webpages that link to a
page p g y Possible to order its results by how many websites link to each found page
y PageRank is based on citation analysis developed (1950s) g y p ( 95 )
by Eugene Garfield at the University of Pennsylvania y Minimalist user interface was very popular with users y Utilize more than 150 criteria to determine relevancy 6
Others y Yahoo! Search y Founders David Filo and Jerry Yang, Ph.D. candidates at Stanford University F d D id Fil d J Y Ph D did t t St f d U i it y Started in a campus trailer (February 1994) to keep track of their personal
interests on the Internet y 2002, Yahoo! acquired Inktomi y 2003, Yahoo! acquired Overture, which owned AlltheWeb and AltaVista 2003 Yahoo! acquired Overture which owned AlltheWeb and AltaVista y 2004, launched its own search engine
y Microsoft’s Windows Live Search y Most recent major search engine is y Powered by its own web crawler (called msnbot) y 2006, Microsoft migrated to the new search platform
y Ask.com y y y y y
February 2006, rebranded Ask Jeeves M ( ith lki di ti Maps (with walking directions and dynamic address generation) d d i dd ti ) "Smart Answers" were added Algorithmic engine using relevance ranking originally developed for Teoma Features generally unavailable elsewhere to help narrow, expand, and select related names y y
Page previews "Zoom" 7
Oh S hI d Other Search Indexes
8
Ad dS h Li k Advanced Search Link
9
Ab G l Li k About Google Link
10
G l H l Google Help
11
rd 3
P R Party Resources Google Pocket Guide (Paperback) by Tara Calishain, Rael Dornfest, D J Adams Paperback: 140 pages Publisher: O'Reilly Media; 1 edition (June 30, 2003) Language: English ISBN 0596005504 ISBN:
12
Different from Search Engines…. Different from Search Engines
13
S hE i Di i Search Engines vs. Directories S Search Engines h E i
Search Directories
y Automated—no human
y Indexed by humans
intervention y Paid advertisers top results lists
y E Examples: l y Yahoo
14
Computers are quick but they don’t think…. ~D.A. DA
15