Search Engines

Search Engines - Some things you should know

I first started writing this site on 5th June 1999. Later the same month I submitted it to Alta Vista, Excite, Lycos, UK Plus and Yahoo. Six months later the only search engine that carried the site was UK Plus. If you've submitted your site to a search engine and it doesn't immediately appear don't worry, it will.

Keywords Meta Tag

The keywords meta tag is now largely redundant. Most search engine crawlers don't use it anymore, but a few do, so you may want to add it to your pages. It goes in the head section of the page and the form is ...

<meta name="Keywords" content="word, word, word, word">

For this page, I've used

<meta name="Keywords" content="search, engine, placing, improving, submitting, about">

Submitting

There are sites that promise to get you "top ranking" in multiple search engines - for a fee. You can spend a lot of money with these submissions and still not get anywhere near top placing. The reason is simple, these sites can submit your site but they've got no control over how the search engines work. There are plenty of free submission engines around, usually offering to submit your site to 20 or so. Another method is to submit to the search engines yourself. The only drawback is that it takes a bit longer than using an automatic submission. Most search engine re-index sites periodically they also spider sites that haven't been submitted but they've found links to it on other sites.

Sitemaps

All sites should have a human readable sitemap - an easily navigated page that lists the pages on a site so people can find their way around if they get lost using the normal site navigation. There are also sitemaps that help search engine bots and spiders visit every page of a site. Some web technologies such as Flash and pages made on the fly by server side scripting make some sites difficult for search engines to index so in November 2006, Google, Yahoo and MSN, joined in April 2007, by Ask, announced a common protocol for search engine sitemaps.

There are two types of search engine sitemap, one is plain text and the other uses XML. The Sitemaps website was created by Google, Yahoo and Microsoft to explain and define search engine sitemaps. It is the definitive definition of sitemaps and should be consulted if you intend using them.

If you've copy of the website on your computer then creating the text version of a sitemap is pretty easy and doesn't need any extra software than that already installed. The DIR command from a command prompt gives a directory listing. By limiting dir to list htm pages and using the /o /s and /b switches to order, include subdirectories and give a short listing and writing the output to a text file the sitemap text file is practically done for you. Here's the command I use...

dir "c:\users\brisray\documents\My Web Sites\Free Hosts\*.htm" /o /s /b > "c:\users\brisray\documents\My Web Sites\sitemap.txt"

Rather than typing the whole line every time, copy the text to a plain text file and save it as webdir.bat. You will need to edit the input and output paths and then you can create a shortcut to this file and place it in your Start menu, as I  have done...

Shortcut to webdir.bat

Running the batch file gives a text file that looks a little like...

c:\users\brisray\documents\My Web Sites\Free Hosts\biog.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\credits.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\grad.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\guests.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\index.html
c:\users\brisray\documents\My Web Sites\Free Hosts\injury.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\links.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\oldindex.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\search.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\sitemap.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\briscan\briscan1.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\bristol\bagorge1.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\bristol\bagorge2.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\common\menu.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\comp\audioa.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\comp\audiob.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\comp\audioc.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\dad\chevron1.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\dad\chevron2.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\dad\cook1.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\graphics\gind.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\graphics\psvig1.htm

The file is still not suitable though. Use Notepad's replace facility to change \ to /. This makes the file...

c:/users/brisray/documents/My Web Sites/Free Hosts/biog.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/credits.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/grad.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/guests.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/index.html
c:/users/brisray/documents/My Web Sites/Free Hosts/injury.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/links.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/oldindex.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/search.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/sitemap.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/briscan/briscan1.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/bristol/bagorge1.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/bristol/bagorge2.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/common/menu.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/comp/audioa.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/comp/audiob.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/comp/audioc.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/dad/chevron1.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/dad/chevron2.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/dad/cook1.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/graphics/gind.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/graphics/psvig1.htm

Now use Notepad's replace facility to replace the path to the domain and path...

http://brisray.com/biog.htm
http://brisray.com/credits.htm
http://brisray.com/grad.htm
http://brisray.com/guests.htm
http://brisray.com/index.html
http://brisray.com/injury.htm
http://brisray.com/links.htm
http://brisray.com/oldindex.htm
http://brisray.com/search.htm
http://brisray.com/sitemap.htm
http://brisray.com/briscan/briscan1.htm
http://brisray.com/bristol/bagorge1.htm
http://brisray.com/bristol/bagorge2.htm
http://brisray.com/common/menu.htm
http://brisray.com/comp/audioa.htm
http://brisray.com/comp/audiob.htm
http://brisray.com/comp/audioc.htm
http://brisray.com/dad/chevron1.htm
http://brisray.com/dad/chevron2.htm
http://brisray.com/dad/cook1.htm
http://brisray.com/graphics/gind.htm
http://brisray.com/graphics/psvig1.htm

The sitemap.txt file is now suitable for uploading. It can be easily edited for small changes  or the steps above rerun for larger edits.

There are online utilities that can create the xml file if one if needed but I don't understand how these can be any better than a search engine spider than finding all the pages of a site. I prefer programs like OFFLINE SITE MAP GENERATOR which generate the XML file from the files on the computer.

Once created the sitemaps can easily be edited for small changes or recreated for larger ones. If you are using FrontPage you need to edit out the references to the _vti directories that FrontPage produces.

The location of the sitemap file can be designated in the robots.txt with an entry in the file...

Sitemap: http:/brisray.com/sitemap.xml

The location can also be sent directly to Google using Google Site Tools and Yahoo using the Yahoo Site Explorer

The best practice seems to be use the XML version of the sitemap and place the location in robots.txt. Be sure to read the Sitemaps website for full information on all aspects of sitemaps.

How they work

Most work by sending a spider, a small piece of software, to look at your page. These look at the text looking for key phrases and visit multiple pages of your site by following the links (crawling). The results are then added to the search engine database.

Optimization & Ranking

How search engines rank your site depends on a number of factors. Search engines look for how often key phrases appear and in what relation to each other. Once in a search engine index many count how many times the link is clicked on, the more clicks the higher your ranking. Some check how many links your site has from other sites. If a link to your site appears on other sites it may raise your ranking.

Search engines use sophisticated techniques to make their listing accurate. These include reducing the ranking for sites that appear to be just a list of keywords, or have links on a lot of unrelated sites.

Paying

There is a difference between paid inclusion and paid ranking. Paid inclusion is where the site is guaranteed to be indexed. This may be because a site's content may change frequently and so needs to be kept current in the search engine indexes. Paid ranking is where people pay the search engines to improve their ranking to certain keywords. Thankfully, most search engines now have a separate listing of these "sponsored" sites.

Searching

Here's a few hints to look for exactly what you want. What usually happens is that the search engines return pages that contain every word in your search phrase. Be as specific as you can with these search words or phrases. If you enter Bristol Bridge riot you'll get pages that'll mention how flowers near Bristol Bridge are a riot of colour, how Clifton Suspension bridge was built in Bristol and so on. To ensure that all the words appear on the page you are looking for put a + in front of them. For example, use +Bristol +Bridge +riot

To stop getting pages that appear with words you don't need use a - in front of them. In our example you'll get lots of bridge clubs included in the results, to stop them we can use +Bristol +Bridge +riot -club

To look for a specific phrase enclose it in quotes. Such as "Bristol Bridge riot" this will certainly cut down on the amount of pages you get returned but it will miss things like "the riots at Bristol Bridge" and still include bridge clubs. In this case we can combine everything and put +"Bristol Bridge" +riot -club

In this example we've gone from 4,600 sites in Google to 31, all highly relevant to what we wanted to find.

Pet Hates

There are several things I don't like about the way some search engines work. Some sites still manage to get multiple listings covering several search result pages, often these pages have little or no content relevant to the search words or phrases asked for. Another pet hate are the results for commercial sites that I get. If I wanted to buy something I'd put "buy" in the search phrase. If I specifically wanted to buy a book about something then I'd put "buy book" then the search keywords. I know Amazon sells books, I don't need to be reminded every bloody single time I search for something. Blogs also get on my nerves. If I'm looking for information I don't want to see someone's rant about a subject. Especially if it's just two badly put together sentences before the author goes on about the price of cheese or whatever came into their mind at the time. I'm starting to lose track of the number of times that I've looked for something and found a link to a forum where someone has asked a similar question to my search phrase and didn't get an answer. Apart from knowing someone's having the same problem, its about as much use to me as a chocolate teapot.

Experience

My site has been on the web since June 1999. I visited each of the search engines below in April 2001 and entered the following search words 1) Bristol England 2) Bristol England History 3) Bedminster 4) Optical Illusions. The figures give my site position for each. 0 = Not Found or not in the top 50. I did it again over several years.

Search Engine April 2001 August 2002 September 2003 September 2004 October 2005
           
About 0, 22, 0, 0 0, 0, 0, 28 0, 0, 23, 0 0, 0, 0, 0 0, 0, 0, 0
All the Web 0, 17, 0, 0 0, 0, 0, 14 0, 0, 0, 0 0, 43, 37, 36 0, 0, 0, 0
Alta Vista 0, 25, 0, 7 0, 9, 0, 0 0, 0, 9, 0 0, 43, 38, 37 0, 0, 0, 0
AOL 0, 0, 0, 10 0, 0, 0, 51 0, 0, 0, 0 0, 0, 27, 0 0, 0, 0, 0
Ask Jeeves   0, 0, 0, 0 0, 0, 0, 39 0, 0, 31, 0 0, 0, 0, 0
Dogpile     0, 0, 32, 44 0, 0, 0, 0 0, 0, 0, 0
Excite 0, 0, 0, 38 0, 0, 0, 30 0, 0, 48, 0 0, 0, 0, 0 0, 0, 0, 0
Google 0, 44, 0, 18 0, 34, 35, 45 0, 0, 0, 0 0, 0, 21, 0 0, 0, 0, 0
HotBot 0, 27, 0, 0 0, 25, 0, 28 0, 0, 9, 0 0, 0, 49, 0 Now Ask Jeeves and Google
LookSmart 0, 0, 0, 0 0, 0, 0, 0 0, 0, 16, 12 0, 0, 0, 51 0, 0, 34, 37
Lycos 0, 17, 0, 0 0, 0, 0, 13 0, 49, 0, 0 0, 0, 49, 0 0, 0, 0, 0
Mamma 0, 0, 0, 0 0, 0, 0, 0 0, 0, 44, 0 0, 0, 0, 0 0, 0, 0, 0
MetaCrawler 0, 0, 0, 0 0, 0, 0, 0 0, 0, 22, 43 0, 0, 0, 0 0, 0, 0, 0
Mirago       0, 0, 0, 0 0, 0, 0, 0
MSN 0, 52, 0, 0 0, 27, 0, 0 0, 0, 24, 0 0, 0, 42, 0 0, 47, 0, 0
Netscape       0, 41, 19, 0 0, 0, 0, 0
Open Directory Project 0, 0, 0, 25 0, 0, 0, 28 0, 0, 0, 0 0, 0, 0, 0 0, 0, 0, 13
Overture   0, 15, 0, 0 0, 0, 0, 0 0, 43, 47, 0 Now Yahoo Search Marketing
Sprinks 0, 24, 0, 0 0, 0, 0, 0 0, 0, 0, 0 Now Google Gone?
Teoma     0, 0, 0, 29 0, 0, 31, 0 0, 0, 0, 0
UK Plus 0, 0, 0, 5 0, 21, 24, 0 0, 0, 2, 0 0, 32, 13, 3 0, 42, 16, 0
Web Crawler 0, 0, 0, 37 0, 0, 0, 30 0, 0, 19, 35 0, 0, 0, 0 0, 0, 0, 0
Yahoo 0, 33, 45, 17 0, 0, 0, 0 0, 0, 0, 0 0, 0, 0, 0 0, 0, 0, 0
Yumo   0, 0, 0, 0 0, 0, 0, 0 Gone? Now Altaseek using results from Alta Vista, Yahoo and Lycos

Pretty depressing, from the results it's a wonder anyone can find the site at all! But this isn't the whole story. The search terms I originally used were far too general. I tried again using the following search terms 1) Bristol riots 2) Penrose stairs 3) QBasic menus 4) HMS Gambia collision

Search Engine September 2003 September 2004 October 2005
       
About 4, 0, 6, 1 0, 0, 0, 0 0, 0, 0, 0
All the Web 4, 0, 0, 2 3, 0, 1, 1 2, 0, 2, 1
Alta Vista 6, 0, 0, 0 3, 0, 1, 1 2, 0, 2, 1
AOL 3, 8, 1, 1 2, 47, 1, 1 38, 0, 0, 50
Ask Jeeves 2, 3, 1, 1 12, 35, 1, 1 3, 0, 1, 9
Dogpile 2, 14, 1, 4 7, 28, 4, 3 5, 0, 11, 7
Excite 2, 16, 1, 7 7, 0, 5, 3 5, 0, 11, 7
Google 3, 6, 1, 1 2, 43, 2, 1 0, 0, 0, 1
HotBot 4, 0, 2, 1 3, 0, 1, 1 Now using Ask Jeeves and Google
LookSmart 0, 0, 14, 1 0, 0, 10, 1 18, 0, 1, 1
Lycos 4, 0, 0, 2 3, 0, 1, 1 3, 49, 1, 10
Mamma 4, 0, 1, 4 9, 22, 4, 1 13, 0, 7, 4
MetaCrawler 3, 30, 1, 7 10, 0, 6, 3 5, 0, 10, 10
Mirago   0, 0, 0, 0 40, 0, 0, 0
MSN 3, 0, 2, 1 2, 0, 1, 1 14, 21, 1, 1
Netscape   2, 44, 1, 1 29, 0, 0, 46
Open Directory Project 0, 0, 0, 0 0, 0, 0, 0 0, 0, 0, 0
Overture 0, 0, 0, 0 2, 0, 3, 1 Now Yahoo Search Marketing
Sprinks 0, 0, 0, 0 Now Google Gone?
Teoma 2, 3, 1, 1 12, 35, 1, 1 3, 0, 1, 10
UK Plus 3, 0, 0, 1 2, 0, 1, 1 7, 0, 1, 1
Web Crawler 2, 12, 1, 7 8, 0, 5, 3 5, 0, 10, 10
Yahoo 3, 4, 1, 1 3, 0, 1, 1 2, 0, 2, 1
Yumo 9, 20, 8, 1 Gone? Now Altaseek using results from Alta Vista, Yahoo and Lycos

Lastly, I'd though I'd try something truly unique to me, so I searched for brisray. This is a great way to search for links to your site, just use your unique user name. I also use this name for message boards, forums and so on, so these results are returned as well. The numbers list the number of results returned

Search Engine September 2003 September 2004 October 2005 October 2006
         
About 57 0 0 0
All the Web 2,093 2,630 25,500 12,600
Alta Vista 311 2,720 23,900 12,900
AOL 73 154 181 91
Ask Jeeves 190 5,630 3,780 Now Ask.com
134
Dogpile 44 59 67 48
Excite 40 58 20 64
Google 3,320 8,790 14,700 764
HotBot 71 121 Now Ask Jeeves and Google Now MSN, Ask and Google
LookSmart 81 300 0 37
Lycos 2,108 121 3,790 3,576
Mamma 33 44 39 34
MetaCrawler 37 58 68 73
Mirago   78 23 26
MSN 62 110 5,746 184
Netscape   148 157 731
Open Directory Project 1 2 5 5
Overture 40 84 205
Now Yahoo Search Marketing
147
Sprinks 0 Now Google Gone?  
Teoma 2,190 5,630 3,790 Now Ask.com
UK Plus 42 85 203 10
Web Crawler 40 49 62 70
Yahoo 1,770 8,880 27,000 8,130
Yumo 17 Gone? Now Altaseek using results from Alta Vista, Yahoo and Lycos  

Links

List of Search Engines
Search Engine Guide - How to get the best from search engines
Search Engine Watch - Everything you need to know about search engines
Search Engines Worldwide - A list of search engines
Sitemaps - a site created by Google, Yahoo and Microsoft to explain search engine sitemaps
Virtual Search Engines - Specialised search engines
W3 Search Engines - A list of search engines
XML SiteMap Creator - program to produce a XML sitemap

This page created 22nd April 2001, last modified 6th May 2008