Search Engines - Some things you should know
I first started writing this site on 5th June 1999. Later the same month I submitted it to Alta Vista, Excite, Lycos, UK Plus and Yahoo. Six months later the only search engine that carried the site was UK Plus. If you've submitted your site to a search engine and it doesn't immediately appear don't worry, it will.
Keywords Meta Tag
The keywords meta tag is now largely redundant. Most search engine crawlers don't use it anymore, but a few do, so you may want to add it to your pages. It goes in the head section of the page and the form is ...
<meta name="Keywords" content="word, word, word, word">
For this page, I've used
<meta name="Keywords" content="search, engine, placing, improving, submitting, about">
Submitting
There are sites that promise to get you "top ranking" in multiple search engines - for a fee. You can spend a lot of money with these submissions and still not get anywhere near top placing. The reason is simple, these sites can submit your site but they've got no control over how the search engines work. There are plenty of free submission engines around, usually offering to submit your site to 20 or so. Another method is to submit to the search engines yourself. The only drawback is that it takes a bit longer than using an automatic submission. Most search engine re-index sites periodically they also spider sites that haven't been submitted but they've found links to it on other sites.
Sitemaps
All sites should have a human readable sitemap - an easily navigated page that lists the pages on a site so people can find their way around if they get lost using the normal site navigation. There are also sitemaps that help search engine bots and spiders visit every page of a site. Some web technologies such as Flash and pages made on the fly by server side scripting make some sites difficult for search engines to index so in November 2006, Google, Yahoo and MSN, joined in April 2007, by Ask, announced a common protocol for search engine sitemaps.
There are two types of search engine sitemap, one is plain text and the other uses XML. The Sitemaps website was created by Google, Yahoo and Microsoft to explain and define search engine sitemaps. It is the definitive definition of sitemaps and should be consulted if you intend using them.
If you've copy of the website on your computer then creating the text version of a sitemap is pretty easy and doesn't need any extra software than that already installed. The DIR command from a command prompt gives a directory listing. By limiting dir to list htm pages and using the /o /s and /b switches to order, include subdirectories and give a short listing and writing the output to a text file the sitemap text file is practically done for you. Here's the command I use...
dir "c:\users\brisray\documents\My Web Sites\Free Hosts\*.htm" /o /s /b > "c:\users\brisray\documents\My Web Sites\sitemap.txt"
Rather than typing the whole line every time, copy the text to a plain text file and save it as webdir.bat. You will need to edit the input and output paths and then you can create a shortcut to this file and place it in your Start menu, as I have done...
Running the batch file gives a text file that looks a little like...
c:\users\brisray\documents\My Web Sites\Free Hosts\biog.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\credits.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\grad.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\guests.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\index.html
c:\users\brisray\documents\My Web Sites\Free Hosts\injury.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\links.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\oldindex.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\search.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\sitemap.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\briscan\briscan1.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\bristol\bagorge1.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\bristol\bagorge2.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\common\menu.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\comp\audioa.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\comp\audiob.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\comp\audioc.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\dad\chevron1.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\dad\chevron2.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\dad\cook1.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\graphics\gind.htm
c:\users\brisray\documents\My Web Sites\Free Hosts\graphics\psvig1.htm
The file is still not suitable though. Use Notepad's replace facility to change \ to /. This makes the file...
c:/users/brisray/documents/My Web Sites/Free Hosts/biog.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/credits.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/grad.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/guests.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/index.html
c:/users/brisray/documents/My Web Sites/Free Hosts/injury.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/links.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/oldindex.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/search.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/sitemap.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/briscan/briscan1.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/bristol/bagorge1.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/bristol/bagorge2.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/common/menu.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/comp/audioa.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/comp/audiob.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/comp/audioc.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/dad/chevron1.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/dad/chevron2.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/dad/cook1.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/graphics/gind.htm
c:/users/brisray/documents/My Web Sites/Free Hosts/graphics/psvig1.htm
Now use Notepad's replace facility to replace the path to the domain and path...
http://brisray.com/biog.htm
http://brisray.com/credits.htm
http://brisray.com/grad.htm
http://brisray.com/guests.htm
http://brisray.com/index.html
http://brisray.com/injury.htm
http://brisray.com/links.htm
http://brisray.com/oldindex.htm
http://brisray.com/search.htm
http://brisray.com/sitemap.htm
http://brisray.com/briscan/briscan1.htm
http://brisray.com/bristol/bagorge1.htm
http://brisray.com/bristol/bagorge2.htm
http://brisray.com/common/menu.htm
http://brisray.com/comp/audioa.htm
http://brisray.com/comp/audiob.htm
http://brisray.com/comp/audioc.htm
http://brisray.com/dad/chevron1.htm
http://brisray.com/dad/chevron2.htm
http://brisray.com/dad/cook1.htm
http://brisray.com/graphics/gind.htm
http://brisray.com/graphics/psvig1.htm
The sitemap.txt file is now suitable for uploading. It can be easily edited for small changes or the steps above rerun for larger edits.
There are online utilities that can create the xml file if one if needed but I don't understand how these can be any better than a search engine spider than finding all the pages of a site. I prefer programs like OFFLINE SITE MAP GENERATOR which generate the XML file from the files on the computer.
Once created the sitemaps can easily be edited for small changes or recreated for larger ones. If you are using FrontPage you need to edit out the references to the _vti directories that FrontPage produces.
The location of the sitemap file can be designated in the robots.txt with an entry in the file...
Sitemap: http:/brisray.com/sitemap.xml
The location can also be sent directly to Google using Google Site Tools and Yahoo using the Yahoo Site Explorer
The best practice seems to be use the XML version of the sitemap and place the location in robots.txt. Be sure to read the Sitemaps website for full information on all aspects of sitemaps.
How they work
Most work by sending a spider, a small piece of software, to look at your page. These look at the text looking for key phrases and visit multiple pages of your site by following the links (crawling). The results are then added to the search engine database.
Optimization & Ranking
How search engines rank your site depends on a number of factors. Search engines look for how often key phrases appear and in what relation to each other. Once in a search engine index many count how many times the link is clicked on, the more clicks the higher your ranking. Some check how many links your site has from other sites. If a link to your site appears on other sites it may raise your ranking.
Search engines use sophisticated techniques to make their listing accurate. These include reducing the ranking for sites that appear to be just a list of keywords, or have links on a lot of unrelated sites.
Paying
There is a difference between paid inclusion and paid ranking. Paid inclusion is where the site is guaranteed to be indexed. This may be because a site's content may change frequently and so needs to be kept current in the search engine indexes. Paid ranking is where people pay the search engines to improve their ranking to certain keywords. Thankfully, most search engines now have a separate listing of these "sponsored" sites.
Searching
Here's a few hints to look for exactly what you want. What usually happens is that the search engines return pages that contain every word in your search phrase. Be as specific as you can with these search words or phrases. If you enter Bristol Bridge riot you'll get pages that'll mention how flowers near Bristol Bridge are a riot of colour, how Clifton Suspension bridge was built in Bristol and so on. To ensure that all the words appear on the page you are looking for put a + in front of them. For example, use +Bristol +Bridge +riot
To stop getting pages that appear with words you don't need use a - in front of them. In our example you'll get lots of bridge clubs included in the results, to stop them we can use +Bristol +Bridge +riot -club
To look for a specific phrase enclose it in quotes. Such as "Bristol Bridge riot" this will certainly cut down on the amount of pages you get returned but it will miss things like "the riots at Bristol Bridge" and still include bridge clubs. In this case we can combine everything and put +"Bristol Bridge" +riot -club
In this example we've gone from 4,600 sites in Google to 31, all highly relevant to what we wanted to find.
Pet Hates
There are several things I don't like about the way some search engines work. Some sites still manage to get multiple listings covering several search result pages, often these pages have little or no content relevant to the search words or phrases asked for. Another pet hate are the results for commercial sites that I get. If I wanted to buy something I'd put "buy" in the search phrase. If I specifically wanted to buy a book about something then I'd put "buy book" then the search keywords. I know Amazon sells books, I don't need to be reminded every bloody single time I search for something. Blogs also get on my nerves. If I'm looking for information I don't want to see someone's rant about a subject. Especially if it's just two badly put together sentences before the author goes on about the price of cheese or whatever came into their mind at the time. I'm starting to lose track of the number of times that I've looked for something and found a link to a forum where someone has asked a similar question to my search phrase and didn't get an answer. Apart from knowing someone's having the same problem, its about as much use to me as a chocolate teapot.
Experience
My site has been on the web since June 1999. I visited each of the search engines below in April 2001 and entered the following search words 1) Bristol England 2) Bristol England History 3) Bedminster 4) Optical Illusions. The figures give my site position for each. 0 = Not Found or not in the top 50. I did it again over several years.
Search Engine | April 2001 | August 2002 | September 2003 | September 2004 | October 2005 |
About | 0, 22, 0, 0 | 0, 0, 0, 28 | 0, 0, 23, 0 | 0, 0, 0, 0 | 0, 0, 0, 0 |
All the Web | 0, 17, 0, 0 | 0, 0, 0, 14 | 0, 0, 0, 0 | 0, 43, 37, 36 | 0, 0, 0, 0 |
Alta Vista | 0, 25, 0, 7 | 0, 9, 0, 0 | 0, 0, 9, 0 | 0, 43, 38, 37 | 0, 0, 0, 0 |
AOL | 0, 0, 0, 10 | 0, 0, 0, 51 | 0, 0, 0, 0 | 0, 0, 27, 0 | 0, 0, 0, 0 |
Ask Jeeves | 0, 0, 0, 0 | 0, 0, 0, 39 | 0, 0, 31, 0 | 0, 0, 0, 0 | |
Dogpile | 0, 0, 32, 44 | 0, 0, 0, 0 | 0, 0, 0, 0 | ||
Excite | 0, 0, 0, 38 | 0, 0, 0, 30 | 0, 0, 48, 0 | 0, 0, 0, 0 | 0, 0, 0, 0 |
0, 44, 0, 18 | 0, 34, 35, 45 | 0, 0, 0, 0 | 0, 0, 21, 0 | 0, 0, 0, 0 | |
HotBot | 0, 27, 0, 0 | 0, 25, 0, 28 | 0, 0, 9, 0 | 0, 0, 49, 0 | Now Ask Jeeves and Google |
LookSmart | 0, 0, 0, 0 | 0, 0, 0, 0 | 0, 0, 16, 12 | 0, 0, 0, 51 | 0, 0, 34, 37 |
Lycos | 0, 17, 0, 0 | 0, 0, 0, 13 | 0, 49, 0, 0 | 0, 0, 49, 0 | 0, 0, 0, 0 |
Mamma | 0, 0, 0, 0 | 0, 0, 0, 0 | 0, 0, 44, 0 | 0, 0, 0, 0 | 0, 0, 0, 0 |
MetaCrawler | 0, 0, 0, 0 | 0, 0, 0, 0 | 0, 0, 22, 43 | 0, 0, 0, 0 | 0, 0, 0, 0 |
Mirago | 0, 0, 0, 0 | 0, 0, 0, 0 | |||
MSN | 0, 52, 0, 0 | 0, 27, 0, 0 | 0, 0, 24, 0 | 0, 0, 42, 0 | 0, 47, 0, 0 |
Netscape | 0, 41, 19, 0 | 0, 0, 0, 0 | |||
Open Directory Project | 0, 0, 0, 25 | 0, 0, 0, 28 | 0, 0, 0, 0 | 0, 0, 0, 0 | 0, 0, 0, 13 |
Overture | 0, 15, 0, 0 | 0, 0, 0, 0 | 0, 43, 47, 0 | Now Yahoo Search Marketing | |
Sprinks | 0, 24, 0, 0 | 0, 0, 0, 0 | 0, 0, 0, 0 | Now Google | Gone? |
Teoma | 0, 0, 0, 29 | 0, 0, 31, 0 | 0, 0, 0, 0 | ||
UK Plus | 0, 0, 0, 5 | 0, 21, 24, 0 | 0, 0, 2, 0 | 0, 32, 13, 3 | 0, 42, 16, 0 |
Web Crawler | 0, 0, 0, 37 | 0, 0, 0, 30 | 0, 0, 19, 35 | 0, 0, 0, 0 | 0, 0, 0, 0 |
Yahoo | 0, 33, 45, 17 | 0, 0, 0, 0 | 0, 0, 0, 0 | 0, 0, 0, 0 | 0, 0, 0, 0 |
Yumo | 0, 0, 0, 0 | 0, 0, 0, 0 | Gone? | Now Altaseek using results from Alta Vista, Yahoo and Lycos |
Pretty depressing, from the results it's a wonder anyone can find the site at all! But this isn't the whole story. The search terms I originally used were far too general. I tried again using the following search terms 1) Bristol riots 2) Penrose stairs 3) QBasic menus 4) HMS Gambia collision
Search Engine | September 2003 | September 2004 | October 2005 |
About | 4, 0, 6, 1 | 0, 0, 0, 0 | 0, 0, 0, 0 |
All the Web | 4, 0, 0, 2 | 3, 0, 1, 1 | 2, 0, 2, 1 |
Alta Vista | 6, 0, 0, 0 | 3, 0, 1, 1 | 2, 0, 2, 1 |
AOL | 3, 8, 1, 1 | 2, 47, 1, 1 | 38, 0, 0, 50 |
Ask Jeeves | 2, 3, 1, 1 | 12, 35, 1, 1 | 3, 0, 1, 9 |
Dogpile | 2, 14, 1, 4 | 7, 28, 4, 3 | 5, 0, 11, 7 |
Excite | 2, 16, 1, 7 | 7, 0, 5, 3 | 5, 0, 11, 7 |
3, 6, 1, 1 | 2, 43, 2, 1 | 0, 0, 0, 1 | |
HotBot | 4, 0, 2, 1 | 3, 0, 1, 1 | Now using Ask Jeeves and Google |
LookSmart | 0, 0, 14, 1 | 0, 0, 10, 1 | 18, 0, 1, 1 |
Lycos | 4, 0, 0, 2 | 3, 0, 1, 1 | 3, 49, 1, 10 |
Mamma | 4, 0, 1, 4 | 9, 22, 4, 1 | 13, 0, 7, 4 |
MetaCrawler | 3, 30, 1, 7 | 10, 0, 6, 3 | 5, 0, 10, 10 |
Mirago | 0, 0, 0, 0 | 40, 0, 0, 0 | |
MSN | 3, 0, 2, 1 | 2, 0, 1, 1 | 14, 21, 1, 1 |
Netscape | 2, 44, 1, 1 | 29, 0, 0, 46 | |
Open Directory Project | 0, 0, 0, 0 | 0, 0, 0, 0 | 0, 0, 0, 0 |
Overture | 0, 0, 0, 0 | 2, 0, 3, 1 | Now Yahoo Search Marketing |
Sprinks | 0, 0, 0, 0 | Now Google | Gone? |
Teoma | 2, 3, 1, 1 | 12, 35, 1, 1 | 3, 0, 1, 10 |
UK Plus | 3, 0, 0, 1 | 2, 0, 1, 1 | 7, 0, 1, 1 |
Web Crawler | 2, 12, 1, 7 | 8, 0, 5, 3 | 5, 0, 10, 10 |
Yahoo | 3, 4, 1, 1 | 3, 0, 1, 1 | 2, 0, 2, 1 |
Yumo | 9, 20, 8, 1 | Gone? | Now Altaseek using results from Alta Vista, Yahoo and Lycos |
Lastly, I'd though I'd try something truly unique to me, so I searched for brisray. This is a great way to search for links to your site, just use your unique user name. I also use this name for message boards, forums and so on, so these results are returned as well. The numbers list the number of results returned
Search Engine | September 2003 | September 2004 | October 2005 | October 2006 |
About | 57 | 0 | 0 | 0 |
All the Web | 2,093 | 2,630 | 25,500 | 12,600 |
Alta Vista | 311 | 2,720 | 23,900 | 12,900 |
AOL | 73 | 154 | 181 | 91 |
Ask Jeeves | 190 | 5,630 | 3,780 | Now Ask.com 134 |
Dogpile | 44 | 59 | 67 | 48 |
Excite | 40 | 58 | 20 | 64 |
3,320 | 8,790 | 14,700 | 764 | |
HotBot | 71 | 121 | Now Ask Jeeves and Google | Now MSN, Ask and Google |
LookSmart | 81 | 300 | 0 | 37 |
Lycos | 2,108 | 121 | 3,790 | 3,576 |
Mamma | 33 | 44 | 39 | 34 |
MetaCrawler | 37 | 58 | 68 | 73 |
Mirago | 78 | 23 | 26 | |
MSN | 62 | 110 | 5,746 | 184 |
Netscape | 148 | 157 | 731 | |
Open Directory Project | 1 | 2 | 5 | 5 |
Overture | 40 | 84 | 205 Now Yahoo Search Marketing |
147 |
Sprinks | 0 | Now Google | Gone? | |
Teoma | 2,190 | 5,630 | 3,790 | Now Ask.com |
UK Plus | 42 | 85 | 203 | 10 |
Web Crawler | 40 | 49 | 62 | 70 |
Yahoo | 1,770 | 8,880 | 27,000 | 8,130 |
Yumo | 17 | Gone? | Now Altaseek using results from Alta Vista, Yahoo and Lycos |
Links
List of Search Engines
Search Engine Guide - How to get the best from search engines
Search Engine Watch - Everything you need to know about search engines
Search Engines Worldwide - A list of search engines
Sitemaps - a site created by Google, Yahoo and Microsoft to explain search engine sitemaps
Virtual Search Engines - Specialised search engines
W3 Search Engines - A list of search engines
XML SiteMap Creator - program to produce a XML sitemap
This page created 22nd April 2001, last modified 6th May 2008