Introduction
Google is the world's most popular search engine with over 90% of the world's searches made on the platform. The market share by other well-known search engines such as Bing, Yahoo, Baidu, Yandex, DuckDuckGo, and Ask are tiny in comparison, but still account for millions of searches every day.
There are plenty of sites around that give lists of the best search engines to ensure anonymous searches and avoid things like the UKUSA Agreement (5 Eyes, 9 Eyes, and 14 Eyes) and other intelligence gathering organizations. Top of the most recommended search engines on these are usually DuckDuckGo, Startpage, Searx, Qwant, Swisscows, MetaGer, Mojeek, and Brave Search.
Most major search engines enable you to refine the results by them using operators such as those for Google and Bing. Google also offers an Advanced Search page. Sometimes though, even these tricks are not enough and this page looks at search engines that are designed to find smaller, personal sites and includes some unusual or specialized search engines.
This is not a comprehensive list of all the search engines, but it does include those that I found useful or interesting.
Small, Personal, Non-commercial Web
The two most popular hosting sites for the small web revival appear to be Neocities and Nekoweb. Both have their own searches available.
Wiby
These search engines were the main reason for writing this page. The best I have found so far is Wiby
Marginalia
Marginalia is an independent DIY search engine that focuses on non-commercial content, and attempts to show you sites you perhaps weren't aware of. The results are determined by up-ranking web sites that are text-heavy, and downranking ones that are highly visual, loaded with modern web cruft, and SEO-optimized. This makes it a great tool for finding websites that the other search engines would not display. It also makes it one of my favourite search engines.
Mojeek
Mojeek is an independent search engine in that it uses its own database of websites. It doesn't necessarily find sites on the small web, but they are certainly different from the results obtained using Google or Bing.
Newgle
Newgle is a search engine that does not include results from .COM, .NET, and .ORG domains. This sounds like a good idea but I found Newgle disappointing. Rather than returning the results from the small web, it returned the sort of sites Google would but not those from .COM, .NET, and .ORG domains.
Searchmysite
Searchmysite is a small search engine of the small web user-submitted sites.
VHSearch
VHSearch is based on a filtered Google search to only return results from Neocities.
The Old Web
Towards the end of the 1990s and early 2000s, there were many free website hosts around including 50Megs, Angelfire, AOL Hometown, Bravepages, FortuneCity, Freeserve, Freeservers, Lycos Tripod, Neocities, Xoom, Yahoo and many more. In Canada, Sympatico (Internet Archive), owned by Bell, started hosting sites in November 1995. That lasted until February 1, 2023, when the sites were shut down. Bell did not seem interested in providing the service any longer for quite a while. Even in 2022, they were running Apache 1.3.36 which had been released in 2006.
Many involved in the small, personal, non-commercial web revival like looking through old sites out of interest, or for inspiration and material but unfortunately many of the sites were never indexed by the Internet Archive and once the servers were shut down, were deleted for ever. Some though still have copies available in some form, either on the original servers, mirrors of them, or on the Internet Archive.
Some of these sites are still very incomplete, so have a little patience and be prepared for lots of 404 (page not found), 405 (permanently unavailable), 505 (HTTP version not supported) and other error messages.
Angelfire Classic Member Directory
Angelfire Classic Member Directory is provided by Lycos and is a directory of the old Angelfire sites.
Geocities.ws
Geocities.ws has attempted to archive or mirror whatever is left of the old Geocities sites. The archive can be searched using the form on their homepage and if an old site belonging to you is still there, they will enable the editing of it. The site also has an archive of FortuneCity sites.Internet Archive
Since 1996, the Internet Archive has been collecting and saving webpages and other material. It now has 835 billion web pages that can be accessed through the Wayback Machine.
Old'aVista
Old'aVista was written by Eric Mackrodt in the style of the old AltaVista search engine which ran from 1995 to 2013. The search engine uses the Internet Archive to bring results from many of the old hosting services. Eric explains why he made the utility at Why Make This Site? and in his YouTube video.
Oocities
Oocities is another archive of old Geocities sites.
Restorativland
Restorativland is an ambitious project aiming to "excavate shut down, abandoned web ruins and restore them to surfable, visually accessible, searchable, remixable condition." The sites they are attempting to restore are AOL Hometown, Fortune City, Geocities, Geocities Japan, and Myspace Music.
The Old Net
The Old Net understands about the old web and offers a search portal to the Internet Archive as well as lists of available sites from many of the old hosting providers.
YToo!
YToo! is a project portal page made to resemble the old Yahoo! page. One feature it has is a search function that uses or searches the Old'aVista, Oocities, Wilby, and Marginalia sites. It can also search using FrogFind!, (GitHub) which is a lightweight search engine but has a few problems including invalid SSL certificates.
Google and Bing
Both Google and Bing can use the site: keyword to only return results from certain sites. Both also use the OR operator to allow multiple sites to be searched. The usual form of writing this is:
site:yoursite.com search-term(s)
To search multiple sites then what works for both Google and Bing is to use multiple site: keywords, separated by OR, with all of it inside parentheses, such as:
(site:site1.com OR site:site2.com OR site:site3.com) search-term(s)
For example, to search all the old web sites I know still work for the word "pokemon", I used:
(site:50megs.com OR site:tripod.com OR site:angelfire.com OR site:geocities.ws OR site:freeservers.com OR site:oocities.org OR site:oocities.com OR restorativland.org) pokemon
There are limits to how long a query to the search engines can be. I could not find any definite information but Google appears to be limited to 32 words or 2,048 characters.
Search Engine Lists
Search Engine Links
Search Engine Links is an interesting site as it attempts to list search engines and directories by country. It is not entirely successful as many of the links, whatever they were originally like, now go to news or magazine type sites but there are some interesting finds on the site. I could not find any listing for small web, non-commercial site search engines but did come across some interesting things in the Subjects section such as the Calvin & Hobbes search engine and watched a wedding live from the Elvis Wedding Chapel, Las Vegas, Nevada coutesy of Earth Cam in the Webcams sub-category.
Search engines with their own indexes
Search engines with their own indexes by Rohan "Seirdy" Kumar is a great article about little known and unusual search engines.
Wikipedia
Of course, the Wikipedia page List of search engines lists lots of them.
Audio and Image Search
FindSounds
FindSounds searches the web for audio files. It is very good at doing what it does, but makes no assertions at all about the licensing or use of the files it finds.
Openverse
Openverse searches for media in the public domain or with creative commons licesning. Even the type of CC license can be filtered.
Picryl
Picryl describes itself as "The World's Largest Public Domain Media Search Engine" and it really is good.
TinEye
TinEye is a brilliant reverse image search engine. Simply give it an image to search for and it will search through its collection of 66.8 billion images in seconds to match it. The results can be sorted in a variety of ways.
Forums and Message Boards
Boardreader
Boardreader searches forums and message boards for its results.
Program Code Search
Searchcode
Searchcode is a search engine to find code from a variety of languages on websites.
The Internet of Things
These search engines are scary, they find absolutely anything connected to the internet, even though they really should not be. One of the oldest, Shodan, was in 2013, dubbed "The scariest search engine on the Internet". This is no longer true because as the article "Still the Scariest Search Engine on the Internet?" points out, now there are many more of this type of search engine. The security blog OSINT ME lists around a dozen of them on "20+ links for IoT and webcam search engines"
If you want to create an account with Shodan, then OSINT ME lists around 100 search queries that should return interesting results.
Shodan uses an API and security researcher Samy Younsi lists three utilities make make use of the API on "3 scary tools that use Shodan search engine".
Webcam Search Engines
The security site OSINT ME provides a list of around a dozen webcam search engines.