Link to South Central Regional Library Council, New York

Discovering
The Invisible Web

Presented September 29, 2000
for South Central Regional Library Council
by Kay Benjamin


Gateways ~ Databases ~ Bots ~ Webrings ~ Ask a Question ~ Finding Specialized Search Engines ~ Current Awareness Services ~ Biblioigraphy

 
What is the Invisible Web?

There are many sources of information on the Interent which are not accessible or "visible" to standard web search engines, either because the files are not publicly available or because the pages are not stored as individual files, instead created on the fly or dynamically as the information is requested. This invisible portion of the web is estimated to be many times larger than that portion which is indexed by the general search engines such as Google and InfoSeek. The types of materialthat isn't indexed includes non-HTML files (e.g., PDF files), databases, sites requiring registration, archives (e.g., newspapers and magazines), dynamically created Web pages and interactive tools (e.g., calculators). Examples of the types in this non-indexed portion includes:

statistics ~ medical information ~ periodical articles ~ product catalogs ~ addresses & phone numbers ~ legal information ~ images files ~ sound files ~ video files ~ software files ~ dictionaries ~ patents ~ directories ~ genealogies ~ jobs

Gateways to Invisible Web Search Tools
There are many specialized search tools that exist to locate the information on the invisible web. Some of these topic-specific search engines have been gathered into directories by a number of companies, agencies, and individuals. Below are a few of the most noteworthy of these directories. These are designed to guide you to the appropriate search engine, not to the specific information that you may need.
  • AllSearchEngines.com - includes links to both general search engines, and to topic search engines
  • Beaucoup! - over 2500 databases and search engines listed
  • BigHub - consumer oriented listing over 1500 specialty databases
  • Complete Planet - indexes the "deep web"
  • DirectSearch - extensive list, added to nearly daily, includes largely databases geared for research and academics
  • FirstGov - a comprehensive directory and search engine to U.S. local, state, & federal government sites
  • Fossick.com - search engines across academic and popular topics
  • Intelliseek - offers "vertical search channels" and allows searching across specialized search engines
  • Internet Oracle - good list of popular engines, such as shopping, travel, entertainment
  • Internets - a collection of databases by category or by concept search
  • InvisibleWeb.com - the best known, and one of the largest, directories of catalogs, databases, search engines
  • Lycos Searchable Databases - another well-known directory
  • Search Engine Guide - nearly 3500 search engines than can be browsed or searched
  • Searchpower.com - allows you to either search the web or find a search engine
  • WebData - lets you search web data (the invisible web), the web, or by topic
Databases

Much of the invisible web consists of databases, or collections of targeted files or links, that have their own search engines. Some sites offer free information while others requires registration, a fee, and a password. This is sampling of some databases that are particularly useful for librarians.

periodical indexes ~ business ~ digital libraries ~ entertainment ~ images ~ government ~ jobs ~ legal ~ online catalogs ~ reference ~ software ~ statistics

Periodical Indexes
  • ERIC - an index to education journals and documents: free searching, but no full text
  • FindArticles.com - free index to full text of selected journals and magazines online
  • MagPortal - an index to free magazine articles on the web
  • PsychCrawler - APA product that indexes the contents of several psychology web sites, including journals
  • PubMed - index to MEDLINE medical journals: free searching, with selected full text links
  • ResearchIndex - "Earth's largest free full text index of scientific literature"
  • TotallyBusiness.com: Magazine Directory - links to popular magazines from an extensive subject listing
  • UnCoverWeb - indexes over 12,000 magazines and journals across fields: free searching
  • U.S. Internet Government Periodicals - Search the full-text government periodicals by subject, Sudoc number, or title
Business and Financial Databases
  • Companies Online - search for information on over 100,000 public and private companies
  • Company Profiles - search by company name, ticker symbol, country, or industry to get financial information
  • Hoover's Online - popular site with extensive company information
  • SEC EDGAR Filings - real-time 10K filings, insider trading, and financial reports
Digital Libraries
Digitial libraries consist of text documents, maps, images, and sounds that have been digitized for electronic access. Usually these collections include their own internal search engine.
  • Alex - a collection of public domain documents from American and English literature as well as Western philosophy
  • American Memory - digitized versions of holdings at the Library of Congress, including photographs, manuscripts, rare books, maps, recorded sound, and moving pictures
  • Bartleby - includes the searchable text of standard reference books, verse, fiction and non-fiction works
  • netLibrary - a collection of the full text of books: the "public" collection is free
  • On-Line Books Page - "A directory of books that can be freely read right on the Internet." Includes over 10,000 books
Entertainment
  • All Music Guide - search artists, albums, songs, or styles for biography, discography
  • Internet Movie Database - a large database with reviews, descriptions, awards, and more
  • Listen.com - find music and videos to download in real audio, mp3, and other formats
Government Databases
Images
  • Artchive -
  • Clip-Art.com -
  • Web Gallery of Art - a searchable database of European painting and sculpture of the Gothic, Renaissance and Baroque periods (1200-1700), currently containing over 6,500 reproductions.
Jobs
Legal Databases
  • FindLaw: LawCrawler - legal databases, including a law library, Federal and State cases and codes, and legal news
  • LawGuru.com - legal databases, post a question, and other good legal resources
Online Catalogs
Reference
  • CIA World Factbook - standard U.S. government resource to countries of the world
  • Merck Manual - provides "useful clinical information to practicing physicians, medical students, interns, residents, nurses, pharmacists, and other health care professionals in a concise, complete, and accurate manner."
  • Occupational Outlook Handbook - U.S. government publication on jobs with descriptions, earnings, etc.
  • Research-It! - a collection of search engines including dictionaries, converters, translators, quotations, maps, phones
  • xrefer - "Encyclopedias, dictionaries, thesauri & books of quotations from the world's leading publishers"
  • YourDictionary.com - language and subject dictionaries, thesaurus, and other word tools
Software
There are many, many databases of software free for downloading: games, utilities, spreadsheets, word processors, screen savers, graphics tools, backgrounds, and more. Some of it is totally free, and some of it is shareware, software that you may install and try before you decide to pay.
Statistics
Bots
Bots are the software (short for knowledge ro"bots," also known as agents, spiders, crawlers) that search a specific database. In the case of general search engines, the software searches across fixed pages on the web at large, crawling from link to link and creating a database as it goes. Specialized search engines search only the already existing database or databases to which it has been directed. "Bot" has become a popular term due to the proliferation of shopping bots which search selected product catalogs, bringing back a list of products and prices that match the search, providing a unique opporitunity to comparison shop.
Shopping Bots
  • BestBookBuys - find the best price from 29 online bookstores
  • DealTime - comparison shop for all products
  • SmartBots.Com - an extensive list of shopping bots, includes a search feature of its own
  • Yahoo! Shopping - search for a product of any kind, then sort results by increasing or decreasing price
Other Bots
  • BotSpot - a directory of bots which unfortunately has a lot of dead links, but still a good annotated resource
  • Deja.com - searchable index to Usenet messages
  • FindSounds.com - search the web for sound effects in AIFF, WAVE, and AU formats
  • RootsWeb - searches over 40 major genealogical databases
  • Search Adobe PDF Files Online - searches the summaries of over 1 million PDF files on the Internet
  • SpeechBot - an experimental index of popular US radio shows indexing 7124 hours of content, uses RealPlayer
  • Travelocity - compare travel fares and vacation packages
WebRings
A webring is a group of related websites that link to one another. Webrings, though not specifically part of the invisible web, represent tools to locate web sites that might not otherwise easily be found.
  • Yahoo! WebRing - join a ring, create a ring, or search for a ring on any topic
Ask a Question
Many sites exist that allow the user to post a question to either a database designed to handle natural language, or to a service that routes the question to a human "expert."
Finding Specialized Search Engines
Want some tricks to find specialized databases and search engines on your own? Try these:
  1. Using any standard search engine, type in key words describing the topic and combine with the word database or databases or "search engine" or "invisible web" or bot or bots
    • EXAMPLE: environment and "search engine"
    • EXAMPLE: "endangered species" and database
    • EXAMPLE: food and bots
  2. Librarian's Index to the Internet - this directory includes a category for "databases." Try a search for a keyword(s) and database
    • EXAMPLE: history and database
  3. ScoutReport - in the search box enter the word database to get an annotated list
Current Awareness Services
There are a number of ways to keep up with new invisible web search tools, along with other useful news about about the changing nature of the Web and how it can be searched.
Bibliography