CiteULike is a free online bibliography manager. Register and you can start organising your references online.

The Invisible Web: Uncovering Information Sources Search Engines Can't See Export

(01 July 2001)

Citation Format

[Posts]

View FullText article


fmccown's tags for this article

deep-web

X Reviews [Write a review of this article]

X Notes for this article

fmccown has 0 private notes and 1 public note for this article.

p. 61 lists types of invisible web content: 1) disconnect pages 2) Page consisting primarily of images, audio, or video 3) Pages consisting primarily of PDF, PS, Flash, exe, compressed files (tar, zip) 4) Content in relational databases 5) Real-time content 6) Dynamically generated content

Of course, this book is somewhat outdated. Google has crawled PDF and PS for some time and now crawls Flash.

fmccown (public note) - 2008-07-14 22:51:53

X Find related articles from these CiteULike users

X Find related articles with these CiteULike tags

X Posting History

X Abstract

Enormous expanses of the Internet are unreachable with standard Web search engines. This book provides the key to finding these hidden resources by identifying how to uncover and use invisible Web resources. Mapping the invisible Web, when and how to use it, assessing the validity of the information, and the future of Web searching are topics covered in detail. Only 16 percent of Net-based information can be located using a general search engine. The other 84 percent is what is referred to as the invisible Web-made up of information stored in databases. Unlike pages on the visible Web, information in databases is generally inaccessible to the software spiders and crawlers that compile search engine indexes. As Web technology improves, more and more information is being stored in databases that feed into dynamically generated Web pages. The tips provided in this resource will ensure that those databases are exposed and Net-based research will be conducted in the most thorough and effective manner.


X BibTeX record

X RIS record


Privacy Statement | Terms & Conditions
CiteULike organises scholarly (or academic) papers or literature and provides bibliographic (which means it makes bibliographies) for universities and higher education establishments. It helps undergraduates and postgraduates. People studying for PhDs or in postdoctoral (postdoc) positions. The service is similar in scope to EndNote or RefWorks or any other reference manager like BibTeX, but it is a social bookmarking service for scientists and humanities researchers.