Information Overwhelm

More Pages Than anyone Can Count?

No one really knows the number of pages available on the Web. New, unique, publicly accessible pages (aka the visible Web) are created every second. Given the enormous amount of information available on both the visible Web and the deep/invisible Web, it is clear that an informed researcher must make limited and careful choices.

Back in the early 2000's, before Google appeared on the scene, estimates of the size of the Internet placed the number of unique Web pages at 2 billion (statistic from , a business intelligence gathering firm, July 2000). Because no central clearing house for Web pages exists, the number of Web pages can only be estimated. At that time, Cyveillance found that 7.3 million unique new pages were going online each day. Looking ahead, they anticipated 4 billion publicly available pages on the Internet by early 2001. In 2003 they revised that number to 6 billion.

A more conservative estimate was offered by the OCLC (Online Computer Library Center, Office of Research) in a report called Trends in the Evolution of the Public Web. According to the 2002 results of the Web Characterization Project's survey, the public Web contained 3,080,000 Web sites, or 35 percent of the Web as a whole. Public sites accounted for approximately 1.4 billion Web pages. The average size of a public Web site was 441 pages.

"A public Web site offers to all Web users free, unrestricted access to a significant portion of its content." Many pages on the Internet remain out of reach unless you are willing to pay (OCLC, 2003). The page you are reading right now is an example of the Deep Web--it requires a subscription.

A decade later, in 2014, the number of Websites was estimated close to 1 billion (see graphic above). While attempts to measure size still may be found, knowing "the number" isn't really meaningful since the numbers are so exponentially large as to render them beyond comprehension. Consequently, the measurements now include things like how much and what type of information can be transferred at any given time, the storage capacity of the Internet, and so on.

For an information researcher, the problem was overwhelming a long time ago and has only become more so.

Information Overwhelm Strategy

Ian Jukes coined the term "overwhelm" to describe the researcher's dilemma. On the one hand, a researcher wants to be thorough. On the other hand, there is so much to work with, there simply isn't enough time to read it all. Therefore, the problem is how to locate the right amount and balance of information to do justice to the research. To this end, there are several approaches one can use.

  1. mousetrap ready to closeTake the first result. No serious researcher would do this, but students often fall into this trap if not corrected. Search engines make research appear easy. Enter a query and a page of results is provided. Since the first page of results appear to be the most relevant, it only takes one or two to complete the assignment. Unfortunately, this may backfire with humorous and not so humorous results (like the student who was assigned a paper on Jane Austen and instead based her research on an article about Calamity Jane Austin). Since Wikipedia is ranked high in the results, students frequently treat that as the go-to source of information for the one-and-done strategy. Educators seldom value Wikipedia as a trusted source. Following the references provided by Wikipedia is a better strategy. Choosing the first result fails completely when the search requires multiple revised queries to locate relevant information.
  3. Trianglate sources. A safer strategy involves finding three different sources to compare information. Wikipedia references could be a starting point for locating multiple sources. This is part of investigative searching, to verify that the information can be trusted. In the process it is possible to find different facts and perspectives on facts which are reasonable. Comparing the work of different authors and publishers helps provide balance. For typical school research projects, this is an acceptable approach. For research toward scholarship, it represents only the beginning. After all, this only scratches the surface with three sources. How many sources are required depends on the rigor of the scholarship desired. No one will be able to read all the information available on a topic. The best approach is to know when you've found enough. For beginning researchers, that means asking for help and sticking with trusted sources.
  4. Revise queries. This is a more rigorous approach. One query is rarely sufficient to find the information needed for a research purpose. This is so because of the inherent limited capacity of a query. Robust searches keep the number of keywords between two and five. The more keywords, the fewer the relevant results. More results allows the researcher to look for relevant information and, at the same time, discover and substitute other relevant keywords. Revised queries using other keywords help locate other sources of information. For more info and practice using substitution strategies, see Keyword Challenges.
  5. black and white owlUse different search engines. Knowing where expert information is found is a key to keeping overwhelm to a minimum. For this reason, Google is not always the best choice. Different search engines access different results. A vast amount of research is found only with specialized serch engines. Many library databases focus on specific fields and are accessible only by subscription. Ask a librarian when you need help.
  6. Avoid unnecessary and suspicious sources. This goes with choosing the right search engines: don't waste time looking in the the wrong places. Unless you are conducting social research, avoid social media (Twitter, Facebook and blogs written by anonymous or unknown authors) and information that requires you to evaluate the author or publisher. Evaluation adds time-consuming effort to the research process. Sometimes it is necessary, but using peer-reviewed sources (like subscription journals and books by known authors) eliminates the need to determine who wrote it and if that person can be trusted. Don't waste time with suspicious sources if credible ones are readily available. Again, ask a librarian in order to save time.
Avoid overwhelm. Have a plan.

Authored by Dennis O'Connor (2004) and Carl Heine (2016)