Searching Other Formats

Tutorial Directory

How Can You Search For Documents In Non-HTML Formats?

Cartoon: Librarian stacks books in a computer

Search engines were first built to find HTML pages for keyword searching. HTML pages are often rich in text and follow common conventions that are easy to index. Before the year 2000, search engine crawlers ignored relatively rare non-html files that contained little text. As search technology matured and market pressures created a more competitive environment, the large commercial search engines began to index non-HTML file formats.

Since then search engines added PDF, Microsoft Application, image, video, audio files and more to their search indexes. Major search engines make it easy to do simple searches for non-html files. For a while, new, specialized search tools were developed that focused on multimedia formats. But any big brand search engine will do it all.

Which search engines should I use to find non-HTML files?

Use any search engine

What are file extensions?

Different computer applications use different file types. The name of the file is followed by a period (sometimes referred to as "dot"') and the file extension. Files are differentiated by a two to four letter file extension. These file extensions are used by computer applications to identify useable files. You can usually change a file's name, but it is best not to change the file extension.

image.jpg with filename period aka dot and file extension indicatged

What are some common non-html file types?

Image, video, audio and text files are available in a number of different file types. Some multimedia file types work with both Windows and Macintosh computers. Others are platform specific. The following list is by no means exhaustive.

File Extension Type of File / Type of Computer
.aiff Sound File Mac
.au Sound File Mac
.avi Video / Windows
.bmp Image file (Bit map) Windows
.gif Image file (Graphic Interchange Format) Mac & Windows
.png Image file (vector) Mac & Windows
.jpg .jpeg. .jpe Image file a common graphics file found on the Internet
.mid Sound File for use with MIDI sound mixing systems Mac & Windows
.mpeg .mpg Video format (Motion Pictures Expert Group) Mac & Windows
.mp3 Music file, requires .mp3 player Mac & Windows
.ppt Powerpoint Presentation Mac & Windows
.ram .ra Real Audio file Mac & Windows
.swf Macromedia Flash Mac & Windows
.pdf Adobe Acrobat file
.doc Microsoft Word docment
.txt plain text docment

How to search for a specific type of file?

The filetype: operator limits the search to the type of file you specify. For example, this query will return only pdf files: filetype:pdf

How can I save a file to my local computer?

In general, to save a file from the Internet:

  • Point at the desired link, image, or file name, and right click your mouse
  • Choose Save Target As or Save Image As
  • Direct the file to your desktop, or some designated folder.

Depending on how you configure your Internet browser, simply clicking on a file may start the file-saving procedure. Be sure to note the file name and where you are saving it in your system to avoid the frustration of losing your new file after you've taken the trouble to download it.

What about downloading a virus along with a file?

You can 'catch a virus' from downloaded files. Search engines can locate web pages and files for you, but they do not check for viruses. Any file you download from the Internet could carry a virus. Keep your virus scanning software current and updated. While most viruses are attached to email or Word documents, it pays to scan all files before opening them.

Alternately, Google's HTML view feature allows you to read many file formats without downloading them, thus protecting you from virus problems.

What about Copyright?

copyright symbol

All files and web pages on the Internet are subject to copyright laws. Image, video, and audio files are often specifically copyrighted and cannot be used without explicit permission by the author. Search engines leave it up to the user to determine copyright details and take the necessary steps to insure fair use. (For more, see the Copyright MicroModule.)

Content authored by Dennis O'Connor 2003 | modified 2017 by Carl Heine