(unh intranet faq file logo)
UNH Search FAQ:
HTML META tags for search engines.
author and contact: cwis.admin@unh.edu
updated 04-FEB-1999


HTML META tags allow a page author to provide supplemental information in the HEAD section of a document, information that is not displayed by Web browsers but which may be used by other applications such as Web servers or search engines. For more information on META tags, with links to other resources, see Search Engine Watch. There are several free, online META tag analysis tools available. See the META tags syntax checker available at Meta Medic (along with much advertising). And see Keywordcount.com for keyword use relative to the size of your document and in comparison with someone else's document (URL).

Infoseek not only indexes META keywords and description information, it gives a relevancy boost to information in these fields. Most other search engines make similar use of META keywords; for a list of how the most popular search engines use META tags in both ranking and display, see the search engine features chart at Search Engine Watch. While most keyword use is well intentioned and effective, this can lead to keyword abuses and a technological arms race between abusers and search engine countermeasures. As the Infoseek sysadmins at UNH we say, "Please don't abuse." If moral suasion doesn't work in flagrant cases, then we reserve the right to exclude such pages from the index and Infoseek includes an automatic keyword spamming detector.

As with all HTML tags (elements) and attributes, case is not significant for META tags. Multiple META tags are allowed. Each META tag is distinguished by the values of the NAME= and CONTENT= attributes (there are other attributes, outside the discussion here). META tags are placed in the HEAD section of an HTML document and may span multiple lines (in the examples below we use the arbitrary convention of placing each attribute on a separate line). We recommend that you limit the CONTENT to 1000 characters. There is no closing /META tag. In these examples we begin each attribute on a new line, a practice we like for humans working on the code, however, you can place the complete tag on one line if you wish.

Keywords.
You can define a comma-delimited list of keywords.
<meta name="keywords"
      content="meta tags,infoseek,authoring">
Keywords are useful to define your own special collection of files within the UNH Intranet collection of Web pages, if you use something that is unique. This is discussed under customized search forms. See also the information on Infoseek's relevance scoring.

Description.
This allows you to associate a phrase, sentence, or several sentences of description with your Web pages, which will receive special attention by engines such as Infoseek.
<meta name="description"
      content="How to become an Infoseek power user.">
See also the information on Infoseek's relevance scoring.

Indexing instructions.
You can supply page by page instructions to search engine spiders as to whether they should index your page (index or noindex) and whether they should follow links on it (follow or nofollow). As the page author, this gives you control over the indexing of your pages, for those search engines that support this convention (Infoseek does). See the related discussion of robots.txt files.
<meta name="robots"
      content="noindex,nofollow">

Refreshing data.
This is not directly a search engine issue, but while adding META tags you should be aware of, and consider selective use of this feature -- emphasis on selective. You can insure that someone gets a fresh copy of a Web page every time it is displayed by a browser, by having your Web page instruct the browser not to cache the page. This is done with the Pragma instruction.

Extended use.
You can create your own, self-defined META tags, but consider the benefits of fitting your needs to one of the standards-based schemes that exist.

Several proposed schemes exist to provide additional structured information ("metadata elements") about a Web page, not to rely on the document content itself to be self-descriptive. Infoseek supports the Dublin Core system, which is covered by RFC 2413. For example, you could identify the authorship of your pages with:
<meta name="DC.creator"
      content="Clytemnestra">
And then find all your pages that match this search:
DC.creator:Clytemnestra

More on Metadata and Resource Discovery.
Metadata is defined as data about data and is a very active area for proposals and discussion. The idea is to go beyond just keywords to develop fields of information that describe content. See the metadata page at W3C for links to current activity. See also the Digital Libraries Initiative Projects funded by NSF, DARPA, and NASA.

Return to FAQ for Search Engine Use at UNH.