UNH Search FAQ:
The discovery process for new pages.
author and contact: cwis.admin@unh.edu
updated 08-DEC-1998
General promotion issues.
When you prepare a new Web server/site/home-page, you need
to think about the discovery process by which
search engines will index your pages.
What follows
is specific to our campus search
engines, though the general issues apply to
the well known Internet-wide search engines
and directories as well.
The default discovery process is to wait for
the search engine spider to eventually
crawl (walk) to your pages from other hyperlinks.
That could take weeks and would never
happen if you were not linked to other pages
that it visits.
To promote or advertise your new collection of pages
in Infoseek's index of the UNH Intranet,
you probably want to
request a visit as described below.
Webinator is a special case for the
Pubpages collection of pages.
For the
Internet-wide search engines and directories,
either look at the individual sites for
information or look at one of the sites that
helps to promote your pages,
such as the
Submit It! site, or the
Web Site Garage's
Register-It and Hitometer services, or the
WebPosition Analyzer
that looks at ways to get near the top of
search engine results.
Privacy?
The flip side of discovery, of course, is to
remain undiscovered, i.e., restricted or confidential
to some degree. Search engines operate on the
reasonable premise that Web pages are intended
as publicly accessible documents unless
restrictions are applied by the page owner.
See the Search FAQ items on what is
inherently not indexed,
how to apply
limits as a page author,
and how to apply
limits as a Web server manager.
Infoseek discovery process.
Infoseek's spider continually crawls (walks) the
UNH Intranet to index pages, as described
in the Search FAQ on
what is included in the collection.
Infoseek automatically discovers new links
and obsolete links in the crawling process,
supplemented by requests to
add or delete specific pages.
When Infoseek finds a link anywhere on
a newly-discovered server (new URL in the
unh.edu domain), it automatically tries
to index the server contents by going
to the root URL for that server.
You can also use Infoseek to discover (search for) related
Web pages and you may want to make those pageowners
aware of your Web
presence and request that they cross-link
to your pages
(well designed pages should include an
address for the owner).
Webinator discovery process.
Webinator
indexes the Pubpages server
(pubpages.unh.edu) based on the index of home pages that
is maintained on that server
(also see that list on how to be de-listed).
If you are not on that list, then you probably
have incorrectly set up your Web page
subdirectory and should
review the requirements.
Return to
FAQ for Search Engine Use at UNH.
|