Google indexing again (was Re: new mailing list: fogf)

Replies:

  • None.

Parents:

On Fri, Feb 02, 2001, Gerald Oskoboiny wrote:
> For some reason, some of their servers keep losing pages from
> their indexes: my home page disappeared completely for a while,
> and now the W3C html validator isn't there [1], and that used to
> be the #1 result for a search for "HTML". [2] Hey, now the #1
> result for "HTML" is my old validator at the U of Alberta.

I sent them an email about that two weeks ago and I just got a reply
this afternoon. Unfortunately, it was the standard reply (I think that
I already complained about a similar problem and got the same reply a
while back) - I am copying it here, I don't think that disclosing an
automated reply is a violation of the netiquette:

| Every time we update our database of web pages, our index invariably
| shifts: We find new sites, we lose some sites, and sites ranking may
| change.

Huh! I wonder how updating could make them lose sites.

| Here's some more information about how Google ranks pages: Google
| finds most of its pages when our robots crawl the web and jump from
| page to page via hyperlinks. The best way to ensure listing on
| Google is for a page to be linked from lots of other pages.

I think that the W3C HTML validator is linked from a few other pages.
:-)

[ useless crap ]
| If your page does not appear at all, there is another possible
| explanation.  Sometimes websites are not reachable when we tried to
| crawl them. We try to crawl a site multiple times, but if the site
| is not reachable, that can cause it to be left-out of the current
| index.  If that was a transient problem, the site will likely show
| up in the next index.

Hmmm... I don't believe in that, because the other page that
disapeared and that I complained about originally was on my site,
which is never down. ;-)

I think that they have a problem in their indexing process. I would
actually like to know more about how Google works. I would like to
know in detail about indexing, storage of information, search
algorithms, clustering, etc.

--
Hugo Haas <[email protected]> - http://larve.net/people/hugo/
Mais alors, tout se recoupe !

HURL: fogo mailing list archives, maintained by Gerald Oskoboiny