Re: BusinessWeek: Will Google's Purity Pay Off?

Replies:

Parents:

On Sat, Dec 09, 2000, Gerald Oskoboiny wrote:
[..]
> > Even though Google is seeing queries grow at a rate of 20% a
> > month, Brin and Page admit that the company makes less cash per
> > search query than AltaVista or Northern Light. Supporting a
> > research staff of 100, including 30 PhDs, might have something to
> > do with this. But Google has spent less money developing an
> > ad-sales staff than other search engines.
>
> 30 PhDs? What are they all working on? Seems fairly straightforward
> from here :)

Not on HTTP/1.1 support for their robot!

Reading the logs of my Web site, I am surprised by how many crawlers
only talk HTTP/1.0:
- Googlebot/2.1 (+http://www.googlebot.com/bot.html)"
- Slurp/si ([email protected]; http://www.inktomi.com/slurp.html)
- Lycos_Spider_(T-Rex)
- FAST-WebCrawler/2.2-pre27 ([email protected];
 http://www.fast.no/faq/faqfastwebsearch/faqfastwebcrawler.html)
- Scooter2_fr0-1.0
- TV34_Mercator_n2s3_A-1.0
- JennyBot/0.1
- Mercator-2.0
- htdig/3.1.5 ([email protected])
- Spinne/2.0
- tv33_Merc_ep-1.0
- roach.smo.av.com-1.0
- Unlost Web Crawler 2.0.1.4
- TV35_Mercator_6-1.0
- Scooter2_Mercator_3-1.0
- etc

The only robots which speak HTTP/1.1 that I found are:
- Gulliver/1.3
- appie/1.1
- htdig/3.1.5 ([email protected])

Some of the robots could use the persistent connection feature to crawl
the Web faster. I am not sure it would work for all of the because some
of them get a page every hour, so it wouldn't be really interesting. :-)

--
Hugo Haas <[email protected]> - http://larve.net/people/hugo/
What kind of side dishes will we be enjoying this evening with our
frozen waffles?

Re: BusinessWeek: Will Google's Purity Pay Off?

Replies:

  • None.

Parents:

On Sun, 17 Dec 2000, Hugo Haas wrote:

> > 30 PhDs? What are they all working on? Seems fairly straightforward
> > from here :)
>
> Not on HTTP/1.1 support for their robot!
>
>
> Some of the robots could use the persistent connection feature to crawl
> the Web faster. I am not sure it would work for all of the because some
> of them get a page every hour, so it wouldn't be really interesting. :-)

Well most of the major ones gets a page at least every minute, and a real
HTTP/1.1 server will keep the connection alive ;)

Btw I found references for problem to run MT suid programs under linux [1]
To sum things up, linux threads are separate processes with their own
uid/gid. Changing uid on a thread should cause the process to change uid
=> all the threads should change their uid. This is the case for solaris,
but not linux, so you can't run a MT server on linux on port < 1024
without being and staying root. And it is not yet fixed in 2.4.0-test11
At least Linus knows that, but if it takes ages to fix that, I'll try to
go from linux to solaris8 x86.

[1] http://lists.insecure.org/linux-kernel/2000/Aug/3326.html

--
~~Yves
"Baroula que barouleras, au ti�u toujou t'entourneras."

HURL: fogo mailing list archives, maintained by Gerald Oskoboiny