Study finds Web bigger than we think

Replies:

  • None.

Parents:

  • None.
http://news.cnet.com/news/0-1005-200-2356979.html

> Study finds Web bigger than we think
> By The Associated Press
> Special to CNET News.com
> July 26, 2000, 11:00 p.m. PT
>
> The Internet has become so large so fast that sophisticated
> search engines are just scratching the surface of the Web's vast
> information reservoir, according to a new study released today.
>
> The 41-page research paper, prepared by a South Dakota company
> that has developed new software to plumb the Internet's depths,
> estimates the Web is 500 times larger than the maps provided by
> popular search engines like Yahoo, AltaVista and Google.com.
>
> These hidden information coves, well-known to the Net-savvy, have
> become a tremendous source of frustration for researchers who
> can't find the information they need with a few simple
> keystrokes.
>
> "These days it seems like search engines are a little like the
> weather: Everyone likes to complain about them," said Danny
> Sullivan, editor of SearchEngineWatch.com, which analyzes search
> engines.
>
> For years, the uncharted territory of the Web has been dubbed the
> "invisible Web."
>
> BrightPlanet, the Sioux Falls start-up behind today's report,
> describes the terrain as the "deep Web" to distinguish from the
> surface information captured by Internet search engines.
>
> "It's not an invisible Web anymore. That's what's so cool about
> what we are doing," said Thane Paulsen, BrightPlanet's general
> manager.
>
> Many researchers suspected that these underutilized outposts of
> cyberspace represented a substantial chunk of the Internet, but
> no one seems to have explored the Web's back roads as extensively
> as BrightPlanet.
>
> Deploying new software developed in the past six months,
> BrightPlanet estimates there are now about 550 billion documents
> stored on the Web.
>
> Combined, Internet search engines index about 1 billion pages.
> One of the first Web search engines, Lycos, had an index of
> 54,000 pages in mid-1994.
>
> While search engines obviously have come a long way since 1994,
> they aren't indexing even more pages because an increasing amount
> of information is stored in giant evolving databases set up by
> government agencies, universities and corporations.
>
> Search engines rely on technology that generally identifies
> "static" pages, rather than the "dynamic" information stored in
> databases.
>
> This means that general-purpose search engines will guide Web
> surfers to the home site that houses a huge database, but finding
> out what's in them requires additional queries.
>
> BrightPlanet believes it has developed a solution with software
> called "LexiBot."
>
> With a single search request, the technology not only searches
> the pages indexed by traditional search engines but delves into
> the databases on the Internet and fishes out the information in
> them.
>
> The LexiBot isn't for everyone, BrightPlanet executives concede.
> For one thing, the software costs money--$89.95 after a 30-day
> free trial. For another, a LexiBot search isn't fast. Typical
> searches will take 10 to 25 minutes to complete, but could
> require up to 90 minutes for the most complex requests.
>
> "This isn't for grandma when she is looking for chocolate-chip
> (cookie) recipes on the Internet," Paulsen said.
>
> The privately held company expects LexiBot to be particularly
> popular in academic and scientific circles. It also plans to sell
> its technology and services to businesses.
>
> About 95 percent of the information stored in the deep Web is
> free, according to BrightPlanet.
>
> Several Internet veterans who reviewed BrightPlanet's research
> today were intrigued, but they warned that the company's software
> could be too overwhelming.
>
> "The World Wide Web is getting to be so humongous that you need
> specialized engines. A centralized approach like this isn't going
> to be successful," predicted Carl Malamud, co-founder of
> Petaluma, Calif.-based Invisible Worlds.
>
> Like BrightPlanet, Invisible Worlds is trying to extract more
> data hidden from search engines but is customizing the
> information.
>
> Malamud calls this process "giving context to the content."
>
> Sullivan agreed that BrightPlanet's greatest challenge will be
> showing businesses and individuals how to effectively deploy the
> company's breakthrough.
>
> "No one else has come up with something like this yet, so when
> they fetch people all this information on the deep Web, they are
> going to have to show people where to dive in. Otherwise, people
> will just drown."

--
Gerald Oskoboiny <[email protected]>
http://impressive.net/people/gerald/

HURL: fogo mailing list archives, maintained by Gerald Oskoboiny