Searching the Web: general and scientific information access

@article{Lawrence1999SearchingTW,
  title={Searching the Web: general and scientific information access},
  author={Steve Lawrence and C. Lee Giles},
  journal={First IEEE/POPOV Workshop on Internet Technologies and Services. Proceedings (Cat. No.99EX391)},
  year={1999},
  pages={18-31},
  url={https://api.semanticscholar.org/CorpusID:10947844}
}
  • S. LawrenceC. Lee Giles
  • Published in 1999
  • Computer Science
  • First IEEE/POPOV Workshop on Internet Technologies and Services. Proceedings (Cat. No.99EX391)
The World Wide Web has revolutionized the way that people access information, and has opened up new possibilities in areas such as digital libraries, general and scientific information dissemination

Figures and Tables from this paper

Information retrieval on the web

Overall trends cited by the sources are consistent and point to exponential growth in the past and in the coming decade, and the development of new techniques targeted to resolve some of the problems associated with Web-based information retrieval are discussed.

New Methods and Tools for the World Wide Web Search

Analysis of key aspects of recently developed Web search methods and tools are presented: visual representation of subject trees, interactive user interfaces, linguistic approaches, image search, ranking and grouping of search results, database search, and scientific information retrieval.

CONTENT AND LINK STRUCTURE ANALYSIS FOR SEARCHING THE WEB

An ideal search algorithm should find all of the relevant pages, rank them by relevance to the user query, and present a rank-ordered result to the users.

Information Retrieval on the Web

Researchers in Web IR have reexamined the findings from traditional IR research to discover which conventional text retrieval approaches may be applicable in Web settings, while exploring new approaches that can accommodate Web-specific characteristics.

A Comparison on Intelligent Web Information Retrieval Systems

This research is to finding out the techniques which can improve the effectiveness of information retrieval, which requires new advanced tools, which covering in a better way the various phases of the information streams and capable of surviving with the severe limitations of existing tools for information retrieval on the web.

Hierarchical structural approach to improving the browsability of Web search engine results

An agent system based on hierarchically structural approach for organizing Web search results coupled with a metasearch approach for Web searching and an ontological approach aimed at providing a mechanism to categorize search results in a semantic hierarchical organization are designed.

Classification-based Retrieval Methods to Enhance Information Discovery on the Web

Log analyses are shown to be reasonable and informative, and can be used to detect changing trends and patterns in the query stream, thus providing valuable data to a search service, as well as providing techniques and metrics for performing temporal analysis on query logs.

Custom interfaces for advanced queries in search engines

It is demonstrated that the gap between the provision of advanced search facilities and their use can be bridged, for specific information needs, by the construction of a simple interface in the form of a website that automatically formulates the necessary requests.

Text Retrieval Systems for the Web

The focus of this paper is to survey the modern approaches to the accomplishment of different Web search engine tasks, as well as modifications of the retrieval problem associated with heterogeneity of both user's needs and architectures of the search systems used.

World Wide Web Search Technologies

This chapter provides an overview of existing Web search technologies and classifies them into six categories: (i) hyperlink exploration, (ii) information retrieval, (iii) metasearches, (iv) SQL approaches, (v) content-based multimedia searches, and (vi) others.
...

Multi-Engine Search and Comparison Using the MetaCrawler

The MetaCrawler is presented, a fielded Web service that represents the next level up in the information "food chain" and is sufficiently lightweight to reside on a user's machine, which facilitates customization, privacy, sophisticated filtering of references, and more.

Searching the world wide Web

The coverage and recency of the major World Wide Web search engines was analyzed, yielding some surprising results, including a lower bound on the size of the indexable Web of 320 million pages.

Multi-Service Search and Comparison Using the MetaCrawler

The MetaCrawler provides a single, central interface for Web document searching that facilitates customization, privacy, sophisticated ltering of references, and more and serves as a tool for comparison of diverse search services.

CiteSeer: an automatic citation indexing system

CiteSeer has many advantages over traditional citation indexes, including the ability to create more up-to-date databases which are not limited to a preselected set of journals or restricted by journal publication delays, completely autonomous operation with a corresponding reduction in cost, and powerful interactive browsing of the literature using the context of citations.

A Machine Learning Architecture for Optimizing Web Search Engines

A wide range of heuristics for adjusting document rankings based on the special HTML structure of Web documents are described, including a novel one inspired by reinforcement learning techniques for propagating rewards through a graph which can be used to improve a search engine's rankings.

Context and Page Analysis for Improved Web Search

The paper discusses the features of the NECI metasearch engine and suggests ways to improve the efficiency of Web searches by downloading and analyzing each document and then displaying results that show the query terms in concert.

Authoritative sources in a hyperlinked environment

This work proposes and test an algorithmic formulation of the notion of authority, based on the relationship between a set of relevant authoritative pages and the set of “hub pages” that join them together in the link structure, and has connections to the eigenvectors of certain matrices associated with the link graph.

The AltaVista Revolution: How to Find Anything on the Internet

This book explains how to use the AltaVista service, an Internet search tool that captures the full and complete text of over 2.5 million web pages per day, without exercising any filters that might eliminate important content.

A Universal Citation Database as a Catalyst for Reform in Scholarly Communication

A universal, Internet-based, bibliographic and citation database would link every scholarly work ever written - no matter how published - to every work that it cites and every work that cites it.