1997


From: American Society for Technion - Israel Institute of Technology

New Search Engine, Language Simplifies Web Surfing

NEW YORK, N.Y. and HAIFA, ISRAEL, November 14, 1997 -- After years of struggling with often frustrating and time-consuming searches on the World Wide Web, users will soon be able to get much more accurate results using a new search engine developed in Israel.

Called W3QS, the search engine actually looks for structures or relationships within the content of documents, rather than just searching for key words. That means that instead of generating a multitude of irrelevant sites, as often happens in conventional search engines like Infoseek, Lycos and Excite!, a W3QS query can yield very specific results.

"The current search engines are limited because the structural information, namely the organization of the document into parts pointing to each other, is usually lost," explains Dr. Oded Shmueli from the Computer Science Faculty of the Technion-Israel Institute of Technology, who developed the new search engine with David Konopnicki, a graduate student.

Most search services today use "robots," or programs that scan the network periodically and form text-based indices, but "those searches are limited by the kind of textual analysis provided by the search service, and the depth of site exploration," Dr. Shmueli says. "In particular, robots do not fill forms themselves, as the number of possibilities is enormous, so they miss interesting avenues that humans might follow."

W3QS, by contrast, can look for complex content relationships, rather than just looking for words or groups of words. That can make for more accurate searches and less time wasted chasing down irrelevant leads. W3QS also searches a designated portion of the Web, in its most current configuration, rather than relying on text-indices that may be out-of-date.

Unlike the current search services, W3QS is capable of filling forms that are encountered as it navigates the WWW. Consider the following example: You're looking for the actual texts of scientific papers written by Drs. Smith and Jones, who work in the Computer Science Department at West University. With conventional searches, you might type in the key words "Smith," "Jones," "West" and "University."

A basic Infoseek or Alta Vista search could bring up hundreds of sites that contain those words. But the results could very likely include a document that discusses the Smith building at Jones University, on West Street, or Michael Jones's analysis of Jay Smith's university architecture in the West in the 1800s. And even if you could find authors Smith and Jones at West University, you might have trouble linking to their home pages, and there's no guarantee that all their papers are listed in their home pages.

If you were to use W3QS to look for those same papers, you could ask the search engine to first go to the Computer Science Department at West University, and then follow hypertext links, all the while looking for documents that contain "Smith "and "Jones" in some specified proximity, together with a link to a file containing the text of a scientific paper. W3QS would then fetch that actual file and store it on the server that runs W3QS, rather than making the user go to the specified Web page and extract the needed document. All the fetched documents can then be e-mailed to you.

To limit what could be searches ad infinitum, you can tell W3QS to follow links through only four documents, or six, or ten. Most search engines available today do not search a site to any depth. And because W3QS uses its own language -- W3QL -- users can create very rich queries and be sure that W3QS will do the querying precisely as specified. Current search services mostly offer ad-hoc ways of stating the conditions of a search, which leaves the results somewhat up to chance.

W3QS includes a number of other conveniences, including:

  • Easy-to-use interface that lets even novice, non-programmer searchers do sophisticated Web searches;
  • Automatic re-evaluation of queries at pre-determined time intervals to be sure users continually have up-to-date results;
  • Convenient interfacing to user-written programs, as well as to existing search engines, which allows users to do more sophisticated searches.
  • Easy accessibility -- W3QS is accessible through many Web browsers, for example Netscape Navigator;
  • Extendibility--users can make the language work with their own data analysis tools, e.g., by integrating a program that does image analysis. Thatfar expands the searching ability of W3QS;
  • Coordination with existing search engines -- W3QS queries can incorporate searches through Yahoo, Lycos, and other services.

The researchers, whose current work is supported by the Israel Ministry of Science and the Arts, are currently seeking partners for the further development and commercialization of this system. W3QS can be accessed at http://www.cs.technion.ac.il/~W3QS.

The Technion-Israel Institute of Technology is the country's premier scientific and technological center for applied research and education. It commands a worldwide reputation for its pioneering work in communications, electronics, coater-resource management, materials engineering, aerospace and medicine, among others. The majority of Israel's engineers are Technion graduates, as are most of the founders and managers of its high-tech industries. The university's 11,000 students and 700 faculty study and work in the Technion's 19 faculties and 30 research centers and institutes in Haifa.

The American Technion Society (ATS) is the university's support organization in the United States. Based in New York City, it is the leading American organization supporting higher education in Israel. The ATS has raised $632 million since its inception in 1940, half of that during the last six years. Technion societies are located in 24 countries around the world.



This article comes from Science Blog. Copyright � 2004
http://www.scienceblog.com/community