benchmarking the advanced search interfaces of eight major www search engines.doc

资源描述

《benchmarking the advanced search interfaces of eight major www search engines.doc》由会员分享，可在线阅读，更多相关《benchmarking the advanced search interfaces of eight major www search engines.doc（27页珍藏版）》请在金锄头文库上搜索。

1、Benchmarking the Advanced Search Interfaces of Eight Major WWW Search EnginesDr. Randy D. Ralph & John W. Felts, Jr.Keywords: information retrieval, search engines, World Wide Web, benchmarking, advanced search, search interfacesAbstract: This research project was designed to benchmark the performan

2、ce of the advanced search interfaces of eight of the major World Wide Web (WWW) search engines, excluding the meta engines. A review of the literature did not find any previous benchmarking studies of the advanced interfaces based on quantitative data. The research was performed by fifty-two graduat

3、e students of library and information studies (LIS) on three campuses of the University of North Carolina (UNC) as a class research project for course LIS 645, Computer-Related Technologies in Library Management. The class was offered by the Department of Library and Information Studies at UNC Green

4、sboro through the North Carolina Research and Education Network (NC-REN). The LIS students selected Altavista, Excite, Go/Infoseek, Google, Hotbot, Lycos, Northernlight, and Yahoo for comparative study. Each researcher submitted a total of five questions in a range of subject areas to each of the ei

5、ght selected search engines, totaling 2,080 individual searches in 260 search panels of eight search engine trials. Data was collected in the following categories on the first 20 unique citations viewed in the search output lists from the engines:1) an index of relative recall based on the actual or

6、 estimated recall reported by the search engine 2) the number of direct hits among the first 20 unique citations viewed3) the number of false coordinations among the first 20 unique citations viewed 4) the number of citations to websites with duplicate content5) the number of citations to websites r

7、esulting in failed views6) the depth to the first solid hit among the citations in the search output listThe aim of the research was to identify the engines that might best meet the needs of a library patron. While, on the whole, the search engines performed equally well on a number of parameters te

8、sted, it was found that engines differed most significantly in: 1) the percent of relevancy in results from direct hits2) the depth to the first solid hit3) the number of duplicate citations delivered 4) the number of citations which resulted in failed views A discussion and summary of the results,

9、conclusions and recommendations for further research are included. 1.BACKGROUND1.1OverviewThis project builds on previous work conducted by classes in the Department of Library and Information Studies of the School of Education at the University of North Carolina (UNC) at Greensboro under the direct

10、ion of Dr. Randy D. Ralph. Six of the top eight global World Wide Web (WWW) search engines identified in the previous comparative testing in 1997 as part of an Indexing and Abstracting course (WWW Search Engine Test Methods, available at URL http:/ and in 1999 as part of a course in library automati

11、on (Computer-Related Technologies in Library Management, by Randy D. Ralph and John W. Felts at URL http:/library.uncg.edu/search/ were again selected for comparative benchmark testing, this time using the fall 2000 Computer-Related Technologies in Library Management classes (LIS 645), meeting at UN

12、Cs Asheville, Charlotte and Greensboro campuses. Each of the fifty-two students devised five (5) search queries in diverse subject areas and genres in order to gauge the overall performance of the eight selected search engines. In a departure from the earlier study, advanced search queries were pres

13、ented to the search engines using their own advanced search interfaces, rather than the simple default interfaces. The search engines selected were Altavista, Excite, Go/Infoseek, Google, Hotbot, Lycos, Northernlight and Yahoo. 1.2RationaleThere is still a need for the type of examination performed

14、here. While more and more librarians (among the rest of us) are using search engines, few real statistical analyses, as opposed to popular informal comparisons, have been conducted. Many earlier studies are so old they are outdated, since search engines evolve so rapidly. New studies are underway, b

15、ut this study builds on earlier research only three years old, expanding the earlier parameters. Moreover, as the Internet becomes more and more commercialized, the need for an unbiased and statistically valid comparison is greater now than ever before. This research can be periodically repeated, ta

16、king into account the evolution of the search engines as well as that of the Internet itself. 1.3Background of Search EnginesSearch engines came into existence only after 1994. A search engine is software that searches web sites and indexes found in the World Wide Web, and returns the matches, such as documents compatible with the search

展开阅读全文

benchmarking the advanced search interfaces of eight major www search engines.doc

最新文档