Abstract:
The information retrieval process consists of locating relevant documents on the basis of user input, such as keywords. The explosive growth of the World Wide Web creates new challenges for the designers of the systems that used in information retrieval. This task can be achieved in either by using link graph or by using search engines. The performance of both methods is not very satisfactory. In link graph approach, human maintain lists cover popular topics effectively but are subjective, expensive to build and maintain, slow to improve, and cannot cover all esoteric topics. In search engines approach, the biggest challenge is: how to retrieve the best and most relevance documents to the user query? And how to define a relevance rank in order to sort them according to their relevance ranks? Different methods are used to define the relevance rank. These methods used both keywords of the document and/or hyperlinks from and to the document. A lot of factors used in defining the relevance rank such as: the number of occurrences of the term in the document, the percentage of the occurrences of the term relative to the number of terms in the document, the position of the term in the document, and the number of sites that have hyperlinks point to the document. In this context, new techniques have been developed and tested, such as: inverted indices and vector space techniques which should better performance.