How it Works: Information Retrieval
Information retrieval, IR, is an area of study based around the searching for documents, data within documents or metadata about documents. It may also include searching relational databases, structured storage and the Internet. Other terms that may be used include document, data, information and text retrieval. These forms of retrieval, however, have their own technology, literature, and theory.
IR is considered interdisciplinary. It is based on library, information, and computer science, mathematics, information architecture, linguistics, statistics, and cognitive psychology. To reduce what is considered data overload, automated IR systems may be used. Universities and libraries utilize these systems as well to provide access to documents, journals, and books. Online search engines present the most visible form of IR applications.
The idea behind using computers to search for specific data became popularized in 1945, after an article by Vannevar Bush. The first IR systems were developed shortly after in the 1950s and 1960s. The 1970s saw the development of large-scale retrieval programs. Digitization has created a phenomenon known as digital obsolescence. This means that the digital resource is unable to be read because the physical copy, needed to read the media, software, or hardware that it runs on is no longer available.
An IR process begins with the user entering a query. Generally these are formal statements that include what data is needed. Several results may match the initial query but at varying degrees of relevance. Queries are matched against objects found in a database. Data objects may be, depending on the application, text, audio, images, videos, or mind maps. Typically the documents themselves are not stored in the system. Instead, they are represented in the system through surrogates or metadata.
Most systems will rank the objects by a numeric score on how relevant they are to the query. Top-ranked objects are shown to in the inquirer. The user may then choose to refine the results.
There are many different ways to evaluate the performance and effectiveness of an IR system. The methods require a number of documents and a query. The different measures: precision, recall, fall-out, f-measure, average precision, r-precision, mean average precision, and discounted cumulative gain.
In order for the IR to work efficiently, the documents are usually placed in a suitable representation. There are many different model types. Typically these are categorized within two dimensions: the properties of the model and the mathematical basis. Some examples of models within the first dimension- mathematical basis: standard Boolean, extended Boolean, fuzzy retrieval, vector space, generalized vector space, and topic-based vector space. Second dimension-properties of the model include models that are without interdependencies, with immanent term interdependencies, or with transcendent term interdependencies.
Information retrieval systems are designed to retrieve data. These programs were not developed until the 1950s and 1960s, a few years after the idea of using computers to search for data was considered. This could be searching for a specific document, text in a document or metadata about a document. There are also automated IR systems. Typically these systems are utilized in libraries, schools, and universities and most visibly on the Internet through applications such as search engines.
The leaders in document storage and tracking Ontario offer services such as hard file record storage London Ontario and document storage and tracking Ontario to insure your information is safe.
The leaders in document storage and tracking Ontario offer services such as hard file record storage London Ontario and document storage and tracking Ontario to insure your information is safe.
http://www.commandrecords.com
Author Bio: The leaders in document storage and tracking Ontario offer services such as hard file record storage London Ontario and document storage and tracking Ontario to insure your information is safe.
Category: Computers and Technology
Keywords: security, information, computer, techonology, business, society