Knowledge (XXG)

Information retrieval

Source đź“ť

595: 990:, and eventually became associate director of the Center for Documentation and Communications Research. That same year, Kent and colleagues published a paper in American Documentation describing the precision and recall measures as well as detailing a proposed "framework" for evaluating an IR system which included statistical sampling methods for determining the number of relevant documents not retrieved. 610:
In order to effectively retrieve relevant documents by IR strategies, the documents are typically transformed into a suitable representation. Each retrieval strategy incorporates a specific model for its document representation purposes. The picture on the right illustrates the relationship of some
329:
there is ... a machine called the Univac ... whereby letters and figures are coded as a pattern of magnetic spots on a long steel tape. By this means the text of a document, preceded by its subject code symbol, can be recorded ... the machine ... automatically selects and types out those references
285:
An information retrieval process begins when a user enters a query into the system. Queries are formal statements of information needs, for example search strings in web search engines. In information retrieval, a query does not uniquely identify a single object in the collection. Instead, several
1050:
published early findings of the Cranfield studies, developing a model for IR system evaluation. See: Cyril W. Cleverdon, "Report on the Testing and Analysis of an Investigation into the Comparative Efficiency of Indexing Systems". Cranfield Collection of Aeronautics, Cranfield, England,
381:(TREC) as part of the TIPSTER text program. The aim of this was to look into the information retrieval community by supplying the infrastructure that was needed for evaluation of text retrieval methodologies on a very large text collection. This catalyzed research on methods that 804:
The evaluation of an information retrieval system' is the process of assessing how well a system meets the information needs of its users. In general, measurement considers a collection of documents to be searched and a search query. Traditional evaluation metrics, designed for
319:
Most IR systems compute a numeric score on how well each object in the database matches the query, and rank the objects according to this value. The top ranking objects are then shown to the user. The process may then be iterated if the user wishes to refine the query.
788:
allow a representation of interdependencies between terms, but they do not allege how the interdependency between two terms is defined. They rely on an external source for the degree of interdependency between two terms. (For example, a human or sophisticated
1604: â€“ ESSIR promotes research, innovation, and development of information access systems by educating junior and senior researchers, students, professionals, and developers on the latest developments in the field, both methodological and technological. 297:. User queries are matched against the database information. However, as opposed to classical SQL queries of a database, in information retrieval the results returned may or may not match the query, so results are typically ranked. This 1366:, Robert N. Oddy, and Helen M. Brooks proposed the ASK (Anomalous State of Knowledge) viewpoint for information retrieval. This was an important concept, though their automated analysis tool proved ultimately disappointing. 2194:
Doszkocs, T.E. & Rapp, B.A. (1979). "Searching MEDLINE in English: a Prototype User Interface with Natural Language Query, Ranked Output, and relevance feedback," In: Proceedings of the ASIS Annual Meeting, 16:
773:
allow a representation of interdependencies between terms. However the degree of the interdependency between two terms is defined by the model itself. It is usually directly or indirectly derived (e.g. by
354:
in the 1920s and 1930s – that searched for documents stored on film. The first description of a computer searching for information was described by Holmstrom in 1948, detailing an early mention of the
1113:
sponsored a symposium titled "Statistical Association Methods for Mechanized Documentation". Several highly significant papers, including G. Salton's first published reference (we believe) to the
599: 687:
treat the process of document retrieval as a probabilistic inference. Similarities are computed as probabilities that a document is relevant for a given query. Probabilistic theorems like
892:
submits patents for his "Statistical Machine", a document search engine that used photoelectric cells and pattern recognition to search the metadata on rolls of microfilmed documents.
1931: 1908:. Proceedings of the 5th International Conference on Collaborative Computing: Networking, Applications and Worksharing (CollaborateCom'09). Washington, DC: IEEE. Archived from 1509: 746:
methods. Feature functions are arbitrary functions of document and query, and as such can easily incorporate almost any other retrieval model as just another feature.
1645: â€“ field of research that involves studying situations, motivations, and methods for people seeking and sharing information in participatory online social sites 1718: 1601: 1320: 1069:
Weinberg report "Science, Government and Information" gave a full articulation of the idea of a "crisis of scientific information". The report was named after Dr.
370:
such as the Cranfield collection (several thousand documents). Large-scale retrieval systems, such as the Lockheed Dialog system, came into use early in the 1970s.
650:
represent documents and queries usually as vectors, matrices, or tuples. The similarity of the query vector and document vector is represented as a scalar value.
374: 945:: Growing concern in the US for a "science gap" with the USSR motivated, encouraged funding and provided a backdrop for mechanized literature searching systems ( 1484:
implementation of many features formerly found only in experimental IR systems. Search engines become the most common and maybe best instantiation of IR models.
312:
or videos. Often the documents themselves are not kept or stored directly in the IR system, but are instead represented in the system by document surrogates or
214: 1503: 2107:
Perry, James W.; Kent, Allen; Berry, Madeline M. (1955). "Machine literature searching X. Machine language; factors underlying its design and development".
1821: 799: 177: 1205: 1610: 996:: International Conference on Scientific Information Washington DC included consideration of IR systems as a solution to problems identified. See: 1515: 1077: 1038:
and John Lary Kuhns published "On relevance, probabilistic indexing, and information retrieval" in the Journal of the ACM 7(3):216–244, July 1960.
611:
common models. In the picture, the models are categorized according to two dimensions: the mathematical basis and the properties of the model.
2324: 2222: 2091: 1941: 273:. An IR system is a software system that provides access to books, journals and other documents; it also stores and manages those documents. 2135: 2402: 1580: 703: 568: 1947: 1317: 1343:
for MEDLINE at the National Library of Medicine. The CITE system supported free form query input, ranked output and relevance feedback.
1114: 817:
notion of relevance: every document is known to be either relevant or non-relevant to a particular query. In practice, queries may be
305: 1691: 1637: 1562: 764: 546: 287: 207: 250:. The information need can be specified in the form of a search query. In the case of document retrieval, queries can be based on 2270: 1668: 719: 512: 905:: The US military confronted problems of indexing and retrieval of wartime scientific research documents captured from Germans. 397:
Areas where information retrieval techniques are employed include (the entries are in alphabetical order within each category):
1673: 1340: 987: 358:
computer. Automated information retrieval systems were introduced in the 1950s: one even featured in the 1957 romantic comedy,
298: 167: 2397: 1730: 1653: 663: 658: 89: 79: 29: 1193:
John W. Sammon, Jr.'s RADC Tech report "Some Mathematics of Information Storage and Retrieval..." outlined the vector model.
937:(research engineer at IBM since 1941) began work on a mechanized punch card-based system for searching chemical compounds. 699: 1903: 1175:
completed evaluation studies of the MEDLARS system and published the first edition of his text on information retrieval.
1110: 556: 200: 1134:
Medical Literature Analysis and Retrieval System, the first major machine-readable database and batch-retrieval system.
1706: 724: 627:
of words or phrases. Similarities are usually derived from set-theoretic operations on those sets. Common models are:
526: 439: 187: 1549: 1642: 1261: 1818: 1443: 1303: 1103: 694: 378: 84: 677: 673: 563: 44: 2377: 2259: 2168:
N. Jardine, C.J. van Rijsbergen (December 1971). "The use of hierarchic clustering in information retrieval".
603: 1659: 573: 551: 2043: 1265: 1202: 976:: Philip Bagley conducted the earliest experiment in computerized document retrieval in a master thesis at 759:
treat different terms/words as independent. This fact is usually represented in vector space models by the
258:
of searching for information in a document, searching for documents themselves, and also searching for the
2351: 1866: 1712: 1620: 806: 668: 635: 630: 152: 127: 74: 54: 2363: 1629: â€“ Process or activity of attempting to obtain information in both human and technological contexts 1574: 1539: 1274:: Three highly influential publications by Salton fully articulated his vector processing framework and 840: 411: 366:
at Cornell. By the 1970s several different retrieval techniques had been shown to perform well on small
340:
The idea of using computers to search for relevant pieces of information was popularized in the article
162: 132: 1095: 1402: 1356:: First international ACM SIGIR conference, joint with British Computer Society IR group in Cambridge. 1212:" (IEEE Transactions on Computers) was the first proposal for visualization interface to an IR system. 2343: 2044:
The Theory of Digital Handling of Non-numerical Information and its Implications to Machine Economics
2029:
The Royal Society Scientific Information Conference, 21 June-2 July 1948: Report and Papers Submitted
1172: 810: 521: 270: 157: 49: 1871: 1685: 1632: 1626: 1544: 1467: 1275: 775: 709: 482: 416: 239: 69: 64: 22: 350:
in 1945. It would appear that Bush was inspired by patents for a 'statistical machine' – filed by
301:
of results is a key difference of information retrieval searching compared to database searching.
2372: 1884: 1800: 1697: 1453: 1406: 1363: 1146: 1047: 872: 653: 583: 462: 243: 39: 2245: 1662: â€“ Set of techniques for creating images, diagrams, or animations to communicate a message 594: 2218: 2214: 2207: 2087: 2024: 1937: 1481: 1035: 742:) and seek the best way to combine these features into a single relevance score, typically by 688: 624: 492: 472: 406: 386: 274: 247: 2177: 2150: 2116: 2062: 2003: 1876: 1828:. Journal of the American Society for Information Sciences and Technology. 61(8), 1517-1534. 1790: 1782: 1724: 1257: 923: 889: 857:
invents an electro-mechanical data tabulator using punch cards as a machine readable medium.
854: 743: 477: 351: 251: 137: 1386:
publish: An Evaluation of Retrieval Effectiveness for a Full-Text Document-Retrieval System
2367: 2355: 2263: 2249: 1825: 1568: 1422: 1209: 1007: 957: 934: 917: 640: 531: 450: 429: 342: 1838:
Goodrum, Abby A. (2000). "Image Information Retrieval: An Overview of Current Research".
2243:
Modern Information Retrieval: The Concepts and Technology behind Search (second edition)
2278: 1819:
The Seventeen Theoretical Constructs of Information Searching and Information Retrieval
1595: 1418: 1235: 1070: 953: 876: 714: 578: 467: 457: 117: 2391: 2181: 1804: 1742: 1679: 1163:
was involved in studies at University of Chicago on Requirements for Future Catalogs.
1025: 967: 912: 864: 844: 779: 760: 487: 363: 347: 293:
An object is an entity that is represented by information in a content collection or
142: 112: 1888: 814: 542:
Methods/Techniques in which information retrieval techniques are employed include:
434: 367: 362:. In the 1960s, the first large information retrieval research group was formed by 2307: 1733: â€“ Area of research related to information retrieval centered on timeliness 1694: â€“ Measure of a document's applicability to a given subject or search query 1589: 1160: 382: 269:
Automated information retrieval systems are used to reduce what has been called
147: 2339: 2008: 1991: 1786: 1773:
Luk, R. W. P. (2022). "Why is information retrieval a scientific discipline?".
1736: 2154: 1909: 1748: 1383: 946: 122: 1973:
Bulletin of the IEEE Computer Society Technical Committee on Data Engineering
1080:
published text on information retrieval. Becker, Joseph; Hayes, Robert Mayo.
847:, the first machine to use punched cards to control a sequence of operations. 2067: 2050: 1239: 1232:
First online systems—NLM's AIM-TWX, MEDLINE; Lockheed's Dialog; SDC's ORBIT.
818: 309: 235: 2325:
BCS IRSG: British Computer Society – Information Retrieval Specialist Group
2120: 998:
Proceedings of the International Conference on Scientific Information, 1958
2383:
Information retrieval performance evaluation tool @ Athena Research Centre
1880: 389:
has boosted the need for very large scale retrieval systems even further.
1751: â€“ Process of extracting and discovering patterns in large data sets 1592: â€“ Process of extracting and discovering patterns in large data sets 868: 359: 313: 294: 263: 259: 94: 330:
which have been coded in any desired way at a rate of 120 words a minute
1795: 1131: 255: 182: 1857:
Foote, Jonathan (1999). "An overview of audio information retrieval".
1398:: Key papers on and experimental systems for visualization interfaces. 1268:
in information retrieval", which articulated the "cluster hypothesis".
1709: â€“ A classification model in machine learning based on centroids 355: 308:
the data objects may be, for example, text documents, images, audio,
172: 2294: 1571: â€“ Computer component that stores information for immediate use 2256: 1965: 1521: 1293: 2293:
Christopher D. Manning, Prabhakar Raghavan, and Hinrich SchĂĽtze.
2257:
Information Retrieval: Implementing and Evaluating Search Engines
1460:
with emphasis on visualization and multi-reference point systems.
1392:: Efforts to develop end-user versions of commercial IR systems. 1010:
published "Auto-encoding of documents for information retrieval".
2334: 1474:
by Addison Wesley, the first book that attempts to cover all IR.
1426: 1905:
Information Retrieval On Mind Maps - What Could It Be Good For?
1498:
Conference on Research and Development in Information Retrieval
286:
objects may match the query, perhaps with different degrees of
16:
Obtaining information resources relevant to an information need
2304:
Behind the Search Box: Google and the Global Internet Industry
2255:
Stefan BĂĽttcher, Charles L. A. Clarke, and Gordon V. Cormack.
1527: 977: 254:
or other content-based indexing. Information retrieval is the
2136:"An Historical Note on the Origins of Probabilistic Indexing" 1633:
Information seeking § Compared to information retrieval
1082:
Information storage and retrieval: tools, elements, theories
763:
assumption of term vectors or in probabilistic models by an
2382: 2348: 1721: â€“ Subgroup of the Association for Computing Machinery 1528:
International Conference on Theory of Information Retrieval
2373:
TREC report on information retrieval evaluation techniques
2360: 1739: â€“ Estimate of the importance of a word in a document 2329: 2242: 2320:
ACM SIGIR: Information Retrieval Special Interest Group
1753:
Pages displaying short descriptions of redirect targets
1664:
Pages displaying short descriptions of redirect targets
1583: â€“ retrieval of Information in different languages 1333:(Butterworths). Heavy emphasis on probabilistic models. 1933:
Information Retrieval Data Structures & Algorithms
1290:
A Theory of Term Importance in Automatic Text Analysis
2319: 1676: â€“ Tools and systems for managing one's own data 373:
In 1992, the US Department of Defense along with the
1702:
Pages displaying wikidata descriptions as a fallback
1647:
Pages displaying wikidata descriptions as a fallback
1606:
Pages displaying wikidata descriptions as a fallback
1585:
Pages displaying wikidata descriptions as a fallback
1565: â€“ Information retrieval strategies in datasets 1497: 1376:(McGraw-Hill), with heavy emphasis on vector models. 1028:
began work on IR at Harvard, later moved to Cornell.
1000:(National Academy of Sciences, Washington, DC, 1959) 1902:Beel, Jöran; Gipp, Bela; Stiller, Jan-Olaf (2009). 2206: 1510:Conference on Information and Knowledge Management 2335:Forum for Information Retrieval Evaluation (FIRE) 1930:Frakes, William B.; Baeza-Yates, Ricardo (1992). 1656: â€“ Organization in Vienna, Austria 2006–2012 1623: â€“ Machine reading of unstructured documents 966:: The term "information retrieval" was coined by 1966:"Modern Information Retrieval: A Brief Overview" 1745: â€“ Content-based retrieval of XML documents 1286:(Society for Industrial and Applied Mathematics) 1188:Automatic Information Organization and Retrieval 821:and there may be different shades of relevance. 2306:(U of Illinois Press, 2023) ISBN 10:0252087127 1992:"The History of Information Retrieval Research" 1719:Special Interest Group on Information Retrieval 1602:European Summer School in Information Retrieval 1203:A nonlinear mapping for data structure analysis 786:Models with transcendent term interdependencies 327: 1409:, Matthew Chalmers, Anselm Spoerri and others. 375:National Institute of Standards and Technology 2051:"Automatic Retrieval of Recorded Information" 1959: 1957: 1727: â€“ Classifying a document by index terms 782:of those terms in the whole set of documents. 598:Categorization of IR-models (translated from 517:Information retrieval for chemical structures 208: 8: 2266:. MIT Press, Cambridge, Massachusetts, 2010. 2241:Ricardo Baeza-Yates, Berthier Ribeiro-Neto. 2047:(Zator Technical Bulletin No. 48), cited in 1990:Mark Sanderson & W. Bruce Croft (2012). 1504:European Conference on Information Retrieval 1374:Introduction to Modern Information Retrieval 1372:: Salton (and Michael J. McGill) published 1300:A Vector Space Model for Automatic Indexing 800:Evaluation measures (information retrieval) 771:Models with immanent term interdependencies 1598: â€“ Way to obtain data from a database 242:is the task of identifying and retrieving 215: 201: 18: 2275:Library & Information Science Network 2066: 2007: 1870: 1794: 751:Second dimension: properties of the model 664:(Enhanced) Topic-based Vector Space Model 1522:Conference on Web Search and Data Mining 593: 2025:"'Section III. Opening Plenary Session" 1765: 1516:International World Wide Web Conference 1130:National Library of Medicine developed 734:view documents as vectors of values of 103: 28: 21: 2082:Doyle, Lauren; Becker, Joseph (1975). 1577: â€“ Method of organizing knowledge 1339:: Tamas Doszkocs implemented the CITE 277:are the most visible IR applications. 2295:Introduction to Information Retrieval 2143:Information Processing and Management 1688: â€“ Search engine processing step 757:Models without term-interdependencies 385:to huge corpora. The introduction of 7: 2084:Information Retrieval and Processing 1611:Human–computer information retrieval 1581:Cross-language information retrieval 1100:Synonymy and Semantic Classification 794:Performance and correctness measures 2297:. Cambridge University Press, 2008. 615:First dimension: mathematical basis 2378:How eBay measures search relevance 1817:Jansen, B. J. and Rieh, S. (2010) 1715: â€“ Method for data management 1098:finished her thesis at Cambridge, 1056:Information Analysis and Retrieval 246:resources that are relevant to an 14: 2209:Information Storage and Retrieval 2170:Information Storage and Retrieval 1692:Relevance (information retrieval) 1638:Collaborative information seeking 1563:Adversarial information retrieval 1458:Information Storage and Retrieval 1329:: C. J. van Rijsbergen published 547:Adversarial information retrieval 2330:Text Retrieval Conference (TREC) 1669:Multimedia information retrieval 720:Divergence-from-randomness model 691:are often used in these models. 513:Geographic information retrieval 2277:. 24 April 2015. Archived from 1682: â€“ Type of search strategy 1674:Personal information management 1341:natural language user interface 988:Case Western Reserve University 168:Library and information science 2361:Information Retrieval Facility 2271:"Information Retrieval System" 1731:Temporal information retrieval 1654:Information Retrieval Facility 767:assumption for term variables. 732:Feature-based retrieval models 659:Generalized vector space model 623:models represent documents as 90:Science and technology studies 1: 2086:. Melville. pp. 410 pp. 700:Probabilistic relevance model 509:Genomic information retrieval 262:that describes data, and for 2205:Korfhage, Robert R. (1997). 2182:10.1016/0020-0271(71)90051-9 1472:Modern Information Retrieval 1470:and Berthier Ribeiro-Neto's 1111:National Bureau of Standards 809:or top-k retrieval, include 557:Multi-document summarization 501:Domain-specific applications 266:of texts, images or sounds. 104:Related fields and subfields 2403:Natural language processing 2252:. Addison-Wesley, UK, 2011. 1244:Computer Lib/Dream Machines 952:) and the invention of the 725:Latent Dirichlet allocation 527:Legal information retrieval 188:Quantum information science 2419: 2349:Information Retrieval Wiki 2049:Fairthorne, R. A. (1958). 2009:10.1109/jproc.2012.2189916 1787:10.1007/s10699-020-09685-x 1643:Social information seeking 1262:Cornelis J. van Rijsbergen 797: 2155:10.1016/j.ipm.2007.02.012 2134:Maron, Melvin E. (2008). 1104:computational linguistics 1084:. New York, Wiley (1963). 695:Binary Independence Model 520:Information retrieval in 379:Text Retrieval Conference 1700: â€“ type of feedback 1550:Karen Spärck Jones Award 1186:Gerard Salton published 1102:, and continued work on 813:. All measures assume a 678:latent semantic analysis 674:Latent semantic indexing 564:Compound term processing 377:(NIST), cosponsored the 2302:Yeo, ShinJoung. (2023) 1996:Proceedings of the IEEE 1660:Knowledge visualization 1151:Libraries of the Future 574:Document classification 569:Cross-lingual retrieval 552:Automatic summarization 538:Other retrieval methods 2121:10.1002/asi.5090060411 2109:American Documentation 1964:Singhal, Amit (2001). 1936:. Prentice-Hall, Inc. 1775:Foundations of Science 1713:Search engine indexing 1707:Rocchio classification 1621:Information extraction 1264:published "The use of 1137:Project Intrex at MIT. 702:on which is based the 669:Extended Boolean model 636:Extended Boolean model 631:Standard Boolean model 607: 338: 153:Information technology 75:Knowledge organization 2398:Information retrieval 2340:Information Retrieval 2068:10.1093/comjnl/1.1.36 2023:JE Holmstrom (1948). 1881:10.1007/s005300050106 1575:Controlled vocabulary 1540:Tony Kent Strix award 1331:Information Retrieval 1266:hierarchic clustering 1238:promoting concept of 841:Joseph Marie Jacquard 776:dimensional reduction 597: 506:Expert search finding 412:Information filtering 335:J. E. Holmstrom, 1948 228:Information retrieval 163:Intellectual property 133:Computer data storage 2344:C. J. van Rijsbergen 2055:The Computer Journal 1284:A Theory of Indexing 1173:F. Wilfrid Lancaster 1106:as it applies to IR. 986:: Allen Kent joined 875:used to process the 811:precision and recall 685:Probabilistic models 522:software engineering 401:General applications 271:information overload 158:Intellectual freedom 2041:Mooers, Calvin N.; 1686:Query understanding 1627:Information seeking 1545:Gerard Salton Award 1534:Awards in the field 1468:Ricardo Baeza-Yates 1276:term discrimination 710:Uncertain inference 417:Recommender systems 240:information science 23:Information science 2366:2008-05-22 at the 2354:2015-11-24 at the 2262:2020-10-05 at the 2248:2017-09-18 at the 2213:. Wiley. pp.  1859:Multimedia Systems 1824:2016-03-04 at the 1698:Relevance feedback 1482:Web search engines 1407:Robert R. Korfhage 1382:: David Blair and 1364:Nicholas J. Belkin 1208:2017-08-08 at the 1147:J. C. R. Licklider 1096:Karen Spärck Jones 1076:Joseph Becker and 1048:Cyril W. Cleverdon 706:relevance function 654:Vector space model 608: 602:, original source 584:Question answering 387:web search engines 275:Web search engines 244:information system 2342:(online book) by 2224:978-0-471-14338-3 2093:978-0-471-22151-7 1943:978-0-13-463837-9 1840:Informing Science 1491:Major conferences 1466:: Publication of 1452:: Publication of 1036:Melvin Earl Maron 807:Boolean retrieval 736:feature functions 473:Enterprise search 407:Digital libraries 304:Depending on the 225: 224: 2410: 2290: 2288: 2286: 2229: 2228: 2212: 2202: 2196: 2192: 2186: 2185: 2165: 2159: 2158: 2140: 2131: 2125: 2124: 2104: 2098: 2097: 2079: 2073: 2072: 2070: 2039: 2033: 2032: 2020: 2014: 2013: 2011: 1987: 1981: 1980: 1970: 1961: 1952: 1951: 1946:. Archived from 1927: 1921: 1920: 1918: 1917: 1899: 1893: 1892: 1874: 1854: 1848: 1847: 1835: 1829: 1815: 1809: 1808: 1798: 1770: 1754: 1725:Subject indexing 1703: 1665: 1648: 1616: 1607: 1586: 1403:Donald B. Crouch 1258:Nicholas Jardine 924:Atlantic Monthly 890:Emanuel Goldberg 855:Herman Hollerith 744:learning to rank 648:Algebraic models 478:Federated search 447:Speech retrieval 352:Emanuel Goldberg 336: 248:information need 217: 210: 203: 138:Cultural studies 19: 2418: 2417: 2413: 2412: 2411: 2409: 2408: 2407: 2388: 2387: 2368:Wayback Machine 2356:Wayback Machine 2316: 2284: 2282: 2269: 2264:Wayback Machine 2250:Wayback Machine 2238: 2236:Further reading 2233: 2232: 2225: 2204: 2203: 2199: 2193: 2189: 2167: 2166: 2162: 2138: 2133: 2132: 2128: 2106: 2105: 2101: 2094: 2081: 2080: 2076: 2048: 2040: 2036: 2022: 2021: 2017: 1989: 1988: 1984: 1968: 1963: 1962: 1955: 1944: 1929: 1928: 1924: 1915: 1913: 1901: 1900: 1896: 1856: 1855: 1851: 1837: 1836: 1832: 1826:Wayback Machine 1816: 1812: 1772: 1771: 1767: 1762: 1757: 1752: 1701: 1663: 1646: 1614: 1605: 1584: 1569:Computer memory 1558: 1536: 1493: 1423:Tim Berners-Lee 1210:Wayback Machine 1078:Robert M. Hayes 1054:Kent published 1008:Hans Peter Luhn 958:Eugene Garfield 935:Hans Peter Luhn 918:As We May Think 827: 802: 796: 753: 715:Language models 641:Fuzzy retrieval 617: 604:Dominik Kuropka 592: 540: 532:Vertical search 503: 451:Video retrieval 440:Music retrieval 430:Image retrieval 403: 395: 343:As We May Think 337: 334: 326: 283: 221: 192: 99: 30:General aspects 17: 12: 11: 5: 2416: 2414: 2406: 2405: 2400: 2390: 2389: 2386: 2385: 2380: 2375: 2370: 2358: 2346: 2337: 2332: 2327: 2322: 2315: 2314:External links 2312: 2311: 2310: 2299: 2298: 2291: 2281:on 11 May 2020 2267: 2253: 2237: 2234: 2231: 2230: 2223: 2197: 2187: 2176:(5): 217–240. 2160: 2149:(2): 971–972. 2126: 2115:(4): 242–254. 2099: 2092: 2074: 2034: 2015: 1982: 1953: 1950:on 2013-09-28. 1942: 1922: 1894: 1872:10.1.1.39.6339 1849: 1830: 1810: 1781:(2): 427–453. 1764: 1763: 1761: 1758: 1756: 1755: 1746: 1740: 1734: 1728: 1722: 1716: 1710: 1704: 1695: 1689: 1683: 1677: 1671: 1666: 1657: 1651: 1650: 1649: 1640: 1635: 1624: 1618: 1608: 1599: 1596:Data retrieval 1593: 1587: 1578: 1572: 1566: 1559: 1557: 1554: 1553: 1552: 1547: 1542: 1535: 1532: 1531: 1530: 1524: 1518: 1512: 1506: 1500: 1492: 1489: 1488: 1487: 1486: 1485: 1475: 1461: 1447: 1432: 1431: 1430: 1419:World Wide Web 1412: 1411: 1410: 1399: 1387: 1377: 1367: 1357: 1346: 1345: 1344: 1334: 1324: 1311: 1310: 1309: 1308: 1307: 1297: 1287: 1269: 1251: 1250: 1249: 1248: 1247: 1236:Theodor Nelson 1233: 1217: 1216: 1215: 1214: 1213: 1195: 1194: 1191: 1183: 1182: 1166: 1165: 1164: 1154: 1140: 1139: 1138: 1135: 1120: 1119: 1118: 1107: 1087: 1086: 1085: 1074: 1071:Alvin Weinberg 1061: 1060: 1059: 1052: 1039: 1029: 1013: 1012: 1011: 1001: 991: 981: 971: 961: 954:citation index 940: 939: 938: 928: 895: 894: 893: 882: 881: 880: 877:1890 US Census 858: 848: 826: 823: 798:Main article: 795: 792: 791: 790: 783: 768: 752: 749: 748: 747: 729: 728: 727: 722: 717: 712: 707: 697: 689:Bayes' theorem 682: 681: 680: 671: 666: 661: 656: 645: 644: 643: 638: 633: 616: 613: 591: 588: 587: 586: 581: 579:Spam filtering 576: 571: 566: 561: 560: 559: 549: 539: 536: 535: 534: 529: 524: 518: 515: 510: 507: 502: 499: 498: 497: 496: 495: 490: 485: 480: 475: 470: 468:Desktop search 465: 458:Search engines 455: 454: 453: 448: 445: 442: 437: 432: 427: 421: 420: 419: 409: 402: 399: 394: 391: 332: 325: 322: 282: 279: 223: 222: 220: 219: 212: 205: 197: 194: 193: 191: 190: 185: 180: 175: 170: 165: 160: 155: 150: 145: 140: 135: 130: 128:Classification 125: 120: 118:Categorization 115: 109: 106: 105: 101: 100: 98: 97: 92: 87: 82: 77: 72: 67: 62: 57: 52: 47: 42: 36: 33: 32: 26: 25: 15: 13: 10: 9: 6: 4: 3: 2: 2415: 2404: 2401: 2399: 2396: 2395: 2393: 2384: 2381: 2379: 2376: 2374: 2371: 2369: 2365: 2362: 2359: 2357: 2353: 2350: 2347: 2345: 2341: 2338: 2336: 2333: 2331: 2328: 2326: 2323: 2321: 2318: 2317: 2313: 2309: 2305: 2301: 2300: 2296: 2292: 2280: 2276: 2272: 2268: 2265: 2261: 2258: 2254: 2251: 2247: 2244: 2240: 2239: 2235: 2226: 2220: 2216: 2211: 2210: 2201: 2198: 2191: 2188: 2183: 2179: 2175: 2171: 2164: 2161: 2156: 2152: 2148: 2144: 2137: 2130: 2127: 2122: 2118: 2114: 2110: 2103: 2100: 2095: 2089: 2085: 2078: 2075: 2069: 2064: 2060: 2056: 2052: 2046: 2045: 2038: 2035: 2030: 2026: 2019: 2016: 2010: 2005: 2002:: 1444–1451. 2001: 1997: 1993: 1986: 1983: 1978: 1974: 1967: 1960: 1958: 1954: 1949: 1945: 1939: 1935: 1934: 1926: 1923: 1912:on 2011-05-13 1911: 1907: 1906: 1898: 1895: 1890: 1886: 1882: 1878: 1873: 1868: 1864: 1860: 1853: 1850: 1845: 1841: 1834: 1831: 1827: 1823: 1820: 1814: 1811: 1806: 1802: 1797: 1792: 1788: 1784: 1780: 1776: 1769: 1766: 1759: 1750: 1747: 1744: 1743:XML retrieval 1741: 1738: 1735: 1732: 1729: 1726: 1723: 1720: 1717: 1714: 1711: 1708: 1705: 1699: 1696: 1693: 1690: 1687: 1684: 1681: 1680:Pearl growing 1678: 1675: 1672: 1670: 1667: 1661: 1658: 1655: 1652: 1644: 1641: 1639: 1636: 1634: 1631: 1630: 1628: 1625: 1622: 1619: 1612: 1609: 1603: 1600: 1597: 1594: 1591: 1588: 1582: 1579: 1576: 1573: 1570: 1567: 1564: 1561: 1560: 1555: 1551: 1548: 1546: 1543: 1541: 1538: 1537: 1533: 1529: 1525: 1523: 1519: 1517: 1513: 1511: 1507: 1505: 1501: 1499: 1495: 1494: 1490: 1483: 1479: 1476: 1473: 1469: 1465: 1462: 1459: 1455: 1451: 1448: 1445: 1441: 1438: 1437: 1436: 1433: 1428: 1424: 1421:proposals by 1420: 1416: 1413: 1408: 1404: 1400: 1397: 1394: 1393: 1391: 1388: 1385: 1381: 1378: 1375: 1371: 1368: 1365: 1361: 1358: 1355: 1352: 1351: 1350: 1347: 1342: 1338: 1335: 1332: 1328: 1325: 1322: 1319: 1315: 1312: 1305: 1301: 1298: 1295: 1291: 1288: 1285: 1282: 1281: 1280: 1279: 1277: 1273: 1270: 1267: 1263: 1259: 1255: 1252: 1245: 1241: 1237: 1234: 1231: 1230: 1229: 1228: 1226: 1223: 1222: 1221: 1218: 1211: 1207: 1204: 1200: 1197: 1196: 1192: 1189: 1185: 1184: 1180: 1177: 1176: 1174: 1170: 1167: 1162: 1158: 1155: 1152: 1148: 1144: 1141: 1136: 1133: 1129: 1128: 1127: 1126: 1124: 1121: 1116: 1112: 1108: 1105: 1101: 1097: 1094: 1093: 1091: 1088: 1083: 1079: 1075: 1072: 1068: 1067: 1065: 1062: 1057: 1053: 1049: 1046: 1045: 1043: 1040: 1037: 1033: 1030: 1027: 1026:Gerard Salton 1023: 1020: 1019: 1017: 1014: 1009: 1005: 1002: 999: 995: 992: 989: 985: 982: 979: 975: 972: 969: 968:Calvin Mooers 965: 962: 959: 955: 951: 948: 944: 941: 936: 932: 929: 926: 925: 920: 919: 914: 913:Vannevar Bush 910: 907: 906: 904: 901: 900: 899: 896: 891: 888: 887: 886: 883: 878: 874: 870: 866: 862: 859: 856: 852: 849: 846: 845:Jacquard loom 842: 838: 835: 834: 833: 829: 828: 824: 822: 820: 816: 812: 808: 801: 793: 787: 784: 781: 780:co-occurrence 777: 772: 769: 766: 762: 761:orthogonality 758: 755: 754: 750: 745: 741: 737: 733: 730: 726: 723: 721: 718: 716: 713: 711: 708: 705: 701: 698: 696: 693: 692: 690: 686: 683: 679: 675: 672: 670: 667: 665: 662: 660: 657: 655: 652: 651: 649: 646: 642: 639: 637: 634: 632: 629: 628: 626: 622: 621:Set-theoretic 619: 618: 614: 612: 605: 601: 596: 589: 585: 582: 580: 577: 575: 572: 570: 567: 565: 562: 558: 555: 554: 553: 550: 548: 545: 544: 543: 537: 533: 530: 528: 525: 523: 519: 516: 514: 511: 508: 505: 504: 500: 494: 491: 489: 488:Social search 486: 484: 483:Mobile search 481: 479: 476: 474: 471: 469: 466: 464: 461: 460: 459: 456: 452: 449: 446: 443: 441: 438: 436: 433: 431: 428: 425: 424: 423:Media search 422: 418: 415: 414: 413: 410: 408: 405: 404: 400: 398: 392: 390: 388: 384: 380: 376: 371: 369: 365: 364:Gerard Salton 361: 357: 353: 349: 348:Vannevar Bush 345: 344: 331: 323: 321: 317: 315: 311: 307: 302: 300: 296: 291: 289: 280: 278: 276: 272: 267: 265: 261: 257: 253: 249: 245: 241: 237: 233: 229: 218: 213: 211: 206: 204: 199: 198: 196: 195: 189: 186: 184: 181: 179: 176: 174: 171: 169: 166: 164: 161: 159: 156: 154: 151: 149: 146: 144: 143:Data modeling 141: 139: 136: 134: 131: 129: 126: 124: 121: 119: 116: 114: 113:Bibliometrics 111: 110: 108: 107: 102: 96: 93: 91: 88: 86: 83: 81: 78: 76: 73: 71: 68: 66: 63: 61: 58: 56: 53: 51: 48: 46: 43: 41: 38: 37: 35: 34: 31: 27: 24: 20: 2303: 2283:. Retrieved 2279:the original 2274: 2208: 2200: 2190: 2173: 2169: 2163: 2146: 2142: 2129: 2112: 2108: 2102: 2083: 2077: 2058: 2054: 2042: 2037: 2028: 2018: 1999: 1995: 1985: 1976: 1972: 1948:the original 1932: 1925: 1914:. Retrieved 1910:the original 1904: 1897: 1862: 1858: 1852: 1843: 1839: 1833: 1813: 1778: 1774: 1768: 1477: 1471: 1463: 1457: 1449: 1439: 1434: 1414: 1395: 1389: 1379: 1373: 1369: 1359: 1353: 1348: 1336: 1330: 1326: 1316:: The First 1313: 1299: 1289: 1283: 1271: 1253: 1243: 1242:, published 1224: 1219: 1201:: Sammon's " 1198: 1187: 1178: 1168: 1156: 1150: 1142: 1122: 1099: 1089: 1081: 1063: 1055: 1041: 1031: 1021: 1015: 1003: 997: 993: 983: 973: 963: 949: 942: 930: 922: 921:appeared in 916: 908: 902: 897: 884: 860: 850: 843:invents the 836: 831: 815:ground truth 803: 789:algorithms.) 785: 770: 765:independency 756: 739: 735: 731: 704:okapi (BM25) 684: 647: 620: 609: 600:German entry 541: 435:3D retrieval 396: 393:Applications 372: 368:text corpora 341: 339: 328: 318: 303: 292: 284: 268: 231: 227: 226: 178:Preservation 59: 45:Architecture 1979:(4): 35–43. 1796:10397/94873 1590:Data mining 1446:conference. 1323:conference. 1225:early 1970s 1161:Don Swanson 1022:early 1960s 898:1940s–1950s 885:1920s-1930s 830:Before the 778:) from the 590:Model types 463:Site search 444:News search 426:Blog search 306:application 148:Informatics 2392:Categories 1916:2012-03-13 1760:References 1749:Web mining 1478:late 1990s 1384:Bill Maron 1169:late 1960s 1149:published 947:Allen Kent 903:late 1940s 873:tabulators 869:keypunches 863:Hollerith 493:Web search 123:Censorship 85:Philosophy 55:Management 2061:(1): 37. 1867:CiteSeerX 1805:220506422 1396:1985–1993 1390:mid-1980s 1240:hypertext 1123:mid-1960s 819:ill-posed 738:(or just 310:mind maps 288:relevance 264:databases 252:full-text 236:computing 60:Retrieval 2364:Archived 2352:Archived 2260:Archived 2246:Archived 2195:131-139. 1865:: 2–10. 1822:Archived 1556:See also 1454:Korfhage 1442:: First 1417:: First 1401:Work by 1206:Archived 825:Timeline 740:features 360:Desk Set 333:—  314:metadata 295:database 281:Overview 260:metadata 95:Taxonomy 80:Ontology 50:Behavior 1889:2000641 1526:ICTIR: 1496:SIGIR: 1278:model: 1132:MEDLARS 1117:system. 676:a.k.a. 324:History 299:ranking 256:science 183:Privacy 70:Society 65:Seeking 2308:online 2221:  2215:368 pp 2090:  1940:  1887:  1869:  1803:  1737:tf–idf 1520:WSDM: 1508:CIKM: 1502:ECIR: 1306:18:11) 1296:v. 26) 950:et al. 356:Univac 173:Memory 40:Access 2285:3 May 2139:(PDF) 2031:: 85. 1969:(PDF) 1885:S2CID 1801:S2CID 1514:WWW: 1435:1990s 1349:1980s 1321:SIGIR 1294:JASIS 1220:1970s 1115:SMART 1051:1962. 1016:1960s 943:1950s 879:data. 865:cards 851:1880s 832:1900s 383:scale 234:) in 2287:2020 2219:ISBN 2088:ISBN 1938:ISBN 1846:(2). 1615:HCIR 1464:1999 1450:1997 1444:TREC 1440:1992 1427:CERN 1415:1989 1380:1985 1370:1983 1360:1982 1354:1980 1337:1979 1327:1979 1314:1978 1304:CACM 1272:1975 1260:and 1254:1971 1199:1969 1179:1968 1157:1966 1143:1965 1109:The 1090:1964 1064:1963 1042:1962 1032:1960 1004:1959 994:1958 984:1955 974:1951 964:1950 931:1947 909:1945 871:and 861:1890 837:1801 625:sets 238:and 2178:doi 2151:doi 2117:doi 2063:doi 2004:doi 2000:100 1877:doi 1791:hdl 1783:doi 1456:'s 1425:at 1318:ACM 978:MIT 956:by 915:'s 346:by 2394:: 2273:. 2217:. 2172:. 2147:44 2145:. 2141:. 2111:. 2057:. 2053:. 2027:. 1998:. 1994:. 1977:24 1975:. 1971:. 1956:^ 1883:. 1875:. 1861:. 1842:. 1799:. 1789:. 1779:27 1777:. 1480:: 1405:, 1362:: 1256:: 1227:: 1171:: 1159:: 1145:: 1125:: 1092:: 1066:: 1044:: 1034:: 1024:: 1018:: 1006:: 933:: 911:: 867:, 853:: 839:: 316:. 290:. 232:IR 2289:. 2227:. 2184:. 2180:: 2174:7 2157:. 2153:: 2123:. 2119:: 2113:6 2096:. 2071:. 2065:: 2059:1 2012:. 2006:: 1919:. 1891:. 1879:: 1863:7 1844:3 1807:. 1793:: 1785:: 1617:) 1613:( 1429:. 1302:( 1292:( 1246:. 1190:. 1181:: 1153:. 1073:. 1058:. 980:. 970:. 960:. 927:. 606:) 230:( 216:e 209:t 202:v

Index

Information science
General aspects
Access
Architecture
Behavior
Management
Retrieval
Seeking
Society
Knowledge organization
Ontology
Philosophy
Science and technology studies
Taxonomy
Bibliometrics
Categorization
Censorship
Classification
Computer data storage
Cultural studies
Data modeling
Informatics
Information technology
Intellectual freedom
Intellectual property
Library and information science
Memory
Preservation
Privacy
Quantum information science

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

↑