182:-related documents. Each index was prepared by an expert in that methodology. The authors of the original documents were then asked to prepare a set of search terms that should return that document. The indexing experts were then asked to generate queries into their index based on the author's search terms. The queries were then used to examine the index to see if it returned the target document.
109:
held the
Scientific Information Conference that first explored some of these concepts on a formal basis. This led to a small number of experiments in the field in the UK, US, and the Netherlands. The only major effort to compare different systems was led by Gull using the collection of works from the
185:
In these tests, all but the faceted system produced roughly equal numbers of "correct" results, while the faceted concept lagged. Studying these results, the faceted system was re-indexed using a different format on the cards and the tests were re-run. In this series of tests, the faceted system was
254:
In spite of these criticisms, Cranfield 2 set the bar by which many following experiments were judged. In particular, Cranfield 2's methodology, starting with natural language terms and judging the results by relevance, not exact matches, became almost universal in following experiments in spite of
222:
might be happy with any of the collection's many papers on the topic, but
Cranfield 1 would consider such a result a failure in spite of returning relevant materials. In the second series, the results were judged by 3rd parties who gave a qualitative answer on whether the query generated a relevant
61:
These criticisms also led to the second series of experiments, now known as
Cranfield 2. Cranfield 2 attempted to gain additional insight by reversing the methodology; Cranfield 1 tested the ability for experts to find a specific resource following the index system, Cranfield 2 instead studied the
189:
The outcome of these experiments, published in 1962, generated enormous debate, both among the supporters of the various systems, as well as among researchers who complained about the experiments as a whole. Nevertheless, it appeared one conclusion was clearly supported: simple systems based on
53:
In the first series of experiments, several existing indexing methods were compared to test their efficiency. The queries were generated by the authors of the papers in the collection and then translated into index lookups by experts in those systems. In this series, one method went from least
263:
With the conclusion of
Cranfield 2 in 1967, the entire corpus was published in a machine-readable form. Today, this is known as the Cranfield 1400, or any variety of variations on that theme. The name refers to the number of documents in the collection, which consists of 1398 abstracts. The
118:. Judging of the results was carried out by experts in the two systems, and they never agreed on whether various retrieved documents were relevant to the search, with each group rejecting over 30% of the results as wrong. Further testing was cancelled as there appeared to be no consensus.
294:
Today the collection is too small to use for practical testing beyond pilot experiments. Its place has mostly been taken by the TREC collection, which contains 1.89 million documents across a wider array of subjects, or the even more recent GOV2 collection of 25 million web pages.
217:
Another major change was how the results were judged. In the original tests, a success occurred only if the index returned the exact document that had been used to generate the search. However, this was not typical of an actual query; a user looking for information on aircraft
198:
In the first series of experiments, experts in the use of the various techniques were tasked with both the creation of the index and its use against the sample queries. Each system had its own concept about how a query should be structured, which would today be known as a
50:. The experiments were broken into two main phases, neither of which was computerized. The entire collection of abstracts, resulting indexes and results were later distributed in electronic format and were widely used for decades.
69:
era when the quantity of scientific research was exploding. It was the topic of continual debate for years and led to several computer projects to test its results. Its influence was considerable over a forty-year period before
62:
results of asking human-language questions and seeing if the indexing system provided a relevant answer, regardless of whether it was the original target document. It too was the topic of considerable debate.
58:. The conclusion appeared to be that the underlying methodology seemed less important than specific details of the implementation. This led to considerable debate on the methodology of the experiments.
247:
period, explain why everything they were doing was wrong. The debate has been characterized as "...fierce and unrelenting, sometimes well beyond the boundaries of civility." This chorus was joined by
291:
stored just over 80 MB. As the capabilities of systems grew through the 1960s and 1970s, the
Cranfield document collection became a major testbed corpus that was used repeatedly for many years.
206:
This led to the second series of experiments, Cranfield 2, that considered the question of converting the query into the language. To do this, instead of considering the generation of the query as a
236:
210:, each step was broken down. The outcome of this approach was revolutionary at the time; it suggested that the search terms be left in their original format, what would today be known as a
264:
collection also includes 225 queries and the relevance judgments of all query:document pairs that resulted from the experimental runs. The main database of abstracts is about 1.6 MB.
203:. Much of the criticism of the first experiments focused on whether the experiments were truly testing the systems, or the user's ability to translate the query into the query language.
759:
769:
141:
The first series of experiments directly compared four indexing systems that represented significantly different conceptual underpinnings. The four systems were:
231:
The results of the two test series continued to be a subject of considerable debate for years. In particular, it led to a running debate between
Cleverdon and
764:
190:
keywords appeared to work just as well as complex classificatory schemes. This is important, as the former are dramatically easier to implement.
65:
The
Cranfield experiments were extremely influential in the information retrieval field, itself a subject of considerable interest in the post-
125:
in 1958, by which time computer development had reached the point where automatic index retrieval was possible. It was at this meeting that
593:
186:
now the clear winner. This suggested the underlying theory behind the system was less important than specifics of the implementation.
438:
538:
146:
111:
688:
130:
726:
333:
Cleverdon, C.W. (1960). "The Aslib
Cranfield Research Project on the Comparative Efficiency of Indexing Systems".
54:
efficient to most efficient after making minor changes to the arrangement of the way the data was recorded on the
102:" that would hold all of mankind's knowledge in an indexed form that would allow it to be retrieved by anyone.
650:
211:
160:
239:
in 1958. The two would invariably appear at meetings where the other was presenting and then, during the
95:
35:
121:
A second conference on the topic, the
International Conference on Scientific Information, was held in
43:
678:
Lancaster, F. W. (1965). A case study in the application of
Cranfield system evaluation techniques.
612:
174:
system of co-ordinate indexing where a reference may be found on any number of separate index cards.
309:
705:
178:
In an early series of experiments, participants were asked to create indexes for a collection of
126:
75:
39:
667:
534:
385:
350:
244:
528:
453:
114:, which had started as a collection of aeronautics reports captured in Germany at the end of
697:
659:
465:
434:
418:
377:
342:
71:
47:
717:
232:
152:
the Alphabetical Subject Catalogue which alphabetized subject headings in classic library
87:
405:
Factors determining the performance of indexing systems. Vol. 1: Design, Vol. 2: Results
280:
248:
200:
167:
122:
753:
106:
91:
17:
251:
in the US, who published a critique on the Cranfield experiments a few years later.
709:
219:
115:
66:
284:
272:
94:
is often pointed to as the first complete description of the field that became
288:
153:
55:
701:
671:
454:"Seven years of work on the organization of materials in the special library"
389:
354:
423:"Emanuel Goldberg, Electronic Document Retrieval, and Vannevar Bush's Memex"
422:
207:
179:
129:"got the bit between his teeth" and managed to arrange for funding from the
663:
469:
368:
Cleverdon, Cyril (1967). "The Cranfield Tests on Index Language Devices".
276:
268:
240:
439:
10.1002/(SICI)1097-4571(199205)43:4<284::AID-ASI3>3.0.CO;2-0
171:
223:
set of papers, as opposed to returning a specified original document.
744:
564:
562:
381:
346:
725:
Manning, Christopher; Raghavan, Prabhakar; SchĂĽtze, Hinrich (2008).
267:
The experiments were carried out in an era when computers had a few
304:
99:
686:
Robertson, Stephen (2008). "On the history of evaluation in IR".
648:
Richmond, Phyllis A. (1963). "Review of the cranfield project".
163:
which allows combinations of subjects to produce new subjects,
149:, a hierarchical system being widely introduced in libraries,
98:. The article describes a hypothetical machine known as "
510:
508:
506:
504:
502:
489:
487:
485:
483:
481:
479:
427:
Journal of the American Society for Information Science
568:
133:to start what would later be known as Cranfield 1.
595:IBM System/360 Model 50 Functional Characteristics
407:. Cranfield, UK: Aslib Cranfield Research Project.
287:(tending toward the lower end) and its typical
530:The Notion of Relevance in Information Science
42:at the College of Aeronautics, today known as
46:, in the 1960s to evaluate the efficiency of
8:
760:History of computing in the United Kingdom
613:"IBM Archives: IBM 1302 disk storage unit"
632:
553:
514:
493:
112:Armed Forces Technical Information Agency
34:were a series of experimental studies in
325:
770:Science and technology in Bedfordshire
403:Cleverdon, C. W.; Keen, E. M. (1966).
731:Introduction to Information Retrieval
580:
533:. Morgan & Claypool. p. 13.
7:
745:Cranfield papers in ACM SIGIR Museum
569:Manning, Raghavan & SchĂĽtze 2008
275:and network access to perhaps a few
237:Institute of Information Scientists
283:shipped with 64 to 512 kB of
86:The now-famous July 1945 article "
25:
680:Journal of Chemical Documentation
27:Information retrieval experiments
765:Information retrieval evaluation
147:Universal Decimal Classification
689:Journal of Information Science
452:Gull, Cloyd (1 October 1956).
279:. For instance, the mid-range
131:US National Science Foundation
1:
733:. Cambridge University Press.
235:, one of the founders of the
161:Faceted Classification Scheme
74:indexes like those of modern
727:"Standard test collections"
786:
527:Saracevic, Tefko (2016).
702:10.1177/0165551507086989
601:. IBM. 1967. A22-6898-1.
341:(12). Emerald: 421–431.
718:"Cranfield 1400 corpus"
376:(6). Emerald: 173–194.
281:IBM System/360 Model 50
664:10.1002/asi.5090140408
651:American Documentation
470:10.1002/asi.5090070408
458:American Documentation
212:natural language query
96:information retrieval
36:information retrieval
32:Cranfield experiments
18:Cranfield Experiments
419:Buckland, Michael K.
78:became commonplace.
44:Cranfield University
310:Information history
127:Cyril W. Cleverdon
76:web search engines
40:Cyril W. Cleverdon
615:. IBM. 2003-01-23
370:ASLIB Proceedings
335:ASLIB Proceedings
255:many objections.
16:(Redirected from
777:
734:
721:
713:
675:
636:
635:, pp. 5, 7.
630:
624:
623:
621:
620:
609:
603:
602:
600:
590:
584:
578:
572:
566:
557:
551:
545:
544:
524:
518:
512:
497:
491:
474:
473:
449:
443:
442:
415:
409:
408:
400:
394:
393:
382:10.1108/eb050097
365:
359:
358:
347:10.1108/eb049778
330:
227:Continued debate
72:natural language
48:indexing systems
21:
785:
784:
780:
779:
778:
776:
775:
774:
750:
749:
741:
724:
716:
685:
647:
644:
639:
631:
627:
618:
616:
611:
610:
606:
598:
592:
591:
587:
579:
575:
567:
560:
552:
548:
541:
526:
525:
521:
513:
500:
492:
477:
451:
450:
446:
417:
416:
412:
402:
401:
397:
367:
366:
362:
332:
331:
327:
323:
318:
301:
261:
233:Jason Farradane
229:
196:
139:
88:As We May Think
84:
28:
23:
22:
15:
12:
11:
5:
783:
781:
773:
772:
767:
762:
752:
751:
748:
747:
740:
739:External links
737:
736:
735:
722:
714:
696:(4): 439–456.
683:
682:, 5(2), 92–96.
676:
658:(4): 307–311.
643:
640:
638:
637:
633:Robertson 2008
625:
604:
585:
573:
558:
554:Robertson 2008
546:
539:
519:
515:Robertson 2008
498:
494:Robertson 2008
475:
464:(4): 320–329.
444:
410:
395:
360:
324:
322:
319:
317:
314:
313:
312:
307:
300:
297:
260:
257:
249:Don R. Swanson
228:
225:
201:query language
195:
192:
176:
175:
168:Mortimer Taube
164:
157:
150:
138:
135:
123:Washington, DC
83:
80:
26:
24:
14:
13:
10:
9:
6:
4:
3:
2:
782:
771:
768:
766:
763:
761:
758:
757:
755:
746:
743:
742:
738:
732:
728:
723:
719:
715:
711:
707:
703:
699:
695:
691:
690:
684:
681:
677:
673:
669:
665:
661:
657:
653:
652:
646:
645:
641:
634:
629:
626:
614:
608:
605:
597:
596:
589:
586:
582:
577:
574:
570:
565:
563:
559:
555:
550:
547:
542:
540:9781598297690
536:
532:
531:
523:
520:
516:
511:
509:
507:
505:
503:
499:
495:
490:
488:
486:
484:
482:
480:
476:
471:
467:
463:
459:
455:
448:
445:
440:
436:
433:(4): 284–94.
432:
428:
424:
420:
414:
411:
406:
399:
396:
391:
387:
383:
379:
375:
371:
364:
361:
356:
352:
348:
344:
340:
336:
329:
326:
320:
315:
311:
308:
306:
303:
302:
298:
296:
292:
290:
286:
282:
278:
274:
270:
265:
258:
256:
252:
250:
246:
242:
238:
234:
226:
224:
221:
215:
213:
209:
204:
202:
193:
191:
187:
183:
181:
173:
169:
165:
162:
158:
155:
151:
148:
144:
143:
142:
136:
134:
132:
128:
124:
119:
117:
113:
108:
107:Royal Society
105:In 1948, the
103:
101:
97:
93:
92:Vannevar Bush
89:
81:
79:
77:
73:
68:
63:
59:
57:
51:
49:
45:
41:
38:conducted by
37:
33:
19:
730:
693:
687:
679:
655:
649:
642:Bibliography
628:
617:. Retrieved
607:
594:
588:
576:
556:, p. 7.
549:
529:
522:
517:, p. 4.
496:, p. 3.
461:
457:
447:
430:
426:
421:(May 1992).
413:
404:
398:
373:
369:
363:
338:
334:
328:
293:
266:
262:
253:
230:
220:landing gear
216:
205:
197:
188:
184:
177:
156:collections,
140:
120:
116:World War II
104:
85:
67:World War II
64:
60:
52:
31:
29:
285:core memory
273:main memory
194:Cranfield 2
137:Cranfield 1
56:index cards
754:Categories
619:2011-07-20
316:References
289:hard drive
154:index card
82:Background
672:0096-946X
581:CRANFIELD
390:0001-253X
355:0001-253X
321:Citations
277:megabytes
269:kilobytes
259:Influence
208:black box
180:aerospace
299:See also
241:question
710:8032578
172:Uniterm
708:
670:
537:
388:
353:
245:answer
706:S2CID
599:(PDF)
305:ASLIB
100:memex
90:" by
668:ISSN
535:ISBN
386:ISSN
351:ISSN
243:and
166:and
159:the
145:the
30:The
698:doi
660:doi
466:doi
435:doi
378:doi
343:doi
271:of
170:'s
756::
729:.
704:.
694:34
692:.
666:.
656:14
654:.
561:^
501:^
478:^
460:.
456:.
431:43
429:.
425:.
384:.
374:19
372:.
349:.
339:12
337:.
214:.
720:.
712:.
700::
674:.
662::
622:.
583:.
571:.
543:.
472:.
468::
462:7
441:.
437::
392:.
380::
357:.
345::
20:)
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.