505:
251:. The Lemur Project develops search engines, browser toolbars, text analysis tools, and data resources that support research and development of information retrieval and text mining software. The project is best known for its Indri and Galago search engines, the ClueWeb09 and ClueWeb12 datasets, and the RankLib learning-to-rank library. The software and datasets are used widely in scientific and research applications, as well as in some commercial applications.
25:
127:
254:
The Lemur
Project's software development philosophy emphasizes state-of-the-art accuracy, flexibility, and efficiency. For example, the Indri search engine provides accurate search for large text collections 'out of the box', and data is stored in an accessible manner to support development of new
411:
Updates to the Lemur
Project components are made twice a year, in June and December. The latest version of the Indri search engine is 5.17. The latest version of the Galago search engine is version 3.18. The latest version of the RankLib learning-to-rank library is 2.14. The latest version of the
420:
The Indri search engine is one of the components developed by the Lemur
Project. It is open source. The query language that is used in Indri allows researchers to index data or structure documents using simple command line instructions. Indri offers flexibility in terms of adaptation to various
270:, and it comes along with the source files and build instructions. The provided source code can be modified for the purpose of developing new libraries. It is compatible with various operating systems which include Linux and Windows.
421:
current applications. It also can be distributed across a cluster of nodes for high performance. The Indri search engine can handle large collections of data and can understand various data formats like
255:
retrieval strategies. Software from the Lemur
Project is distributed under open-source licenses that provide flexibility to scientists and software developers.
570:
546:
478:
151:
of the topic and provide significant coverage of it beyond a mere trivial mention. If notability cannot be shown, the article is likely to be
240:
222:
108:
46:
39:
539:
148:
565:
299:
244:
437:
203:
89:
512:
175:
61:
144:
160:
532:
433:
267:
182:
68:
248:
35:
504:
259:
189:
75:
137:
171:
57:
305:
156:
152:
366:
344:
336:
516:
196:
82:
559:
239:
is a collaboration between the Center for
Intelligent Information Retrieval at the
24:
294:
432:
The Indri API supports various programming and scripting languages like C++,
289:
316:
263:
143:
Please help to demonstrate the notability of the topic by citing
422:
490:
441:
426:
120:
18:
520:
258:The programming languages used to create Lemur are
453:Can make use of multiple document representations
390:Galago search engine research framework in Java
16:Information retrieval and text mining research
540:
8:
383:Lemur Project has the following components:
547:
533:
223:Learn how and when to remove this message
109:Learn how and when to remove this message
479:List of information retrieval libraries
412:Sifaka data mining application is 1.8.
278:Lemur supports the following features:
45:Please improve this article by adding
7:
501:
499:
571:Free and open-source software stubs
322:Passage and cross-lingual retrieval
241:University of Massachusetts Amherst
519:. You can help Knowledge (XXG) by
14:
285:English, Chinese, and Arabic text
503:
399:ClueWeb09 and ClueWeb12 datasets
393:RankLib learning-to-rank library
125:
23:
448:Features of Indri Search Engine
245:Language Technologies Institute
136:may not meet Knowledge (XXG)'s
468:Can be efficiently implemented
396:Sifaka data mining application
1:
513:free and open-source software
358:Database based ranking (CORI)
47:secondary or tertiary sources
138:general notability guideline
587:
498:
387:Indri search engine in C++
249:Carnegie Mellon University
145:reliable secondary sources
134:The topic of this article
491:The Lemur Project website
341:Structured query language
304:Passage and incremental
456:Explicit term weighting
566:Free software projects
462:Formally well-grounded
374:Simple text processing
34:relies excessively on
459:Robust query language
355:Query-based sampling
328:Query model updating
416:Indri Search Engine
367:Document clustering
331:Two stage smoothing
337:Relevance feedback
325:Language modeling
315:Ad hoc retrieval (
140:
528:
527:
402:Query Log Toolbar
233:
232:
225:
207:
135:
119:
118:
111:
93:
578:
549:
542:
535:
507:
500:
465:Highly effective
352:Distributed IR:
228:
221:
217:
214:
208:
206:
165:
129:
128:
121:
114:
107:
103:
100:
94:
92:
51:
27:
19:
586:
585:
581:
580:
579:
577:
576:
575:
556:
555:
554:
553:
496:
487:
475:
450:
418:
409:
381:
361:Results merging
276:
229:
218:
212:
209:
172:"Lemur Project"
166:
164:
142:
130:
126:
115:
104:
98:
95:
58:"Lemur Project"
52:
50:
44:
40:primary sources
28:
17:
12:
11:
5:
584:
582:
574:
573:
568:
558:
557:
552:
551:
544:
537:
529:
526:
525:
508:
494:
493:
486:
485:External links
483:
482:
481:
474:
471:
470:
469:
466:
463:
460:
457:
454:
449:
446:
417:
414:
408:
407:Latest Version
405:
404:
403:
400:
397:
394:
391:
388:
380:
377:
376:
375:
372:
369:
364:
363:
362:
359:
356:
350:
349:
348:
342:
339:
334:
333:
332:
329:
323:
320:
310:
309:
308:
302:
297:
292:
286:
275:
272:
231:
230:
133:
131:
124:
117:
116:
31:
29:
22:
15:
13:
10:
9:
6:
4:
3:
2:
583:
572:
569:
567:
564:
563:
561:
550:
545:
543:
538:
536:
531:
530:
524:
522:
518:
515:article is a
514:
509:
506:
502:
497:
492:
489:
488:
484:
480:
477:
476:
472:
467:
464:
461:
458:
455:
452:
451:
447:
445:
443:
439:
435:
430:
428:
424:
415:
413:
406:
401:
398:
395:
392:
389:
386:
385:
384:
378:
373:
371:Summarization
370:
368:
365:
360:
357:
354:
353:
351:
347:term matching
346:
343:
340:
338:
335:
330:
327:
326:
324:
321:
318:
314:
313:
311:
307:
303:
301:
298:
296:
293:
291:
287:
284:
283:
281:
280:
279:
273:
271:
269:
265:
261:
256:
252:
250:
246:
242:
238:
237:Lemur Project
227:
224:
216:
213:December 2020
205:
202:
198:
195:
191:
188:
184:
181:
177:
174: –
173:
169:
168:Find sources:
162:
158:
154:
150:
146:
139:
132:
123:
122:
113:
110:
102:
91:
88:
84:
81:
77:
74:
70:
67:
63:
60: –
59:
55:
54:Find sources:
48:
42:
41:
37:
32:This article
30:
26:
21:
20:
521:expanding it
510:
495:
431:
419:
410:
382:
319:and InQuery)
300:Tokenization
277:
257:
253:
236:
234:
219:
210:
200:
193:
186:
179:
167:
105:
96:
86:
79:
72:
65:
53:
33:
312:Retrieval:
149:independent
99:August 2011
560:Categories
379:Components
295:Stop words
282:Indexing:
183:newspapers
157:redirected
69:newspapers
36:references
147:that are
473:See also
345:Wildcard
306:indexing
290:stemming
274:Features
243:and the
197:scholar
161:deleted
83:scholar
440:, and
317:TF-IDF
266:, and
199:
192:
185:
178:
170:
153:merged
85:
78:
71:
64:
56:
511:This
288:Word
204:JSTOR
190:books
159:, or
90:JSTOR
76:books
517:stub
434:Java
425:and
423:HTML
268:Java
235:The
176:news
62:news
442:PHP
427:XML
264:C++
247:at
38:to
562::
444:.
438:C#
436:,
429:.
262:,
155:,
49:.
548:e
541:t
534:v
523:.
260:C
226:)
220:(
215:)
211:(
201:·
194:·
187:·
180:·
163:.
141:.
112:)
106:(
101:)
97:(
87:·
80:·
73:·
66:·
43:.
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.