433:(OIRDS) is an annotated library of imagery and tools. OIRDS v1.0 is composed of passenger vehicle objects annotated in overhead imagery. Passenger vehicles in the OIRDS include cars, trucks, vans, etc. In addition to the object outlines, the OIRDS includes subjective and objective statistics that quantify the vehicle within the image's context. For example, subjective measures of image clutter, clarity, noise, and vehicle color are included along with more objective statistics such as
465:. MICC-Flickr 101 corrects the main drawback of Caltech 101, i.e. its low inter-class variability and provides social annotations through user tags. It builds on a standard and widely used data set composed of a manageable number of categories (101) and therefore can be used to compare object categorization performance in a constrained scenario (Caltech 101) and object categorization "in the wild" (MICC-Flickr 101) on the same 101 categories.
286:
Images are very uniform in presentation, aligned from left to right, and usually not occluded. As a result, the images are not always representative of practical inputs that the algorithm might later expect to see. Under practical conditions, images are more cluttered, occluded and display greater
103:
techniques is the fact that most groups use their own data sets. Each set may have different properties that make reported results from different methods harder to compare directly. For example, differences in image size, image quality, relative location of objects within the images and level of
386:
is another image data set, created in 2007. It is a successor to
Caltech 101. It is intended to address some of the weaknesses of Caltech 101. Overall, it is a more difficult data set than Caltech 101, but it suffers from comparable problems. It includes
760:
Object
Recognition with Features Inspired by Visual Cortex. T. Serre, L. Wolf and T. Poggio. Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), IEEE Computer Society Press, San Diego, June
203:
Object
Recognition with Features Inspired by Visual Cortex. T. Serre, L. Wolf and T. Poggio. Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), IEEE Computer Society Press, San Diego, June
426:
VOC 2008 is a
European effort to collect images for benchmarking visual categorization methods. Compared to Caltech 101/256, a smaller number of categories (about 20) are collected. The number of images in each category, however, is
922:"L. Ballan, M. Bertini, A. Del Bimbo, A.M. Serain, G. Serra, B.F. Zaccone. Combining Generative and Discriminative Models for Classifying Social Images from 101 Object Categories. Int. Conference on Pattern Recognition (ICPR), 2012"
160:
A set of annotations is provided for each image. Each set of annotations contains two pieces of information: the general bounding box in which the object is located and a detailed human-specified outline enclosing the object.
645:
L. Fei-Fei, R. Fergus and P. Perona. Learning generative visual models from few training examples: an incremental
Bayesian approach tested on 101 object categories. IEEE. CVPR 2004, Workshop on Generative-Model Based Vision.
420:
Due to its open nature, LabelMe has many more images covering a much wider scope than
Caltech 101. However, since each person decides what images to upload, and how to label and annotate each image, the images are less
252:
Almost all the images within each category are uniform in image size and in the relative position of interest objects. Caltech 101 users generally do not need to crop or scale images before they can be
261:
Algorithms concerned with recognition usually function by storing features unique to the object. However, most images taken have varying degrees of background clutter, which means algorithms may build
124:
However, a follow-up study demonstrated that tests based on uncontrolled natural images (like the
Caltech 101 data set) can be seriously misleading, potentially guiding progress in the wrong direction.
353:
408:
276:
Weaknesses to the
Caltech 101 data set may be conscious trade-offs, but others are limitations of the data set. Papers that rely solely on Caltech 101 are frequently rejected.
172:
The
Caltech 101 data set was used to train and test several computer vision recognition and classification algorithms. The first paper to use Caltech 101 was an incremental
200:
Combining
Generative Models and Fisher Kernels for Object Class Recognition. Holub, AD. Welling, M. Perona, P. International Conference on Computer Vision (ICCV), 2005
197:
The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features. K. Grauman and T. Darrell. International Conference on Computer Vision (ICCV), 2005
721:
The Pyramid Match Kernel:Discriminative Classification with Sets of Image Features. K. Grauman and T. Darrell. International Conference on Computer Vision (ICCV), 2005
287:
variance in relative position and orientation of interest objects. The uniformity allows concepts to be derived using the average of a category, which is unrealistic.
814:
Multiclass Object Recognition with Sparse, Localized Features, Jim Mutch and David G. Lowe. , pp. 11โ18, CVPR 2006, IEEE Computer Society Press, New York, June 2006
99:
Historically, most data sets used in computer vision research have been tailored to the specific needs of the project being worked on. A large problem in comparing
228:
Multiclass Object Recognition with Sparse, Localized Features. Jim Mutch and David G. Lowe., pp. 11โ18, CVPR 2006, IEEE Computer Society Press, New York, June 2006
138:
The Caltech 101 data set consists of a total of 9,146 images, split between 101 different object categories, as well as an additional background/clutter category.
475:
865:; M. Marszalek; C. Schmid; B. C. Russell; A. Torralba; C. K. I. Williams; J. Zhang; A. Zisserman (2006). J. Ponce; M. Hebert; C. Schmid; A. Zisserman (eds.).
1002:
540:
Oertel, Carsten; Colder, Brian; Colombe, Jeffrey; High, Julia; Ingram, Michael; Sallee, Phil (2008). "Current challenges in automating visual perception".
141:
Each object category contains between 40 and 800 images. Common and popular categories such as faces tend to have a larger number of images than others.
904:
772:
SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition. Hao Zhang, Alex Berg, Michael Maire, Jitendra Malik. CVPR, 2006
89:
algorithms function by training on example inputs. They require a large and varied set of training data to work effectively. For example, the real-time
826:
921:
866:
658:
120:
Available for general use, Caltech 101 acts as a common standard by which to compare different algorithms without bias due to different data sets.
231:
Using Dependent Regions or Object Categorization in a Generative Framework. G. Wang, Y. Zhang, and L. Fei-Fei. IEEE Comp. Vis. Patt. Recog. 2006
164:
A Matlab script is provided with the annotations. It loads an image and its corresponding annotation file and displays them as a Matlab figure.
557:
176:
approach to one-shot learning, an attempt to classify an object using only a few examples, by building on prior knowledge of other classes.
1007:
898:
Overhead Imagery Research Data Set (OIRDS) โ an annotated data library and tools to aid in the development of computer vision algorithms
958:
430:
39:
733:
803:
Empirical study of multi-scale filter banks for object categorization, M.J. Mar้-Jim้ez, and N. P้ez de la Blanca. December 2005
225:
Empirical Study of Multi-Scale Filter Banks for Object Categorization. M.J. Mar้-Jim้ez, and N. P้ez de la Blanca. December 2005
50:
classification and categorization. Caltech 101 contains a total of 9,146 images, split between 101 distinct object categories (
207:
SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition. Hao Zhang, Alex Berg, Michael Maire,
309:
152:
were mirrored to be left to right aligned and vertically oriented structures such as buildings were rotated to be off axis.
411:(CSAIL). LabelMe takes a different approach to the problem of creating a large image data set, with different trade-offs.
720:
355:. The number of images used for training must be less than or equal to 30, which is not sufficient for all purposes.
802:
186:
Shape Matching and Object Recognition using Low Distortion Correspondence. Alexander C. Berg, Tamara L. Berg,
179:
The Caltech 101 images, along with the annotations, were used for another one-shot learning paper at Caltech.
872:. Toward Category-Level Object Recognition, Springer-Verlag Lecture Notes in Computer Science. Archived from
901:
782:
458:
434:
837:
771:
363:
Some images have been rotated and scaled from their original orientation, and suffer from some amount of
928:
873:
697:
457:
MICC-Flickr 101 is an image data set created at the Media Integration and Communication Center (MICC),
599:
364:
114:
Many categories are represented, which suits both single and multiple class recognition algorithms.
759:
689:
563:
522:
417:
Users may add images to the data set by upload, and add labels or annotations to existing images.
383:
173:
74:
862:
786:
681:
627:
553:
215:
47:
813:
673:
644:
617:
607:
545:
514:
214:
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories.
86:
968:
955:
962:
908:
790:
783:
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories
219:
100:
43:
986:
518:
303:
Certain categories are not represented as well as others, containing as few as 31 images.
603:
295:
The Caltech 101 data set represents only a small fraction of possible object categories.
93:
method used by Paul Viola and Michael J. Jones was trained on 4,916 hand-labeled faces.
739:
622:
587:
480:
208:
187:
96:
Cropping, re-sizing and hand-marking points of interest is tedious and time-consuming.
90:
27:
996:
35:
567:
693:
526:
612:
182:
Other Computer Vision papers that report using the Caltech 101 data set include:
149:
67:
549:
31:
827:"Using Dependent Regions or Object Categorization in a Generative Framework"
145:
107:
The Caltech 101 data set aims at alleviating many of these common problems.
735:
Combining Generative Models and Fisher Kernels for Object Class Recognition
685:
631:
738:. International Conference on Computer Vision (ICCV), 2005. Archived from
677:
505:
Viola, Paul; Jones, Michael J. (2004). "Robust Real-Time Face Detection".
902:
http://sourceforge.net/apps/mediawiki/oirds/index.php?title=Documentation
368:
66:, etc.) and a background category. Provided with the images are a set of
23:
980:
896:
F. Tanner, B. Colder, C. Pullen, D. Heagy, C. Oertel, & P. Sallee,
485:
404:
144:
Each image is about 300x200 pixels. Images of oriented objects such as
55:
46:
research and techniques and is most applicable to techniques involving
462:
414:
106,739 images, 41,724 annotated images, and 203,363 labeled objects.
71:
63:
245:
Caltech 101 has several advantages over other similar data sets:
974:
191:
59:
51:
861:
J. Ponce; T. L. Berg; M. Everingham; D. A. Forsyth; M. Hebert;
666:
IEEE Transactions on Pattern Analysis and Machine Intelligence
461:, in 2012. It is based on Caltech 101 and is collected from
542:
2008 37th IEEE Applied Imagery Pattern Recognition Workshop
409:
MIT Computer Science and Artificial Intelligence Laboratory
104:
occlusion and clutter present can lead to varying results.
983:โ Randomized Caltech 101 download page (Includes download)
586:
Pinto, Nicolas; Cox, David D.; Dicarlo, James J. (2008).
969:
http://www.vision.caltech.edu/Image_Datasets/Caltech256/
956:
http://www.vision.caltech.edu/Image_Datasets/Caltech101/
987:
http://www.micc.unifi.it/vim/datasets/micc-flickr-101/
348:{\displaystyle \mathrm {N} _{\mathrm {train} }\leq 30}
390:
30,607 images, covering a larger number of categories
312:
70:
describing the outlines of each image, along with a
588:"Why is Real-World Visual Object Recognition Hard?"
581:
579:
577:
393:Minimum number of images per category raised to 80
347:
452:Limited to passenger vehicles in overhead imagery
657:L. Fei-Fei; R. Fergus; P. Perona (April 2006).
476:List of datasets for machine learning research
440:~900 images, containing ~1800 annotated images
34:, Marco Andreetto, Marc 'Aurelio Ranzato and
989:โ MICC-Flickr101 Homepage (Includes download)
8:
360:Aliasing and artifacts due to manipulation:
971:โ Caltech 256 Homepage (Includes download)
965:โ Caltech 101 Homepage (Includes download)
30:created in September 2003 and compiled by
621:
611:
320:
319:
314:
311:
659:"One-Shot learning of object categories"
507:International Journal of Computer Vision
407:is an open, dynamic data set created at
497:
867:"Dataset Issues in Object Recognition"
825:G. Wang; Y. Zhang; L. Fei-Fei (2006).
437:(GSD), time of day, and day of year.
7:
981:http://www2.it.lut.fi/project/visiq/
399:More variation in image presentation
300:Some categories contain few images:
117:Detailed object outlines are marked.
111:The images are cropped and re-sized.
446:~60 statistical measures per object
1003:California Institute of Technology
732:Holub, AD; Welling, M; Perona, P.
519:10.1023/B:VISI.0000013087.49260.fb
431:Overhead Imagery Research Data Set
333:
330:
327:
324:
321:
315:
40:California Institute of Technology
14:
396:Images are not left-right aligned
449:Wide variation in object context
258:Low level of clutter/occlusion:
42:. It is intended to facilitate
249:Uniform size and presentation:
292:Limited number of categories:
1:
975:http://labelme.csail.mit.edu/
793:, and Jean Ponce. CVPR, 2006
613:10.1371/journal.pcbi.0040027
222:, and Jean Ponce. CVPR, 2006
1008:Datasets in computer vision
834:IEEE Comp. Vis. Patt. Recog
283:The data set is too clean:
1024:
592:PLOS Computational Biology
443:~30 annotations per object
550:10.1109/AIPR.2008.4906457
85:Most computer vision and
911:> (28 December 2009)
236:Analysis and comparison
459:University of Florence
435:ground sample distance
349:
678:10.1109/TPAMI.2006.79
350:
310:
279:Weaknesses include:
267:Detailed annotations
604:2008PLSCB...4...27P
977:โ LabelMe Homepage
961:2013-12-06 at the
907:2012-11-09 at the
345:
900:, June 2009, <
787:Svetlana Lazebnik
559:978-1-4244-3125-0
216:Svetlana Lazebnik
48:image recognition
16:Dataset of images
1015:
943:
942:
940:
939:
933:
927:. Archived from
926:
918:
912:
894:
888:
887:
885:
884:
878:
871:
858:
852:
851:
849:
848:
842:
836:. Archived from
831:
822:
816:
811:
805:
800:
794:
780:
774:
769:
763:
757:
751:
750:
748:
747:
729:
723:
718:
712:
711:
709:
708:
702:
696:. Archived from
663:
654:
648:
642:
636:
635:
625:
615:
583:
572:
571:
544:. pp. 1โ8.
537:
531:
530:
502:
354:
352:
351:
346:
338:
337:
336:
318:
306:This means that
87:machine learning
1023:
1022:
1018:
1017:
1016:
1014:
1013:
1012:
993:
992:
963:Wayback Machine
952:
947:
946:
937:
935:
931:
924:
920:
919:
915:
909:Wayback Machine
895:
891:
882:
880:
876:
869:
860:
859:
855:
846:
844:
840:
829:
824:
823:
819:
812:
808:
801:
797:
791:Cordelia Schmid
781:
777:
770:
766:
758:
754:
745:
743:
731:
730:
726:
719:
715:
706:
704:
700:
661:
656:
655:
651:
643:
639:
585:
584:
575:
560:
539:
538:
534:
504:
503:
499:
494:
472:
380:
378:Other data sets
313:
308:
307:
274:
243:
238:
220:Cordelia Schmid
170:
158:
136:
131:
101:computer vision
83:
44:computer vision
17:
12:
11:
5:
1021:
1019:
1011:
1010:
1005:
995:
994:
991:
990:
984:
978:
972:
966:
951:
950:External links
948:
945:
944:
913:
889:
853:
817:
806:
795:
775:
764:
752:
724:
713:
672:(4): 594โ611.
649:
637:
573:
558:
532:
513:(2): 137โ154.
496:
495:
493:
490:
489:
488:
483:
481:MNIST database
478:
471:
468:
467:
466:
455:
454:
453:
450:
447:
444:
441:
428:
424:
423:
422:
418:
415:
402:
401:
400:
397:
394:
391:
379:
376:
375:
374:
373:
372:
358:
357:
356:
344:
341:
335:
332:
329:
326:
323:
317:
304:
298:
297:
296:
290:
289:
288:
273:
270:
269:
268:
265:
264:
263:
256:
255:
254:
242:
239:
237:
234:
233:
232:
229:
226:
223:
212:
209:Jitendra Malik
205:
201:
198:
195:
188:Jitendra Malik
169:
166:
157:
154:
135:
132:
130:
127:
122:
121:
118:
115:
112:
91:face detection
82:
79:
28:digital images
15:
13:
10:
9:
6:
4:
3:
2:
1020:
1009:
1006:
1004:
1001:
1000:
998:
988:
985:
982:
979:
976:
973:
970:
967:
964:
960:
957:
954:
953:
949:
934:on 2014-08-26
930:
923:
917:
914:
910:
906:
903:
899:
893:
890:
879:on 2016-12-24
875:
868:
864:
857:
854:
843:on 2007-06-09
839:
835:
828:
821:
818:
815:
810:
807:
804:
799:
796:
792:
788:
784:
779:
776:
773:
768:
765:
762:
756:
753:
742:on 2007-08-14
741:
737:
736:
728:
725:
722:
717:
714:
703:on 2007-06-09
699:
695:
691:
687:
683:
679:
675:
671:
667:
660:
653:
650:
647:
641:
638:
633:
629:
624:
619:
614:
609:
605:
601:
597:
593:
589:
582:
580:
578:
574:
569:
565:
561:
555:
551:
547:
543:
536:
533:
528:
524:
520:
516:
512:
508:
501:
498:
491:
487:
484:
482:
479:
477:
474:
473:
469:
464:
460:
456:
451:
448:
445:
442:
439:
438:
436:
432:
429:
425:
419:
416:
413:
412:
410:
406:
403:
398:
395:
392:
389:
388:
385:
382:
381:
377:
370:
366:
362:
361:
359:
342:
339:
305:
302:
301:
299:
294:
293:
291:
285:
284:
282:
281:
280:
277:
271:
266:
260:
259:
257:
251:
250:
248:
247:
246:
240:
235:
230:
227:
224:
221:
217:
213:
210:
206:
202:
199:
196:
193:
189:
185:
184:
183:
180:
177:
175:
167:
165:
162:
155:
153:
151:
147:
142:
139:
133:
128:
126:
119:
116:
113:
110:
109:
108:
105:
102:
97:
94:
92:
88:
80:
78:
77:for viewing.
76:
73:
69:
65:
61:
57:
53:
49:
45:
41:
37:
36:Pietro Perona
33:
29:
25:
21:
936:. Retrieved
929:the original
916:
897:
892:
881:. Retrieved
874:the original
856:
845:. Retrieved
838:the original
833:
820:
809:
798:
778:
767:
755:
744:. Retrieved
740:the original
734:
727:
716:
705:. Retrieved
698:the original
669:
665:
652:
640:
595:
591:
541:
535:
510:
506:
500:
278:
275:
262:incorrectly.
244:
211:. CVPR, 2006
181:
178:
171:
163:
159:
143:
140:
137:
123:
106:
98:
95:
84:
19:
18:
863:S. Lazebnik
421:consistent.
384:Caltech 256
156:Annotations
150:motorcycles
68:annotations
20:Caltech 101
997:Categories
938:2012-07-11
883:2008-02-08
847:2008-01-16
746:2008-01-16
707:2008-01-16
598:(1): e27.
492:References
272:Weaknesses
241:Advantages
32:Fei-Fei Li
365:artifacts
340:≤
146:airplanes
959:Archived
905:Archived
686:16566508
632:18225950
568:36669995
470:See also
369:aliasing
174:Bayesian
129:Data set
24:data set
694:6953475
623:2211529
600:Bibcode
527:2796017
486:LabelMe
427:larger.
405:LabelMe
81:Purpose
56:watches
38:at the
692:
684:
630:
620:
566:
556:
525:
463:Flickr
134:Images
75:script
72:Matlab
64:pianos
932:(PDF)
925:(PDF)
877:(PDF)
870:(PDF)
841:(PDF)
830:(PDF)
701:(PDF)
690:S2CID
662:(PDF)
564:S2CID
523:S2CID
253:used.
52:faces
22:is a
761:2005
682:PMID
646:2004
628:PMID
554:ISBN
204:2005
194:2005
192:CVPR
168:Uses
148:and
60:ants
674:doi
618:PMC
608:doi
546:doi
515:doi
367:or
26:of
999::
832:.
789:,
785:.
688:.
680:.
670:28
668:.
664:.
626:.
616:.
606:.
594:.
590:.
576:^
562:.
552:.
521:.
511:57
509:.
343:30
218:,
190:.
62:,
58:,
54:,
941:.
886:.
850:.
749:.
710:.
676::
634:.
610::
602::
596:4
570:.
548::
529:.
517::
371:.
334:n
331:i
328:a
325:r
322:t
316:N
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.