LabelMe - Knowledge (XXG)

321:

instead they assigned words to senses manually. At first, this may seem like a daunting task since new labels are added to the LabelMe project continuously. To the right is a graph comparing the growth of polygons to the growth of words (descriptions). As you can see, the growth of words is small compared with the continuous growth of polygons, and therefore is easy enough to keep up to date manually by the LabelMe team.

25: 224:

is closed, a bubble pops up on the screen which allows the user to enter a label for the object. The user can choose whatever label the user thinks best describes the object. If the user disagrees with the previous labeling of the image, the user can click on the outline polygon of an object and either delete the polygon completely or edit the text label to give it a new name.

223:

to draw a polygon containing an object in the image. For example, in the adjacent image, if a person was standing in front of the building, the user could click on a point on the border of the person, and continue clicking along the outside edge until returning to the starting point. Once the polygon

320:

is a database of words organized into a structural way. It allows assigning a word to a category, or in WordNet language: a sense. Sense assignment is not easy to do automatically. When the authors of LabelMe tried automatic sense assignment, they found that it was prone to a high rate of error, so

240:

The LabelMe dataset has some problems. Some are inherent in the data, such as the objects in the images not being uniformly distributed with respect to size and image location. This is due to the images being primarily taken by humans who tend to focus the camera on interesting objects in a scene.

215:

support. When the tool is loaded, it chooses a random image from the LabelMe dataset and displays it on the screen. If the image already has object labels associated with it, they will be overlaid on top of the image in polygon format. Each distinct object label is displayed in a different color.

152:

The motivation behind creating LabelMe comes from the history of publicly available data for computer vision researchers. Most available data was tailored to a specific research group's problems and caused new researchers to have to collect additional data to solve their own problems. LabelMe was

1051:

The LabelMe project provides a set of tools for using the LabelMe dataset from Matlab. Since research is often done in Matlab, this allows the integration of the dataset with existing tools in computer vision. The entire dataset can be downloaded and used offline, or the toolbox allows dynamic

271:

The creators of LabelMe decided to leave these decisions up to the annotator. The reason for this is that they believe people will tend to annotate the images according to what they think is the natural labeling of the images. This also provides some variability in the data, which can help

227:

As soon as changes are made to the image by the user, they are saved and openly available for anyone to download from the LabelMe dataset. In this way, the data is always changing due to contributions by the community of users who use the tool. Once the user is finished with an image, the

1010:

as above since the person is not part of the building. Instead, they are two separate objects that happen to overlap. To automatically determine which object is the foreground and which is the background, the authors of LabelMe propose several options:

289:

Since the text labels for objects provided in LabelMe come from user input, there is a lot of variation in the labels used (as described above). Because of this, analysis of objects can be difficult. For example, a picture of a dog might be labeled as

161:

of a class of objects instead of single instances of an object. For example, a traditional dataset may have contained images of dogs, each of the same size and orientation. In contrast, LabelMe contains images of dogs in multiple angles, sizes, and

800: 578: 638: 1015:

If an object is completely contained within another object, then the inner object must be in the foreground. Otherwise, it would not be visible in the image. The only exception is with transparent or translucent objects, but these occur

241:

However, cropping and rescaling the images randomly can simulate a uniform distribution. Other problems are caused by the amount of freedom given to the users of the annotation tool. Some problems that arise are:

684: 1041:

in the intersecting areas is compared to the color histogram of the two objects. The object with the closer color histogram is assigned as the foreground. This method is less accurate than counting the polygon

206:

The LabelMe annotation tool provides a means for users to contribute to the project. The tool can be accessed anonymously or by logging into a free account. To access the tool, users must have a compatible

252:

The user has to describe the shape of the object themselves by outlining a polygon. Should the fingers of a hand on a person be outlined with detail? How much precision must be used when outlining objects?

912: 840: 512: 180:

Complex annotation: Instead of labeling an entire image (which also limits each image to containing a single object), LabelMe allows annotation of multiple objects within an image by specifying a

125: 944: 872: 461: 426: 364:

Having a large dataset of objects where overlap is allowed provides enough data to try and categorize objects as being a part of another object. For example, most of the labels assigned

1006:

Another instance of object overlap is when one object is actually on top of the other. For example, an image might contain a person standing in front of a building. The person is not a

739: 732: 992: 517: 965: 705: 1034:

The object with more polygon points inside the intersecting area is most likely the foreground. The authors tested this hypothesis and found it to be highly accurate.

586: 1272: 1061: 153:

created to solve several common shortcomings of available data. The following is a list of qualities that distinguish LabelMe from previous work.

998:

This algorithm allows the automatic classification of parts of an object when the part objects are frequently contained within the outer object.

1076: 1267: 643: 108: 46: 39: 170: 877: 805: 477: 249:

person be labeled? Should an occluded part of an object be included when outlining the object? Should the sky be labeled?

246: 89: 324:

Once WordNet assignment is done, searches in the LabelMe database are much more effective. For example, a search for

144:

research. As of October 31, 2010, LabelMe has 187,240 images, 62,197 annotated images, and 658,992 labeled objects.

61: 917: 845: 434: 399: 35: 140:. The dataset is dynamic, free to use, and open to public contribution. The most applicable use of LabelMe is in 68: 795:{\displaystyle {\frac {\mathrm {N} _{\mathrm {O} ,\mathrm {P} }}{\mathrm {N} _{\mathrm {P} }+\alpha }}\,} 75: 1177: 57: 573:{\displaystyle {\frac {\mathrm {A} (\mathrm {O} \cap \mathrm {P} )}{\mathrm {A} (\mathrm {P} )}}\,} 710: 1233: 1204: 970: 1019:

One of the objects could be labeled as something that cannot be in the foreground. Examples are

949: 633:{\displaystyle \mathrm {I} _{\mathrm {O} ,\mathrm {P} }\subseteq \mathrm {I} _{\mathrm {P} }\,} 689: 220: 1225: 1196: 340:. However, since the assignment was done manually, a picture of a computer mouse labeled as 1038: 514:, be defined as the ratio of the intersection area to the area of the part polygon. (e.g. 158: 141: 356:

to return these objects as results. WordNet makes the LabelMe database much more useful.

165:

Designed for recognizing objects embedded in arbitrary scenes instead of images that are

187:

Contains a large number of object classes and allows the creation of new classes easily.

1066: 197:

images and allows public additions to the annotations. This creates a free environment.

133: 232:

link can be clicked and another random image will be selected to display to the user.

1261: 255:

The user chooses what text to enter as the label for the object. Should the label be

174: 82: 1176:

Russell, Bryan C.; Torralba, Antonio; Murphy, Kevin P.; Freeman, William T. (2008).

1237: 1208: 1071: 208: 24: 1200: 273: 212: 137: 194: 1229: 317: 181: 166: 129: 245:

The user can choose which objects in the scene to outline. Should an

679:{\displaystyle \mathrm {S} _{\mathrm {O} ,\mathrm {P} }>\beta \,} 314:

at the abstract level should incorporate all of these text labels.

190:

Diverse images: LabelMe contains images from many different scenes.

1251: 1081: 1216:

Swain, Michael J.; Ballard, Dana H. (1991). "Color indexing".

18: 219:

If the image is not completely labeled, the user can use the

1148: 1136: 1124: 1112: 1100: 348:. Also, if objects are labeled with more complex terms like 368:

are probably part of objects assigned to other labels like

126:

MIT Computer Science and Artificial Intelligence Laboratory

907:{\displaystyle \mathrm {I} _{\mathrm {O} ,\mathrm {P} }\,} 835:{\displaystyle \mathrm {N} _{\mathrm {O} ,\mathrm {P} }\,} 507:{\displaystyle \mathrm {S} _{\mathrm {O} ,\mathrm {P} }\,} 967:

is a concentration parameter. The authors of LabelMe use

428:

denote the set of images containing an object (e.g. car)

463:

denote the set of images containing a part (e.g. wheel)

640:

denote the images where object and part polygons have

1182:: A Database and Web-Based Tool for Image Annotation" 973: 952: 920: 880: 848: 808: 742: 713: 692: 646: 589: 520: 480: 437: 402: 707:

is some threshold value. The authors of LabelMe use

1037:Histogram intersection can be used. To do this, a 986: 959: 938: 906: 866: 834: 794: 726: 699: 678: 632: 572: 506: 455: 420: 310:. Ideally, when using the data, the object class 736:The object-part score for a candidate label is 1062:List of datasets for machine learning research 939:{\displaystyle \mathrm {I} _{\mathrm {P} }\,} 867:{\displaystyle \mathrm {N} _{\mathrm {P} }\,} 456:{\displaystyle \mathrm {I} _{\mathrm {P} }\,} 421:{\displaystyle \mathrm {I} _{\mathrm {O} }\,} 8: 1160: 983: 972: 956: 951: 935: 928: 927: 922: 919: 903: 896: 888: 887: 882: 879: 863: 856: 855: 850: 847: 831: 824: 816: 815: 810: 807: 791: 775: 774: 769: 760: 752: 751: 746: 743: 741: 723: 712: 696: 691: 675: 662: 654: 653: 648: 645: 629: 622: 621: 616: 605: 597: 596: 591: 588: 569: 558: 550: 540: 532: 524: 521: 519: 503: 496: 488: 487: 482: 479: 452: 445: 444: 439: 436: 417: 410: 409: 404: 401: 109:Learn how and when to remove this message 1218:International Journal of Computer Vision 1189:International Journal of Computer Vision 1093: 184:bounding box that contains the object. 45:Please improve this article by adding 1273:Object recognition and categorization 1077:List of Manual Image Annotation Tools 466:Let the overlap score between object 352:, WordNet still allows the search of 7: 1254:– LabelMe: The open annotation tool 1052:downloading of content on demand. 929: 923: 897: 889: 883: 857: 851: 825: 817: 811: 776: 770: 761: 753: 747: 663: 655: 649: 623: 617: 606: 598: 592: 559: 551: 541: 533: 525: 497: 489: 483: 446: 440: 411: 405: 344:would not show up in a search for 14: 276:to account for this variability. 23: 563: 555: 545: 529: 1: 1252:http://labelme.csail.mit.edu/ 47:secondary or tertiary sources 874:are the number of images in 727:{\displaystyle \beta =0.5\,} 124:is a project created by the 1268:Datasets in computer vision 987:{\displaystyle \alpha =5\,} 328:might bring up pictures of 177:to display a single object. 1289: 1201:10.1007/s11263-007-0090-8 960:{\displaystyle \alpha \,} 1161:Swain & Ballard 1991 700:{\displaystyle \beta \,} 380:. To determine if label 128:(CSAIL) that provides a 272:researchers tune their 988: 961: 940: 908: 868: 836: 796: 728: 701: 680: 634: 574: 508: 457: 422: 236:Problems with the data 34:relies excessively on 1002:Object depth ordering 989: 962: 941: 909: 869: 837: 797: 729: 702: 681: 635: 575: 509: 458: 423: 360:Object-part hierarchy 230:Show me another image 971: 950: 946:, respectively, and 918: 878: 846: 806: 740: 711: 690: 644: 587: 518: 478: 435: 400: 1149:Russell et al. 2008 1137:Russell et al. 2008 1125:Russell et al. 2008 1113:Russell et al. 2008 1101:Russell et al. 2008 376:. These are called 1230:10.1007/BF00130487 984: 957: 936: 904: 864: 832: 792: 724: 697: 676: 630: 570: 504: 453: 418: 280:Extending the data 789: 567: 119: 118: 111: 93: 1280: 1241: 1212: 1195:(1–3): 157–173. 1186: 1163: 1158: 1152: 1146: 1140: 1134: 1128: 1122: 1116: 1110: 1104: 1098: 993: 991: 990: 985: 966: 964: 963: 958: 945: 943: 942: 937: 934: 933: 932: 926: 913: 911: 910: 905: 902: 901: 900: 892: 886: 873: 871: 870: 865: 862: 861: 860: 854: 841: 839: 838: 833: 830: 829: 828: 820: 814: 801: 799: 798: 793: 790: 788: 781: 780: 779: 773: 766: 765: 764: 756: 750: 744: 733: 731: 730: 725: 706: 704: 703: 698: 685: 683: 682: 677: 668: 667: 666: 658: 652: 639: 637: 636: 631: 628: 627: 626: 620: 611: 610: 609: 601: 595: 579: 577: 576: 571: 568: 566: 562: 554: 548: 544: 536: 528: 522: 513: 511: 510: 505: 502: 501: 500: 492: 486: 462: 460: 459: 454: 451: 450: 449: 443: 427: 425: 424: 419: 416: 415: 414: 408: 114: 107: 103: 100: 94: 92: 51: 27: 19: 1288: 1287: 1283: 1282: 1281: 1279: 1278: 1277: 1258: 1257: 1248: 1215: 1184: 1175: 1167: 1166: 1159: 1155: 1147: 1143: 1135: 1131: 1123: 1119: 1111: 1107: 1099: 1095: 1090: 1058: 1049: 1039:color histogram 1004: 969: 968: 948: 947: 921: 916: 915: 881: 876: 875: 849: 844: 843: 809: 804: 803: 768: 767: 745: 738: 737: 709: 708: 688: 687: 647: 642: 641: 615: 590: 585: 584: 549: 523: 516: 515: 481: 476: 475: 438: 433: 432: 403: 398: 397: 362: 287: 282: 238: 204: 202:Annotation Tool 150: 142:computer vision 115: 104: 98: 95: 52: 50: 44: 40:primary sources 28: 17: 12: 11: 5: 1286: 1284: 1276: 1275: 1270: 1260: 1259: 1256: 1255: 1247: 1246:External links 1244: 1243: 1242: 1213: 1172: 1171: 1165: 1164: 1153: 1141: 1129: 1117: 1105: 1092: 1091: 1089: 1086: 1085: 1084: 1079: 1074: 1069: 1067:MNIST database 1064: 1057: 1054: 1048: 1047:Matlab Toolbox 1045: 1044: 1043: 1035: 1032: 1017: 1003: 1000: 996: 995: 982: 979: 976: 955: 931: 925: 899: 895: 891: 885: 859: 853: 827: 823: 819: 813: 787: 784: 778: 772: 763: 759: 755: 749: 734: 722: 719: 716: 695: 674: 671: 665: 661: 657: 651: 625: 619: 614: 608: 604: 600: 594: 581: 565: 561: 557: 553: 547: 543: 539: 535: 531: 527: 499: 495: 491: 485: 464: 448: 442: 429: 413: 407: 361: 358: 286: 283: 281: 278: 269: 268: 253: 250: 237: 234: 203: 200: 199: 198: 191: 188: 185: 178: 163: 149: 146: 134:digital images 117: 116: 31: 29: 22: 15: 13: 10: 9: 6: 4: 3: 2: 1285: 1274: 1271: 1269: 1266: 1265: 1263: 1253: 1250: 1249: 1245: 1239: 1235: 1231: 1227: 1223: 1219: 1214: 1210: 1206: 1202: 1198: 1194: 1190: 1183: 1181: 1174: 1173: 1169: 1168: 1162: 1157: 1154: 1151:, Section 3.3 1150: 1145: 1142: 1139:, Section 3.2 1138: 1133: 1130: 1127:, Section 3.1 1126: 1121: 1118: 1115:, Section 2.2 1114: 1109: 1106: 1103:, Section 2.5 1102: 1097: 1094: 1087: 1083: 1080: 1078: 1075: 1073: 1070: 1068: 1065: 1063: 1060: 1059: 1055: 1053: 1046: 1040: 1036: 1033: 1030: 1026: 1022: 1018: 1014: 1013: 1012: 1009: 1001: 999: 980: 977: 974: 953: 893: 821: 785: 782: 757: 735: 720: 717: 714: 693: 672: 669: 659: 612: 602: 582: 537: 493: 473: 469: 465: 430: 395: 394: 393: 391: 387: 383: 379: 375: 371: 367: 359: 357: 355: 351: 347: 343: 339: 335: 331: 327: 322: 319: 315: 313: 309: 305: 301: 297: 293: 285:Using WordNet 284: 279: 277: 275: 266: 262: 258: 254: 251: 248: 244: 243: 242: 235: 233: 231: 225: 222: 217: 214: 210: 201: 196: 193:Provides non- 192: 189: 186: 183: 179: 176: 172: 168: 164: 162:orientations. 160: 157:Designed for 156: 155: 154: 147: 145: 143: 139: 135: 131: 127: 123: 113: 110: 102: 91: 88: 84: 81: 77: 74: 70: 67: 63: 60: – 59: 55: 54:Find sources: 48: 42: 41: 37: 32:This article 30: 26: 21: 20: 16:Image dataset 1221: 1217: 1192: 1188: 1179: 1170:Bibliography 1156: 1144: 1132: 1120: 1108: 1096: 1050: 1028: 1024: 1020: 1007: 1005: 997: 471: 467: 389: 385: 381: 377: 373: 369: 365: 363: 353: 349: 345: 341: 337: 333: 329: 325: 323: 316: 311: 307: 303: 299: 295: 291: 288: 270: 264: 260: 256: 239: 229: 226: 218: 205: 151: 121: 120: 105: 96: 86: 79: 72: 65: 53: 33: 1072:Caltech 101 378:part labels 350:dog walking 209:web browser 195:copyrighted 159:recognition 138:annotations 99:August 2018 1262:Categories 1088:References 1008:part label 388:for label 386:part label 274:algorithms 265:pedestrian 213:JavaScript 171:normalized 148:Motivation 69:newspapers 36:references 1224:: 11–32. 975:α 954:α 786:α 715:β 694:β 673:β 613:⊆ 538:∩ 470:and part 173:, and/or 58:"LabelMe" 1056:See also 247:occluded 1238:8167136 1209:1900911 1178:"Label 1042:points. 1016:rarely. 374:bicycle 346:animals 318:WordNet 182:polygon 175:resized 167:cropped 130:dataset 122:LabelMe 83:scholar 1236: 1207: 1025:ground 802:where 686:where 338:snakes 326:animal 308:animal 296:canine 257:person 85: 78: 71: 64: 56: 1234:S2CID 1205:S2CID 1185:(PDF) 1027:, or 384:is a 366:wheel 342:mouse 306:, or 304:pooch 300:hound 263:, or 221:mouse 211:with 136:with 90:JSTOR 76:books 1082:VoTT 1029:road 914:and 842:and 670:> 583:Let 431:Let 396:Let 336:and 334:cats 330:dogs 62:news 1226:doi 1197:doi 1021:sky 721:0.5 372:or 370:car 354:dog 312:dog 292:dog 261:man 132:of 38:to 1264:: 1232:. 1220:. 1203:. 1193:77 1191:. 1187:. 1180:Me 1023:, 474:, 392:: 332:, 302:, 298:, 294:, 259:, 169:, 49:. 1240:. 1228:: 1222:7 1211:. 1199:: 1031:. 994:. 981:5 978:= 930:P 924:I 898:P 894:, 890:O 884:I 858:P 852:N 826:P 822:, 818:O 812:N 783:+ 777:P 771:N 762:P 758:, 754:O 748:N 718:= 664:P 660:, 656:O 650:S 624:P 618:I 607:P 603:, 599:O 593:I 580:) 564:) 560:P 556:( 552:A 546:) 542:P 534:O 530:( 526:A 498:P 494:, 490:O 484:S 472:P 468:O 447:P 441:I 412:O 406:I 390:O 382:P 267:? 112:) 106:( 101:) 97:( 87:· 80:· 73:· 66:· 43:.

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

Index