Knowledge (XXG)

Object categorization from image search

Source 📝

22: 2039:: An important question to address is whether OPTIMOL's incremental learning gives it an advantage over traditional batch learning methods, when everything else about the model is held constant. When the classifier learns incrementally, by selecting the next images based on what it learned from the previous ones, three important results are observed: 1941: 2067:
is that of improving search results by taking into account visual information contained in the images themselves. Several CBIR methods make use of classifiers trained on image search results, to refine the search. In other words, object categorization from image search is one component of the system.
2007:
Once an image is accepted by meeting the above criterion and incorporated into the dataset, however, it needs to meet another criterion before it is incorporated into the 揷ache set敆the set of images to be used for training. This set is intended to be a diverse subset of the set of accepted images. If
2003:
are foreground (object) and background categories, respectively, and the ratio of constants describes the risk of accepting false positives and false negatives. They are adjusted automatically at every iteration, with the cost of a false positive set higher than that of a false negative. This ensures
1400:
The dataset must be initialized, or seeded with an original batch of images which serve as good exemplars of the object category to be learned. These can be gathered automatically, using the first page or so of images returned by the search engine (which tend to be better than the subsequent images).
1303:
As expected, training directly on Google data gives higher error rates than training on prepared data.? In about half of the object categories tested do ABS-pLSA and TSI-pLSA perform significantly better than regular pLSA, and in only 2 categories out of 7 does TSI-pLSA perform better than the other
1299:
The authors of the Fergus et al. paper compared performance of the three pLSA algorithms (pLSA, ABS-pLSA, and TSI-pLSA) on handpicked datasets and images returned from Google searches. Performance was measured as the error rate when classifying images in a test set as either containing the image or
1312:
OPTIMOL (automatic Online Picture collection via Incremental MOdel Learning) approaches the problem of learning object categories from online image searches by addressing model learning and searching simultaneously. OPTIMOL is an iterative model that updates its model of the target object category
1203:
can take on. It is a 4-vector, whose components describe the object抯 centroid as well as x and y scales that define a bounding box around the object, so the space of possible values it can take on is enormous. To limit the number of possible object locations to a reasonable number, normal pLSA is
75:
Traditionally, classifiers are trained using sets of images that are labeled by hand. Collecting such a set of images is often a very time-consuming and laborious process. The use of Internet search engines to automate the process of acquiring large sets of labeled images has been described as a
826:
A problem with this model is that it is not translation or scale invariant. Since the positions of the visual words are absolute, changing the size of the object in the image or moving it would have a significant impact on the spatial distribution of the visual words into different bins.
89:
One problem with using Internet image search results as a training set for a classifier is the high percentage of unrelated images within the results. It has been estimated that, when a search engine such as Google images is queried with the name of an object category (such as
1417:
in order to approximate their distributions. Sampling involves generating a value for the random variable in question, based on the state of the other random variables on which it is dependent. Given sufficient samples, a reasonable approximation of the value can be achieved.
158:
as well. Just as knowing the topic(s) of an article allows you to make good guesses about the kinds of words that will appear in it, the distribution of words in an image is dependent on the underlying topics. The pLSA model tells us the probability of seeing each word
2033:: Classification accuracy was compared to the accuracy displayed by the classifier yielded by the pLSA methods discussed earlier. It was discovered that OPTIMOL achieved slightly higher accuracy, obtaining 74.8% accuracy on 7 object categories, as compared to 72.0%. 1722: 102:
Another challenge posed by using Internet image search results as training sets for classifiers is that there is a high amount of variability within object categories, when compared with categories found in hand-labeled datasets such as
2023:: OPTIMOL, it is found, can automatically collect large numbers of good images from the web. The size of the OPTIMOL-retrieved image sets surpass that of large human-labeled image sets for the same categories, such as those found in 1011: 1705: 596: 741: 326: 1172:
descriptors, and vector quantized to match one of 350 words contained in a codebook. The codebook was precomputed from features extracted from a large number of images spanning numerous object categories.
835:
Translation and scale invariant pLSA (TSI-pLSA). This model extends pLSA by adding another latent variable, which describes the spatial location of the target object in an image. Now, the position
1376:. HDP models the distributions of an unspecified number of topics across images in a category, and across categories. The distribution of topics among images in a single category is modeled as a 1387:). To allow the sharing of topics across classes, each of these Dirichlet processes is modeled as a sample from another 損arent?Dirichlet process. HDP was first described by Teh et al. in 2005. 1372:
The two categories (target object and background) are modeled as Hierarchical Dirichlet processes (HDPs). As in the pLSA approach, it is assumed that the images can be described with the
1064: 1508: 787: 1463: 1245: 475: 436: 2001: 1972: 1130: 1095: 818: 1936:{\displaystyle \displaystyle {\frac {P(I|c_{f})}{P(I|c_{b})}}>{\frac {\lambda _{Ac_{b}}-\lambda _{Rc_{b}}}{\lambda _{Rc_{f}}-\lambda _{Ac_{f}}}}{\frac {P(c_{b})}{P(c_{f})}}} 123:(probabilistic latent semantic analysis) and extensions of this model were applied to the problem of object categorization from image search. pLSA was originally developed for 1574: 1552: 1530: 1413:
is used over the latent variables. It is carried out after each new set of images is incorporated into the dataset. Gibbs sampling involves repeatedly sampling from a set of
1289: 1267: 1201: 855: 630: 394: 372: 350: 221: 199: 68:. Ideally, automatic image collection would allow classifiers to be trained with nothing but the category names as input. This problem is closely related to that of 1364:
Note that only the most recently added images are used in each round of learning. This allows the algorithm to run on an arbitrarily large number of input images.
2333: 177: 862: 143:
Just as text documents are made up of words, each of which may be repeated within the document and across documents, images can be modeled as combinations of
1321:
OPTIMOL was presented as a general iterative framework that is independent of the specific model used for category learning. The algorithm is as follows:
1581: 2008:
the model were trained on all accepted images, it might become more and more highly specialized, only accepting images very similar to previous ones.
2068:
OPTIMOL, for example, uses a classifier trained on images collected during previous iterations to select additional images for the returned dataset.
490: 111:. Images of objects can vary widely in a number of important factors, such as scale, pose, lighting, number of objects, and amount of occlusion. 2298: 1709:
This is computed for each new candidate image per iteration. The image is classified as belonging to the category with the highest likelihood.
610:
Absolute position pLSA (ABS-pLSA) attaches location information to each visual word by localizing it to one of X 揵ins?in the image. Here,
637: 228: 1149: 1169: 857:
of a visual word is given relative to this object location, rather than as an absolute position in the image. The new equation is:
2130: 396:. Given a topic, the probability of a certain word appearing as part of that topic is independent of the rest of the image. 2318: 2064: 108: 69: 72:(CBIR), where the goal is to return better image search results rather than training a classifier for image recognition. 2303: 2108: 147:. Just as the entire set of text words are defined by a dictionary, the entire set of visual words is defined in a 61: 1381: 1168:
Using these 4 detectors, approximately 700 features were detected per image. These features were then encoded as
1384: 1098: 821: 478: 1018: 1717:
In order to qualify for incorporation into the dataset, however, an image must satisfy a stronger condition:
1468: 1159: 747: 124: 2196: 1429: 1211: 1205: 441: 402: 1977: 1948: 2338: 1104: 1069: 792: 2159: 1557: 1535: 1513: 1272: 1250: 1184: 1181:
One important question in the TSI-pLSA model is how to determine the values that the random variable
838: 613: 377: 355: 333: 204: 182: 64:
to recognize categories of objects, using only the images retrieved automatically with an Internet
2201: 2313: 2214: 1373: 482: 132: 1377: 2233: 2308: 2259: 2206: 1414: 1269:
Gaussians are tried (allowing for multiple instances of an object in a single image), where
1154: 128: 53: 1006:{\displaystyle \displaystyle P(w,x|d)=\sum _{z=1}^{Z}\sum _{c=1}^{C}P(w,x|c,z)P(c)P(z|d)} 1576:
is a single visual word. The likelihood of an image being in a certain class, then, is:
35:
Please help update this article to reflect recent events or newly available information.
2141: 2063:
Typically, image searches only make use of text associated with images. The problem of
1410: 162: 477:
that maximizes the likelihood of the observed words in each document. To do this, the
2327: 2045:
Incremental learning allows OPTIMOL to learn faster (by discarding irrelevant images)
1510:
can be obtained from model learned after the previous round of Gibbs sampling, where
65: 2181: 2218: 1700:{\displaystyle \displaystyle P(I|c)=\prod _{i}\sum _{j}P(x_{i}|z_{j},c)P(z_{j}|c)} 2278: 2024: 104: 2210: 591:{\displaystyle \displaystyle L=\prod _{d=1}^{D}\prod _{w=1}^{W}P(w|d)^{n(w|d)}} 820:
can be solved for in a manner similar to the original pLSA problem, using the
632:
represents which of the bins the visual word falls into. The new equation is:
2160:"OPTIMOL: automatic Online Picture collection via Incremental MOdel Learning" 2071:
Examples of CBIR methods that model object categories from image search are:
2049: 2263: 2052:
of the classifier; in fact, incremental learning yielded an improvement
1409:
To learn the various parameters of the HDP in an incremental manner,
1145:
Words in an image were selected using 4 different feature detectors:
94:), up to 85% of the returned images are unrelated to the category. 1328:
a large set of images from the Internet by searching for a keyword
736:{\displaystyle \displaystyle P(w|d)=\sum _{z=1}^{Z}P(w,x|z)P(z|d)} 131:. It makes the assumption that images are documents that fit the 2167:
Proc. IEEE Conference on Computer Vision and Pattern Recognition
321:{\displaystyle \displaystyle P(w|d)=\sum _{z=1}^{Z}P(w|z)P(z|d)} 120: 76:
potential way of greatly facilitating computer vision research.
2042:
Incremental learning allows OPTIMOL to collect a better dataset
2016:
Performance of the OPTIMOL method is defined by three factors:
15: 1204:
first carried out on the set of images, and for each topic a
1401:
Alternatively, the initial images can be gathered by hand.
2107:
Fergus, R.; Fei-Fei, L.; Perona, P.; Zisserman,A. (2005).
2283:
ACM SIGMM workshop on Multimedia information retrieval
2116:
Proc. IEEE International Conference on Computer Vision
2109:"Learning Object Categories from Google抯 Image Search" 2059:
Object categorization in content-based image retrieval
1981: 1980: 1952: 1951: 1726: 1725: 1585: 1584: 1561: 1560: 1539: 1538: 1517: 1516: 1472: 1471: 1433: 1432: 1276: 1275: 1254: 1253: 1215: 1214: 1188: 1187: 1108: 1107: 1073: 1072: 1022: 1021: 866: 865: 842: 841: 796: 795: 751: 750: 641: 640: 617: 616: 494: 493: 445: 444: 406: 405: 381: 380: 359: 358: 337: 336: 232: 231: 208: 207: 186: 185: 165: 2254:
Berg, T.; Forsyth, D. (2006). "Animals on the web".
2048:
Incremental learning does not negatively affect the
1313:
while concurrently retrieving more relevant images.
330:An important assumption made in this model is that 2180:Teh, Yw; Jordan, MI; Beal, MJ; Blei,David (2006). 1995: 1966: 1935: 1699: 1568: 1546: 1524: 1502: 1457: 1283: 1261: 1239: 1195: 1124: 1089: 1058: 1005: 849: 812: 781: 735: 624: 590: 469: 430: 388: 366: 344: 320: 215: 193: 171: 1346:the model with most recently added dataset images 2189:Journal of the American Statistical Association 2256:Proc. Computer Vision and Pattern Recognition 1132:can be assumed to be a uniform distribution. 8: 2234:"A visual category filter for Google images" 2232:Fergus, R.; Perona,P.; Zisserman,A. (2004). 2158:Li, Li-Jia; Wang, Gang; Fei-Fei, Li (2007). 2241:Proc. 8th European Conf. on Computer Vision 1164:Edge based operator, described in the study 1208:is fit over the visual words, weighted by 2200: 1986: 1979: 1957: 1950: 1920: 1899: 1886: 1875: 1867: 1852: 1844: 1830: 1822: 1807: 1799: 1792: 1777: 1768: 1748: 1739: 1727: 1724: 1685: 1679: 1654: 1645: 1639: 1623: 1613: 1595: 1583: 1559: 1537: 1515: 1482: 1470: 1443: 1431: 1352:downloaded images using the updated model 1274: 1252: 1225: 1213: 1186: 1106: 1071: 1038: 1020: 991: 953: 932: 921: 911: 900: 882: 864: 840: 794: 767: 749: 721: 701: 680: 669: 651: 639: 615: 574: 564: 552: 537: 526: 516: 505: 492: 455: 443: 416: 404: 379: 357: 335: 306: 286: 271: 260: 242: 230: 206: 184: 164: 2131:"Probabilistic Latent Semantic Analysis" 2102: 2100: 2098: 2096: 2094: 1059:{\displaystyle \displaystyle P(w,x|c,z)} 2090: 1713:Addition to the dataset and "cache set" 58:object categorization from image search 2299:Probabilistic latent semantic analysis 2138:Uncertainty in Artificial Intelligence 1503:{\displaystyle \displaystyle P(x|z,c)} 782:{\displaystyle \displaystyle P(w,x|z)} 481:algorithm is used, with the following 2334:Object recognition and categorization 399:Training this model involves finding 7: 2004:that a better dataset is collected. 1458:{\displaystyle \displaystyle P(z|c)} 1240:{\displaystyle \displaystyle P(w|z)} 470:{\displaystyle \displaystyle P(z|d)} 431:{\displaystyle \displaystyle P(w|z)} 374:are conditionally independent given 2279:"Probabilistic web image gathering" 1996:{\displaystyle \displaystyle c_{b}} 1967:{\displaystyle \displaystyle c_{f}} 1340:more images needed in the dataset: 2182:"Hierarchical Dirichlet Processes" 1125:{\displaystyle \displaystyle P(c)} 1090:{\displaystyle \displaystyle P(d)} 813:{\displaystyle \displaystyle P(d)} 119:In a 2005 paper by Fergus et al., 14: 1170:Scale-invariant feature transform 127:, but has since been applied to 20: 1569:{\displaystyle \displaystyle x} 1547:{\displaystyle \displaystyle c} 1525:{\displaystyle \displaystyle z} 1284:{\displaystyle \displaystyle K} 1262:{\displaystyle \displaystyle K} 1196:{\displaystyle \displaystyle C} 850:{\displaystyle \displaystyle x} 625:{\displaystyle \displaystyle x} 389:{\displaystyle \displaystyle z} 367:{\displaystyle \displaystyle d} 345:{\displaystyle \displaystyle w} 216:{\displaystyle \displaystyle z} 194:{\displaystyle \displaystyle d} 2277:Yanai, K; Barnard, K. (2005). 2037:Comparison with batch learning 1926: 1913: 1905: 1892: 1783: 1769: 1762: 1754: 1740: 1733: 1693: 1686: 1672: 1666: 1646: 1632: 1603: 1596: 1589: 1496: 1483: 1476: 1451: 1444: 1437: 1358:accepted images to the dataset 1233: 1226: 1219: 1118: 1112: 1083: 1077: 1052: 1039: 1026: 999: 992: 985: 979: 973: 967: 954: 941: 890: 883: 870: 806: 800: 775: 768: 755: 729: 722: 715: 709: 702: 689: 659: 652: 645: 582: 575: 568: 561: 553: 546: 463: 456: 449: 424: 417: 410: 314: 307: 300: 294: 287: 280: 250: 243: 236: 1: 2319:Content-based image retrieval 2065:content-based image retrieval 1150:Kadir–Brady saliency detector 70:content-based image retrieval 60:is the problem of training a 1334:the dataset with seed images 1300:containing only background. 154:pLSA divides documents into 2304:Latent Dirichlet allocation 1155:Multi-scale Harris detector 2355: 2211:10.1198/016214506000000302 2021:Ability to collect images 1177:Possible object locations 29:This article needs to be 2129:Hofmann, Thomas (1999). 2081:Yanai and Barnard, 2006 1385:probability distribution 1097:can be solved using the 479:expectation maximization 2078:Berg and Forsyth, 2006 2031:Classification accuracy 1160:Difference of Gaussians 125:document classification 98:Intra-class variability 1997: 1968: 1937: 1701: 1570: 1548: 1526: 1504: 1459: 1285: 1263: 1241: 1206:Gaussian mixture model 1197: 1126: 1091: 1060: 1015:Again, the parameters 1007: 937: 916: 851: 814: 783: 737: 685: 626: 592: 542: 521: 471: 432: 390: 368: 346: 322: 276: 217: 195: 173: 1998: 1969: 1938: 1702: 1571: 1549: 1527: 1505: 1460: 1286: 1264: 1242: 1198: 1127: 1092: 1061: 1008: 917: 896: 852: 815: 784: 738: 665: 627: 593: 522: 501: 472: 433: 391: 369: 347: 323: 256: 218: 196: 174: 2264:10.1109/CVPR.2006.57 2075:Fergus et al., 2004 1978: 1949: 1723: 1582: 1558: 1536: 1514: 1469: 1430: 1273: 1251: 1212: 1185: 1105: 1070: 1019: 863: 839: 793: 748: 638: 614: 491: 442: 403: 378: 356: 334: 229: 205: 183: 163: 1554:is a category, and 1426:At each iteration, 201:in terms of topics 179:given the category 149:codeword dictionary 2314:Bag of words model 1993: 1992: 1964: 1963: 1933: 1932: 1697: 1696: 1628: 1618: 1566: 1565: 1544: 1543: 1522: 1521: 1500: 1499: 1455: 1454: 1374:bag of words model 1281: 1280: 1259: 1258: 1237: 1236: 1193: 1192: 1122: 1121: 1087: 1086: 1056: 1055: 1003: 1002: 847: 846: 810: 809: 779: 778: 733: 732: 622: 621: 588: 587: 483:objective function 467: 466: 428: 427: 386: 385: 364: 363: 342: 341: 318: 317: 213: 212: 191: 190: 169: 133:bag of words model 1930: 1884: 1787: 1619: 1609: 1378:Dirichlet process 1317:General framework 172:{\displaystyle w} 56:, the problem of 50: 49: 2346: 2309:Machine learning 2287: 2286: 2274: 2268: 2267: 2251: 2245: 2244: 2238: 2229: 2223: 2222: 2204: 2186: 2177: 2171: 2170: 2164: 2155: 2149: 2148: 2146: 2140:. Archived from 2135: 2126: 2120: 2119: 2113: 2104: 2002: 2000: 1999: 1994: 1991: 1990: 1973: 1971: 1970: 1965: 1962: 1961: 1942: 1940: 1939: 1934: 1931: 1929: 1925: 1924: 1908: 1904: 1903: 1887: 1885: 1883: 1882: 1881: 1880: 1879: 1859: 1858: 1857: 1856: 1838: 1837: 1836: 1835: 1834: 1814: 1813: 1812: 1811: 1793: 1788: 1786: 1782: 1781: 1772: 1757: 1753: 1752: 1743: 1728: 1706: 1704: 1703: 1698: 1689: 1684: 1683: 1659: 1658: 1649: 1644: 1643: 1627: 1617: 1599: 1575: 1573: 1572: 1567: 1553: 1551: 1550: 1545: 1531: 1529: 1528: 1523: 1509: 1507: 1506: 1501: 1486: 1464: 1462: 1461: 1456: 1447: 1415:random variables 1290: 1288: 1287: 1282: 1268: 1266: 1265: 1260: 1246: 1244: 1243: 1238: 1229: 1202: 1200: 1199: 1194: 1131: 1129: 1128: 1123: 1096: 1094: 1093: 1088: 1065: 1063: 1062: 1057: 1042: 1012: 1010: 1009: 1004: 995: 957: 936: 931: 915: 910: 886: 856: 854: 853: 848: 819: 817: 816: 811: 788: 786: 785: 780: 771: 742: 740: 739: 734: 725: 705: 684: 679: 655: 631: 629: 628: 623: 597: 595: 594: 589: 586: 585: 578: 556: 541: 536: 520: 515: 476: 474: 473: 468: 459: 437: 435: 434: 429: 420: 395: 393: 392: 387: 373: 371: 370: 365: 351: 349: 348: 343: 327: 325: 324: 319: 310: 290: 275: 270: 246: 222: 220: 219: 214: 200: 198: 197: 192: 178: 176: 175: 170: 85:Unrelated images 45: 42: 36: 24: 23: 16: 2354: 2353: 2349: 2348: 2347: 2345: 2344: 2343: 2324: 2323: 2295: 2290: 2276: 2275: 2271: 2253: 2252: 2248: 2236: 2231: 2230: 2226: 2184: 2179: 2178: 2174: 2162: 2157: 2156: 2152: 2144: 2133: 2128: 2127: 2123: 2111: 2106: 2105: 2092: 2088: 2061: 2014: 1982: 1976: 1975: 1953: 1947: 1946: 1916: 1909: 1895: 1888: 1871: 1863: 1848: 1840: 1839: 1826: 1818: 1803: 1795: 1794: 1773: 1758: 1744: 1729: 1721: 1720: 1715: 1675: 1650: 1635: 1580: 1579: 1556: 1555: 1534: 1533: 1512: 1511: 1467: 1466: 1428: 1427: 1424: 1407: 1398: 1393: 1370: 1319: 1310: 1297: 1291:is a constant. 1271: 1270: 1249: 1248: 1210: 1209: 1183: 1182: 1179: 1143: 1141:Selecting words 1138: 1103: 1102: 1068: 1067: 1017: 1016: 861: 860: 837: 836: 833: 791: 790: 746: 745: 636: 635: 612: 611: 608: 603: 560: 489: 488: 440: 439: 401: 400: 376: 375: 354: 353: 332: 331: 227: 226: 203: 202: 181: 180: 161: 160: 141: 129:computer vision 117: 100: 87: 82: 54:computer vision 46: 40: 37: 34: 25: 21: 12: 11: 5: 2352: 2350: 2342: 2341: 2336: 2326: 2325: 2322: 2321: 2316: 2311: 2306: 2301: 2294: 2291: 2289: 2288: 2269: 2246: 2224: 2172: 2150: 2147:on 2007-07-10. 2121: 2089: 2087: 2084: 2083: 2082: 2079: 2076: 2060: 2057: 2056: 2055: 2054: 2053: 2046: 2043: 2034: 2028: 2013: 2010: 1989: 1985: 1960: 1956: 1928: 1923: 1919: 1915: 1912: 1907: 1902: 1898: 1894: 1891: 1878: 1874: 1870: 1866: 1862: 1855: 1851: 1847: 1843: 1833: 1829: 1825: 1821: 1817: 1810: 1806: 1802: 1798: 1791: 1785: 1780: 1776: 1771: 1767: 1764: 1761: 1756: 1751: 1747: 1742: 1738: 1735: 1732: 1714: 1711: 1695: 1692: 1688: 1682: 1678: 1674: 1671: 1668: 1665: 1662: 1657: 1653: 1648: 1642: 1638: 1634: 1631: 1626: 1622: 1616: 1612: 1608: 1605: 1602: 1598: 1594: 1591: 1588: 1564: 1542: 1520: 1498: 1495: 1492: 1489: 1485: 1481: 1478: 1475: 1453: 1450: 1446: 1442: 1439: 1436: 1423: 1422:Classification 1420: 1411:Gibbs sampling 1406: 1405:Model learning 1403: 1397: 1396:Initialization 1394: 1392: 1391:Implementation 1389: 1382:non-parametric 1369: 1366: 1362: 1361: 1360: 1359: 1353: 1347: 1335: 1329: 1318: 1315: 1309: 1306: 1296: 1293: 1279: 1257: 1235: 1232: 1228: 1224: 1221: 1218: 1191: 1178: 1175: 1166: 1165: 1162: 1157: 1152: 1142: 1139: 1137: 1136:Implementation 1134: 1120: 1117: 1114: 1111: 1085: 1082: 1079: 1076: 1054: 1051: 1048: 1045: 1041: 1037: 1034: 1031: 1028: 1025: 1001: 998: 994: 990: 987: 984: 981: 978: 975: 972: 969: 966: 963: 960: 956: 952: 949: 946: 943: 940: 935: 930: 927: 924: 920: 914: 909: 906: 903: 899: 895: 892: 889: 885: 881: 878: 875: 872: 869: 845: 832: 829: 808: 805: 802: 799: 777: 774: 770: 766: 763: 760: 757: 754: 731: 728: 724: 720: 717: 714: 711: 708: 704: 700: 697: 694: 691: 688: 683: 678: 675: 672: 668: 664: 661: 658: 654: 650: 647: 644: 620: 607: 604: 602: 599: 584: 581: 577: 573: 570: 567: 563: 559: 555: 551: 548: 545: 540: 535: 532: 529: 525: 519: 514: 511: 508: 504: 500: 497: 465: 462: 458: 454: 451: 448: 426: 423: 419: 415: 412: 409: 384: 362: 340: 316: 313: 309: 305: 302: 299: 296: 293: 289: 285: 282: 279: 274: 269: 266: 263: 259: 255: 252: 249: 245: 241: 238: 235: 211: 189: 168: 140: 137: 116: 113: 99: 96: 86: 83: 81: 78: 48: 47: 41:September 2019 28: 26: 19: 13: 10: 9: 6: 4: 3: 2: 2351: 2340: 2337: 2335: 2332: 2331: 2329: 2320: 2317: 2315: 2312: 2310: 2307: 2305: 2302: 2300: 2297: 2296: 2292: 2284: 2280: 2273: 2270: 2265: 2261: 2257: 2250: 2247: 2242: 2235: 2228: 2225: 2220: 2216: 2212: 2208: 2203: 2202:10.1.1.5.9094 2198: 2195:(476): 1566. 2194: 2190: 2183: 2176: 2173: 2168: 2161: 2154: 2151: 2143: 2139: 2132: 2125: 2122: 2117: 2110: 2103: 2101: 2099: 2097: 2095: 2091: 2085: 2080: 2077: 2074: 2073: 2072: 2069: 2066: 2058: 2051: 2047: 2044: 2041: 2040: 2038: 2035: 2032: 2029: 2026: 2022: 2019: 2018: 2017: 2011: 2009: 2005: 1987: 1983: 1958: 1954: 1943: 1921: 1917: 1910: 1900: 1896: 1889: 1876: 1872: 1868: 1864: 1860: 1853: 1849: 1845: 1841: 1831: 1827: 1823: 1819: 1815: 1808: 1804: 1800: 1796: 1789: 1778: 1774: 1765: 1759: 1749: 1745: 1736: 1730: 1718: 1712: 1710: 1707: 1690: 1680: 1676: 1669: 1663: 1660: 1655: 1651: 1640: 1636: 1629: 1624: 1620: 1614: 1610: 1606: 1600: 1592: 1586: 1577: 1562: 1540: 1518: 1493: 1490: 1487: 1479: 1473: 1448: 1440: 1434: 1421: 1419: 1416: 1412: 1404: 1402: 1395: 1390: 1388: 1386: 1383: 1379: 1375: 1367: 1365: 1357: 1354: 1351: 1348: 1345: 1342: 1341: 1339: 1336: 1333: 1330: 1327: 1324: 1323: 1322: 1316: 1314: 1307: 1305: 1301: 1294: 1292: 1277: 1255: 1230: 1222: 1216: 1207: 1189: 1176: 1174: 1171: 1163: 1161: 1158: 1156: 1153: 1151: 1148: 1147: 1146: 1140: 1135: 1133: 1115: 1109: 1100: 1080: 1074: 1049: 1046: 1043: 1035: 1032: 1029: 1023: 1013: 996: 988: 982: 976: 970: 964: 961: 958: 950: 947: 944: 938: 933: 928: 925: 922: 918: 912: 907: 904: 901: 897: 893: 887: 879: 876: 873: 867: 858: 843: 830: 828: 824: 823: 803: 797: 772: 764: 761: 758: 752: 743: 726: 718: 712: 706: 698: 695: 692: 686: 681: 676: 673: 670: 666: 662: 656: 648: 642: 633: 618: 605: 600: 598: 579: 571: 565: 557: 549: 543: 538: 533: 530: 527: 523: 517: 512: 509: 506: 502: 498: 495: 486: 484: 480: 460: 452: 446: 421: 413: 407: 397: 382: 360: 338: 328: 311: 303: 297: 291: 283: 277: 272: 267: 264: 261: 257: 253: 247: 239: 233: 224: 209: 187: 166: 157: 152: 150: 146: 138: 136: 134: 130: 126: 122: 115:pLSA approach 114: 112: 110: 106: 97: 95: 93: 84: 79: 77: 73: 71: 67: 66:search engine 63: 59: 55: 44: 32: 27: 18: 17: 2339:Image search 2282: 2272: 2255: 2249: 2240: 2227: 2192: 2188: 2175: 2166: 2153: 2142:the original 2137: 2124: 2115: 2070: 2062: 2036: 2030: 2020: 2015: 2006: 1944: 1719: 1716: 1708: 1578: 1532:is a topic, 1425: 1408: 1399: 1371: 1363: 1355: 1349: 1343: 1337: 1331: 1325: 1320: 1311: 1304:two models. 1302: 1298: 1180: 1167: 1144: 1099:EM algorithm 1014: 859: 834: 825: 822:EM algorithm 744: 634: 609: 487: 398: 329: 225: 155: 153: 148: 145:visual words 144: 142: 118: 101: 91: 88: 74: 57: 51: 38: 30: 2025:Caltech 101 2012:Performance 1380:(a type of 1295:Performance 601:Application 105:Caltech 101 2328:Categories 2086:References 1332:Initialize 80:Challenges 62:classifier 2197:CiteSeerX 2050:ROC curve 1865:λ 1861:− 1842:λ 1820:λ 1816:− 1797:λ 1621:∑ 1611:∏ 919:∑ 898:∑ 667:∑ 524:∏ 503:∏ 258:∑ 2293:See also 1350:Classify 1326:Download 1247:. Up to 831:TSI-pLSA 606:ABS-pLSA 92:airplane 2219:7934949 1308:OPTIMOL 31:updated 2217:  2199:  1945:Where 156:topics 109:Pascal 2237:(PDF) 2215:S2CID 2185:(PDF) 2163:(PDF) 2145:(PDF) 2134:(PDF) 2112:(PDF) 1368:Model 1344:Learn 1338:While 139:Model 1974:and 1790:> 1465:and 1066:and 789:and 438:and 352:and 121:pLSA 107:and 2260:doi 2207:doi 2193:101 1356:Add 52:In 2330:: 2281:. 2258:. 2239:. 2213:. 2205:. 2191:. 2187:. 2165:. 2136:. 2114:. 2093:^ 1101:. 485:: 223:: 151:. 135:. 2285:. 2266:. 2262:: 2243:. 2221:. 2209:: 2169:. 2118:. 2027:. 1988:b 1984:c 1959:f 1955:c 1927:) 1922:f 1918:c 1914:( 1911:P 1906:) 1901:b 1897:c 1893:( 1890:P 1877:f 1873:c 1869:A 1854:f 1850:c 1846:R 1832:b 1828:c 1824:R 1809:b 1805:c 1801:A 1784:) 1779:b 1775:c 1770:| 1766:I 1763:( 1760:P 1755:) 1750:f 1746:c 1741:| 1737:I 1734:( 1731:P 1694:) 1691:c 1687:| 1681:j 1677:z 1673:( 1670:P 1667:) 1664:c 1661:, 1656:j 1652:z 1647:| 1641:i 1637:x 1633:( 1630:P 1625:j 1615:i 1607:= 1604:) 1601:c 1597:| 1593:I 1590:( 1587:P 1563:x 1541:c 1519:z 1497:) 1494:c 1491:, 1488:z 1484:| 1480:x 1477:( 1474:P 1452:) 1449:c 1445:| 1441:z 1438:( 1435:P 1278:K 1256:K 1234:) 1231:z 1227:| 1223:w 1220:( 1217:P 1190:C 1119:) 1116:c 1113:( 1110:P 1084:) 1081:d 1078:( 1075:P 1053:) 1050:z 1047:, 1044:c 1040:| 1036:x 1033:, 1030:w 1027:( 1024:P 1000:) 997:d 993:| 989:z 986:( 983:P 980:) 977:c 974:( 971:P 968:) 965:z 962:, 959:c 955:| 951:x 948:, 945:w 942:( 939:P 934:C 929:1 926:= 923:c 913:Z 908:1 905:= 902:z 894:= 891:) 888:d 884:| 880:x 877:, 874:w 871:( 868:P 844:x 807:) 804:d 801:( 798:P 776:) 773:z 769:| 765:x 762:, 759:w 756:( 753:P 730:) 727:d 723:| 719:z 716:( 713:P 710:) 707:z 703:| 699:x 696:, 693:w 690:( 687:P 682:Z 677:1 674:= 671:z 663:= 660:) 657:d 653:| 649:w 646:( 643:P 619:x 583:) 580:d 576:| 572:w 569:( 566:n 562:) 558:d 554:| 550:w 547:( 544:P 539:W 534:1 531:= 528:w 518:D 513:1 510:= 507:d 499:= 496:L 464:) 461:d 457:| 453:z 450:( 447:P 425:) 422:z 418:| 414:w 411:( 408:P 383:z 361:d 339:w 315:) 312:d 308:| 304:z 301:( 298:P 295:) 292:z 288:| 284:w 281:( 278:P 273:Z 268:1 265:= 262:z 254:= 251:) 248:d 244:| 240:w 237:( 234:P 210:z 188:d 167:w 43:) 39:( 33:.

Index

computer vision
classifier
search engine
content-based image retrieval
Caltech 101
Pascal
pLSA
document classification
computer vision
bag of words model
expectation maximization
objective function
EM algorithm
EM algorithm
Kadir–Brady saliency detector
Multi-scale Harris detector
Difference of Gaussians
Scale-invariant feature transform
Gaussian mixture model
bag of words model
Dirichlet process
non-parametric
probability distribution
Gibbs sampling
random variables
Caltech 101
ROC curve
content-based image retrieval

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.