Knowledge

Scene text

Source πŸ“

26: 189:
Lucas, Simon M.; Panaretos, Alex; Sosa, Luis; Tang, Anthony; Wong, Shirley; Young, Robert; Ashida, Kazuki; Nagai, Hiroki; Okamoto, Masayuki; Yamamoto, Hiroaki; Miyao, Hidetoshi; Zhu, Junmin; Ou, Wuwen; Wolf, Christian; Jolion, Jean-Michel; Todoran, Leon; Worring, Marcel; Lin, Xiaofan (2005). "S. M.
102:
In word recognition, the text is assumed to be already detected and located and the rectangular bounding box containing the text is available. The word present in the bounding box needs to be recognized. The methods available to perform word recognition can be broadly classified into top-down and
106:
In the top-down approaches, a set of words from a dictionary is used to identify which word suits the given image. Images are not segmented in most of these methods. Hence, the top-down approach is sometimes referred as segmentation free recognition.
41:
which became important after smart phones with good cameras became ubiquitous. The text in scene images varies in shape, font, colour and position. The recognition of scene text is further complicated sometimes by non-uniform illumination and focus.
49:(ICDAR) conducts a robust reading competition once in two years. The competition was held in 2003, 2005 and during every ICDAR conference. International association for pattern recognition (IAPR) has created a list of datasets as Reading systems. 88:(DWT) are used to extract the high frequency coefficients. It is assumed that the text present in an image has high frequency components and selecting only the high frequency coefficients filters the text from the non-text regions in an image. 57:
Text detection is the process of detecting the text present in the image, followed by surrounding it with a rectangular bounding box. Text detection can be carried out using image based techniques or frequency based techniques.
467:
Kumar, Deepak; Anil Prasad, M. N.; Ramakrishnan, A. G. (2013). "NESP: Nonlinear enhancement and selection of plane for optimal segmentation and recognition of scene word images". In Zanibbi, Richard; CoΓΌasnon, Bertrand (eds.).
65:
into multiple segments. Each segment is a connected component of pixels with similar characteristics. The statistical features of connected components are utilised to group them and form the text.
46: 288:"J. J. Weinmann, E. Learned-Miller, and A. R. Hanson. Scene text recognition using similarity and a lexicon with sparse belief propagation. IEEE Trans. PAMI, 31(10):1733–1746, 2009" 110:
In the bottom-up approaches, the image is segmented into multiple components and the segmented image is passed through a recognition engine. Either an off the shelf
352:
Novikova, Tatiana; Barinova, Olga; Kohli, Pushmeet; Lempitsky, Victor (2012). "Large-Lexicon Attribute-Consistent Text Recognition in Natural Images".
412: 379: 152: 397:
D. Kumar and A. G. Ramakrishnan. Power-law transformation for enhanced recognition of born-digital word images. In Proc. 9th SPCOM, 2012
395:
Kumar, Deepak; Ramakrishnan, A. G. (2012). "Power-law transformation for enhanced recognition of born-digital word images".
336:"A. Mishra, K. Alahari, and C. V. Jawahar. Scene Text Recognition using Higher Order Language Priors. In Proc. BMVC, 2012" 111: 74: 190:
Lucas. ICDAR 2003 Robust Reading Competitions: Entries, Results, and Future Directions. IJDAR, 7(2):105–122, June 2005".
97: 176: 85: 81: 546: 541: 436:
D. Kumar; M. N. Anil Prasad; A. G. Ramakrishnan. "MAPS: Midline analysis and propagation of segmentation".
357: 199: 70: 362: 204: 481: 449: 418: 217: 158: 62: 29:
The image displays the coach category in text format. We can observe that the coach belongs to
408: 375: 335: 317: 273: 148: 473: 441: 400: 367: 307: 299: 209: 140: 66: 38: 312: 287: 535: 137:
S. M. Lucas. Text Locating Competition Results. In Proc. 8th ICDAR, pages 80–85, 2005
485: 453: 422: 221: 177:
http://www.iapr-tc11.org/mediawiki/index.php/ICDAR_2005_Robust_Reading_Competitions
162: 30: 371: 22:
is text that appears in an image captured by a camera in an outdoor environment.
404: 260: 213: 25: 132: 445: 37:
The detection and recognition of scene text from camera captured images are
523: 321: 144: 303: 477: 356:. Lecture Notes in Computer Science. Vol. 7577. pp. 752–765. 114:(OCR) engine or a custom-trained one is used to recognise the text. 24: 292:
IEEE Transactions on Pattern Analysis and Machine Intelligence
248: 47:
International Conference on Document Analysis and Recognition
16:
Text captured as part of outdoor surroundings in a photograph
77:
are used to classify the components into text and non-text.
274:
http://www.iapr-tc11.org/mediawiki/index.php?title=Datasets
235: 192:
International Journal of Document Analysis and Recognition
511: 499: 286:
Weinman, J.J.; Learned-Miller, E.; Hanson, A.R. (2009).
133:"ICDAR 2005 text locating competition results" 8: 261:http://www.cvc.uab.es/icdar2011competition/ 361: 311: 272:IAPR TC11 Reading Systems-Datasets List. 203: 524:http://code.google.com/p/tesseract-ocr/ 259:ICDAR 2011 Robust Reading Competition. 123: 61:In image based techniques, an image is 45:To improve scene text recognition, the 470:Document Recognition and Retrieval XX 7: 472:. Vol. 8658. p. 865806. 14: 80:In frequency based techniques, 1: 112:Optical character recognition 75:convolutional neural networks 372:10.1007/978-3-642-33783-3_54 98:intelligent word recognition 354:Computer Vision – ECCV 2012 249:http://u-pat.org/ICDAR2017/ 563: 405:10.1109/SPCOM.2012.6290009 95: 86:discrete wavelet transform 82:discrete Fourier transform 510:Nuance Omnipage Reader. 214:10.1007/s10032-004-0134-3 175:ICDAR 2005 Competitions. 139:. pp. 80–84 Vol. 1. 236:http://www.icdar2013.org 446:10.1145/2425333.2425348 522:Tesseract OCR Engine. 512:http://www.nuance.com/ 438:Proc. 8th ICVGIP, 2012 145:10.1109/ICDAR.2005.231 103:bottom-up approaches. 71:support vector machine 34: 500:http://www.abbyy.com/ 304:10.1109/TPAMI.2009.38 39:computer vision tasks 28: 131:Lucas, S.M. (2005). 498:Abbyy Fine Reader. 69:approaches such as 478:10.1117/12.2008519 35: 414:978-1-4673-2014-6 381:978-3-642-33782-6 298:(10): 1733–1746. 154:978-0-7695-2420-7 554: 547:Image processing 526: 520: 514: 508: 502: 496: 490: 489: 464: 458: 457: 433: 427: 426: 399:. pp. 1–5. 392: 386: 385: 365: 349: 343: 342: 340: 332: 326: 325: 315: 283: 277: 270: 264: 257: 251: 245: 239: 232: 226: 225: 207: 198:(2–3): 105–122. 186: 180: 173: 167: 166: 128: 92:Word recognition 67:Machine learning 562: 561: 557: 556: 555: 553: 552: 551: 542:Computer vision 532: 531: 530: 529: 521: 517: 509: 505: 497: 493: 466: 465: 461: 435: 434: 430: 415: 394: 393: 389: 382: 363:10.1.1.296.4807 351: 350: 346: 338: 334: 333: 329: 285: 284: 280: 271: 267: 258: 254: 246: 242: 233: 229: 205:10.1.1.104.1667 188: 187: 183: 174: 170: 155: 130: 129: 125: 120: 100: 94: 55: 17: 12: 11: 5: 560: 558: 550: 549: 544: 534: 533: 528: 527: 515: 503: 491: 459: 428: 413: 387: 380: 344: 327: 278: 265: 252: 240: 227: 181: 168: 153: 122: 121: 119: 116: 93: 90: 54: 53:Text detection 51: 15: 13: 10: 9: 6: 4: 3: 2: 559: 548: 545: 543: 540: 539: 537: 525: 519: 516: 513: 507: 504: 501: 495: 492: 487: 483: 479: 475: 471: 463: 460: 455: 451: 447: 443: 439: 432: 429: 424: 420: 416: 410: 406: 402: 398: 391: 388: 383: 377: 373: 369: 364: 359: 355: 348: 345: 337: 331: 328: 323: 319: 314: 309: 305: 301: 297: 293: 289: 282: 279: 275: 269: 266: 262: 256: 253: 250: 244: 241: 237: 231: 228: 223: 219: 215: 211: 206: 201: 197: 193: 185: 182: 178: 172: 169: 164: 160: 156: 150: 146: 142: 138: 134: 127: 124: 117: 115: 113: 108: 104: 99: 91: 89: 87: 83: 78: 76: 72: 68: 64: 59: 52: 50: 48: 43: 40: 32: 27: 23: 21: 518: 506: 494: 469: 462: 437: 431: 396: 390: 353: 347: 330: 295: 291: 281: 268: 255: 247:ICDAR 2017. 243: 234:ICDAR 2013. 230: 195: 191: 184: 171: 136: 126: 109: 105: 101: 79: 60: 56: 44: 36: 19: 18: 536:Categories 118:References 96:See also: 20:Scene text 358:CiteSeerX 200:CiteSeerX 84:(DFT) or 63:segmented 33:category. 486:13848101 454:13303734 423:13876092 322:19696446 313:3021989 222:2250003 163:1842569 31:Sleeper 484:  452:  421:  411:  378:  360:  320:  310:  220:  202:  161:  151:  482:S2CID 450:S2CID 419:S2CID 339:(PDF) 218:S2CID 159:S2CID 409:ISBN 376:ISBN 318:PMID 149:ISBN 73:and 474:doi 442:doi 401:doi 368:doi 308:PMC 300:doi 210:doi 141:doi 538:: 480:. 448:. 440:. 417:. 407:. 374:. 366:. 316:. 306:. 296:31 294:. 290:. 216:. 208:. 194:. 157:. 147:. 135:. 488:. 476:: 456:. 444:: 425:. 403:: 384:. 370:: 341:. 324:. 302:: 276:. 263:. 238:. 224:. 212:: 196:7 179:. 165:. 143::

Index


Sleeper
computer vision tasks
International Conference on Document Analysis and Recognition
segmented
Machine learning
support vector machine
convolutional neural networks
discrete Fourier transform
discrete wavelet transform
intelligent word recognition
Optical character recognition
"ICDAR 2005 text locating competition results"
doi
10.1109/ICDAR.2005.231
ISBN
978-0-7695-2420-7
S2CID
1842569
http://www.iapr-tc11.org/mediawiki/index.php/ICDAR_2005_Robust_Reading_Competitions
CiteSeerX
10.1.1.104.1667
doi
10.1007/s10032-004-0134-3
S2CID
2250003
http://www.icdar2013.org
http://u-pat.org/ICDAR2017/
http://www.cvc.uab.es/icdar2011competition/
http://www.iapr-tc11.org/mediawiki/index.php?title=Datasets

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

↑