185:
Automatic document processing applies to a whole range of documents, whether structured or not. For instance, in the world of business and finance, technologies may be used to process paper-based invoices, forms, purchase orders, contracts, and currency bills. Financial institutions use intelligent
221:
If, from the 1980s onward, traditional computer vision algorithms were widely used to solve document processing problems, these have been gradually replaced by neural network technologies in the 2010s. However, traditional computer vision technologies are still used, sometimes in conjunction with
189:
In medicine, document processing methods have been developed to facilitate patient follow-up and streamline administrative procedures, in particular by digitizing medical or laboratory analysis reports. The goal is also to standardize medical databases. Algorithms are also directly used to assist
574:
135:
vice-president, Paul
Strassman, expressed a critical opinion, saying that computers add rather than reduce the volume of paper in an office. It was said that the engineering and maintenance documents for an airplane weigh "more than the airplane itself".
105:
As an example of manual document processing, as relatively recent as 2007, document processing for "millions of visa and citizenship applications" was about use of "approximately 1,000 contract workers" working to "manage mail room and
252:. Sometimes, specific 2D scanners must also be developed to adapt to the size of the documents or for reasons of scanning ergonomics. The document processing also depends on the digital encoding of the documents in a suitable
501:"Intelligent Document Processing" in Proceedings. Eighth International Conference on Document Analysis and Recognition, Seoul, South Korea, 2005 pp. 1100-1104. doi: 10.1109/ICDAR.2005.144
186:
document processing to process high volumes of forms such as regulatory forms or loan documents. ID uses AI to extract and classify data from documents, replacing manual data entry.
213:
from archives or heritage collections. Specific approaches were developed for various sources, including textual documents, such as newspaper archives, but also images, or maps.
173:(ICE) to extract data from several types documents. Advancements in automatic document processing, also called Intelligent Document Processing, improve the ability to process
248:
technologies are also involved, whether in the form of classical or three-dimensional scanning. The digitization of 3D documents can in particular resort to derivatives of
1056:
531:
78:
technologies. It is applied in many industrial and scientific fields for the optimization of administrative processes, mail processing and the digitization of analog
589:
572:, John E. Jones; William J. Jones & Frank M. Csultis, "Financial document processing system", published 2011-01-18, issued 2011-01-18
244:
These technologies often form the core of document processing. However, other algorithms may intervene before or after these processes. Indeed, document
263:
At the other end of the chain are various image completion, extrapolation or data cleanup algorithms. For textual documents, the interpretation can use
661:"Volumetric assessment of extrusion in medial meniscus posterior root tears through semi-automatic segmentation on 3-tesla magnetic resonance images"
70:
or not. The term can also include the phase of digitizing the document using a scanner and the phase of interpreting the document, for example using
1066:
237:
algorithms, which can sometimes also be used to detect the structure of the document. The resolution of the latter problem sometimes also uses
94:, such as letters and parcels, in an aim of sorting, extracting or massively extracting data. This work could be performed in-house or through
432:
364:
874:
Neural networks for semantic segmentation of historical city maps: Cross-cultural performance and the impact of figurative diversity
391:
337:
170:
153:
A technology called automatic document processing or sometimes intelligent document processing (IDP) emerged as a specific form of
381:
327:
354:
226:
90:
Document processing was initially as is still to some extent a kind of production line work dealing with the treatment of
55:
829:"New Techniques for the Digitization of Art Historical Photographic Archives - the Case of the Cini Foundation in Venice"
1061:
230:
95:
59:
264:
166:
71:
63:
996:. 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR). Niagara Falls, NY, USA: IEEE.
499:
233:(HTR), which allow the text to be transcribed automatically. Text segments as such are identified using instance or
191:
828:
39:
158:
38:, but also to make it digitally intelligible. This includes extracting the structure of the document or the
945:
660:
99:
238:
47:
150:
advanced, document processing transitioned to handling "document components ... as database entities."
757:
614:
957:
626:
257:
75:
42:
and then the content, which can take the form of text or images. The process can involve traditional
613:
Adamo, Francesco; Attivissimo, Filippo; Di Nisio, Attilio; Spadavecchia, Maurizio (February 2015).
276:
908:"Segmentation methods for character recognition: from segmentation to document structure analysis"
545:
997:
878:
680:
525:
513:
437:
413:
281:
206:
154:
123:
46:
algorithms, convolutional neural networks or manual labor. The problems addressed are related to
23:
854:
Ares
Oliveira, Sofia; di Lenardo, Isabella; Tourenc, Bastien; Kaplan, Frédéric (11 July 2019).
805:
Ehrmann, Maud; Romanello, Matteo; Clematide, Simon; Ströbel, Phillip; Barman, Raphaël (2020).
787:
738:
495:
387:
360:
333:
174:
329:
Integrative
Document & Content Management: Strategies for Exploiting Enterprise Knowledge
1007:
965:
919:
888:
836:
827:
Seguin, Benoit; Costiner, Lisandra; di
Lenardo, Isabella; Kaplan, Frédéric (April 1, 2018).
777:
769:
728:
718:
672:
634:
505:
291:
234:
162:
146:
128:
107:
51:
296:
286:
43:
98:. Document processing can indeed involve some kind of externalized manual labor, such as
961:
630:
733:
706:
467:
249:
131:" stated that "document processing begins with the scanner". In this context, a former
114:
31:
969:
840:
638:
1050:
872:
684:
408:
118:
35:
758:"Leucocyte classification for leukaemia detection using image processing techniques"
517:
569:
245:
1011:
773:
892:
301:
253:
225:
Many technologies support the development of document processing, in particular
113:
While document processing involved data entry via keyboard well before use of a
991:
676:
907:
202:
195:
67:
855:
806:
707:"MRI Segmentation of the Human Brain: Challenges, Methods, and Applications"
79:
791:
742:
509:
990:
Ares
Oliveira, Sofia; Seguin, Benoit; Kaplan, Frederic (5–8 August 2018).
723:
756:
Putzua, Lorenzo; Caocci, Giovanni; Di
Rubertoa, Cecilia (November 2014).
452:
Al Young; Dayle
Woolstein; Jay Johnson (February 1996). "Unknown Title".
306:
210:
91:
27:
807:"Language Resources for Historical Newspapers: the Impresso Collection"
782:
498:, Stefano Ferilli, Teresa M. A. Basile, Nicola Di Mauro (2005-04-01).
993:
dhSegment: A Generic Deep-Learning
Approach for Document Segmentation
923:
705:
Despotović, Ivana; Bart, Goossens; Wilfried, Philips (1 March 2015).
615:"An automatic document processing system for medical data extraction"
256:. Furthermore, the processing of heterogeneous databases can rely on
811:
Proceedings of the 12th
Language Resources and Evaluation Conference
1002:
883:
590:"Appian Adds Google Cloud Intelligence To Low-Code Automation Mix"
132:
30:
digital. Document processing does not simply aim to photograph or
659:
Changwan, Kim; Seong-Il, Lee; Won Joon, Cho (September 2020).
835:. Society for Imaging Science and Technology. pp. 1–5.
822:
820:
356:
Business
Process Outsourcing: A Supply Chain of Expertises
944:
Tang, Yuan Y.; Lee, Seong-Whan; Suen, Ching Y. (1996).
665:
Orthopaedics & Traumatology: Surgery & Research
1025:
860:. Digital Humanities Conference. Utrecht, Netherlands.
433:"Paper, Once Written Off, Keeps a Place in the Office"
475:
Department of Computer Science – University of Bari
190:physicians in medical diagnosis, e.g. by analyzing
906:Fujisawa, H.; Nakano, Y.; Kurino, K. (July 1992).
711:Computational Intelligence Techniques in Medicine
857:A deep learning approach to Cadastral Computing
201:Document processing is also widely used in the
8:
833:Archiving 2018 Final Program and Proceedings
530:: CS1 maint: multiple names: authors list (
383:Outsourcing to India: The Offshore Advantage
177:with fewer exceptions and greater speeds.
1026:"Revolutionary Scanning Technology for Art"
426:
424:
1057:Automatic identification and data capture
1001:
946:"Automatic document processing: a survey"
882:
781:
732:
722:
386:. Springer Science & Business Media.
546:"Intelligent Document Processing (IDP)"
318:
813:. Marseille, France. pp. 958–968.
523:
326:Len Asprey; Michael Middleton (2003).
380:Mark Kobayashi-Hillary (2005-12-05).
7:
409:"Immigration Contractor Trims Wages"
22:is a field of research and a set of
762:Artificial Intelligence in Medicine
431:Lawrence M. Fisher (July 7, 1990).
56:optical character recognition (OCR)
407:Julia Preston (December 2, 2007).
222:neural networks, in some sectors.
60:handwritten text recognition (HTR)
16:Digitalisation of analog documents
14:
841:10.2352/issn.2168-3204.2018.1.0.2
639:10.1016/j.measurement.2014.10.032
468:"Intelligent Document processing"
209:, in order to extract historical
171:Intelligent Character Recognition
1067:Applications of computer vision
871:Petitpierre, Rémi (July 2020).
155:Intelligent Process Automation
127:regarding what it called the "
1:
1012:10.1109/ICFHR-2018.2018.00011
970:10.1016/S0031-3203(96)00044-1
353:Vinod V. Sople (2009-05-25).
227:optical character recognition
140:Automatic document processing
774:10.1016/j.artmed.2014.09.002
231:handwritten text recognition
96:business process outsourcing
893:10.13140/RG.2.2.10973.64484
265:natural language processing
167:Natural Language Processing
72:natural language processing
1083:
677:10.1016/j.rcot.2020.06.003
82:and historical documents.
26:aimed at making an analog
570:US active US7873576B2
359:. PHI Learning Pvt. Ltd.
192:magnetic resonance images
332:. Idea Group Inc (IGI).
912:Proceedings of the IEEE
159:artificial intelligence
34:a document to obtain a
510:10.1109/ICDAR.2005.144
239:semantic segmentation
48:semantic segmentation
588:Bridgwater, Adrian.
267:(NLP) technologies.
258:image classification
121:, a 1990 article in
76:image classification
24:production processes
1062:Applied data mining
962:1996PatRe..29.1931T
950:Pattern Recognition
724:10.1155/2015/450341
631:2015Meas...61...88A
277:Document automation
62:and, more broadly,
20:Document processing
438:The New York Times
414:The New York Times
282:Document modelling
207:digital humanities
124:The New York Times
956:(12): 1931–1952.
496:Floriana Esposito
175:unstructured data
157:(IPA), combining
1074:
1041:
1040:
1038:
1036:
1022:
1016:
1015:
1005:
987:
981:
980:
978:
976:
941:
935:
934:
932:
930:
924:10.1109/5.156471
918:(7): 1079–1092.
903:
897:
896:
886:
868:
862:
861:
851:
845:
844:
824:
815:
814:
802:
796:
795:
785:
753:
747:
746:
736:
726:
702:
696:
695:
693:
691:
656:
650:
649:
647:
645:
610:
604:
603:
601:
600:
585:
579:
578:
577:
573:
566:
560:
559:
557:
556:
542:
536:
535:
529:
521:
492:
486:
485:
483:
482:
472:
464:
458:
457:
449:
443:
442:
428:
419:
418:
404:
398:
397:
377:
371:
370:
350:
344:
343:
323:
292:Document Imaging
235:object detection
163:Machine Learning
147:state of the art
129:paperless office
119:computer scanner
52:object detection
1082:
1081:
1077:
1076:
1075:
1073:
1072:
1071:
1047:
1046:
1045:
1044:
1034:
1032:
1024:
1023:
1019:
989:
988:
984:
974:
972:
943:
942:
938:
928:
926:
905:
904:
900:
870:
869:
865:
853:
852:
848:
826:
825:
818:
804:
803:
799:
755:
754:
750:
704:
703:
699:
689:
687:
658:
657:
653:
643:
641:
612:
611:
607:
598:
596:
587:
586:
582:
575:
568:
567:
563:
554:
552:
544:
543:
539:
522:
494:
493:
489:
480:
478:
470:
466:
465:
461:
454:Object Magazine
451:
450:
446:
430:
429:
422:
406:
405:
401:
394:
379:
378:
374:
367:
352:
351:
347:
340:
325:
324:
320:
315:
297:Duplex scanning
287:Data Processing
273:
219:
183:
142:
100:mechanical Turk
88:
44:computer vision
17:
12:
11:
5:
1080:
1078:
1070:
1069:
1064:
1059:
1049:
1048:
1043:
1042:
1017:
982:
936:
898:
863:
846:
816:
797:
768:(3): 179–191.
748:
697:
671:(5): 963–968.
651:
605:
580:
561:
550:keymarkinc.com
537:
487:
459:
444:
420:
399:
392:
372:
366:978-8120338159
365:
345:
338:
317:
316:
314:
311:
310:
309:
304:
299:
294:
289:
284:
279:
272:
269:
260:technologies.
250:photogrammetry
218:
215:
182:
179:
141:
138:
115:computer mouse
87:
84:
15:
13:
10:
9:
6:
4:
3:
2:
1079:
1068:
1065:
1063:
1060:
1058:
1055:
1054:
1052:
1031:
1027:
1021:
1018:
1013:
1009:
1004:
999:
995:
994:
986:
983:
971:
967:
963:
959:
955:
951:
947:
940:
937:
925:
921:
917:
913:
909:
902:
899:
894:
890:
885:
880:
876:
875:
867:
864:
859:
858:
850:
847:
842:
838:
834:
830:
823:
821:
817:
812:
808:
801:
798:
793:
789:
784:
779:
775:
771:
767:
763:
759:
752:
749:
744:
740:
735:
730:
725:
720:
716:
712:
708:
701:
698:
686:
682:
678:
674:
670:
666:
662:
655:
652:
640:
636:
632:
628:
624:
620:
616:
609:
606:
595:
591:
584:
581:
571:
565:
562:
551:
547:
541:
538:
533:
527:
519:
515:
511:
507:
503:
502:
497:
491:
488:
476:
469:
463:
460:
456:. p. 51.
455:
448:
445:
440:
439:
434:
427:
425:
421:
416:
415:
410:
403:
400:
395:
393:9783540247944
389:
385:
384:
376:
373:
368:
362:
358:
357:
349:
346:
341:
339:9781591400554
335:
331:
330:
322:
319:
312:
308:
305:
303:
300:
298:
295:
293:
290:
288:
285:
283:
280:
278:
275:
274:
270:
268:
266:
261:
259:
255:
251:
247:
242:
240:
236:
232:
228:
223:
216:
214:
212:
208:
204:
199:
197:
193:
187:
180:
178:
176:
172:
168:
164:
160:
156:
151:
149:
148:
139:
137:
134:
130:
126:
125:
120:
116:
111:
109:
103:
101:
97:
93:
85:
83:
81:
77:
73:
69:
65:
64:transcription
61:
57:
53:
49:
45:
41:
37:
36:digital image
33:
29:
25:
21:
1033:. Retrieved
1029:
1020:
992:
985:
973:. Retrieved
953:
949:
939:
927:. Retrieved
915:
911:
901:
873:
866:
856:
849:
832:
810:
800:
765:
761:
751:
714:
710:
700:
688:. Retrieved
668:
664:
654:
642:. Retrieved
622:
618:
608:
597:. Retrieved
593:
583:
564:
553:. Retrieved
549:
540:
500:
490:
479:. Retrieved
477:. 2005-04-07
474:
462:
453:
447:
436:
412:
402:
382:
375:
355:
348:
328:
321:
262:
246:digitization
243:
241:algorithms.
224:
220:
217:Technologies
200:
188:
184:
181:Applications
152:
145:
143:
122:
112:
104:
89:
19:
18:
783:11584/94592
717:: 963–968.
619:Measurement
302:Text mining
254:file format
229:(OCR), and
196:microscopic
1051:Categories
1035:3 February
1003:1804.10371
975:3 February
929:3 February
884:2101.12478
690:31 January
644:31 January
599:2021-04-21
555:2024-07-12
481:2018-09-08
313:References
203:humanities
108:data entry
86:Background
66:, whether
685:225215597
625:: 88–99.
526:cite book
169:(NLP) or
92:documents
74:(NLP) or
68:automatic
792:25241903
743:25945121
518:17302169
307:Workflow
271:See also
211:big data
198:images.
161:such as
80:archives
28:document
958:Bibcode
877:(MSc).
734:4402572
627:Bibcode
144:As the
1030:Artmyn
790:
741:
731:
683:
594:Forbes
576:
516:
390:
363:
336:
165:(ML),
40:layout
998:arXiv
879:arXiv
681:S2CID
514:S2CID
471:(PDF)
194:, or
133:Xerox
117:or a
1037:2021
977:2021
931:2021
788:PMID
739:PMID
715:2015
692:2021
646:2021
532:link
388:ISBN
361:ISBN
334:ISBN
205:and
32:scan
1008:doi
966:doi
920:doi
889:doi
837:doi
778:hdl
770:doi
729:PMC
719:doi
673:doi
669:101
635:doi
506:doi
110:."
1053::
1028:.
1006:.
964:.
954:29
952:.
948:.
916:80
914:.
910:.
887:.
831:.
819:^
809:.
786:.
776:.
766:63
764:.
760:.
737:.
727:.
713:.
709:.
679:.
667:.
663:.
633:.
623:61
621:.
617:.
592:.
548:.
528:}}
524:{{
512:.
504:.
473:.
435:.
423:^
411:.
102:.
58:,
54:,
50:,
1039:.
1014:.
1010::
1000::
979:.
968::
960::
933:.
922::
895:.
891::
881::
843:.
839::
794:.
780::
772::
745:.
721::
694:.
675::
648:.
637::
629::
602:.
558:.
534:)
520:.
508::
484:.
441:.
417:.
396:.
369:.
342:.
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.