Knowledge (XXG)

Motion estimation

Source 📝

209:
difference) and then summarise over a local image region (block base motion and filter based motion). An emerging type of matching criteria summarises a local image region first for every pixel location (through some feature transform such as Laplacian transform), compares each summarised pixel and summarises over a local image region again. Some matching criteria have the ability to exclude points that do not actually correspond to each other albeit producing a good matching score, others do not have this ability, but they are still matching criteria.
20: 230: 208:
It can be argued that almost all methods require some kind of definition of the matching criteria. The difference is only whether you summarise over a local image region first and then compare the summarisation (such as feature based methods), or you compare each pixel first (such as squaring the
113:
between two images or video frames. The points that correspond to each other in two views (images or frames) of a real scene or object are "usually" the same point in that scene or on that object. Before we do motion estimation, we must define our measurement of correspondence, i.e., the matching
220:
is a technique used in computer vision and image processing to estimate the motion between two images or frames. It assumes that the motion can be modeled as an affine transformation (translation + rotation + zooming), which is a linear transformation followed by a translation.
114:
metric, which is a measurement of how similar two image points are. There is no right or wrong here; the choice of matching metric is usually related to what the final estimated motion is used for as well as the optimisation strategy in the estimation process.
151:
The methods for finding motion vectors can be categorised into pixel based methods ("direct") and feature based methods ("indirect"). A famous debate resulted in two papers from the opposing factions being produced to try to establish a conclusion.
193:, and match corresponding features between frames, usually with a statistical function applied over a local or global area. The purpose of the statistical function is to remove matches that do not correspond to the actual motion. 83:. The motion vectors may be represented by a translational model or many other models that can approximate the motion of a real video camera, such as rotation and translation in all three dimensions and zoom. 142:
motion vector: a two-dimensional vector used for inter prediction that provides an offset from the coordinates in the decoded picture to the coordinates in a reference picture.
569: 43:-plane of the image, combined with a lateral movement to the lower-right. This is a visualization of the motion estimation performed in order to compress an MPEG movie. 599: 521: 893: 594: 41: 826: 559: 403: 554: 589: 657: 75:
happens in three dimensions (3D) but the images are a projection of the 3D scene onto a 2D plane. The motion vectors may relate to the whole image (
674: 694: 440: 338: 281: 604: 574: 564: 544: 514: 373: 940: 759: 584: 614: 471: 308: 764: 749: 507: 491: 328: 127:
in a picture based on the position of this macroblock (or a similar one) in another picture, called the reference picture.
858: 789: 640: 609: 356: 754: 935: 930: 806: 719: 430: 250: 910: 784: 298: 161: 217: 846: 836: 579: 293: 883: 851: 630: 303: 254: 110: 390: 888: 699: 645: 105: 811: 794: 774: 744: 245:
Applying the motion vectors to an image to synthesize the transformation to the next image is called
234: 816: 679: 463: 264:. Almost all video coding standards use block-based motion estimation and compensation such as the 246: 831: 739: 724: 684: 99: 68: 260:
As a way of exploiting temporal redundancy, motion estimation and compensation are key parts of
229: 19: 769: 714: 666: 467: 457: 436: 334: 261: 166: 878: 841: 689: 549: 190: 131: 52: 873: 821: 530: 360: 48: 416: 494:." 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 2013. 898: 868: 779: 704: 635: 487: 419:', in IEEE Transactions on Image Processing, vol. 25, no. 3, pp. 1095-1108, March 2016. 26: 79:) or specific parts, such as rectangular blocks, arbitrary shaped patches or even per 924: 863: 63:
that describe the transformation from one 2D image to another; usually from adjacent
801: 417:
Motion Estimation Based on Mutual Information and Adaptive Multi-scale Thresholding
175: 93: 709: 64: 123: 284:, a 3D model of a scene is reconstructed using images from a moving camera. 353: 197: 72: 429:
Borko Furht; Joshua Greenberg; Raymond Westwater (6 December 2012).
499: 228: 80: 269: 265: 503: 374:"Latest working draft of H.264/MPEG-4 AVC on hhi.fraunhofer.de" 196:
Statistical functions that have been successfully used include
91:
More often than not, the term motion estimation and the term
97:
are used interchangeably. It is also related in concept to
406:, ICCV Workshop on Vision Algorithms, pages 267-277, 1999. 393:, ICCV Workshop on Vision Algorithms, pages 278-294, 1999 391:
Feature Based Methods for Structure and Motion Estimation
459:
Understanding Digital Cinema: A Professional Handbook
109:. In fact all of these terms refer to the process of 29: 23:
Motion vectors that result from a movement into the
665: 656: 623: 537: 432:Motion Estimation Algorithms for Video Compression 35: 415:Rui Xu, David Taubman & Aous Thabit Naman, ' 515: 257:, because the coding is performed in blocks. 8: 662: 522: 508: 500: 435:. Springer Science & Business Media. 28: 354:Latest working draft of H.264/MPEG-4 AVC 18: 16:Process used in video coding/compression 389:Philip H.S. Torr and Andrew Zisserman: 319: 675:3D reconstruction from multiple images 695:Simultaneous localization and mapping 282:simultaneous localization and mapping 204:Additional note on the categorization 7: 492:Dense visual SLAM for RGB-D cameras 486:Kerl, Christian, Jürgen Sturm, and 760:Automatic number-plate recognition 14: 309:Scale-invariant feature transform 268:series including the most recent 765:Automated species identification 750:Audio-visual speech recognition 249:. It is most easily applied to 595:Recognition and categorization 67:in a video sequence. It is an 59:is the process of determining 1: 859:Optical character recognition 790:Content-based image retrieval 402:Michal Irani and P. Anandan: 330:Computer Vision and Robotics 169:and frequency domain methods 111:finding corresponding points 456:Swartz, Charles S. (2005). 957: 755:Automatic image annotation 590:Noise reduction techniques 363:. Retrieved on 2008-02-29. 172:Pixel recursive algorithms 941:Motion in computer vision 907: 720:Free viewpoint television 251:discrete cosine transform 785:Computer-aided diagnosis 299:Graphics processing unit 218:Affine motion estimation 213:Affine motion estimation 162:Block-matching algorithm 77:global motion estimation 847:Moving object detection 837:Medical image computing 600:Research infrastructure 570:Image sensor technology 294:Moving object detection 121:is used to represent a 884:Video content analysis 852:Small object detection 631:Computer stereo vision 304:Vision processing unit 255:video coding standards 237: 189:use features, such as 144: 44: 37: 889:Video motion analysis 700:Structure from motion 646:3D object recognition 232: 140: 106:stereo correspondence 38: 22: 812:Foreground detection 795:Reverse image search 775:Bioimage informatics 745:Activity recognition 464:Taylor & Francis 404:About Direct Methods 327:John X. Liu (2006). 235:motion interpolation 27: 879:Autonomous vehicles 817:Gesture recognition 680:2D to 3D conversion 333:. Nova Publishers. 247:motion compensation 894:Video surveillance 832:Landmark detection 740:3D pose estimation 725:Volumetric capture 685:Gaussian splatting 641:Object recognition 555:Commercial systems 359:2004-07-23 at the 238: 233:Video frames with 100:image registration 45: 33: 918: 917: 827:Image restoration 770:Augmented reality 735: 734: 715:4D reconstruction 667:3D reconstruction 560:Feature detection 442:978-1-4615-6241-2 340:978-1-59454-357-9 276:3D reconstruction 262:video compression 167:Phase correlation 134:standard defines 69:ill-posed problem 57:motion estimation 36:{\displaystyle z} 948: 936:Motion (physics) 931:Video processing 842:Object detection 807:Face recognition 690:Shape from focus 663: 550:Digital geometry 524: 517: 510: 501: 495: 484: 478: 477: 453: 447: 446: 426: 420: 413: 407: 400: 394: 387: 381: 380: 378: 370: 364: 351: 345: 344: 324: 191:corner detection 187:Indirect methods 182:Indirect methods 132:H.264/MPEG-4 AVC 53:image processing 42: 40: 39: 34: 956: 955: 951: 950: 949: 947: 946: 945: 921: 920: 919: 914: 903: 874:Robotic mapping 822:Image denoising 731: 652: 619: 585:Motion analysis 533: 531:Computer vision 528: 498: 485: 481: 474: 466:. p. 143. 455: 454: 450: 443: 428: 427: 423: 414: 410: 401: 397: 388: 384: 376: 372: 371: 367: 361:Wayback Machine 352: 348: 341: 326: 325: 321: 317: 290: 278: 243: 227: 215: 206: 184: 158: 149: 89: 49:computer vision 25: 24: 17: 12: 11: 5: 954: 952: 944: 943: 938: 933: 923: 922: 916: 915: 908: 905: 904: 902: 901: 899:Video tracking 896: 891: 886: 881: 876: 871: 869:Remote sensing 866: 861: 856: 855: 854: 849: 839: 834: 829: 824: 819: 814: 809: 804: 799: 798: 797: 787: 782: 780:Blob detection 777: 772: 767: 762: 757: 752: 747: 742: 736: 733: 732: 730: 729: 728: 727: 722: 712: 707: 705:View synthesis 702: 697: 692: 687: 682: 677: 671: 669: 660: 654: 653: 651: 650: 649: 648: 638: 636:Motion capture 633: 627: 625: 621: 620: 618: 617: 612: 607: 602: 597: 592: 587: 582: 577: 572: 567: 562: 557: 552: 547: 541: 539: 535: 534: 529: 527: 526: 519: 512: 504: 497: 496: 488:Daniel Cremers 479: 472: 448: 441: 421: 408: 395: 382: 365: 346: 339: 318: 316: 313: 312: 311: 306: 301: 296: 289: 286: 277: 274: 242: 239: 226: 223: 214: 211: 205: 202: 183: 180: 179: 178: 173: 170: 164: 157: 156:Direct methods 154: 148: 145: 88: 85: 61:motion vectors 32: 15: 13: 10: 9: 6: 4: 3: 2: 953: 942: 939: 937: 934: 932: 929: 928: 926: 913: 912: 911:Main category 906: 900: 897: 895: 892: 890: 887: 885: 882: 880: 877: 875: 872: 870: 867: 865: 864:Pose tracking 862: 860: 857: 853: 850: 848: 845: 844: 843: 840: 838: 835: 833: 830: 828: 825: 823: 820: 818: 815: 813: 810: 808: 805: 803: 800: 796: 793: 792: 791: 788: 786: 783: 781: 778: 776: 773: 771: 768: 766: 763: 761: 758: 756: 753: 751: 748: 746: 743: 741: 738: 737: 726: 723: 721: 718: 717: 716: 713: 711: 708: 706: 703: 701: 698: 696: 693: 691: 688: 686: 683: 681: 678: 676: 673: 672: 670: 668: 664: 661: 659: 655: 647: 644: 643: 642: 639: 637: 634: 632: 629: 628: 626: 622: 616: 613: 611: 608: 606: 603: 601: 598: 596: 593: 591: 588: 586: 583: 581: 578: 576: 573: 571: 568: 566: 563: 561: 558: 556: 553: 551: 548: 546: 543: 542: 540: 536: 532: 525: 520: 518: 513: 511: 506: 505: 502: 493: 489: 483: 480: 475: 473:9780240806174 469: 465: 461: 460: 452: 449: 444: 438: 434: 433: 425: 422: 418: 412: 409: 405: 399: 396: 392: 386: 383: 375: 369: 366: 362: 358: 355: 350: 347: 342: 336: 332: 331: 323: 320: 314: 310: 307: 305: 302: 300: 297: 295: 292: 291: 287: 285: 283: 275: 273: 271: 267: 263: 258: 256: 252: 248: 240: 236: 231: 224: 222: 219: 212: 210: 203: 201: 199: 194: 192: 188: 181: 177: 174: 171: 168: 165: 163: 160: 159: 155: 153: 146: 143: 139: 137: 136:motion vector 133: 128: 126: 125: 120: 119:motion vector 115: 112: 108: 107: 102: 101: 96: 95: 87:Related terms 86: 84: 82: 78: 74: 70: 66: 62: 58: 54: 50: 30: 21: 909: 802:Eye tracking 658:Applications 624:Technologies 610:Segmentation 482: 458: 451: 431: 424: 411: 398: 385: 368: 349: 329: 322: 279: 259: 253:(DCT) based 244: 241:Video coding 225:Applications 216: 207: 195: 186: 185: 176:Optical flow 150: 141: 135: 129: 122: 118: 116: 104: 98: 94:optical flow 92: 90: 76: 60: 56: 46: 710:Visual hull 605:Researchers 925:Categories 580:Morphology 538:Categories 315:References 147:Algorithms 124:macroblock 615:Software 575:Learning 565:Geometry 545:Datasets 357:Archived 288:See also 71:as the 470:  439:  337:  198:RANSAC 73:motion 65:frames 377:(PDF) 117:Each 81:pixel 468:ISBN 437:ISBN 335:ISBN 270:HEVC 266:MPEG 138:as: 130:The 103:and 51:and 490:. " 280:In 47:In 927:: 462:. 272:. 200:. 55:, 523:e 516:t 509:v 476:. 445:. 379:. 343:. 31:z

Index


computer vision
image processing
frames
ill-posed problem
motion
pixel
optical flow
image registration
stereo correspondence
finding corresponding points
macroblock
H.264/MPEG-4 AVC
Block-matching algorithm
Phase correlation
Optical flow
corner detection
RANSAC
Affine motion estimation

motion interpolation
motion compensation
discrete cosine transform
video coding standards
video compression
MPEG
HEVC
simultaneous localization and mapping
Moving object detection
Graphics processing unit

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.