Knowledge (XXG)

Hamilton–Jacobi–Bellman equation

Source 📝

1772:) for approximating the Bellman function in general. This is an effective mitigation strategy for reducing the impact of dimensionality by replacing the memorization of the complete function mapping for the whole space domain with the memorization of the sole neural network parameters. In particular, for continuous-time systems, an approximate dynamic programming approach that combines both policy iterations with neural networks was introduced. In discrete-time, an approach to solve the HJB equation combining value iterations and neural networks was introduced. 2857: 2572: 1482: 823: 2852:{\displaystyle -{\frac {\partial V(x,t)}{\partial t}}={\frac {1}{2}}q(t)x^{2}+{\frac {\partial V(x,t)}{\partial x}}ax-{\frac {b^{2}}{2r(t)}}\left({\frac {\partial V(x,t)}{\partial x}}\right)^{2}+{\frac {\sigma ^{2}}{2}}{\frac {\partial ^{2}V(x,t)}{\partial x^{2}}}.} 2326:
of the latter does not necessarily solve the primal problem, it is a candidate only and a further verifying argument is required. This technique is widely used in Financial Mathematics to determine optimal investment strategies in the market (see for example
1241: 3096: 1940: 1256: 1748: 1814:
The idea of solving a control problem by applying Bellman's principle of optimality and then working out backwards in time an optimizing strategy can be generalized to stochastic control problems. Consider similar as above
2952: 2210: 315: 670: 2564: 2436: 2063: 2003: 1521: 1726:
In general case, the HJB equation does not have a classical (smooth) solution. Several notions of generalized solutions have been developed to cover such situations, including
622: 2299: 2237: 883: 499: 3207:
Kemajou-Brown, Isabelle (2016). "Brief History of Optimal Control Theory and Some Recent Developments". In Budzban, Gregory; Hughes, Harry Randolph; Schurz, Henri (eds.).
2105: 1061: 1573: 1037: 921: 534: 2324: 1821: 1804: 1477:{\displaystyle V(x(t+dt),t+dt)=V(x(t),t)+{\frac {\partial V(x,t)}{\partial t}}\,dt+{\frac {\partial V(x,t)}{\partial x}}\cdot {\dot {x}}(t)\,dt+{\mathcal {o}}(dt),} 1677: 467: 438: 409: 2240: 1645: 1619: 1721: 1701: 985: 965: 945: 654: 376: 347: 163: 3382:
Abu-Khalaf, Murad; Lewis, Frank L. (2005). "Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach".
3417:
Al-Tamimi, Asma; Lewis, Frank L.; Abu-Khalaf, Murad (2008). "Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof".
105:, can be solved using the Hamilton–Jacobi–Bellman equation, the method can be applied to a broader spectrum of problems. Further it can be generalized to 2868: 3552: 32: 110: 2117: 3510: 3366: 3301: 3460:
Jones, Morgan; Peet, Matthew (2020). "Polynomial Approximation of Value Functions and Nonlinear Controller Design with Performance Bounds".
171: 55:
of the optimal control problem which, once known, can be used to obtain the optimal control by taking the maximizer (or minimizer) of the
818:{\displaystyle {\frac {\partial V(x,t)}{\partial t}}+\min _{u}\left\{{\frac {\partial V(x,t)}{\partial x}}\cdot F(x,u)+C(x,u)\right\}=0} 1680: 36: 3533: 3341: 3276: 3251: 3224: 3074: 3045: 3016: 2962: 2444: 83: 2980: 3165:
Bellman, R.; Dreyfus, S. (1959). "An Application of Dynamic Programming to the Determination of Optimal Satellite Trajectories".
44: 3572: 2339:
As an example, we can look at a system with linear stochastic dynamics and quadratic cost. If the system dynamics is given by
2345: 2328: 1779:
can yield an approximate polynomial solution to the Hamilton–Jacobi–Bellman equation arbitrarily well with respect to the
71: 3562: 2984: 56: 3567: 1776: 3184:
Kálmán, Rudolf E. (1963). "The Theory of Optimal Control and the Calculus of Variations". In Bellman, Richard (ed.).
2987:, but this has the advantage over HJB of only needing to be satisfied over the single trajectory being considered. 1765: 1743: 3557: 1040: 102: 40: 2008: 1948: 79: 1769: 1490: 549: 2249: 3485: 3104: 2218: 834: 1735: 1236:{\displaystyle V(x(t),t)=\min _{u}\left\{V(x(t+dt),t+dt)+\int _{t}^{t+dt}C(x(s),u(s))\,ds\right\}.} 472: 98: 87: 63: 3461: 3442: 3399: 1731: 1727: 1592: 1039:
is the optimal cost-to-go function (also called the 'value function'), then by Richard Bellman's
118: 2068: 1534: 998: 3529: 3521: 3506: 3434: 3362: 3337: 3297: 3272: 3247: 3241: 3220: 3189: 3132: 3070: 3062: 3041: 3033: 3012: 1935:{\displaystyle \min _{u}\mathbb {E} \left\{\int _{0}^{T}C(t,X_{t},u_{t})\,dt+D(X_{T})\right\}} 1757: 1524: 75: 3498: 3006: 3426: 3391: 3212: 3122: 3112: 2974: 2958: 1739: 1247: 891: 504: 113:. A major drawback, however, is that the HJB equation admits classical solutions only for a 91: 2307: 1782: 1761: 1653: 443: 414: 385: 67: 2108: 1624: 1598: 3108: 1706: 1686: 970: 950: 930: 924: 630: 537: 352: 323: 52: 3127: 3091: 136: 3546: 3395: 664:
For this simple system, the Hamilton–Jacobi–Bellman partial differential equation is
379: 133:
Consider the following problem in deterministic optimal control over the time period
122: 117:
value function, which is not guaranteed in most situations. Instead, the notion of a
48: 3446: 3403: 3503:
Continuous-time Stochastic Control and Optimization with Financial Applications
3216: 656:
gives the vector determining physical evolution of the state vector over time.
3430: 114: 106: 2947:{\displaystyle u_{t}=-{\frac {b}{r(t)}}{\frac {\partial V(x,t)}{\partial x}}} 1683:
for an optimum when the terminal state is unconstrained. If we can solve for
3334:
Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations
3269:
Optimal Control and Viscosity Solutions of Hamilton–Jacobi–Bellman Equations
121:
is required, in which conventional derivatives are replaced by (set-valued)
3438: 3193: 3136: 3117: 3419:
IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics
2205:{\displaystyle \min _{u}\left\{{\mathcal {A}}V(x,t)+C(t,x,u)\right\}=0,} 3092:"Dynamic Programming and a new formalism in the calculus of variations" 2983:, necessary but not sufficient condition for optimum, by maximizing a 2957:
Assuming a quadratic form for the value function, we obtain the usual
1523:
denotes the terms in the Taylor expansion of higher order than one in
310:{\displaystyle V(x(0),0)=\min _{u}\left\{\int _{0}^{T}C\,dt+D\right\}} 2977:, discrete-time counterpart of the Hamilton–Jacobi–Bellman equation. 3466: 2304:
Note that the randomness has disappeared. In this case a solution
3038:
Stochastic Controls : Hamiltonian Systems and HJB Equations
3246:. Cambridge, UK: Cambridge University Press. pp. 113–168. 3292:
Lewis, Frank L.; Vrabie, Draguna; Syrmos, Vassilis L. (2012).
3188:. Berkeley: University of California Press. pp. 309–331. 2559:{\displaystyle C(x_{t},u_{t})=r(t)u_{t}^{2}/2+q(t)x_{t}^{2}/2} 3211:. Contemporary Mathematics. Vol. 668. pp. 119–130. 2224: 2138: 1496: 1454: 995:
Intuitively, the HJB equation can be derived as follows. If
927:, which represents the cost incurred from starting in state 1583:
approaches zero, we obtain the HJB equation defined above.
967:
and controlling the system optimally from then until time
923:
in the above partial differential equation is the Bellman
109:
systems, in which case the HJB equation is a second-order
501:
is the control vector that we are trying to find. Thus,
2065:
the steering. By first using Bellman and then expanding
3011:. Englewood Cliffs, NJ: Prentice-Hall. pp. 86–90. 2431:{\displaystyle dx_{t}=(ax_{t}+bu_{t})dt+\sigma dw_{t},} 1756:
Approximate dynamic programming has been introduced by
2961:
for the Hessian of the value function as is usual for
1679:
is continuously differentiable, the HJB equation is a
2871: 2575: 2447: 2348: 2310: 2252: 2221: 2120: 2071: 2011: 1951: 1824: 1785: 1709: 1689: 1656: 1627: 1601: 1537: 1493: 1259: 1064: 1001: 973: 953: 933: 894: 837: 673: 633: 552: 507: 475: 446: 417: 388: 355: 326: 174: 139: 3499:"The Classical PDE Approach to Dynamic Programming" 3357:Bertsekas, Dimitri P.; Tsitsiklis, John N. (1996). 2946: 2851: 2558: 2430: 2318: 2293: 2231: 2204: 2099: 2057: 1997: 1934: 1798: 1715: 1695: 1671: 1639: 1613: 1567: 1515: 1476: 1235: 1031: 979: 959: 939: 915: 877: 817: 648: 616: 528: 493: 461: 432: 403: 370: 341: 309: 157: 3209:Probability on Algebraic and Geometric Structures 2315: 2287: 2054: 1994: 16:An optimality condition in optimal control theory 3332:Bardi, Martino; Capuzzo-Dolcetta, Italo (1997). 3267:Bardi, Martino; Capuzzo-Dolcetta, Italo (1997). 2122: 1826: 1096: 713: 206: 1650:When solved over the whole of state space and 8: 3154:. Princeton, NJ: Princeton University Press. 3069:. Boca Raton: CRC Press. pp. 277–283 . 1250:of the first term on the right-hand side is 3243:Stochastic Optimization in Continuous Time 62:The equation is a result of the theory of 3465: 3126: 3116: 2909: 2888: 2876: 2870: 2837: 2804: 2797: 2786: 2780: 2771: 2732: 2702: 2696: 2652: 2643: 2617: 2579: 2574: 2548: 2542: 2537: 2510: 2504: 2499: 2471: 2458: 2446: 2419: 2391: 2375: 2356: 2347: 2314: 2309: 2286: 2251: 2223: 2222: 2220: 2137: 2136: 2125: 2119: 2082: 2070: 2053: 2029: 2019: 2010: 1993: 1969: 1959: 1950: 1918: 1898: 1889: 1876: 1854: 1849: 1836: 1835: 1829: 1823: 1790: 1784: 1708: 1688: 1655: 1626: 1600: 1536: 1495: 1494: 1492: 1453: 1452: 1442: 1422: 1421: 1383: 1373: 1338: 1258: 1218: 1167: 1162: 1099: 1063: 1000: 972: 952: 932: 893: 874: 836: 727: 716: 674: 672: 632: 613: 554: 553: 551: 506: 474: 445: 416: 387: 354: 325: 271: 229: 224: 209: 173: 138: 2243:, and subject to the terminal condition 2111:, one finds the stochastic HJB equation 3490:Dynamic Programming and Optimal Control 3319:Dynamic Programming and Optimal Control 3034:"Dynamic Programming and HJB Equations" 3008:Optimal Control Theory: An Introduction 2997: 2005:the stochastic process to optimize and 888:As before, the unknown scalar function 33:nonlinear partial differential equation 3063:"The Hamilton–Jacobi–Bellman Equation" 1775:Alternatively, it has been shown that 111:elliptic partial differential equation 3528:. New York: Dover. pp. 201–222. 3032:Yong, Jiongmin; Zhou, Xun Yu (1999). 349:is the scalar cost rate function and 70:and coworkers. The connection to the 7: 3296:(3rd ed.). Wiley. p. 278. 3186:Mathematical Optimization Techniques 66:which was pioneered in the 1950s by 2241:stochastic differentiation operator 2058:{\displaystyle (u_{t})_{t\in }\,\!} 1998:{\displaystyle (X_{t})_{t\in }\,\!} 1703:then we can find from it a control 543:The system must also be subject to 37:necessary and sufficient conditions 2935: 2912: 2830: 2801: 2758: 2735: 2678: 2655: 2605: 2582: 1681:necessary and sufficient condition 1516:{\displaystyle {\mathcal {o}}(dt)} 1409: 1386: 1364: 1341: 828:subject to the terminal condition 753: 730: 700: 677: 14: 2963:Linear-quadratic-Gaussian control 2441:and the cost accumulates at rate 660:The Partial Differential Equation 617:{\displaystyle {\dot {x}}(t)=F\,} 3396:10.1016/j.automatica.2004.11.034 2294:{\displaystyle V(x,T)=D(x)\,\!.} 1810:Extension to Stochastic Problems 1723:that achieves the minimum cost. 3040:. Springer. pp. 157–215 . 2566:, the HJB equation is given by 3553:Partial differential equations 3526:Optimal Control and Estimation 3317:Bertsekas, Dimitri P. (2005). 2981:Pontryagin's maximum principle 2930: 2918: 2903: 2897: 2825: 2813: 2753: 2741: 2721: 2715: 2673: 2661: 2636: 2630: 2600: 2588: 2530: 2524: 2492: 2486: 2477: 2451: 2397: 2365: 2283: 2277: 2268: 2256: 2232:{\displaystyle {\mathcal {A}}} 2185: 2167: 2158: 2146: 2094: 2075: 2048: 2036: 2026: 2012: 1988: 1976: 1966: 1952: 1924: 1911: 1895: 1863: 1666: 1660: 1562: 1553: 1547: 1541: 1510: 1501: 1468: 1459: 1439: 1433: 1404: 1392: 1359: 1347: 1332: 1323: 1317: 1311: 1302: 1284: 1269: 1263: 1215: 1212: 1206: 1197: 1191: 1185: 1152: 1134: 1119: 1113: 1089: 1080: 1074: 1068: 1026: 1017: 1011: 1005: 910: 898: 878:{\displaystyle V(x,T)=D(x),\,} 868: 862: 853: 841: 801: 789: 780: 768: 748: 736: 695: 683: 643: 637: 610: 607: 601: 592: 586: 580: 571: 565: 523: 511: 456: 450: 427: 421: 398: 392: 365: 359: 336: 330: 299: 296: 290: 284: 268: 265: 259: 250: 244: 238: 199: 190: 184: 178: 152: 140: 90:is usually referred to as the 59:involved in the HJB equation. 1: 2862:with optimal action given by 494:{\displaystyle 0\leq t\leq T} 378:is a function that gives the 3505:. Springer. pp. 37–60. 1591:The HJB equation is usually 411:is the system state vector, 3522:"Conditions for Optimality" 3520:Stengel, Robert F. (1994). 3061:Naidu, Desineni S. (2003). 1777:sum-of-squares optimization 1744:Andrei Izmailovich Subbotin 1575:from both sides, divide by 3589: 2335:Application to LQG-Control 2329:Merton's portfolio problem 2100:{\displaystyle V(X_{t},t)} 1766:artificial neural networks 3431:10.1109/TSMCB.2008.926614 3359:Neuro-dynamic Programming 1568:{\displaystyle V(x(t),t)} 1032:{\displaystyle V(x(t),t)} 3240:Chang, Fwu-Ranq (2004). 3005:Kirk, Donald E. (1970). 1593:solved backwards in time 1579:, and take the limit as 129:Optimal Control Problems 86:problems, the analogous 72:Hamilton–Jacobi equation 3167:J. Br. Interplanet. Soc 3150:Bellman, R. E. (1957). 3090:Bellman, R. E. (1954). 3067:Optimal Control Systems 1041:principle of optimality 103:brachistochrone problem 21:Hamilton-Jacobi-Bellman 3573:William Rowan Hamilton 3336:. Boston: Birkhäuser. 3271:. Boston: Birkhäuser. 3217:10.1090/conm/668/13400 3097:Proc. Natl. Acad. Sci. 2948: 2853: 2560: 2432: 2320: 2295: 2233: 2206: 2101: 2059: 1999: 1936: 1800: 1770:multilayer perceptrons 1717: 1697: 1673: 1641: 1615: 1569: 1531:. Then if we subtract 1517: 1478: 1237: 1033: 981: 961: 941: 917: 916:{\displaystyle V(x,t)} 879: 819: 650: 618: 530: 529:{\displaystyle V(x,t)} 495: 463: 440:is assumed given, and 434: 405: 372: 343: 311: 159: 51:. Its solution is the 3486:Bertsekas, Dimitri P. 3361:. Athena Scientific. 3118:10.1073/pnas.40.4.231 2949: 2854: 2561: 2433: 2321: 2319:{\displaystyle V\,\!} 2296: 2234: 2207: 2102: 2060: 2000: 1937: 1801: 1799:{\displaystyle L^{1}} 1718: 1698: 1674: 1642: 1616: 1570: 1518: 1479: 1238: 1034: 991:Deriving the Equation 982: 962: 942: 918: 880: 820: 651: 619: 531: 496: 464: 435: 406: 373: 344: 312: 160: 3497:Pham, Huyên (2009). 3492:. Athena Scientific. 3321:. Athena Scientific. 2869: 2573: 2445: 2346: 2308: 2250: 2219: 2118: 2069: 2009: 1949: 1822: 1783: 1707: 1687: 1672:{\displaystyle V(x)} 1654: 1625: 1599: 1587:Solving the Equation 1535: 1491: 1257: 1062: 999: 971: 951: 931: 892: 835: 671: 631: 550: 505: 473: 462:{\displaystyle u(t)} 444: 433:{\displaystyle x(0)} 415: 404:{\displaystyle x(t)} 386: 382:at the final state, 353: 324: 172: 137: 99:variational problems 3563:Dynamic programming 3152:Dynamic Programming 3109:1954PNAS...40..231B 2547: 2509: 1859: 1640:{\displaystyle t=0} 1614:{\displaystyle t=T} 1181: 234: 115:sufficiently smooth 88:difference equation 78:was first drawn by 64:dynamic programming 3568:Stochastic control 2944: 2849: 2556: 2533: 2495: 2428: 2316: 2291: 2229: 2202: 2130: 2097: 2055: 1995: 1932: 1845: 1834: 1796: 1732:Pierre-Louis Lions 1728:viscosity solution 1713: 1693: 1669: 1637: 1611: 1565: 1513: 1474: 1233: 1158: 1104: 1043:, going from time 1029: 977: 957: 937: 913: 875: 815: 721: 646: 614: 526: 491: 459: 430: 401: 368: 339: 307: 220: 214: 155: 119:viscosity solution 47:with respect to a 3512:978-3-540-89499-5 3368:978-1-886529-10-6 3303:978-0-470-63349-6 2942: 2907: 2844: 2795: 2765: 2725: 2685: 2625: 2612: 2121: 1825: 1716:{\displaystyle u} 1696:{\displaystyle V} 1430: 1416: 1371: 1095: 980:{\displaystyle T} 960:{\displaystyle t} 940:{\displaystyle x} 760: 712: 707: 649:{\displaystyle F} 562: 371:{\displaystyle D} 342:{\displaystyle C} 205: 76:classical physics 3580: 3539: 3516: 3493: 3472: 3471: 3469: 3457: 3451: 3450: 3414: 3408: 3407: 3379: 3373: 3372: 3354: 3348: 3347: 3329: 3323: 3322: 3314: 3308: 3307: 3289: 3283: 3282: 3264: 3258: 3257: 3237: 3231: 3230: 3204: 3198: 3197: 3181: 3175: 3174: 3162: 3156: 3155: 3147: 3141: 3140: 3130: 3120: 3087: 3081: 3080: 3058: 3052: 3051: 3029: 3023: 3022: 3002: 2975:Bellman equation 2959:Riccati equation 2953: 2951: 2950: 2945: 2943: 2941: 2933: 2910: 2908: 2906: 2889: 2881: 2880: 2858: 2856: 2855: 2850: 2845: 2843: 2842: 2841: 2828: 2809: 2808: 2798: 2796: 2791: 2790: 2781: 2776: 2775: 2770: 2766: 2764: 2756: 2733: 2726: 2724: 2707: 2706: 2697: 2686: 2684: 2676: 2653: 2648: 2647: 2626: 2618: 2613: 2611: 2603: 2580: 2565: 2563: 2562: 2557: 2552: 2546: 2541: 2514: 2508: 2503: 2476: 2475: 2463: 2462: 2437: 2435: 2434: 2429: 2424: 2423: 2396: 2395: 2380: 2379: 2361: 2360: 2325: 2323: 2322: 2317: 2300: 2298: 2297: 2292: 2238: 2236: 2235: 2230: 2228: 2227: 2211: 2209: 2208: 2203: 2192: 2188: 2142: 2141: 2129: 2106: 2104: 2103: 2098: 2087: 2086: 2064: 2062: 2061: 2056: 2052: 2051: 2024: 2023: 2004: 2002: 2001: 1996: 1992: 1991: 1964: 1963: 1941: 1939: 1938: 1933: 1931: 1927: 1923: 1922: 1894: 1893: 1881: 1880: 1858: 1853: 1839: 1833: 1805: 1803: 1802: 1797: 1795: 1794: 1764:with the use of 1762:J. N. Tsitsiklis 1752: 1740:minimax solution 1736:Michael Crandall 1722: 1720: 1719: 1714: 1702: 1700: 1699: 1694: 1678: 1676: 1675: 1670: 1646: 1644: 1643: 1638: 1620: 1618: 1617: 1612: 1595:, starting from 1574: 1572: 1571: 1566: 1522: 1520: 1519: 1514: 1500: 1499: 1483: 1481: 1480: 1475: 1458: 1457: 1432: 1431: 1423: 1417: 1415: 1407: 1384: 1372: 1370: 1362: 1339: 1248:Taylor expansion 1242: 1240: 1239: 1234: 1229: 1225: 1180: 1166: 1103: 1038: 1036: 1035: 1030: 986: 984: 983: 978: 966: 964: 963: 958: 946: 944: 943: 938: 922: 920: 919: 914: 884: 882: 881: 876: 824: 822: 821: 816: 808: 804: 761: 759: 751: 728: 720: 708: 706: 698: 675: 655: 653: 652: 647: 623: 621: 620: 615: 564: 563: 555: 535: 533: 532: 527: 500: 498: 497: 492: 468: 466: 465: 460: 439: 437: 436: 431: 410: 408: 407: 402: 377: 375: 374: 369: 348: 346: 345: 340: 316: 314: 313: 308: 306: 302: 233: 228: 213: 164: 162: 161: 158:{\displaystyle } 156: 97:While classical 92:Bellman equation 3588: 3587: 3583: 3582: 3581: 3579: 3578: 3577: 3558:Optimal control 3543: 3542: 3536: 3519: 3513: 3496: 3484: 3481: 3479:Further reading 3476: 3475: 3459: 3458: 3454: 3416: 3415: 3411: 3381: 3380: 3376: 3369: 3356: 3355: 3351: 3344: 3331: 3330: 3326: 3316: 3315: 3311: 3304: 3294:Optimal Control 3291: 3290: 3286: 3279: 3266: 3265: 3261: 3254: 3239: 3238: 3234: 3227: 3206: 3205: 3201: 3183: 3182: 3178: 3164: 3163: 3159: 3149: 3148: 3144: 3089: 3088: 3084: 3077: 3060: 3059: 3055: 3048: 3031: 3030: 3026: 3019: 3004: 3003: 2999: 2994: 2971: 2934: 2911: 2893: 2872: 2867: 2866: 2833: 2829: 2800: 2799: 2782: 2757: 2734: 2728: 2727: 2708: 2698: 2677: 2654: 2639: 2604: 2581: 2571: 2570: 2467: 2454: 2443: 2442: 2415: 2387: 2371: 2352: 2344: 2343: 2337: 2306: 2305: 2248: 2247: 2239:represents the 2217: 2216: 2135: 2131: 2116: 2115: 2078: 2067: 2066: 2025: 2015: 2007: 2006: 1965: 1955: 1947: 1946: 1914: 1885: 1872: 1844: 1840: 1820: 1819: 1812: 1786: 1781: 1780: 1758:D. P. Bertsekas 1753:), and others. 1746: 1705: 1704: 1685: 1684: 1652: 1651: 1623: 1622: 1597: 1596: 1589: 1533: 1532: 1489: 1488: 1408: 1385: 1363: 1340: 1255: 1254: 1109: 1105: 1060: 1059: 997: 996: 993: 969: 968: 949: 948: 929: 928: 890: 889: 833: 832: 752: 729: 726: 722: 699: 676: 669: 668: 662: 629: 628: 548: 547: 503: 502: 471: 470: 442: 441: 413: 412: 384: 383: 351: 350: 322: 321: 219: 215: 170: 169: 135: 134: 131: 68:Richard Bellman 17: 12: 11: 5: 3586: 3584: 3576: 3575: 3570: 3565: 3560: 3555: 3545: 3544: 3541: 3540: 3534: 3517: 3511: 3494: 3480: 3477: 3474: 3473: 3452: 3425:(4): 943–949. 3409: 3390:(5): 779–791. 3374: 3367: 3349: 3342: 3324: 3309: 3302: 3284: 3277: 3259: 3252: 3232: 3225: 3199: 3176: 3157: 3142: 3103:(4): 231–235. 3082: 3075: 3053: 3046: 3024: 3017: 2996: 2995: 2993: 2990: 2989: 2988: 2978: 2970: 2967: 2955: 2954: 2940: 2937: 2932: 2929: 2926: 2923: 2920: 2917: 2914: 2905: 2902: 2899: 2896: 2892: 2887: 2884: 2879: 2875: 2860: 2859: 2848: 2840: 2836: 2832: 2827: 2824: 2821: 2818: 2815: 2812: 2807: 2803: 2794: 2789: 2785: 2779: 2774: 2769: 2763: 2760: 2755: 2752: 2749: 2746: 2743: 2740: 2737: 2731: 2723: 2720: 2717: 2714: 2711: 2705: 2701: 2695: 2692: 2689: 2683: 2680: 2675: 2672: 2669: 2666: 2663: 2660: 2657: 2651: 2646: 2642: 2638: 2635: 2632: 2629: 2624: 2621: 2616: 2610: 2607: 2602: 2599: 2596: 2593: 2590: 2587: 2584: 2578: 2555: 2551: 2545: 2540: 2536: 2532: 2529: 2526: 2523: 2520: 2517: 2513: 2507: 2502: 2498: 2494: 2491: 2488: 2485: 2482: 2479: 2474: 2470: 2466: 2461: 2457: 2453: 2450: 2439: 2438: 2427: 2422: 2418: 2414: 2411: 2408: 2405: 2402: 2399: 2394: 2390: 2386: 2383: 2378: 2374: 2370: 2367: 2364: 2359: 2355: 2351: 2336: 2333: 2313: 2302: 2301: 2290: 2285: 2282: 2279: 2276: 2273: 2270: 2267: 2264: 2261: 2258: 2255: 2226: 2213: 2212: 2201: 2198: 2195: 2191: 2187: 2184: 2181: 2178: 2175: 2172: 2169: 2166: 2163: 2160: 2157: 2154: 2151: 2148: 2145: 2140: 2134: 2128: 2124: 2096: 2093: 2090: 2085: 2081: 2077: 2074: 2050: 2047: 2044: 2041: 2038: 2035: 2032: 2028: 2022: 2018: 2014: 1990: 1987: 1984: 1981: 1978: 1975: 1972: 1968: 1962: 1958: 1954: 1943: 1942: 1930: 1926: 1921: 1917: 1913: 1910: 1907: 1904: 1901: 1897: 1892: 1888: 1884: 1879: 1875: 1871: 1868: 1865: 1862: 1857: 1852: 1848: 1843: 1838: 1832: 1828: 1811: 1808: 1793: 1789: 1712: 1692: 1668: 1665: 1662: 1659: 1636: 1633: 1630: 1621:and ending at 1610: 1607: 1604: 1588: 1585: 1564: 1561: 1558: 1555: 1552: 1549: 1546: 1543: 1540: 1512: 1509: 1506: 1503: 1498: 1485: 1484: 1473: 1470: 1467: 1464: 1461: 1456: 1451: 1448: 1445: 1441: 1438: 1435: 1429: 1426: 1420: 1414: 1411: 1406: 1403: 1400: 1397: 1394: 1391: 1388: 1382: 1379: 1376: 1369: 1366: 1361: 1358: 1355: 1352: 1349: 1346: 1343: 1337: 1334: 1331: 1328: 1325: 1322: 1319: 1316: 1313: 1310: 1307: 1304: 1301: 1298: 1295: 1292: 1289: 1286: 1283: 1280: 1277: 1274: 1271: 1268: 1265: 1262: 1246:Note that the 1244: 1243: 1232: 1228: 1224: 1221: 1217: 1214: 1211: 1208: 1205: 1202: 1199: 1196: 1193: 1190: 1187: 1184: 1179: 1176: 1173: 1170: 1165: 1161: 1157: 1154: 1151: 1148: 1145: 1142: 1139: 1136: 1133: 1130: 1127: 1124: 1121: 1118: 1115: 1112: 1108: 1102: 1098: 1094: 1091: 1088: 1085: 1082: 1079: 1076: 1073: 1070: 1067: 1028: 1025: 1022: 1019: 1016: 1013: 1010: 1007: 1004: 992: 989: 976: 956: 936: 925:value function 912: 909: 906: 903: 900: 897: 886: 885: 873: 870: 867: 864: 861: 858: 855: 852: 849: 846: 843: 840: 826: 825: 814: 811: 807: 803: 800: 797: 794: 791: 788: 785: 782: 779: 776: 773: 770: 767: 764: 758: 755: 750: 747: 744: 741: 738: 735: 732: 725: 719: 715: 711: 705: 702: 697: 694: 691: 688: 685: 682: 679: 661: 658: 645: 642: 639: 636: 625: 624: 612: 609: 606: 603: 600: 597: 594: 591: 588: 585: 582: 579: 576: 573: 570: 567: 561: 558: 538:value function 525: 522: 519: 516: 513: 510: 490: 487: 484: 481: 478: 458: 455: 452: 449: 429: 426: 423: 420: 400: 397: 394: 391: 367: 364: 361: 358: 338: 335: 332: 329: 318: 317: 305: 301: 298: 295: 292: 289: 286: 283: 280: 277: 274: 270: 267: 264: 261: 258: 255: 252: 249: 246: 243: 240: 237: 232: 227: 223: 218: 212: 208: 204: 201: 198: 195: 192: 189: 186: 183: 180: 177: 154: 151: 148: 145: 142: 130: 127: 123:subderivatives 101:, such as the 53:value function 35:that provides 15: 13: 10: 9: 6: 4: 3: 2: 3585: 3574: 3571: 3569: 3566: 3564: 3561: 3559: 3556: 3554: 3551: 3550: 3548: 3537: 3535:0-486-68200-5 3531: 3527: 3523: 3518: 3514: 3508: 3504: 3500: 3495: 3491: 3487: 3483: 3482: 3478: 3468: 3463: 3456: 3453: 3448: 3444: 3440: 3436: 3432: 3428: 3424: 3420: 3413: 3410: 3405: 3401: 3397: 3393: 3389: 3385: 3378: 3375: 3370: 3364: 3360: 3353: 3350: 3345: 3343:0-8176-3640-4 3339: 3335: 3328: 3325: 3320: 3313: 3310: 3305: 3299: 3295: 3288: 3285: 3280: 3278:0-8176-3640-4 3274: 3270: 3263: 3260: 3255: 3253:0-521-83406-6 3249: 3245: 3244: 3236: 3233: 3228: 3226:9781470419455 3222: 3218: 3214: 3210: 3203: 3200: 3195: 3191: 3187: 3180: 3177: 3172: 3168: 3161: 3158: 3153: 3146: 3143: 3138: 3134: 3129: 3124: 3119: 3114: 3110: 3106: 3102: 3099: 3098: 3093: 3086: 3083: 3078: 3076:0-8493-0892-5 3072: 3068: 3064: 3057: 3054: 3049: 3047:0-387-98723-1 3043: 3039: 3035: 3028: 3025: 3020: 3018:0-13-638098-0 3014: 3010: 3009: 3001: 2998: 2991: 2986: 2982: 2979: 2976: 2973: 2972: 2968: 2966: 2964: 2960: 2938: 2927: 2924: 2921: 2915: 2900: 2894: 2890: 2885: 2882: 2877: 2873: 2865: 2864: 2863: 2846: 2838: 2834: 2822: 2819: 2816: 2810: 2805: 2792: 2787: 2783: 2777: 2772: 2767: 2761: 2750: 2747: 2744: 2738: 2729: 2718: 2712: 2709: 2703: 2699: 2693: 2690: 2687: 2681: 2670: 2667: 2664: 2658: 2649: 2644: 2640: 2633: 2627: 2622: 2619: 2614: 2608: 2597: 2594: 2591: 2585: 2576: 2569: 2568: 2567: 2553: 2549: 2543: 2538: 2534: 2527: 2521: 2518: 2515: 2511: 2505: 2500: 2496: 2489: 2483: 2480: 2472: 2468: 2464: 2459: 2455: 2448: 2425: 2420: 2416: 2412: 2409: 2406: 2403: 2400: 2392: 2388: 2384: 2381: 2376: 2372: 2368: 2362: 2357: 2353: 2349: 2342: 2341: 2340: 2334: 2332: 2330: 2311: 2288: 2280: 2274: 2271: 2265: 2262: 2259: 2253: 2246: 2245: 2244: 2242: 2199: 2196: 2193: 2189: 2182: 2179: 2176: 2173: 2170: 2164: 2161: 2155: 2152: 2149: 2143: 2132: 2126: 2114: 2113: 2112: 2110: 2091: 2088: 2083: 2079: 2072: 2045: 2042: 2039: 2033: 2030: 2020: 2016: 1985: 1982: 1979: 1973: 1970: 1960: 1956: 1928: 1919: 1915: 1908: 1905: 1902: 1899: 1890: 1886: 1882: 1877: 1873: 1869: 1866: 1860: 1855: 1850: 1846: 1841: 1830: 1818: 1817: 1816: 1809: 1807: 1791: 1787: 1778: 1773: 1771: 1767: 1763: 1759: 1754: 1750: 1745: 1741: 1737: 1733: 1729: 1724: 1710: 1690: 1682: 1663: 1657: 1648: 1634: 1631: 1628: 1608: 1605: 1602: 1594: 1586: 1584: 1582: 1578: 1559: 1556: 1550: 1544: 1538: 1530: 1528: 1507: 1504: 1471: 1465: 1462: 1449: 1446: 1443: 1436: 1427: 1424: 1418: 1412: 1401: 1398: 1395: 1389: 1380: 1377: 1374: 1367: 1356: 1353: 1350: 1344: 1335: 1329: 1326: 1320: 1314: 1308: 1305: 1299: 1296: 1293: 1290: 1287: 1281: 1278: 1275: 1272: 1266: 1260: 1253: 1252: 1251: 1249: 1230: 1226: 1222: 1219: 1209: 1203: 1200: 1194: 1188: 1182: 1177: 1174: 1171: 1168: 1163: 1159: 1155: 1149: 1146: 1143: 1140: 1137: 1131: 1128: 1125: 1122: 1116: 1110: 1106: 1100: 1092: 1086: 1083: 1077: 1071: 1065: 1058: 1057: 1056: 1054: 1051: +  1050: 1046: 1042: 1023: 1020: 1014: 1008: 1002: 990: 988: 974: 954: 934: 926: 907: 904: 901: 895: 871: 865: 859: 856: 850: 847: 844: 838: 831: 830: 829: 812: 809: 805: 798: 795: 792: 786: 783: 777: 774: 771: 765: 762: 756: 745: 742: 739: 733: 723: 717: 709: 703: 692: 689: 686: 680: 667: 666: 665: 659: 657: 640: 634: 604: 598: 595: 589: 583: 577: 574: 568: 559: 556: 546: 545: 544: 541: 539: 520: 517: 514: 508: 488: 485: 482: 479: 476: 453: 447: 424: 418: 395: 389: 381: 380:bequest value 362: 356: 333: 327: 303: 293: 287: 281: 278: 275: 272: 262: 256: 253: 247: 241: 235: 230: 225: 221: 216: 210: 202: 196: 193: 187: 181: 175: 168: 167: 166: 149: 146: 143: 128: 126: 124: 120: 116: 112: 108: 104: 100: 95: 93: 89: 85: 84:discrete-time 81: 80:Rudolf Kálmán 77: 73: 69: 65: 60: 58: 54: 50: 49:loss function 46: 42: 38: 34: 30: 26: 22: 3525: 3502: 3489: 3455: 3422: 3418: 3412: 3387: 3383: 3377: 3358: 3352: 3333: 3327: 3318: 3312: 3293: 3287: 3268: 3262: 3242: 3235: 3208: 3202: 3185: 3179: 3170: 3166: 3160: 3151: 3145: 3100: 3095: 3085: 3066: 3056: 3037: 3027: 3007: 3000: 2956: 2861: 2440: 2338: 2303: 2214: 1944: 1813: 1774: 1755: 1725: 1649: 1590: 1580: 1576: 1526: 1486: 1245: 1052: 1048: 1044: 994: 887: 827: 663: 626: 542: 319: 132: 96: 61: 28: 24: 20: 18: 2985:Hamiltonian 1747: [ 57:Hamiltonian 3547:Categories 3467:2010.06828 3384:Automatica 2992:References 2109:Itô's rule 1055:, we have 107:stochastic 41:optimality 2936:∂ 2913:∂ 2886:− 2831:∂ 2802:∂ 2784:σ 2759:∂ 2736:∂ 2694:− 2679:∂ 2656:∂ 2606:∂ 2583:∂ 2577:− 2410:σ 2034:∈ 1974:∈ 1945:now with 1847:∫ 1428:˙ 1419:⋅ 1410:∂ 1387:∂ 1365:∂ 1342:∂ 1160:∫ 763:⋅ 754:∂ 731:∂ 701:∂ 678:∂ 641:⋅ 560:˙ 486:≤ 480:≤ 363:⋅ 334:⋅ 222:∫ 3488:(2005). 3447:14202785 3439:18632382 3404:14757582 3173:: 78–83. 3137:16589462 2969:See also 1529:notation 947:at time 29:equation 3194:1033974 3105:Bibcode 1525:little- 536:is the 45:control 3532:  3509:  3445:  3437:  3402:  3365:  3340:  3300:  3275:  3250:  3223:  3192:  3135:  3128:527981 3125:  3073:  3044:  3015:  2215:where 1806:norm. 1487:where 627:where 320:where 3462:arXiv 3443:S2CID 3400:S2CID 2107:with 1751:] 82:. In 74:from 43:of a 31:is a 3530:ISBN 3507:ISBN 3435:PMID 3363:ISBN 3338:ISBN 3298:ISBN 3273:ISBN 3248:ISBN 3221:ISBN 3190:OCLC 3133:PMID 3071:ISBN 3042:ISBN 3013:ISBN 1760:and 1734:and 469:for 39:for 19:The 3427:doi 3392:doi 3213:doi 3123:PMC 3113:doi 2331:). 2123:min 1827:min 1738:), 1097:min 1047:to 714:min 207:min 25:HJB 3549:: 3524:. 3501:. 3441:. 3433:. 3423:38 3421:. 3398:. 3388:41 3386:. 3219:. 3171:17 3169:. 3131:. 3121:. 3111:. 3101:40 3094:. 3065:. 3036:. 2965:. 1749:ru 1647:. 1581:dt 1577:dt 1053:dt 987:. 540:. 165:: 125:. 94:. 27:) 3538:. 3515:. 3470:. 3464:: 3449:. 3429:: 3406:. 3394:: 3371:. 3346:. 3306:. 3281:. 3256:. 3229:. 3215:: 3196:. 3139:. 3115:: 3107:: 3079:. 3050:. 3021:. 2939:x 2931:) 2928:t 2925:, 2922:x 2919:( 2916:V 2904:) 2901:t 2898:( 2895:r 2891:b 2883:= 2878:t 2874:u 2847:. 2839:2 2835:x 2826:) 2823:t 2820:, 2817:x 2814:( 2811:V 2806:2 2793:2 2788:2 2778:+ 2773:2 2768:) 2762:x 2754:) 2751:t 2748:, 2745:x 2742:( 2739:V 2730:( 2722:) 2719:t 2716:( 2713:r 2710:2 2704:2 2700:b 2691:x 2688:a 2682:x 2674:) 2671:t 2668:, 2665:x 2662:( 2659:V 2650:+ 2645:2 2641:x 2637:) 2634:t 2631:( 2628:q 2623:2 2620:1 2615:= 2609:t 2601:) 2598:t 2595:, 2592:x 2589:( 2586:V 2554:2 2550:/ 2544:2 2539:t 2535:x 2531:) 2528:t 2525:( 2522:q 2519:+ 2516:2 2512:/ 2506:2 2501:t 2497:u 2493:) 2490:t 2487:( 2484:r 2481:= 2478:) 2473:t 2469:u 2465:, 2460:t 2456:x 2452:( 2449:C 2426:, 2421:t 2417:w 2413:d 2407:+ 2404:t 2401:d 2398:) 2393:t 2389:u 2385:b 2382:+ 2377:t 2373:x 2369:a 2366:( 2363:= 2358:t 2354:x 2350:d 2312:V 2289:. 2284:) 2281:x 2278:( 2275:D 2272:= 2269:) 2266:T 2263:, 2260:x 2257:( 2254:V 2225:A 2200:, 2197:0 2194:= 2190:} 2186:) 2183:u 2180:, 2177:x 2174:, 2171:t 2168:( 2165:C 2162:+ 2159:) 2156:t 2153:, 2150:x 2147:( 2144:V 2139:A 2133:{ 2127:u 2095:) 2092:t 2089:, 2084:t 2080:X 2076:( 2073:V 2049:] 2046:T 2043:, 2040:0 2037:[ 2031:t 2027:) 2021:t 2017:u 2013:( 1989:] 1986:T 1983:, 1980:0 1977:[ 1971:t 1967:) 1961:t 1957:X 1953:( 1929:} 1925:) 1920:T 1916:X 1912:( 1909:D 1906:+ 1903:t 1900:d 1896:) 1891:t 1887:u 1883:, 1878:t 1874:X 1870:, 1867:t 1864:( 1861:C 1856:T 1851:0 1842:{ 1837:E 1831:u 1792:1 1788:L 1768:( 1742:( 1730:( 1711:u 1691:V 1667:) 1664:x 1661:( 1658:V 1635:0 1632:= 1629:t 1609:T 1606:= 1603:t 1563:) 1560:t 1557:, 1554:) 1551:t 1548:( 1545:x 1542:( 1539:V 1527:o 1511:) 1508:t 1505:d 1502:( 1497:o 1472:, 1469:) 1466:t 1463:d 1460:( 1455:o 1450:+ 1447:t 1444:d 1440:) 1437:t 1434:( 1425:x 1413:x 1405:) 1402:t 1399:, 1396:x 1393:( 1390:V 1381:+ 1378:t 1375:d 1368:t 1360:) 1357:t 1354:, 1351:x 1348:( 1345:V 1336:+ 1333:) 1330:t 1327:, 1324:) 1321:t 1318:( 1315:x 1312:( 1309:V 1306:= 1303:) 1300:t 1297:d 1294:+ 1291:t 1288:, 1285:) 1282:t 1279:d 1276:+ 1273:t 1270:( 1267:x 1264:( 1261:V 1231:. 1227:} 1223:s 1220:d 1216:) 1213:) 1210:s 1207:( 1204:u 1201:, 1198:) 1195:s 1192:( 1189:x 1186:( 1183:C 1178:t 1175:d 1172:+ 1169:t 1164:t 1156:+ 1153:) 1150:t 1147:d 1144:+ 1141:t 1138:, 1135:) 1132:t 1129:d 1126:+ 1123:t 1120:( 1117:x 1114:( 1111:V 1107:{ 1101:u 1093:= 1090:) 1087:t 1084:, 1081:) 1078:t 1075:( 1072:x 1069:( 1066:V 1049:t 1045:t 1027:) 1024:t 1021:, 1018:) 1015:t 1012:( 1009:x 1006:( 1003:V 975:T 955:t 935:x 911:) 908:t 905:, 902:x 899:( 896:V 872:, 869:) 866:x 863:( 860:D 857:= 854:) 851:T 848:, 845:x 842:( 839:V 813:0 810:= 806:} 802:) 799:u 796:, 793:x 790:( 787:C 784:+ 781:) 778:u 775:, 772:x 769:( 766:F 757:x 749:) 746:t 743:, 740:x 737:( 734:V 724:{ 718:u 710:+ 704:t 696:) 693:t 690:, 687:x 684:( 681:V 644:] 638:[ 635:F 611:] 608:) 605:t 602:( 599:u 596:, 593:) 590:t 587:( 584:x 581:[ 578:F 575:= 572:) 569:t 566:( 557:x 524:) 521:t 518:, 515:x 512:( 509:V 489:T 483:t 477:0 457:) 454:t 451:( 448:u 428:) 425:0 422:( 419:x 399:) 396:t 393:( 390:x 366:] 360:[ 357:D 337:] 331:[ 328:C 304:} 300:] 297:) 294:T 291:( 288:x 285:[ 282:D 279:+ 276:t 273:d 269:] 266:) 263:t 260:( 257:u 254:, 251:) 248:t 245:( 242:x 239:[ 236:C 231:T 226:0 217:{ 211:u 203:= 200:) 197:0 194:, 191:) 188:0 185:( 182:x 179:( 176:V 153:] 150:T 147:, 144:0 141:[ 23:(

Index

nonlinear partial differential equation
necessary and sufficient conditions
optimality
control
loss function
value function
Hamiltonian
dynamic programming
Richard Bellman
Hamilton–Jacobi equation
classical physics
Rudolf Kálmán
discrete-time
difference equation
Bellman equation
variational problems
brachistochrone problem
stochastic
elliptic partial differential equation
sufficiently smooth
viscosity solution
subderivatives
bequest value
value function
value function
principle of optimality
Taylor expansion
little-o notation
solved backwards in time
necessary and sufficient condition

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.