Knowledge

Talk:Limited-memory BFGS

Source 📝

493:
article by Byrd, Nocedal and Schnabel which I've cited, I've enriched the wikipedia article to the best of my ability. We really need someone who can explain the relevant proofs -- e.g. Why we're able to use a limited history for the BFGS update without causing the approximate Hessian to stop being symmetric and positive definite. These proofs are given in the "Representations" paper, but I don't understand them well enough to reduplicate the argument without lifting the proofs directly. I imagine they're also in the 1980 paper, but I can't find that anywhere. The 1989 paper is totally useless from an implementation perspective -- it outlines the QN procedure, and skips over how to do the Hessian w/o representing the whole thing.
508:
definition of s and y to be consistent with i going to the current iteration minus one, which seems to have been the intention of the code that was there (there are two alternative formulations in the literature which differ by an index offset of one, and I've gone with the one you can see there, where i goes up to the current iteration minus one). I also changed scalars to Greek letters, because when we're not using bold for vector quantities, it is otherwise quite hard to distinguish scalar and vector quantities.
447:. It is an online encyclopedia, and while links to software packages for L-BFGS are certainly relevant and worth including, this entry is not here simply to advertise software packages; the Knowledge entry on L-BFGS should first and foremost be a discussion the algorithm. If no further objection/elaboration is raised, I'm going to re-add the algorithm information and reformat the article to conform with Knowledge standards. -- 151: 78: 60: 33: 416:
As it stands, this article is basically an a set of links to, and information about, library packages. It should have a more detailed description of the algorithm itself and fewer links to external content. Does someone with a more thorough background in optimization than myself want to take this on?
492:
I recently had to implement this, and was very annoyed that there wasn't a helpful Knowledge page, and none of my standard sources had enough information (e.g. Numerical Recipes). As such, having finished my implementation, and finding the key to doing so buried at the back of the "Representations"
431:
The intention when creating this page was to give software information. Our hope is that users can contribute links to versions of L-BFGS written in different languages. There are various explanations of the L-BFGS algorithm on the web and there is no need for another one here. Therefore I will try
1031:
The paper is: Richard H. Byrd, Jorge Nocedal and Robert B. Schnabel: “Representations of Quasi-Newton Matrices and Their Use in Limited Memory Methods,” Technical Report, CU-CS-612-92, University of Colorado at Boulder, 1992. This is probably the same as the one from citation 5, only two years
507:
Thanks to the previous contributors (Abeppu?) for adding in the L-BFGS algorithm, which is indeed very useful and nicely formatted. There seem to have been some minor errors about the loop index i, and while referring to some published descriptions of L-BFGS I fixed these. I also changed the
172: 1027:
makes this matrix diagonal, a fact that is described in the following text as only “commonly”. There should, IMO, be a short explanation of this apparent contradiction. I feel I don't know enough about this, so won't provide one.
964: 800: 477:
I completely agree with Soultaco; a page entitled L-BFGS should provide a concise description of the algorithm. Readers are more interested in how the method works than the author's software package that implements it.
196: 336: 851: 616: 395:
It would be nice if the LBFGS article and the BFGS article used the same symbol to represent the approximation to the inverse of the Hessian. LBFGS uses Hk and BFGS uses Bk^{-1}
1490: 118: 253: 191: 1203: 659: 1248: 1460: 1413: 1159: 1111: 1079: 1025: 527:
in the sentence "An early, open source implementation of L-BFGS in Fortran exists" I would expect some citation and/or link to this reference that seems important. No?
124: 1254:. Even though my code works for a benchmark optimization problem it would be good if someone could find a source or dig through the original papers on limited bfgs. 1495: 1433: 1325: 1321: 1307: 987: 1485: 663:
Nocedal's Numerical Optimization has a positive sign. (Alg 7.4, page 178) I tried implementing the version listed here and it does not converge on Rosenbrock.
1081:
a **scalar** matrix, so by abuse of notation one might say that the pseudo code is in fact correct? But then surely there should be a better approximation of
94: 298: 856: 272: 687: 137: 85: 65: 670: 361: 1035: 402: 244: 1250:
gave convergence to local minimum in 14 steps. I used a strictly positive bivariate quartic polynomial for my objective similar to the
1303:
When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.
225: 317: 805:
How to fix this? The paper cited below gives at the top of page 9 (in the paragraph just before equation 3.1) the formula
1293: 1368: 543: 282: 163: 40: 292: 206: 327: 93:
related articles on Knowledge. If you would like to participate, please visit the project page, where you can join
354: 1324:
to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the
674: 1039: 406: 462:
I have rewritten the page so that it conforms to Knowledge standards and informs the reader about the L-BFGS
1359: 1285: 808: 573: 1467: 1281: 1118: 483: 1343:
If you have discovered URLs which were erroneously considered dead by the bot, you can report them with
1331: 1259: 263: 46: 1284:. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit 1114: 554: 498: 666: 531: 398: 1209:
satisfying point. (I used Algorithm 3.5 & 3.6 of Nocedal mentioned above.) Flipping the sign to
479: 1277: 1255: 1251: 509: 1164: 1129:
Hi, I know wikipedia is not intended to be a place of original research, but the sign in front of
620: 471: 433: 1212: 550: 535: 513: 494: 452: 1438: 1328:
before doing mass systematic removals. This message is updated dynamically through the template
1344: 1463: 1386: 1132: 1084: 1052: 998: 182: 1206: 539: 234: 90: 1351: 1310:, "External links modified" talk page sections are no longer generated or monitored by 992:
If someone could verify this and then change the page accordingly that would be great.
308: 150: 1418: 1350:
If you found an error with any archives or the URLs themselves, you can fix them with
1294:
https://web.archive.org/web/20131101205929/http://acl.ldc.upenn.edu/W/W02/W02-2018.pdf
972: 173:
Requested articles/Applied arts and sciences/Computer science, computing, and Internet
1479: 448: 444: 418: 1415:
in the algorithm. It is just a scaled identity matrix, so why not simply multiply
1317: 1471: 1373: 1316:. No special action is required regarding these talk page notices, other than 1297: 1263: 1122: 1049:
I was wandering about the same thing. But following your interpretation makes
1043: 678: 558: 517: 502: 487: 456: 436: 421: 410: 959:{\displaystyle \gamma _{k}=y_{k-1}^{\rm {T}}s_{k-1}/y_{k-1}^{\rm {T}}y_{k-1}} 1161:
is definitely bugged. I coded up limited memory BFGS myself to check. Using
215: 795:{\displaystyle H_{k}^{0}=y_{k-1}^{\rm {T}}s_{k-1}/y_{k-1}^{\rm {T}}y_{k-1}} 77: 59: 17: 802:
can't be right: The left side is a matrix, the right side a scalar.
466:. Please feel free to write a separate page about limited memory 470:, but I propose that we do not try to do both in the same page. 291:
Find pictures for the biographies of computer scientists (see
26: 1288:
for additional information. I made the following changes:
1383:
I don't understand the point of introducing the matrix
1441: 1421: 1389: 1215: 1167: 1135: 1087: 1055: 1001: 989:
should be added on the right side of the assignment.
975: 859: 811: 690: 623: 576: 89:, a collaborative effort to improve the coverage of 1320:using the archive tool instructions below. Editors 39:This article has not yet been rated on Knowledge's 1454: 1427: 1407: 1242: 1197: 1153: 1105: 1073: 1019: 981: 958: 853:and, on the bottom of page 11, equation 3.12 says 845: 794: 653: 610: 197:Computer science articles needing expert attention 123:This article has not yet received a rating on the 1113:that uses a diagonal non-scalar matrix, right? 1306:This message was posted before February 2018. 568:The sign of initial z seems to be wrong here: 337:WikiProject Computer science/Unreferenced BLPs 8: 1491:Unknown-importance Computer science articles 1298:http://acl.ldc.upenn.edu/W/W02/W02-2018.pdf 254:Computer science articles without infoboxes 192:Computer science articles needing attention 1276:I have just modified one external link on 664: 158:Here are some tasks awaiting attention: 132: 54: 32: 30: 1446: 1440: 1420: 1399: 1394: 1388: 1231: 1226: 1214: 1186: 1181: 1166: 1145: 1140: 1134: 1097: 1092: 1086: 1065: 1060: 1054: 1032:earlier and fetched from another source. 1011: 1006: 1000: 974: 944: 933: 932: 921: 912: 900: 889: 888: 877: 864: 858: 834: 821: 816: 810: 780: 769: 768: 757: 748: 736: 725: 724: 713: 700: 695: 689: 642: 637: 622: 599: 586: 581: 575: 56: 846:{\displaystyle H_{k}^{0}=\gamma _{k}I} 611:{\displaystyle H_{k}^{0}=\gamma _{k}I} 445:Knowledge is not a collection of links 103:Knowledge:WikiProject Computer science 1496:WikiProject Computer science articles 106:Template:WikiProject Computer science 7: 1486:Unassessed Computer science articles 995:Also, note that this calculation of 83:This article is within the scope of 45:It is of interest to the following 969:So it seems a multiplication with 934: 890: 770: 726: 273:Timeline of computing 2020–present 25: 1280:. Please take a moment to review 299:Computing articles needing images 432:to restore the earlier version. 149: 76: 58: 31: 546:) 22:38, 20 February 2010 (UTC) 1: 1374:10:37, 23 December 2017 (UTC) 1198:{\displaystyle z=-H_{k}^{0}q} 1044:21:07, 17 November 2015 (UTC) 654:{\displaystyle z=-H_{k}^{0}q} 503:05:00, 19 February 2009 (UTC) 443:Thanks for contributing, but 353:Tag all relevant articles in 97:and see a list of open tasks. 1243:{\displaystyle z=H_{k}^{0}q} 679:00:16, 23 January 2019 (UTC) 559:18:07, 31 October 2010 (UTC) 457:04:58, 6 December 2007 (UTC) 437:06:53, 2 December 2007 (UTC) 422:15:55, 15 October 2007 (UTC) 362:WikiProject Computer science 138:WikiProject Computer science 86:WikiProject Computer science 1472:22:37, 25 August 2019 (UTC) 1455:{\displaystyle \gamma _{k}} 293:List of computer scientists 1512: 1337:(last update: 5 June 2024) 1273:Hello fellow Wikipedians, 1264:07:02, 22 March 2019 (UTC) 1123:11:00, 15 April 2017 (UTC) 411:03:52, 26 March 2014 (UTC) 125:project's importance scale 1408:{\displaystyle H_{k}^{0}} 1154:{\displaystyle H_{k}^{0}} 1106:{\displaystyle H_{k}^{0}} 1074:{\displaystyle H_{k}^{0}} 1020:{\displaystyle H_{k}^{0}} 518:00:37, 6 April 2011 (UTC) 488:04:34, 20 June 2008 (UTC) 427:software information back 355:Category:Computer science 131: 122: 109:Computer science articles 71: 53: 357:and sub-categories with 1269:External links modified 1205:gave failure to find a 1456: 1429: 1409: 1244: 1199: 1155: 1107: 1075: 1021: 983: 960: 847: 796: 655: 612: 318:Computer science stubs 1457: 1430: 1410: 1379:Algorithm description 1245: 1200: 1156: 1108: 1076: 1022: 984: 961: 848: 797: 656: 613: 1439: 1419: 1387: 1318:regular verification 1213: 1165: 1133: 1085: 1053: 999: 973: 857: 809: 688: 621: 574: 136:Things you can help 1404: 1308:After February 2018 1278:Limited-memory BFGS 1252:Rosenbrock function 1236: 1191: 1150: 1102: 1070: 1016: 939: 895: 826: 775: 731: 705: 647: 591: 1452: 1425: 1405: 1390: 1362:InternetArchiveBot 1313:InternetArchiveBot 1240: 1222: 1195: 1177: 1151: 1136: 1103: 1088: 1071: 1056: 1017: 1002: 979: 956: 917: 873: 843: 812: 792: 753: 709: 691: 651: 633: 608: 577: 41:content assessment 1428:{\displaystyle z} 1338: 982:{\displaystyle I} 681: 669:comment added by 564:Bug in algorithm? 548: 534:comment added by 401:comment added by 392: 391: 388: 387: 384: 383: 380: 379: 376: 375: 16:(Redirected from 1503: 1461: 1459: 1458: 1453: 1451: 1450: 1434: 1432: 1431: 1426: 1414: 1412: 1411: 1406: 1403: 1398: 1372: 1363: 1336: 1335: 1314: 1249: 1247: 1246: 1241: 1235: 1230: 1207:Wolfe conditions 1204: 1202: 1201: 1196: 1190: 1185: 1160: 1158: 1157: 1152: 1149: 1144: 1112: 1110: 1109: 1104: 1101: 1096: 1080: 1078: 1077: 1072: 1069: 1064: 1026: 1024: 1023: 1018: 1015: 1010: 988: 986: 985: 980: 965: 963: 962: 957: 955: 954: 938: 937: 931: 916: 911: 910: 894: 893: 887: 869: 868: 852: 850: 849: 844: 839: 838: 825: 820: 801: 799: 798: 793: 791: 790: 774: 773: 767: 752: 747: 746: 730: 729: 723: 704: 699: 660: 658: 657: 652: 646: 641: 617: 615: 614: 609: 604: 603: 590: 585: 547: 528: 523:Citation needed? 413: 366: 360: 235:Computer science 164:Article requests 153: 146: 145: 133: 111: 110: 107: 104: 101: 100:Computer science 91:Computer science 80: 73: 72: 66:Computer science 62: 55: 36: 35: 34: 27: 21: 1511: 1510: 1506: 1505: 1504: 1502: 1501: 1500: 1476: 1475: 1464:-- Andrew Myers 1442: 1437: 1436: 1417: 1416: 1385: 1384: 1381: 1366: 1361: 1329: 1322:have permission 1312: 1286:this simple FaQ 1271: 1211: 1210: 1163: 1162: 1131: 1130: 1083: 1082: 1051: 1050: 997: 996: 971: 970: 940: 896: 860: 855: 854: 830: 807: 806: 776: 732: 686: 685: 684:The assignment 671:136.152.250.167 661: 619: 618: 595: 572: 571: 566: 529: 525: 474:February, 2008 429: 396: 372: 369: 364: 358: 346:Project-related 341: 322: 303: 277: 258: 239: 220: 201: 177: 108: 105: 102: 99: 98: 23: 22: 15: 12: 11: 5: 1509: 1507: 1499: 1498: 1493: 1488: 1478: 1477: 1449: 1445: 1424: 1402: 1397: 1393: 1380: 1377: 1356: 1355: 1348: 1301: 1300: 1292:Added archive 1270: 1267: 1239: 1234: 1229: 1225: 1221: 1218: 1194: 1189: 1184: 1180: 1176: 1173: 1170: 1148: 1143: 1139: 1128: 1126: 1125: 1100: 1095: 1091: 1068: 1063: 1059: 1036:84.143.150.249 1014: 1009: 1005: 978: 953: 950: 947: 943: 936: 930: 927: 924: 920: 915: 909: 906: 903: 899: 892: 886: 883: 880: 876: 872: 867: 863: 842: 837: 833: 829: 824: 819: 815: 789: 786: 783: 779: 772: 766: 763: 760: 756: 751: 745: 742: 739: 735: 728: 722: 719: 716: 712: 708: 703: 698: 694: 650: 645: 640: 636: 632: 629: 626: 607: 602: 598: 594: 589: 584: 580: 570: 565: 562: 549:yes - done. -- 524: 521: 460: 459: 439:Jorge Nocedal 428: 425: 403:76.102.204.220 394: 390: 389: 386: 385: 382: 381: 378: 377: 374: 373: 371: 370: 368: 367: 350: 342: 340: 339: 333: 323: 321: 320: 314: 304: 302: 301: 296: 288: 278: 276: 275: 269: 259: 257: 256: 250: 240: 238: 237: 231: 221: 219: 218: 212: 202: 200: 199: 194: 188: 178: 176: 175: 169: 157: 155: 154: 142: 141: 129: 128: 121: 115: 114: 112: 95:the discussion 81: 69: 68: 63: 51: 50: 44: 37: 24: 14: 13: 10: 9: 6: 4: 3: 2: 1508: 1497: 1494: 1492: 1489: 1487: 1484: 1483: 1481: 1474: 1473: 1469: 1465: 1447: 1443: 1422: 1400: 1395: 1391: 1378: 1376: 1375: 1370: 1365: 1364: 1353: 1349: 1346: 1342: 1341: 1340: 1333: 1327: 1323: 1319: 1315: 1309: 1304: 1299: 1295: 1291: 1290: 1289: 1287: 1283: 1279: 1274: 1268: 1266: 1265: 1261: 1257: 1253: 1237: 1232: 1227: 1223: 1219: 1216: 1208: 1192: 1187: 1182: 1178: 1174: 1171: 1168: 1146: 1141: 1137: 1124: 1120: 1116: 1098: 1093: 1089: 1066: 1061: 1057: 1048: 1047: 1046: 1045: 1041: 1037: 1033: 1029: 1012: 1007: 1003: 993: 990: 976: 967: 951: 948: 945: 941: 928: 925: 922: 918: 913: 907: 904: 901: 897: 884: 881: 878: 874: 870: 865: 861: 840: 835: 831: 827: 822: 817: 813: 803: 787: 784: 781: 777: 764: 761: 758: 754: 749: 743: 740: 737: 733: 720: 717: 714: 710: 706: 701: 696: 692: 682: 680: 676: 672: 668: 648: 643: 638: 634: 630: 627: 624: 605: 600: 596: 592: 587: 582: 578: 569: 563: 561: 560: 556: 552: 545: 541: 537: 533: 522: 520: 519: 515: 511: 505: 504: 500: 496: 490: 489: 485: 481: 475: 473: 469: 465: 458: 454: 450: 446: 442: 441: 440: 438: 435: 426: 424: 423: 420: 414: 412: 408: 404: 400: 363: 356: 352: 351: 349: 347: 343: 338: 335: 334: 332: 330: 329: 324: 319: 316: 315: 313: 311: 310: 305: 300: 297: 294: 290: 289: 287: 285: 284: 279: 274: 271: 270: 268: 266: 265: 260: 255: 252: 251: 249: 247: 246: 241: 236: 233: 232: 230: 228: 227: 222: 217: 214: 213: 211: 209: 208: 203: 198: 195: 193: 190: 189: 187: 185: 184: 179: 174: 171: 170: 168: 166: 165: 160: 159: 156: 152: 148: 147: 144: 143: 139: 135: 134: 130: 126: 120: 117: 116: 113: 96: 92: 88: 87: 82: 79: 75: 74: 70: 67: 64: 61: 57: 52: 48: 42: 38: 29: 28: 19: 1382: 1360: 1357: 1332:source check 1311: 1305: 1302: 1275: 1272: 1127: 1034: 1030: 994: 991: 968: 804: 683: 665:— Preceding 662: 567: 526: 506: 491: 476: 467: 463: 461: 430: 415: 397:— Preceding 393: 345: 344: 328:Unreferenced 326: 325: 307: 306: 281: 280: 262: 261: 243: 242: 224: 223: 205: 204: 181: 180: 162: 161: 84: 47:WikiProjects 530:—Preceding 18:Talk:L-BFGS 1480:Categories 1369:Report bug 468:algorithms 1352:this tool 1345:this tool 480:Henkelman 216:Computing 1358:Cheers.— 1256:Akrodger 667:unsigned 544:contribs 532:unsigned 510:Danpovey 449:Soultaco 419:Soultaco 399:unsigned 264:Maintain 207:Copyedit 1282:my edit 1115:bungalo 472:Nocedal 434:Nocedal 245:Infobox 183:Cleanup 551:Dikay0 536:Orzelf 495:Abeppu 226:Expand 43:scale. 464:codes 309:Stubs 283:Photo 140:with: 1468:talk 1260:talk 1119:talk 1040:talk 675:talk 555:talk 540:talk 514:talk 499:talk 484:talk 453:talk 407:talk 1435:by 1326:RfC 1296:to 119:??? 1482:: 1470:) 1462:? 1444:γ 1339:. 1334:}} 1330:{{ 1262:) 1175:− 1121:) 1042:) 966:. 949:− 926:− 905:− 882:− 862:γ 832:γ 785:− 762:− 741:− 718:− 677:) 631:− 597:γ 557:) 542:• 516:) 501:) 486:) 455:) 417:-- 409:) 365:}} 359:{{ 1466:( 1448:k 1423:z 1401:0 1396:k 1392:H 1371:) 1367:( 1354:. 1347:. 1258:( 1238:q 1233:0 1228:k 1224:H 1220:= 1217:z 1193:q 1188:0 1183:k 1179:H 1172:= 1169:z 1147:0 1142:k 1138:H 1117:( 1099:0 1094:k 1090:H 1067:0 1062:k 1058:H 1038:( 1013:0 1008:k 1004:H 977:I 952:1 946:k 942:y 935:T 929:1 923:k 919:y 914:/ 908:1 902:k 898:s 891:T 885:1 879:k 875:y 871:= 866:k 841:I 836:k 828:= 823:0 818:k 814:H 788:1 782:k 778:y 771:T 765:1 759:k 755:y 750:/ 744:1 738:k 734:s 727:T 721:1 715:k 711:y 707:= 702:0 697:k 693:H 673:( 649:q 644:0 639:k 635:H 628:= 625:z 606:I 601:k 593:= 588:0 583:k 579:H 553:( 538:( 512:( 497:( 482:( 451:( 405:( 348:: 331:: 312:: 295:) 286:: 267:: 248:: 229:: 210:: 186:: 167:: 127:. 49:: 20:)

Index

Talk:L-BFGS
content assessment
WikiProjects
WikiProject icon
Computer science
WikiProject icon
WikiProject Computer science
Computer science
the discussion
???
project's importance scale
WikiProject Computer science

Article requests
Requested articles/Applied arts and sciences/Computer science, computing, and Internet
Cleanup
Computer science articles needing attention
Computer science articles needing expert attention
Copyedit
Computing
Expand
Computer science
Infobox
Computer science articles without infoboxes
Maintain
Timeline of computing 2020–present
Photo
List of computer scientists
Computing articles needing images
Stubs

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.