Knowledge

Delta rule

Source đź“ť

66: 25: 168: 1545: 1880: 2484: 2197: 1379: 1674: 1138:
In this case, we wish to move through "weight space" of the neuron (the space of all possible values of all of the neuron's weights) in proportion to the gradient of the error function with respect to each weight. In order to do that, we calculate the
1728: 2755: 1368: 2308: 2635: 2055: 1134: 2882: 493: 941: 1556: 1211: 1540:{\displaystyle {\frac {\partial E}{\partial w_{ji}}}={\frac {\partial \left({\frac {1}{2}}\left(t_{j}-y_{j}\right)^{2}\right)}{\partial y_{j}}}{\frac {\partial y_{j}}{\partial w_{ji}}}} 807: 1238: 2640: 856: 2546: 1014: 2777: 581: 518: 2276: 386: 2303: 2226: 2030: 2003: 1976: 1929: 1723: 720: 691: 662: 633: 980: 554: 316: 2506: 2246: 2050: 1949: 1902: 1875:{\displaystyle {\frac {\partial E}{\partial w_{ji}}}=-\left(t_{j}-y_{j}\right){\frac {\partial y_{j}}{\partial h_{j}}}{\frac {\partial h_{j}}{\partial w_{ji}}}} 1696: 1233: 1161: 1046: 740: 605: 356: 336: 284: 2779:
and eliminating the minus sign to enable us to move the weight in the negative direction of the gradient to minimize error, we arrive at our target equation:
2551: 1051: 864: 2782: 393: 2759:
As noted above, gradient descent tells us that our change for each weight should be proportional to the gradient. Choosing a proportionality constant
2924: 1166: 2956: 2479:{\displaystyle {\frac {\partial E}{\partial w_{ji}}}=-\left(t_{j}-y_{j}\right)g'(h_{j})\;{\frac {\partial }{\partial w_{ji}}}\!\!\left} 226: 208: 149: 52: 2192:{\displaystyle {\frac {\partial E}{\partial w_{ji}}}=-\left(t_{j}-y_{j}\right)g'(h_{j}){\frac {\partial h_{j}}{\partial w_{ji}}}} 87: 1678:
To find the right derivative, we again apply the chain rule, this time differentiating with respect to the total input to
1016:
does not exist at zero, and is equal to zero elsewhere, which makes the direct application of the delta rule impossible.
130: 1669:{\displaystyle {\frac {\partial E}{\partial w_{ji}}}=-\left(t_{j}-y_{j}\right){\frac {\partial y_{j}}{\partial w_{ji}}}} 102: 2892: 83: 38: 256: 2902: 109: 186: 950: 749: 116: 76: 1024:
The delta rule is derived by attempting to minimize the error in the output of the neural network through
861:
The delta rule is commonly stated in simplified form for a neuron with a linear activation function as
98: 2928: 812: 287: 178: 2511: 1140: 252: 1363:{\displaystyle {\frac {\partial E}{\partial w_{ji}}}={\frac {\partial }{\partial w_{ji}}}\left} 2762: 2750:{\displaystyle {\frac {\partial E}{\partial w_{ji}}}=-\left(t_{j}-y_{j}\right)g'(h_{j})x_{i}} 503: 2251: 1025: 361: 248: 240: 2281: 2204: 2008: 1981: 1954: 1907: 1701: 985: 698: 669: 640: 611: 2897: 956: 530: 292: 260: 190: 44: 561: 123: 2491: 2231: 2035: 1934: 1887: 1681: 1218: 1146: 1031: 725: 590: 341: 321: 269: 2950: 522: 1235:-th neuron, we can substitute the error formula above while omitting the summation: 263:
algorithm for a single-layer neural network with mean-square error loss function.
65: 1550: 1373: 946: 584: 2630:{\displaystyle {\frac {\partial (x_{i}w_{ji})}{\partial w_{ji}}}=x_{i}.} 1129:{\displaystyle E=\sum _{j}{\tfrac {1}{2}}\left(t_{j}-y_{j}\right)^{2}.} 949:'s update rule, the derivation is different. The perceptron uses the 936:{\displaystyle \Delta w_{ji}=\alpha \left(t_{j}-y_{j}\right)x_{i}} 2877:{\displaystyle \Delta w_{ji}=\alpha (t_{j}-y_{j})g'(h_{j})x_{i}.} 488:{\displaystyle \Delta w_{ji}=\alpha (t_{j}-y_{j})g'(h_{j})x_{i},} 2508:
th weight, the only term of the summation that is relevant is
161: 59: 18: 251:
learning rule for updating the weights of the inputs to
1206:{\displaystyle {\frac {\partial E}{\partial w_{ji}}}.} 1072: 752: 2785: 2765: 2643: 2554: 2514: 2494: 2311: 2284: 2254: 2234: 2207: 2058: 2038: 2011: 1984: 1957: 1937: 1910: 1890: 1731: 1704: 1684: 1559: 1382: 1241: 1221: 1169: 1149: 1054: 1034: 988: 959: 867: 815: 728: 701: 672: 643: 614: 593: 564: 533: 506: 396: 364: 344: 324: 295: 272: 90:. Unsourced material may be challenged and removed. 2876: 2771: 2749: 2629: 2540: 2500: 2478: 2297: 2270: 2240: 2220: 2191: 2044: 2024: 1997: 1970: 1943: 1923: 1896: 1874: 1717: 1690: 1668: 1539: 1362: 1227: 1215:Because we are only concerning ourselves with the 1205: 1155: 1143:of the error with respect to each weight. For the 1128: 1040: 1008: 974: 935: 850: 801: 734: 714: 685: 656: 627: 599: 575: 548: 512: 487: 380: 350: 330: 310: 278: 16:Gradient descent learning rule in machine learning 2432: 2431: 1549:To find the left derivative, we simply apply the 2637:giving us our final equation for the gradient: 1163:th weight, this derivative can be written as 8: 189:. There might be a discussion about this on 1978:. We can therefore write the derivative of 1931:, is just the neuron's activation function 53:Learn how and when to remove these messages 2405: 664:is the weighted sum of the neuron's inputs 2865: 2852: 2828: 2815: 2793: 2784: 2764: 2741: 2728: 2702: 2689: 2662: 2644: 2642: 2618: 2599: 2578: 2568: 2555: 2553: 2529: 2519: 2513: 2493: 2462: 2452: 2442: 2419: 2406: 2396: 2370: 2357: 2330: 2312: 2310: 2289: 2283: 2259: 2253: 2233: 2212: 2206: 2177: 2162: 2152: 2143: 2117: 2104: 2077: 2059: 2057: 2037: 2016: 2010: 1989: 1983: 1962: 1956: 1936: 1915: 1909: 1889: 1860: 1845: 1835: 1826: 1811: 1801: 1790: 1777: 1750: 1732: 1730: 1709: 1703: 1683: 1654: 1639: 1629: 1618: 1605: 1578: 1560: 1558: 1525: 1510: 1500: 1491: 1471: 1460: 1447: 1427: 1416: 1401: 1383: 1381: 1349: 1338: 1325: 1305: 1288: 1275: 1260: 1242: 1240: 1220: 1188: 1170: 1168: 1148: 1117: 1106: 1093: 1071: 1065: 1053: 1033: 987: 958: 927: 912: 899: 875: 866: 839: 820: 814: 790: 780: 770: 757: 751: 727: 706: 700: 677: 671: 648: 642: 619: 613: 592: 563: 532: 505: 476: 463: 439: 426: 404: 395: 369: 363: 343: 323: 294: 271: 227:Learn how and when to remove this message 209:Learn how and when to remove this message 150:Learn how and when to remove this message 2927:. University of Hartford. Archived from 2915: 2488:Because we are only concerned with the 945:While the delta rule is similar to the 802:{\textstyle h_{j}=\sum _{i}x_{i}w_{ji}} 1028:. The error for a neural network with 2228:in the last term as the sum over all 7: 1376:to split this into two derivatives: 88:adding citations to reliable sources 556:is the neuron's activation function 2786: 2655: 2647: 2592: 2558: 2412: 2408: 2323: 2315: 2170: 2155: 2070: 2062: 1853: 1838: 1819: 1804: 1743: 1735: 1647: 1632: 1571: 1563: 1518: 1503: 1484: 1419: 1394: 1386: 1281: 1277: 1253: 1245: 1181: 1173: 868: 397: 14: 34:This article has multiple issues. 166: 64: 23: 75:needs additional citations for 42:or discuss these issues on the 2858: 2845: 2834: 2808: 2734: 2721: 2587: 2561: 2402: 2389: 2278:times its corresponding input 2149: 2136: 1951:applied to the neuron's input 1003: 997: 969: 963: 851:{\displaystyle y_{j}=g(h_{j})} 845: 832: 543: 537: 469: 456: 445: 419: 305: 299: 1: 1884:Note that the output of the 1048:outputs can be measured as 1020:Derivation of the delta rule 318:, the delta rule for neuron 2893:Stochastic gradient descent 2541:{\displaystyle x_{i}w_{ji}} 953:as the activation function 520:is a small constant called 259:. It can be derived as the 257:single-layer neural network 2973: 2957:Artificial neural networks 2905:– the origin of delta rule 2772:{\displaystyle \alpha } 2248:weights of each weight 951:Heaviside step function 513:{\displaystyle \alpha } 2878: 2773: 2751: 2631: 2542: 2502: 2480: 2299: 2272: 2271:{\displaystyle w_{jk}} 2242: 2222: 2193: 2046: 2026: 1999: 1972: 1945: 1925: 1898: 1876: 1719: 1692: 1670: 1541: 1364: 1229: 1207: 1157: 1130: 1042: 1010: 982:, and that means that 976: 937: 852: 803: 736: 716: 687: 658: 629: 601: 577: 550: 514: 489: 382: 381:{\displaystyle w_{ji}} 352: 332: 312: 280: 2903:Rescorla–Wagner model 2879: 2774: 2752: 2632: 2543: 2503: 2481: 2300: 2298:{\displaystyle x_{k}} 2273: 2243: 2223: 2221:{\displaystyle h_{j}} 2194: 2052:'s first derivative: 2047: 2027: 2025:{\displaystyle h_{j}} 2000: 1998:{\displaystyle y_{j}} 1973: 1971:{\displaystyle h_{j}} 1946: 1926: 1924:{\displaystyle y_{j}} 1899: 1877: 1720: 1718:{\displaystyle h_{j}} 1693: 1671: 1542: 1365: 1230: 1208: 1158: 1131: 1043: 1011: 1009:{\displaystyle g'(h)} 977: 938: 853: 804: 737: 717: 715:{\displaystyle x_{i}} 688: 686:{\displaystyle y_{j}} 659: 657:{\displaystyle h_{j}} 630: 628:{\displaystyle t_{j}} 602: 578: 551: 515: 490: 383: 353: 333: 313: 281: 2783: 2763: 2641: 2552: 2512: 2492: 2309: 2282: 2252: 2232: 2205: 2056: 2036: 2009: 1982: 1955: 1935: 1908: 1888: 1729: 1702: 1682: 1557: 1553:and the chain rule: 1380: 1239: 1219: 1167: 1147: 1052: 1032: 986: 975:{\displaystyle g(h)} 957: 865: 813: 750: 726: 699: 693:is the actual output 670: 641: 635:is the target output 612: 591: 562: 549:{\displaystyle g(x)} 531: 504: 394: 362: 342: 322: 311:{\displaystyle g(x)} 293: 270: 179:confusing or unclear 84:improve this article 288:activation function 187:clarify the article 2874: 2769: 2747: 2627: 2538: 2498: 2476: 2447: 2295: 2268: 2238: 2218: 2189: 2042: 2022: 1995: 1968: 1941: 1921: 1894: 1872: 1715: 1688: 1666: 1537: 1360: 1225: 1203: 1153: 1141:partial derivative 1126: 1081: 1070: 1038: 1006: 972: 933: 848: 799: 775: 732: 712: 683: 654: 625: 597: 576:{\displaystyle g'} 573: 546: 510: 485: 378: 348: 328: 308: 276: 253:artificial neurons 2923:Russell, Ingrid. 2672: 2609: 2501:{\displaystyle i} 2438: 2429: 2340: 2241:{\displaystyle k} 2187: 2087: 2045:{\displaystyle g} 1944:{\displaystyle g} 1897:{\displaystyle j} 1870: 1833: 1760: 1691:{\displaystyle j} 1664: 1588: 1535: 1498: 1435: 1411: 1313: 1298: 1270: 1228:{\displaystyle j} 1198: 1156:{\displaystyle i} 1080: 1061: 1041:{\displaystyle j} 766: 735:{\displaystyle i} 600:{\displaystyle g} 351:{\displaystyle i} 331:{\displaystyle j} 279:{\displaystyle j} 237: 236: 229: 219: 218: 211: 160: 159: 152: 134: 57: 2964: 2941: 2940: 2938: 2936: 2925:"The Delta Rule" 2920: 2883: 2881: 2880: 2875: 2870: 2869: 2857: 2856: 2844: 2833: 2832: 2820: 2819: 2801: 2800: 2778: 2776: 2775: 2770: 2756: 2754: 2753: 2748: 2746: 2745: 2733: 2732: 2720: 2712: 2708: 2707: 2706: 2694: 2693: 2673: 2671: 2670: 2669: 2653: 2645: 2636: 2634: 2633: 2628: 2623: 2622: 2610: 2608: 2607: 2606: 2590: 2586: 2585: 2573: 2572: 2556: 2547: 2545: 2544: 2539: 2537: 2536: 2524: 2523: 2507: 2505: 2504: 2499: 2485: 2483: 2482: 2477: 2475: 2471: 2470: 2469: 2457: 2456: 2446: 2430: 2428: 2427: 2426: 2407: 2401: 2400: 2388: 2380: 2376: 2375: 2374: 2362: 2361: 2341: 2339: 2338: 2337: 2321: 2313: 2304: 2302: 2301: 2296: 2294: 2293: 2277: 2275: 2274: 2269: 2267: 2266: 2247: 2245: 2244: 2239: 2227: 2225: 2224: 2219: 2217: 2216: 2201:Next we rewrite 2198: 2196: 2195: 2190: 2188: 2186: 2185: 2184: 2168: 2167: 2166: 2153: 2148: 2147: 2135: 2127: 2123: 2122: 2121: 2109: 2108: 2088: 2086: 2085: 2084: 2068: 2060: 2051: 2049: 2048: 2043: 2031: 2029: 2028: 2023: 2021: 2020: 2005:with respect to 2004: 2002: 2001: 1996: 1994: 1993: 1977: 1975: 1974: 1969: 1967: 1966: 1950: 1948: 1947: 1942: 1930: 1928: 1927: 1922: 1920: 1919: 1903: 1901: 1900: 1895: 1881: 1879: 1878: 1873: 1871: 1869: 1868: 1867: 1851: 1850: 1849: 1836: 1834: 1832: 1831: 1830: 1817: 1816: 1815: 1802: 1800: 1796: 1795: 1794: 1782: 1781: 1761: 1759: 1758: 1757: 1741: 1733: 1724: 1722: 1721: 1716: 1714: 1713: 1697: 1695: 1694: 1689: 1675: 1673: 1672: 1667: 1665: 1663: 1662: 1661: 1645: 1644: 1643: 1630: 1628: 1624: 1623: 1622: 1610: 1609: 1589: 1587: 1586: 1585: 1569: 1561: 1546: 1544: 1543: 1538: 1536: 1534: 1533: 1532: 1516: 1515: 1514: 1501: 1499: 1497: 1496: 1495: 1482: 1481: 1477: 1476: 1475: 1470: 1466: 1465: 1464: 1452: 1451: 1436: 1428: 1417: 1412: 1410: 1409: 1408: 1392: 1384: 1372:Next we use the 1369: 1367: 1366: 1361: 1359: 1355: 1354: 1353: 1348: 1344: 1343: 1342: 1330: 1329: 1314: 1306: 1299: 1297: 1296: 1295: 1276: 1271: 1269: 1268: 1267: 1251: 1243: 1234: 1232: 1231: 1226: 1212: 1210: 1209: 1204: 1199: 1197: 1196: 1195: 1179: 1171: 1162: 1160: 1159: 1154: 1135: 1133: 1132: 1127: 1122: 1121: 1116: 1112: 1111: 1110: 1098: 1097: 1082: 1073: 1069: 1047: 1045: 1044: 1039: 1026:gradient descent 1015: 1013: 1012: 1007: 996: 981: 979: 978: 973: 942: 940: 939: 934: 932: 931: 922: 918: 917: 916: 904: 903: 883: 882: 857: 855: 854: 849: 844: 843: 825: 824: 808: 806: 805: 800: 798: 797: 785: 784: 774: 762: 761: 741: 739: 738: 733: 721: 719: 718: 713: 711: 710: 692: 690: 689: 684: 682: 681: 663: 661: 660: 655: 653: 652: 634: 632: 631: 626: 624: 623: 606: 604: 603: 598: 582: 580: 579: 574: 572: 555: 553: 552: 547: 519: 517: 516: 511: 494: 492: 491: 486: 481: 480: 468: 467: 455: 444: 443: 431: 430: 412: 411: 387: 385: 384: 379: 377: 376: 357: 355: 354: 349: 337: 335: 334: 329: 317: 315: 314: 309: 285: 283: 282: 277: 249:gradient descent 241:machine learning 232: 225: 214: 207: 203: 200: 194: 170: 169: 162: 155: 148: 144: 141: 135: 133: 92: 68: 60: 49: 27: 26: 19: 2972: 2971: 2967: 2966: 2965: 2963: 2962: 2961: 2947: 2946: 2945: 2944: 2934: 2932: 2931:on 4 March 2016 2922: 2921: 2917: 2912: 2898:Backpropagation 2889: 2861: 2848: 2837: 2824: 2811: 2789: 2781: 2780: 2761: 2760: 2737: 2724: 2713: 2698: 2685: 2684: 2680: 2658: 2654: 2646: 2639: 2638: 2614: 2595: 2591: 2574: 2564: 2557: 2550: 2549: 2525: 2515: 2510: 2509: 2490: 2489: 2458: 2448: 2437: 2433: 2415: 2411: 2392: 2381: 2366: 2353: 2352: 2348: 2326: 2322: 2314: 2307: 2306: 2285: 2280: 2279: 2255: 2250: 2249: 2230: 2229: 2208: 2203: 2202: 2173: 2169: 2158: 2154: 2139: 2128: 2113: 2100: 2099: 2095: 2073: 2069: 2061: 2054: 2053: 2034: 2033: 2012: 2007: 2006: 1985: 1980: 1979: 1958: 1953: 1952: 1933: 1932: 1911: 1906: 1905: 1886: 1885: 1856: 1852: 1841: 1837: 1822: 1818: 1807: 1803: 1786: 1773: 1772: 1768: 1746: 1742: 1734: 1727: 1726: 1705: 1700: 1699: 1680: 1679: 1650: 1646: 1635: 1631: 1614: 1601: 1600: 1596: 1574: 1570: 1562: 1555: 1554: 1521: 1517: 1506: 1502: 1487: 1483: 1456: 1443: 1442: 1438: 1437: 1426: 1422: 1418: 1397: 1393: 1385: 1378: 1377: 1334: 1321: 1320: 1316: 1315: 1304: 1300: 1284: 1280: 1256: 1252: 1244: 1237: 1236: 1217: 1216: 1184: 1180: 1172: 1165: 1164: 1145: 1144: 1102: 1089: 1088: 1084: 1083: 1050: 1049: 1030: 1029: 1022: 989: 984: 983: 955: 954: 923: 908: 895: 894: 890: 871: 863: 862: 835: 816: 811: 810: 786: 776: 753: 748: 747: 724: 723: 702: 697: 696: 673: 668: 667: 644: 639: 638: 615: 610: 609: 589: 588: 565: 560: 559: 529: 528: 502: 501: 472: 459: 448: 435: 422: 400: 392: 391: 365: 360: 359: 340: 339: 320: 319: 291: 290: 268: 267: 261:backpropagation 233: 222: 221: 220: 215: 204: 198: 195: 184: 171: 167: 156: 145: 139: 136: 93: 91: 81: 69: 28: 24: 17: 12: 11: 5: 2970: 2968: 2960: 2959: 2949: 2948: 2943: 2942: 2914: 2913: 2911: 2908: 2907: 2906: 2900: 2895: 2888: 2885: 2873: 2868: 2864: 2860: 2855: 2851: 2847: 2843: 2840: 2836: 2831: 2827: 2823: 2818: 2814: 2810: 2807: 2804: 2799: 2796: 2792: 2788: 2768: 2744: 2740: 2736: 2731: 2727: 2723: 2719: 2716: 2711: 2705: 2701: 2697: 2692: 2688: 2683: 2679: 2676: 2668: 2665: 2661: 2657: 2652: 2649: 2626: 2621: 2617: 2613: 2605: 2602: 2598: 2594: 2589: 2584: 2581: 2577: 2571: 2567: 2563: 2560: 2535: 2532: 2528: 2522: 2518: 2497: 2474: 2468: 2465: 2461: 2455: 2451: 2445: 2441: 2436: 2425: 2422: 2418: 2414: 2410: 2404: 2399: 2395: 2391: 2387: 2384: 2379: 2373: 2369: 2365: 2360: 2356: 2351: 2347: 2344: 2336: 2333: 2329: 2325: 2320: 2317: 2292: 2288: 2265: 2262: 2258: 2237: 2215: 2211: 2183: 2180: 2176: 2172: 2165: 2161: 2157: 2151: 2146: 2142: 2138: 2134: 2131: 2126: 2120: 2116: 2112: 2107: 2103: 2098: 2094: 2091: 2083: 2080: 2076: 2072: 2067: 2064: 2041: 2019: 2015: 1992: 1988: 1965: 1961: 1940: 1918: 1914: 1893: 1866: 1863: 1859: 1855: 1848: 1844: 1840: 1829: 1825: 1821: 1814: 1810: 1806: 1799: 1793: 1789: 1785: 1780: 1776: 1771: 1767: 1764: 1756: 1753: 1749: 1745: 1740: 1737: 1712: 1708: 1687: 1660: 1657: 1653: 1649: 1642: 1638: 1634: 1627: 1621: 1617: 1613: 1608: 1604: 1599: 1595: 1592: 1584: 1581: 1577: 1573: 1568: 1565: 1531: 1528: 1524: 1520: 1513: 1509: 1505: 1494: 1490: 1486: 1480: 1474: 1469: 1463: 1459: 1455: 1450: 1446: 1441: 1434: 1431: 1425: 1421: 1415: 1407: 1404: 1400: 1396: 1391: 1388: 1358: 1352: 1347: 1341: 1337: 1333: 1328: 1324: 1319: 1312: 1309: 1303: 1294: 1291: 1287: 1283: 1279: 1274: 1266: 1263: 1259: 1255: 1250: 1247: 1224: 1202: 1194: 1191: 1187: 1183: 1178: 1175: 1152: 1125: 1120: 1115: 1109: 1105: 1101: 1096: 1092: 1087: 1079: 1076: 1068: 1064: 1060: 1057: 1037: 1021: 1018: 1005: 1002: 999: 995: 992: 971: 968: 965: 962: 930: 926: 921: 915: 911: 907: 902: 898: 893: 889: 886: 881: 878: 874: 870: 847: 842: 838: 834: 831: 828: 823: 819: 796: 793: 789: 783: 779: 773: 769: 765: 760: 756: 746:It holds that 744: 743: 731: 709: 705: 694: 680: 676: 665: 651: 647: 636: 622: 618: 607: 596: 571: 568: 557: 545: 542: 539: 536: 526: 509: 484: 479: 475: 471: 466: 462: 458: 454: 451: 447: 442: 438: 434: 429: 425: 421: 418: 415: 410: 407: 403: 399: 375: 372: 368: 347: 327: 307: 304: 301: 298: 275: 235: 234: 217: 216: 199:September 2012 174: 172: 165: 158: 157: 72: 70: 63: 58: 32: 31: 29: 22: 15: 13: 10: 9: 6: 4: 3: 2: 2969: 2958: 2955: 2954: 2952: 2930: 2926: 2919: 2916: 2909: 2904: 2901: 2899: 2896: 2894: 2891: 2890: 2886: 2884: 2871: 2866: 2862: 2853: 2849: 2841: 2838: 2829: 2825: 2821: 2816: 2812: 2805: 2802: 2797: 2794: 2790: 2766: 2757: 2742: 2738: 2729: 2725: 2717: 2714: 2709: 2703: 2699: 2695: 2690: 2686: 2681: 2677: 2674: 2666: 2663: 2659: 2650: 2624: 2619: 2615: 2611: 2603: 2600: 2596: 2582: 2579: 2575: 2569: 2565: 2533: 2530: 2526: 2520: 2516: 2495: 2486: 2472: 2466: 2463: 2459: 2453: 2449: 2443: 2439: 2434: 2423: 2420: 2416: 2397: 2393: 2385: 2382: 2377: 2371: 2367: 2363: 2358: 2354: 2349: 2345: 2342: 2334: 2331: 2327: 2318: 2290: 2286: 2263: 2260: 2256: 2235: 2213: 2209: 2199: 2181: 2178: 2174: 2163: 2159: 2144: 2140: 2132: 2129: 2124: 2118: 2114: 2110: 2105: 2101: 2096: 2092: 2089: 2081: 2078: 2074: 2065: 2039: 2017: 2013: 1990: 1986: 1963: 1959: 1938: 1916: 1912: 1891: 1882: 1864: 1861: 1857: 1846: 1842: 1827: 1823: 1812: 1808: 1797: 1791: 1787: 1783: 1778: 1774: 1769: 1765: 1762: 1754: 1751: 1747: 1738: 1710: 1706: 1685: 1676: 1658: 1655: 1651: 1640: 1636: 1625: 1619: 1615: 1611: 1606: 1602: 1597: 1593: 1590: 1582: 1579: 1575: 1566: 1552: 1547: 1529: 1526: 1522: 1511: 1507: 1492: 1488: 1478: 1472: 1467: 1461: 1457: 1453: 1448: 1444: 1439: 1432: 1429: 1423: 1413: 1405: 1402: 1398: 1389: 1375: 1370: 1356: 1350: 1345: 1339: 1335: 1331: 1326: 1322: 1317: 1310: 1307: 1301: 1292: 1289: 1285: 1272: 1264: 1261: 1257: 1248: 1222: 1213: 1200: 1192: 1189: 1185: 1176: 1150: 1142: 1136: 1123: 1118: 1113: 1107: 1103: 1099: 1094: 1090: 1085: 1077: 1074: 1066: 1062: 1058: 1055: 1035: 1027: 1019: 1017: 1000: 993: 990: 966: 960: 952: 948: 943: 928: 924: 919: 913: 909: 905: 900: 896: 891: 887: 884: 879: 876: 872: 859: 840: 836: 829: 826: 821: 817: 794: 791: 787: 781: 777: 771: 767: 763: 758: 754: 729: 707: 703: 695: 678: 674: 666: 649: 645: 637: 620: 616: 608: 594: 586: 569: 566: 558: 540: 534: 527: 525: 524: 523:learning rate 507: 500: 499: 498: 495: 482: 477: 473: 464: 460: 452: 449: 440: 436: 432: 427: 423: 416: 413: 408: 405: 401: 389: 373: 370: 366: 345: 325: 302: 296: 289: 273: 266:For a neuron 264: 262: 258: 254: 250: 246: 242: 231: 228: 213: 210: 202: 192: 191:the talk page 188: 182: 180: 175:This article 173: 164: 163: 154: 151: 143: 140:November 2012 132: 129: 125: 122: 118: 115: 111: 108: 104: 101: â€“  100: 96: 95:Find sources: 89: 85: 79: 78: 73:This article 71: 67: 62: 61: 56: 54: 47: 46: 41: 40: 35: 30: 21: 20: 2933:. Retrieved 2929:the original 2918: 2758: 2548:. Clearly, 2487: 2200: 1883: 1677: 1548: 1371: 1214: 1137: 1023: 944: 860: 745: 521: 496: 390: 388:is given by 265: 244: 238: 223: 205: 196: 185:Please help 176: 146: 137: 127: 120: 113: 106: 99:"Delta rule" 94: 82:Please help 77:verification 74: 50: 43: 37: 36:Please help 33: 1904:th neuron, 358:-th weight 2935:5 November 2910:References 2032:simply as 1551:power rule 1374:chain rule 947:perceptron 742:-th input. 585:derivative 245:delta rule 181:to readers 110:newspapers 39:improve it 2822:− 2806:α 2787:Δ 2767:α 2696:− 2678:− 2656:∂ 2648:∂ 2593:∂ 2559:∂ 2440:∑ 2413:∂ 2409:∂ 2364:− 2346:− 2324:∂ 2316:∂ 2171:∂ 2156:∂ 2111:− 2093:− 2071:∂ 2063:∂ 1854:∂ 1839:∂ 1820:∂ 1805:∂ 1784:− 1766:− 1744:∂ 1736:∂ 1648:∂ 1633:∂ 1612:− 1594:− 1572:∂ 1564:∂ 1519:∂ 1504:∂ 1485:∂ 1454:− 1420:∂ 1395:∂ 1387:∂ 1332:− 1282:∂ 1278:∂ 1254:∂ 1246:∂ 1182:∂ 1174:∂ 1100:− 1063:∑ 906:− 888:α 869:Δ 768:∑ 508:α 433:− 417:α 398:Δ 45:talk page 2951:Category 2887:See also 2842:′ 2718:′ 2386:′ 2133:′ 994:′ 570:′ 453:′ 722:is the 583:is the 177:may be 124:scholar 497:where 243:, the 126:  119:  112:  105:  97:  286:with 255:in a 247:is a 131:JSTOR 117:books 2937:2012 809:and 103:news 587:of 338:'s 239:In 86:by 2953:: 2305:: 1725:: 1698:, 858:. 48:. 2939:. 2872:. 2867:i 2863:x 2859:) 2854:j 2850:h 2846:( 2839:g 2835:) 2830:j 2826:y 2817:j 2813:t 2809:( 2803:= 2798:i 2795:j 2791:w 2743:i 2739:x 2735:) 2730:j 2726:h 2722:( 2715:g 2710:) 2704:j 2700:y 2691:j 2687:t 2682:( 2675:= 2667:i 2664:j 2660:w 2651:E 2625:. 2620:i 2616:x 2612:= 2604:i 2601:j 2597:w 2588:) 2583:i 2580:j 2576:w 2570:i 2566:x 2562:( 2534:i 2531:j 2527:w 2521:i 2517:x 2496:i 2473:] 2467:i 2464:j 2460:w 2454:i 2450:x 2444:i 2435:[ 2424:i 2421:j 2417:w 2403:) 2398:j 2394:h 2390:( 2383:g 2378:) 2372:j 2368:y 2359:j 2355:t 2350:( 2343:= 2335:i 2332:j 2328:w 2319:E 2291:k 2287:x 2264:k 2261:j 2257:w 2236:k 2214:j 2210:h 2182:i 2179:j 2175:w 2164:j 2160:h 2150:) 2145:j 2141:h 2137:( 2130:g 2125:) 2119:j 2115:y 2106:j 2102:t 2097:( 2090:= 2082:i 2079:j 2075:w 2066:E 2040:g 2018:j 2014:h 1991:j 1987:y 1964:j 1960:h 1939:g 1917:j 1913:y 1892:j 1865:i 1862:j 1858:w 1847:j 1843:h 1828:j 1824:h 1813:j 1809:y 1798:) 1792:j 1788:y 1779:j 1775:t 1770:( 1763:= 1755:i 1752:j 1748:w 1739:E 1711:j 1707:h 1686:j 1659:i 1656:j 1652:w 1641:j 1637:y 1626:) 1620:j 1616:y 1607:j 1603:t 1598:( 1591:= 1583:i 1580:j 1576:w 1567:E 1530:i 1527:j 1523:w 1512:j 1508:y 1493:j 1489:y 1479:) 1473:2 1468:) 1462:j 1458:y 1449:j 1445:t 1440:( 1433:2 1430:1 1424:( 1414:= 1406:i 1403:j 1399:w 1390:E 1357:] 1351:2 1346:) 1340:j 1336:y 1327:j 1323:t 1318:( 1311:2 1308:1 1302:[ 1293:i 1290:j 1286:w 1273:= 1265:i 1262:j 1258:w 1249:E 1223:j 1201:. 1193:i 1190:j 1186:w 1177:E 1151:i 1124:. 1119:2 1114:) 1108:j 1104:y 1095:j 1091:t 1086:( 1078:2 1075:1 1067:j 1059:= 1056:E 1036:j 1004:) 1001:h 998:( 991:g 970:) 967:h 964:( 961:g 929:i 925:x 920:) 914:j 910:y 901:j 897:t 892:( 885:= 880:i 877:j 873:w 846:) 841:j 837:h 833:( 830:g 827:= 822:j 818:y 795:i 792:j 788:w 782:i 778:x 772:i 764:= 759:j 755:h 730:i 708:i 704:x 679:j 675:y 650:j 646:h 621:j 617:t 595:g 567:g 544:) 541:x 538:( 535:g 483:, 478:i 474:x 470:) 465:j 461:h 457:( 450:g 446:) 441:j 437:y 428:j 424:t 420:( 414:= 409:i 406:j 402:w 374:i 371:j 367:w 346:i 326:j 306:) 303:x 300:( 297:g 274:j 230:) 224:( 212:) 206:( 201:) 197:( 193:. 183:. 153:) 147:( 142:) 138:( 128:· 121:· 114:· 107:· 80:. 55:) 51:(

Index

improve it
talk page
Learn how and when to remove these messages

verification
improve this article
adding citations to reliable sources
"Delta rule"
news
newspapers
books
scholar
JSTOR
Learn how and when to remove this message
confusing or unclear
clarify the article
the talk page
Learn how and when to remove this message
Learn how and when to remove this message
machine learning
gradient descent
artificial neurons
single-layer neural network
backpropagation
activation function
learning rate
derivative
perceptron
Heaviside step function
gradient descent

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

↑