17:
771:
1106:
1487:
1289:
896:
542:
579:
236:
958:
332:
1562:
925:). This is because (following the definition of the Legendre transform: the derivatives are inverse functions) the derivative of softplus is the logistic function, whose inverse function is the
1320:
1158:
108:
427:
905:; the softmax with the first argument set to zero is the multivariable generalization of the logistic function. Both LogSumExp and softmax are used in machine learning.
1150:
1312:
782:
352:
256:
149:
1507:
766:{\displaystyle \operatorname {LSE_{0}} ^{+}(x_{1},\dots ,x_{n}):=\operatorname {LSE} (0,x_{1},\dots ,x_{n})=\ln(1+e^{x_{1}}+\cdots +e^{x_{n}}).}
447:
1603:
1593:
154:
1101:{\displaystyle \ln \left(1+e^{x}\right)\approx {\begin{cases}\ln 2,&x=0,\\{\frac {x}{1-e^{-x/\ln 2}}},&x\neq 0\end{cases}}}
1618:
1608:
261:
1598:
941:
1482:{\displaystyle f(x)={\frac {\ln(1+e^{kx})}{k}},\qquad \qquad f'(x)={\frac {e^{kx}}{1+e^{kx}}}={\frac {1}{1+e^{-kx}}}.}
123:
1284:{\displaystyle \log _{2}(1+2^{y})\approx {\begin{cases}1,&y=0,\\{\frac {y}{1-e^{-y}}},&y\neq 0.\end{cases}}}
1623:
552:
1205:
1002:
937:
50:
1613:
384:
1515:
Proceedings of the 13th
International Conference on Neural Information Processing Systems (NIPS'00)
1540:
918:
1114:
438:
114:
1550:
Rectifier and softplus activation functions. The second one is a smooth version of the first.
914:
902:
548:
37:
891:{\displaystyle \operatorname {LSE} (x_{1},\dots ,x_{n})=\ln(e^{x_{1}}+\cdots +e^{x_{n}}),}
1506:
Dugas, Charles; Bengio, Yoshua; Bélisle, François; Nadeau, Claude; Garcia, René (2000).
1297:
337:
241:
134:
365:
are used in machine learning. The name "softplus" (2000), by analogy with the earlier
1587:
1536:
933:
118:
25:
1523:
has a positive first derivative, its primitive, which we call softplus, is convex.
940:, minimizing logistic loss corresponds to maximizing entropy. This justifies the
33:
570:
564:
1508:"Incorporating second-order functional knowledge for better option pricing"
366:
921:) of the softplus function is the negative binary entropy (with base
569:
The multivariable generalization of single-variable softplus is the
537:{\displaystyle f'(x)={\frac {e^{x}}{1+e^{x}}}={\frac {1}{1+e^{-x}}}}
16:
926:
551:
is a smooth approximation of the derivative of the rectifier, the
231:{\displaystyle \log(1+e^{x})=\log(1+\epsilon )\gtrapprox \log 1=0}
1567:
Developer Guide for Intel Data
Analytics Acceleration Library
1277:
1094:
1563:"Smooth Rectifier Linear Unit (SmoothReLU) Forward Layer"
929:, which is the derivative of negative binary entropy.
1323:
1300:
1161:
1117:
961:
785:
582:
450:
387:
340:
327:{\displaystyle \log(1+e^{x})\gtrapprox \log(e^{x})=x}
264:
244:
157:
137:
53:
1481:
1306:
1283:
1144:
1100:
890:
765:
536:
421:
346:
326:
250:
230:
143:
102:
401:
377:, which is sometimes denoted with a superscript
369:(1989) is presumably because it is a smooth (
8:
238:, so just above 0, while for large positive
113:It is a smooth approximation (in fact, an
1461:
1445:
1430:
1410:
1404:
1361:
1339:
1322:
1299:
1248:
1232:
1200:
1188:
1166:
1160:
1116:
1058:
1051:
1035:
997:
983:
960:
874:
869:
848:
843:
818:
799:
784:
749:
744:
723:
718:
687:
668:
637:
618:
602:
595:
584:
581:
522:
506:
494:
477:
471:
449:
392:
386:
339:
309:
284:
263:
243:
177:
156:
136:
88:
52:
373:) approximation of the positive part of
131:in machine learning. For large negative
15:
1498:
952:This function can be approximated as:
1542:Deep sparse rectifier neural networks
573:with the first argument set to zero:
7:
103:{\displaystyle f(x)=\log(1+e^{x}).}
1111:By making the change of variables
592:
588:
585:
437:The derivative of softplus is the
14:
422:{\displaystyle x^{+}:=\max(0,x)}
1535:Xavier Glorot; Antoine Bordes;
1383:
1382:
932:Softplus can be interpreted as
1398:
1392:
1370:
1348:
1333:
1327:
1194:
1175:
1139:
1133:
936:(as a positive number), so by
882:
836:
824:
792:
757:
705:
693:
655:
643:
611:
465:
459:
416:
404:
315:
302:
290:
271:
207:
195:
183:
164:
94:
75:
63:
57:
1:
942:principle of maximum entropy
129:ReLU (rectified linear unit)
1640:
1604:Artificial neural networks
1594:Computational neuroscience
776:The LogSumExp function is
562:
1145:{\displaystyle x=y\ln(2)}
1152:, this is equivalent to
901:and its gradient is the
121:, which is known as the
1619:Entropy and information
553:Heaviside step function
1609:Functions and mappings
1517:. MIT Press: 451–457.
1483:
1308:
1294:A sharpness parameter
1285:
1146:
1102:
944:as loss minimization.
892:
767:
538:
423:
348:
328:
252:
232:
145:
104:
29:
1484:
1309:
1286:
1147:
1103:
893:
768:
539:
424:
349:
329:
253:
233:
146:
105:
19:
1321:
1298:
1159:
1115:
959:
783:
580:
448:
385:
338:
262:
242:
155:
135:
51:
1599:Logistic regression
917:(specifically, the
1519:Since the sigmoid
1479:
1304:
1281:
1276:
1142:
1098:
1093:
919:Legendre transform
888:
763:
534:
419:
344:
324:
248:
228:
141:
100:
30:
1474:
1440:
1377:
1314:may be included:
1307:{\displaystyle k}
1258:
1075:
948:Alternative forms
532:
501:
439:logistic function
433:Related functions
347:{\displaystyle x}
251:{\displaystyle x}
144:{\displaystyle x}
115:analytic function
24:function and the
1631:
1578:
1577:
1575:
1574:
1559:
1553:
1552:
1547:
1532:
1526:
1525:
1512:
1503:
1488:
1486:
1485:
1480:
1475:
1473:
1472:
1471:
1446:
1441:
1439:
1438:
1437:
1418:
1417:
1405:
1391:
1378:
1373:
1369:
1368:
1340:
1313:
1311:
1310:
1305:
1290:
1288:
1287:
1282:
1280:
1279:
1259:
1257:
1256:
1255:
1233:
1193:
1192:
1171:
1170:
1151:
1149:
1148:
1143:
1107:
1105:
1104:
1099:
1097:
1096:
1076:
1074:
1073:
1072:
1062:
1036:
993:
989:
988:
987:
915:convex conjugate
909:Convex conjugate
897:
895:
894:
889:
881:
880:
879:
878:
855:
854:
853:
852:
823:
822:
804:
803:
772:
770:
769:
764:
756:
755:
754:
753:
730:
729:
728:
727:
692:
691:
673:
672:
642:
641:
623:
622:
607:
606:
601:
600:
599:
549:sigmoid function
543:
541:
540:
535:
533:
531:
530:
529:
507:
502:
500:
499:
498:
482:
481:
472:
458:
428:
426:
425:
420:
397:
396:
376:
353:
351:
350:
345:
334:, so just above
333:
331:
330:
325:
314:
313:
289:
288:
257:
255:
254:
249:
237:
235:
234:
229:
182:
181:
150:
148:
147:
142:
109:
107:
106:
101:
93:
92:
38:machine learning
1639:
1638:
1634:
1633:
1632:
1630:
1629:
1628:
1584:
1583:
1582:
1581:
1572:
1570:
1561:
1560:
1556:
1545:
1534:
1533:
1529:
1510:
1505:
1504:
1500:
1495:
1457:
1450:
1426:
1419:
1406:
1384:
1357:
1341:
1319:
1318:
1296:
1295:
1275:
1274:
1263:
1244:
1237:
1229:
1228:
1214:
1201:
1184:
1162:
1157:
1156:
1113:
1112:
1092:
1091:
1080:
1047:
1040:
1032:
1031:
1017:
998:
979:
972:
968:
957:
956:
950:
924:
911:
870:
865:
844:
839:
814:
795:
781:
780:
745:
740:
719:
714:
683:
664:
633:
614:
591:
583:
578:
577:
567:
561:
518:
511:
490:
483:
473:
451:
446:
445:
435:
388:
383:
382:
374:
336:
335:
305:
280:
260:
259:
240:
239:
173:
153:
152:
133:
132:
84:
49:
48:
12:
11:
5:
1637:
1635:
1627:
1626:
1624:Loss functions
1621:
1616:
1611:
1606:
1601:
1596:
1586:
1585:
1580:
1579:
1554:
1527:
1497:
1496:
1494:
1491:
1490:
1489:
1478:
1470:
1467:
1464:
1460:
1456:
1453:
1449:
1444:
1436:
1433:
1429:
1425:
1422:
1416:
1413:
1409:
1403:
1400:
1397:
1394:
1390:
1387:
1381:
1376:
1372:
1367:
1364:
1360:
1356:
1353:
1350:
1347:
1344:
1338:
1335:
1332:
1329:
1326:
1303:
1292:
1291:
1278:
1273:
1270:
1267:
1264:
1262:
1254:
1251:
1247:
1243:
1240:
1236:
1231:
1230:
1227:
1224:
1221:
1218:
1215:
1213:
1210:
1207:
1206:
1204:
1199:
1196:
1191:
1187:
1183:
1180:
1177:
1174:
1169:
1165:
1141:
1138:
1135:
1132:
1129:
1126:
1123:
1120:
1109:
1108:
1095:
1090:
1087:
1084:
1081:
1079:
1071:
1068:
1065:
1061:
1057:
1054:
1050:
1046:
1043:
1039:
1034:
1033:
1030:
1027:
1024:
1021:
1018:
1016:
1013:
1010:
1007:
1004:
1003:
1001:
996:
992:
986:
982:
978:
975:
971:
967:
964:
949:
946:
922:
910:
907:
899:
898:
887:
884:
877:
873:
868:
864:
861:
858:
851:
847:
842:
838:
835:
832:
829:
826:
821:
817:
813:
810:
807:
802:
798:
794:
791:
788:
774:
773:
762:
759:
752:
748:
743:
739:
736:
733:
726:
722:
717:
713:
710:
707:
704:
701:
698:
695:
690:
686:
682:
679:
676:
671:
667:
663:
660:
657:
654:
651:
648:
645:
640:
636:
632:
629:
626:
621:
617:
613:
610:
605:
598:
594:
590:
587:
563:Main article:
560:
557:
545:
544:
528:
525:
521:
517:
514:
510:
505:
497:
493:
489:
486:
480:
476:
470:
467:
464:
461:
457:
454:
434:
431:
418:
415:
412:
409:
406:
403:
400:
395:
391:
343:
323:
320:
317:
312:
308:
304:
301:
298:
295:
292:
287:
283:
279:
276:
273:
270:
267:
247:
227:
224:
221:
218:
215:
212:
209:
206:
203:
200:
197:
194:
191:
188:
185:
180:
176:
172:
169:
166:
163:
160:
140:
111:
110:
99:
96:
91:
87:
83:
80:
77:
74:
71:
68:
65:
62:
59:
56:
13:
10:
9:
6:
4:
3:
2:
1636:
1625:
1622:
1620:
1617:
1615:
1612:
1610:
1607:
1605:
1602:
1600:
1597:
1595:
1592:
1591:
1589:
1568:
1564:
1558:
1555:
1551:
1544:
1543:
1538:
1537:Yoshua Bengio
1531:
1528:
1524:
1522:
1516:
1509:
1502:
1499:
1492:
1476:
1468:
1465:
1462:
1458:
1454:
1451:
1447:
1442:
1434:
1431:
1427:
1423:
1420:
1414:
1411:
1407:
1401:
1395:
1388:
1385:
1379:
1374:
1365:
1362:
1358:
1354:
1351:
1345:
1342:
1336:
1330:
1324:
1317:
1316:
1315:
1301:
1271:
1268:
1265:
1260:
1252:
1249:
1245:
1241:
1238:
1234:
1225:
1222:
1219:
1216:
1211:
1208:
1202:
1197:
1189:
1185:
1181:
1178:
1172:
1167:
1163:
1155:
1154:
1153:
1136:
1130:
1127:
1124:
1121:
1118:
1088:
1085:
1082:
1077:
1069:
1066:
1063:
1059:
1055:
1052:
1048:
1044:
1041:
1037:
1028:
1025:
1022:
1019:
1014:
1011:
1008:
1005:
999:
994:
990:
984:
980:
976:
973:
969:
965:
962:
955:
954:
953:
947:
945:
943:
939:
935:
934:logistic loss
930:
928:
920:
916:
908:
906:
904:
885:
875:
871:
866:
862:
859:
856:
849:
845:
840:
833:
830:
827:
819:
815:
811:
808:
805:
800:
796:
789:
786:
779:
778:
777:
760:
750:
746:
741:
737:
734:
731:
724:
720:
715:
711:
708:
702:
699:
696:
688:
684:
680:
677:
674:
669:
665:
661:
658:
652:
649:
646:
638:
634:
630:
627:
624:
619:
615:
608:
603:
596:
576:
575:
574:
572:
566:
558:
556:
554:
550:
547:The logistic
526:
523:
519:
515:
512:
508:
503:
495:
491:
487:
484:
478:
474:
468:
462:
455:
452:
444:
443:
442:
440:
432:
430:
413:
410:
407:
398:
393:
389:
380:
372:
368:
364:
360:
355:
341:
321:
318:
310:
306:
299:
296:
293:
285:
281:
277:
274:
268:
265:
245:
225:
222:
219:
216:
213:
210:
204:
201:
198:
192:
189:
186:
178:
174:
170:
167:
161:
158:
138:
130:
126:
125:
120:
119:ramp function
116:
97:
89:
85:
81:
78:
72:
69:
66:
60:
54:
47:
46:
45:
43:
39:
35:
27:
26:ramp function
23:
18:
1614:Exponentials
1571:. Retrieved
1566:
1557:
1549:
1541:
1530:
1520:
1518:
1514:
1501:
1293:
1110:
951:
931:
912:
900:
775:
568:
546:
436:
378:
370:
362:
358:
356:
128:
122:
112:
44:function is
41:
31:
21:
20:Plot of the
1548:. AISTATS.
34:mathematics
1588:Categories
1573:2018-12-04
1493:References
363:SmoothReLU
357:The names
1463:−
1346:
1269:≠
1250:−
1242:−
1198:≈
1173:
1131:
1086:≠
1067:
1053:−
1045:−
1009:
995:≈
966:
860:⋯
834:
809:…
790:
735:⋯
703:
678:…
653:
628:…
609:
571:LogSumExp
565:LogSumExp
559:LogSumExp
524:−
300:
294:⪆
269:
217:
211:⪆
205:ϵ
193:
162:
124:rectifier
117:) to the
73:
1539:(2011).
1389:′
456:′
359:softplus
42:softplus
22:softplus
938:duality
903:softmax
367:softmax
1569:. 2017
258:it is
151:it is
40:, the
1546:(PDF)
1511:(PDF)
927:logit
913:The
379:plus
371:soft
361:and
36:and
1164:log
787:LSE
650:LSE
402:max
297:log
266:log
214:log
190:log
159:log
127:or
70:log
32:In
1590::
1565:.
1513:.
1343:ln
1272:0.
1128:ln
1064:ln
1006:ln
963:ln
831:ln
700:ln
647::=
555:.
441::
429:.
399::=
381:,
354:.
1576:.
1521:h
1477:.
1469:x
1466:k
1459:e
1455:+
1452:1
1448:1
1443:=
1435:x
1432:k
1428:e
1424:+
1421:1
1415:x
1412:k
1408:e
1402:=
1399:)
1396:x
1393:(
1386:f
1380:,
1375:k
1371:)
1366:x
1363:k
1359:e
1355:+
1352:1
1349:(
1337:=
1334:)
1331:x
1328:(
1325:f
1302:k
1266:y
1261:,
1253:y
1246:e
1239:1
1235:y
1226:,
1223:0
1220:=
1217:y
1212:,
1209:1
1203:{
1195:)
1190:y
1186:2
1182:+
1179:1
1176:(
1168:2
1140:)
1137:2
1134:(
1125:y
1122:=
1119:x
1089:0
1083:x
1078:,
1070:2
1060:/
1056:x
1049:e
1042:1
1038:x
1029:,
1026:0
1023:=
1020:x
1015:,
1012:2
1000:{
991:)
985:x
981:e
977:+
974:1
970:(
923:e
886:,
883:)
876:n
872:x
867:e
863:+
857:+
850:1
846:x
841:e
837:(
828:=
825:)
820:n
816:x
812:,
806:,
801:1
797:x
793:(
761:.
758:)
751:n
747:x
742:e
738:+
732:+
725:1
721:x
716:e
712:+
709:1
706:(
697:=
694:)
689:n
685:x
681:,
675:,
670:1
666:x
662:,
659:0
656:(
644:)
639:n
635:x
631:,
625:,
620:1
616:x
612:(
604:+
597:0
593:E
589:S
586:L
527:x
520:e
516:+
513:1
509:1
504:=
496:x
492:e
488:+
485:1
479:x
475:e
469:=
466:)
463:x
460:(
453:f
417:)
414:x
411:,
408:0
405:(
394:+
390:x
375:x
342:x
322:x
319:=
316:)
311:x
307:e
303:(
291:)
286:x
282:e
278:+
275:1
272:(
246:x
226:0
223:=
220:1
208:)
202:+
199:1
196:(
187:=
184:)
179:x
175:e
171:+
168:1
165:(
139:x
98:.
95:)
90:x
86:e
82:+
79:1
76:(
67:=
64:)
61:x
58:(
55:f
28:.
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.