49:
452:
496:. He asserts that friendliness (a desire not to harm humans) should be designed in from the start, but that the designers should recognize both that their own designs may be flawed, and that the robot will learn and evolve over time. Thus the challenge is one of mechanism design—to define a mechanism for evolving AI systems under a system of checks and balances, and to give the systems utility functions that will remain friendly in the face of such changes.
747:, where "Peer review panels of computer and cognitive scientists would sift through projects and choose those that are designed both to advance AI and assure that such advances would be accompanied by appropriate safeguards." McGinnis feels that peer review is better "than regulation to address technical issues that are not possible to capture through bureaucratic mandates". McGinnis notes that his proposal stands in contrast to that of the
791:, say that it will be impossible to ever guarantee "friendly" behavior in AIs because problems of ethical complexity will not yield to software advances or increases in computing power. They write that the criteria upon which friendly AI theories are based work "only when one has not only great powers of prediction about the likelihood of myriad possible outcomes, but certainty and consensus on how one values the different outcomes.
781:, Boyles and Joaquin maintain that such AIs would not be that friendly considering the following: the infinite amount of antecedent counterfactual conditions that would have to be programmed into a machine, the difficulty of cashing out the set of moral values—that is, those that are more ideal than the ones human beings possess at present, and the apparent disconnect between counterfactual antecedents and ideal value consequent.
2352:
771:, Alan Winfield compares human-level artificial intelligence with faster-than-light travel in terms of difficulty, and states that while we need to be "cautious and prepared" given the stakes involved, we "don't need to be obsessing" about the risks of superintelligence. Boyles and Joaquin, on the other hand, argue that Luke Muehlhauser and
677:
strengthened when messages resonate with AI developers; Baum argues that, in contrast, "existing messages about beneficial AI are not always framed well". Baum advocates for "cooperative relationships, and positive framing of AI researchers" and cautions against characterizing AI researchers as "not want(ing) to pursue beneficial designs".
784:
Some philosophers claim that any truly "rational" agent, whether artificial or human, will naturally be benevolent; in this view, deliberate safeguards designed to produce a friendly AI could be unnecessary or even harmful. Other critics question whether it is possible for an artificial intelligence
1786:...the essence of AGIs is their reasoning facilities, and it is the very logic of their being that will compel them to behave in a moral fashion... The real nightmare scenario (is one where) humans find it advantageous to strongly couple themselves to AGIs, with no guarantees against self-deception.
707:
The "preferences" Russell refers to "are all-encompassing; they cover everything you might care about, arbitrarily far into the future." Similarly, "behavior" includes any choice between options, and the uncertainty is such that some probability, which may be quite small, must be assigned to every
617:
Yudkowsky advances the
Coherent Extrapolated Volition (CEV) model. According to him, our coherent extrapolated volition is "our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where
676:
argues that the development of safe, socially beneficial artificial intelligence or artificial general intelligence is a function of the social psychology of AI research communities, and so can be constrained by extrinsic measures and motivated by intrinsic measures. Intrinsic motivations can be
1870:. In particular, Sections 1-4 give background to the definition of Friendly AI in Section 5. Section 6 gives two classes of mistakes (technical and philosophical) which would both lead to the accidental creation of non-Friendly AIs. Sections 7-13 discuss further related issues.
775:’s proposal to create friendly AIs appear to be bleak. This is because Muehlhauser and Bostrom seem to hold the idea that intelligent machines could be programmed to think counterfactually about the moral values that humans beings would have had. In an article in
604:
has called the "security mindset": Rather than thinking about how a system will work, imagine how it could fail. For instance, he suggests even an AI that only makes accurate predictions and communicates via a text interface might cause unintended harm.
588:
says that AIs driven to maximize their future freedom of action (or causal path entropy) might be considered friendly if their planning horizon is longer than a certain threshold, and unfriendly if their planning horizon is shorter than that threshold.
690:
lists three principles to guide the development of beneficial machines. He emphasizes that these principles are not meant to be explicitly coded into the machines; rather, they are intended for the human developers. The principles are as follows:
504:, and picks out agents that are safe and useful, not necessarily ones that are "friendly" in the colloquial sense. The concept is primarily invoked in the context of discussions of recursively self-improving artificial agents that rapidly
608:
In 2014, Luke
Muehlhauser and Nick Bostrom underlined the need for 'friendly AI'; nonetheless, the difficulties in designing a 'friendly' superintelligence, for instance via programming counterfactual moral thinking, are considerable.
645:; extrapolated volition is intended to be what humanity objectively would want, all things considered, but it can only be defined relative to the psychological and cognitive qualities of present-day, unextrapolated humanity.
1969:
535:. In those stories, the extreme intelligence and power of these humanoid creations clash with their status as slaves (which by nature are seen as sub-human), and cause disastrous conflict. By 1942 these themes prompted
956:
559:
Basically we should assume that a 'superintelligence' would be able to achieve whatever goals it has. Therefore, it is extremely important that the goals we endow it with, and its entire motivation system, is 'human
554:
has said that superintelligent AI systems with goals that are not aligned with human ethics are intrinsically dangerous unless extreme measures are taken to ensure the safety of humanity. He put it this way:
522:
The roots of concern about artificial intelligence are very old. Kevin LaGrandeur showed that the dangers specific to AI can be seen in ancient literature concerning artificial humanoid servants such as the
1730:
2355:
582:, and continuous self-improvement, because of the intrinsic nature of any goal-driven systems and that these drives will, "without special precautions", cause the AI to exhibit undesired behavior.
1962:
2319:
1955:
565:
1572:
2036:
1978:
859:
517:
308:
2107:
1807:
399:
2147:
2374:
2142:
1720:
2326:
2306:
1219:
743:
encourages governments to accelerate friendly AI research. Because the goalposts of friendly AI are not necessarily eminent, he suggests a model similar to the
839:
160:
1422:
Boyles, Robert James M.; Joaquin, Jeremiah Joven (July 23, 2019). "Why friendly AIs won't be that friendly: a friendly reply to
Muehlhauser and Bostrom".
996:
484:
443:
behave, friendly artificial intelligence research is focused on how to practically bring about this behavior and ensuring it is adequately constrained.
1254:
543:"—principles hard-wired into all the robots in his fiction, intended to prevent them from turning on their creators, or allowing them to come to harm.
2112:
259:
237:
2152:
1351:
894:
864:
829:
748:
593:
195:
173:
794:
The inner workings of advanced AI systems may be complex and difficult to interpret, leading to concerns about transparency and accountability.
725:, suggested that "a public-private partnership has to be created to bring A.I.-makers together to share ideas about security—something like the
730:
97:
1880:
1660:
1056:
1031:
1006:
392:
318:
272:
227:
222:
625:
and then produce the AI which humanity would want, given sufficient time and insight, to arrive at a satisfactory answer. The appeal to an
1697:
1564:
765:
Some critics believe that both human-level AI and superintelligence are unlikely, and that therefore friendly AI is unlikely. Writing in
371:
343:
338:
232:
2294:
1838:
726:
621:
Rather than a
Friendly AI being designed directly by human programmers, it is to be designed by a "seed AI" programmed to first study
331:
200:
190:
180:
2031:
1229:
1133:
966:
834:
568:. He explains: "The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else."
432:
303:
249:
215:
82:
1873:
Omohundro, S. 2008 The Basic AI Drives
Appeared in AGI-08 - Proceedings of the First Conference on Artificial General Intelligence
385:
135:
1941:
508:, on the grounds that this hypothetical technology would have a large, rapid, and difficult-to-control impact on human society.
1996:
1799:
424:
67:
1164:
618:
our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted".
2102:
48:
744:
2389:
2333:
2193:
2122:
451:
2379:
2339:
254:
205:
102:
1535:
787:
77:
1502:
2092:
2076:
2026:
930:
869:
760:
155:
1594:
Baum, Seth D. (September 28, 2016). "On the promotion of safe and socially beneficial artificial intelligence".
2127:
2046:
904:
575:
279:
1096:
1982:
1276:
1246:
626:
585:
479:
60:
40:
2384:
2051:
1198:
935:
874:
734:
540:
505:
431:
with human interests or contribute to fostering the improvement of the human species. It is a part of the
150:
2006:
501:
1343:
1072:
574:
says that a sufficiently advanced AI system will, unless explicitly counteracted, exhibit a number of
2097:
1296:
824:
729:, but in partnership with corporations." He urges AI researchers to convene a meeting similar to the
92:
1203:
2312:
1280:
884:
854:
804:
721:
528:
244:
1877:
1777:
1619:
1494:
1447:
1401:
919:
914:
654:
642:
294:
1947:
909:
1895:
2278:
2253:
2071:
1834:
1769:
1685:
1666:
1656:
1644:
1611:
1486:
1439:
1393:
1324:
1225:
1149:
1129:
1052:
1027:
1002:
988:
962:
879:
687:
579:
547:
471:
467:
463:
455:
72:
2300:
2273:
2173:
2021:
1867:
1761:
1603:
1478:
1431:
1385:
1314:
1304:
1088:
844:
682:
630:
210:
145:
130:
2238:
2218:
2208:
2198:
2132:
2066:
1917:— A brief description of Friendly AI by the Machine Intelligence Research Institute.
1884:
925:
670:, in which one provably safe AI generation helps build the next provably safe generation.
663:
634:
571:
87:
777:
1944:— On the motives for and impossibility of FAI; by Adam Keiper and Ari N. Schulman.
1300:
2061:
1926:
1649:
889:
849:
637:
formalism), as providing the ultimate criterion of "Friendliness", is an answer to the
601:
597:
436:
2368:
2268:
2213:
2183:
1498:
1451:
1405:
740:
140:
1914:
1623:
2258:
2188:
2001:
1935:
1930:
1857:
1781:
1725:
1309:
1284:
1153:
1092:
992:
809:
785:
to be friendly. Adam Keiper and Ari N. Schulman, editors of the technology journal
772:
767:
716:
622:
551:
536:
475:
428:
284:
17:
1921:
Creating
Friendly AI 1.0: The Analysis and Design of Benevolent Goal Architectures
1765:
1376:
Muehlhauser, Luke; Bostrom, Nick (December 17, 2013). "Why We Need
Friendly AI".
1120:
696:
The machine's only objective is to maximize the realization of human preferences.
2263:
2248:
2056:
2016:
819:
532:
313:
298:
1482:
1435:
564:
In 2008 Eliezer
Yudkowsky called for the creation of "friendly AI" to mitigate
439:. While machine ethics is concerned with how an artificially intelligent agent
2228:
2203:
2178:
2137:
2117:
1670:
1607:
1389:
973:
Its owner may cede control to what
Eliezer Yudkowsky terms a "Friendly AI,"...
638:
1773:
1615:
1490:
1443:
1397:
702:
The ultimate source of information about human preferences is human behavior.
2243:
2233:
2011:
814:
673:
667:
658:
348:
112:
1328:
1920:
1154:"Artificial Intelligence as a Positive and Negative Factor in Global Risk"
2223:
1528:
427:(AGI) that would have a positive (benign) effect on humanity or at least
185:
107:
1858:
Artificial
Intelligence as a Positive and Negative Factor in Global Risk
1319:
1467:"The rise of artificial intelligence and the crisis of moral passivity"
1466:
751:, which generally aims to avoid government involvement in friendly AI.
353:
958:
Our Mathematical Universe: My Quest for the Ultimate Nature of Reality
2157:
1721:"Artificial intelligence will not turn into a Frankenstein's monster"
899:
1908:
1651:
Human Compatible: Artificial Intelligence and the Problem of Control
699:
The machine is initially uncertain about what those preferences are.
1076:
524:
450:
1754:
Journal of Experimental & Theoretical Artificial Intelligence
1888:
629:(perhaps expressed, for mathematical purposes, in the form of a
1951:
1218:
Bostrom, Nick (2014). "Chapter 7: The Superintelligent Will".
1752:
Kornai, András (May 15, 2014). "Bounding the impact of AGI".
492:
Yudkowsky (2008) goes into more detail about how to design a
47:
922:– a moral philosophy advocated by proponents of Friendly AI
1866:
Discusses Artificial Intelligence from the perspective of
1024:
The Battle for Compassion: Ethics in an Apathetic Universe
955:
Tegmark, Max (2014). "Life, Our Universe and Everything".
466:, who is best known for popularizing the idea, to discuss
1189:
Omohundro, S. M. (February 2008). "The basic AI drives".
1898:
Ethics and Information Technology volume 23, pp 207–214.
1565:"What Happens When Artificial Intelligence Turns On Us?"
470:
artificial agents that reliably implement human values.
1891:
2008 Workshop on Meta-Reasoning:Thinking About Thinking
566:
existential risk from advanced artificial intelligence
2037:
Existential risk from artificial general intelligence
1800:"The Problem with 'Friendly' Artificial Intelligence"
1534:. Singularity Institute for Artificial Intelligence.
860:
Existential risk from artificial general intelligence
518:
Existential risk from artificial general intelligence
1923:— A near book-length description from the MIRI
2287:
2166:
2108:
Center for Human-Compatible Artificial Intelligence
2085:
1989:
1942:
The Problem with ‘Friendly’ Artificial Intelligence
1909:Ethical Issues in Advanced Artificial Intelligence
1878:Human-Level AI Requires Compassionate Intelligence
1648:
1119:
2148:Leverhulme Centre for the Future of Intelligence
1049:Moral Machines: Teaching Robots Right from Wrong
693:
2143:Institute for Ethics and Emerging Technologies
1936:Commentary on MIRI's Guidelines on Friendly AI
1927:Critique of the MIRI Guidelines on Friendly AI
1798:Keiper, Adam; Schulman, Ari N. (Summer 2011).
1077:"The Persistent Peril of the Artificial Slave"
961:(First ed.). Knopf Doubleday Publishing.
2327:Superintelligence: Paths, Dangers, Strategies
2307:Open letter on artificial intelligence (2015)
1963:
1558:
1556:
1417:
1415:
1247:"How Skynet Might Emerge From Simple Physics"
1221:Superintelligence: Paths, Dangers, Strategies
1159:. In Nick Bostrom; Milan M. Ćirković (eds.).
393:
8:
1522:
1520:
840:Artificial intelligence systems integration
761:Technological singularity § Criticisms
1970:
1956:
1948:
1831:Artificial Intelligence: A Modern Approach
1371:
1369:
998:Artificial Intelligence: A Modern Approach
485:Artificial Intelligence: A Modern Approach
400:
386:
31:
1318:
1308:
1202:
666:has proposed a "scaffolding" approach to
627:objective through contingent human nature
2113:Centre for the Study of Existential Risk
2153:Machine Intelligence Research Institute
1829:Norvig, Peter; Russell, Stuart (2010).
1733:from the original on September 17, 2014
1541:from the original on September 30, 2015
1348:Machine Intelligence Research Institute
1047:Wallach, Wendell; Allen, Colin (2009).
947:
895:Machine Intelligence Research Institute
865:Hallucination (artificial intelligence)
830:Applications of artificial intelligence
749:Machine Intelligence Research Institute
594:Machine Intelligence Research Institute
458:, AI researcher and creator of the term
39:
1505:from the original on February 10, 2023
983:
981:
731:Asilomar Conference on Recombinant DNA
500:'Friendly' is used in this context as
2375:Philosophy of artificial intelligence
1810:from the original on January 15, 2012
1700:from the original on December 1, 2014
1639:
1637:
1635:
1633:
1563:Hendry, Erica R. (January 21, 2014).
1170:from the original on October 19, 2013
1099:from the original on January 13, 2023
708:logically possible human preference.
7:
1257:from the original on October 8, 2021
1342:Muehlhauser, Luke (July 31, 2013).
1224:. Oxford: Oxford University Press.
655:AI control problem § Alignment
546:In modern times as the prospect of
2295:Statement on AI risk of extinction
1760:(3). Informa UK Limited: 417–438.
1690:Northwestern University Law Review
1575:from the original on July 19, 2014
1354:from the original on July 19, 2014
1344:"AI Risk and the Security Mindset"
1245:Dvorsky, George (April 26, 2013).
727:International Atomic Energy Agency
592:Luke Muehlhauser, writing for the
25:
2032:Ethics of artificial intelligence
1894:Froding, B. and Peterson, M 2021
1719:Winfield, Alan (August 9, 2014).
1684:McGinnis, John O. (Summer 2010).
835:Artificial intelligence arms race
433:ethics of artificial intelligence
2351:
2350:
2042:Friendly artificial intelligence
1864:, Oxford University Press, 2008.
1529:"Coherent Extrapolated Volition"
1051:. Oxford University Press, Inc.
578:, such as resource acquisition,
413:Friendly artificial intelligence
1806:. No. 32. pp. 80–89.
1191:Artificial General Intelligence
425:artificial general intelligence
68:Artificial general intelligence
2103:Center for Applied Rationality
1465:Chan, Berman (March 4, 2020).
1310:10.1103/PhysRevLett.110.168702
1093:10.5621/sciefictstud.38.2.0232
613:Coherent extrapolated volition
1:
745:National Institutes of Health
2123:Future of Humanity Institute
1766:10.1080/0952813x.2014.895109
2340:Artificial Intelligence Act
2334:Do You Trust This Computer?
1022:Leighton, Jonathan (2011).
103:Natural language processing
2406:
1527:Eliezer Yudkowsky (2004).
1483:10.1007/s00146-020-00953-9
1436:10.1007/s00146-019-00903-0
758:
652:
550:looms nearer, philosopher
515:
435:and is closely related to
156:Hybrid intelligent systems
78:Recursive self-improvement
2348:
2093:Alignment Research Center
2077:Technological singularity
2027:Effective accelerationism
1862:Global Catastrophic Risks
1833:(3rd ed.). Pearson.
1655:. United States: Viking.
1608:10.1007/s00146-016-0677-0
1390:10.1017/s1477175613000316
1161:Global Catastrophic Risks
931:Technological singularity
870:Hybrid intelligent system
527:, or the proto-robots of
2128:Future of Life Institute
2047:Instrumental convergence
1285:"Causal entropic forces"
905:Regulation of algorithms
280:Artificial consciousness
1983:artificial intelligence
1289:Physical Review Letters
1081:Science Fiction Studies
641:problem of defining an
600:researchers adopt what
586:Alexander Wissner-Gross
506:explode in intelligence
480:artificial intelligence
462:The term was coined by
151:Evolutionary algorithms
41:Artificial intelligence
2052:Intelligence explosion
1938:— by Peter Voss.
1126:The Rest of the Robots
936:Three Laws of Robotics
875:Intelligence explosion
735:risks of biotechnology
705:
562:
541:Three Laws of Robotics
512:Risks of unfriendly AI
498:
488:, describes the idea:
459:
52:
27:AI to benefit humanity
2007:AI capability control
1118:Isaac Asimov (1964).
557:
502:technical terminology
490:
454:
51:
2098:Center for AI Safety
1915:What is Friendly AI?
1569:Smithsonian Magazine
1277:Wissner-Gross, A. D.
1163:. pp. 308–345.
825:Ambient intelligence
93:General game playing
2390:Affective computing
2313:Our Final Invention
1647:(October 8, 2019).
1301:2013PhRvL.110p8702W
885:Intelligent control
855:Emotion recognition
805:Affective computing
722:Our Final Invention
548:superintelligent AI
529:Gerbert of Aurillac
447:Etymology and usage
245:Machine translation
161:Systems integration
98:Knowledge reasoning
35:Part of a series on
18:Friendliness Theory
2380:Singularitarianism
1883:2022-01-09 at the
920:Singularitarianism
915:Sentiment analysis
733:, which discussed
643:objective morality
635:decision-theoretic
596:, recommends that
460:
423:) is hypothetical
53:
2362:
2361:
2279:Eliezer Yudkowsky
2254:Stuart J. Russell
2072:Superintelligence
1686:"Accelerating AI"
1662:978-0-525-55861-3
1150:Eliezer Yudkowsky
1058:978-0-19-537404-9
1033:978-0-87586-870-7
1008:978-0-13-604259-4
1001:. Prentice Hall.
880:Intelligent agent
688:Stuart J. Russell
580:self-preservation
472:Stuart J. Russell
464:Eliezer Yudkowsky
456:Eliezer Yudkowsky
410:
409:
146:Bayesian networks
73:Intelligent agent
16:(Redirected from
2397:
2354:
2353:
2301:Human Compatible
2274:Roman Yampolskiy
2022:Consequentialism
1979:Existential risk
1972:
1965:
1958:
1949:
1868:Existential risk
1845:
1844:
1826:
1820:
1819:
1817:
1815:
1804:The New Atlantis
1795:
1789:
1788:
1749:
1743:
1742:
1740:
1738:
1716:
1710:
1709:
1707:
1705:
1696:(3): 1253–1270.
1681:
1675:
1674:
1654:
1641:
1628:
1627:
1596:AI & Society
1591:
1585:
1584:
1582:
1580:
1560:
1551:
1550:
1548:
1546:
1540:
1533:
1524:
1515:
1514:
1512:
1510:
1471:AI & Society
1462:
1456:
1455:
1424:AI & Society
1419:
1410:
1409:
1373:
1364:
1363:
1361:
1359:
1339:
1333:
1332:
1322:
1312:
1273:
1267:
1266:
1264:
1262:
1242:
1236:
1235:
1215:
1209:
1208:
1206:
1186:
1180:
1179:
1177:
1175:
1169:
1158:
1146:
1140:
1139:
1123:
1115:
1109:
1108:
1106:
1104:
1073:Kevin LaGrandeur
1069:
1063:
1062:
1044:
1038:
1037:
1019:
1013:
1012:
985:
976:
975:
952:
845:Autonomous agent
788:The New Atlantis
778:AI & Society
686:, AI researcher
683:Human Compatible
649:Other approaches
631:utility function
468:superintelligent
402:
395:
388:
309:Existential risk
131:Machine learning
32:
21:
2405:
2404:
2400:
2399:
2398:
2396:
2395:
2394:
2365:
2364:
2363:
2358:
2344:
2283:
2239:Steve Omohundro
2219:Geoffrey Hinton
2209:Stephen Hawking
2194:Paul Christiano
2174:Scott Alexander
2162:
2133:Google DeepMind
2081:
2067:Suffering risks
1985:
1976:
1911:by Nick Bostrom
1905:
1885:Wayback Machine
1876:Mason, C. 2008
1865:
1853:
1851:Further reading
1848:
1841:
1828:
1827:
1823:
1813:
1811:
1797:
1796:
1792:
1751:
1750:
1746:
1736:
1734:
1718:
1717:
1713:
1703:
1701:
1683:
1682:
1678:
1663:
1645:Russell, Stuart
1643:
1642:
1631:
1593:
1592:
1588:
1578:
1576:
1562:
1561:
1554:
1544:
1542:
1538:
1531:
1526:
1525:
1518:
1508:
1506:
1464:
1463:
1459:
1421:
1420:
1413:
1375:
1374:
1367:
1357:
1355:
1341:
1340:
1336:
1275:
1274:
1270:
1260:
1258:
1244:
1243:
1239:
1232:
1217:
1216:
1212:
1204:10.1.1.393.8356
1188:
1187:
1183:
1173:
1171:
1167:
1156:
1148:
1147:
1143:
1136:
1117:
1116:
1112:
1102:
1100:
1071:
1070:
1066:
1059:
1046:
1045:
1041:
1034:
1021:
1020:
1016:
1009:
989:Russell, Stuart
987:
986:
979:
969:
954:
953:
949:
945:
940:
926:Suffering risks
910:Roko's basilisk
800:
763:
757:
714:
664:Steve Omohundro
661:
651:
615:
572:Steve Omohundro
539:to create the "
520:
514:
449:
406:
377:
376:
367:
359:
358:
334:
324:
323:
295:Control problem
275:
265:
264:
176:
166:
165:
126:
118:
117:
88:Computer vision
63:
28:
23:
22:
15:
12:
11:
5:
2403:
2401:
2393:
2392:
2387:
2382:
2377:
2367:
2366:
2360:
2359:
2349:
2346:
2345:
2343:
2342:
2337:
2330:
2323:
2316:
2309:
2304:
2297:
2291:
2289:
2285:
2284:
2282:
2281:
2276:
2271:
2266:
2261:
2256:
2251:
2246:
2241:
2236:
2231:
2226:
2221:
2216:
2211:
2206:
2201:
2196:
2191:
2186:
2181:
2176:
2170:
2168:
2164:
2163:
2161:
2160:
2155:
2150:
2145:
2140:
2135:
2130:
2125:
2120:
2115:
2110:
2105:
2100:
2095:
2089:
2087:
2083:
2082:
2080:
2079:
2074:
2069:
2064:
2062:Machine ethics
2059:
2054:
2049:
2044:
2039:
2034:
2029:
2024:
2019:
2014:
2009:
2004:
1999:
1993:
1991:
1987:
1986:
1977:
1975:
1974:
1967:
1960:
1952:
1946:
1945:
1939:
1933:
1924:
1918:
1912:
1904:
1903:External links
1901:
1900:
1899:
1892:
1874:
1871:
1856:Yudkowsky, E.
1852:
1849:
1847:
1846:
1840:978-0136042594
1839:
1821:
1790:
1744:
1711:
1676:
1661:
1629:
1602:(4): 543–551.
1586:
1552:
1516:
1477:(4): 991–993.
1457:
1430:(2): 505–507.
1411:
1365:
1334:
1295:(16): 168702.
1268:
1237:
1230:
1210:
1181:
1141:
1134:
1121:"Introduction"
1110:
1064:
1057:
1039:
1032:
1014:
1007:
977:
967:
946:
944:
941:
939:
938:
933:
928:
923:
917:
912:
907:
902:
897:
892:
890:Machine ethics
887:
882:
877:
872:
867:
862:
857:
852:
850:Embodied agent
847:
842:
837:
832:
827:
822:
817:
812:
807:
801:
799:
796:
756:
753:
713:
710:
704:
703:
700:
697:
650:
647:
614:
611:
602:Bruce Schneier
598:machine ethics
576:basic "drives"
516:Main article:
513:
510:
448:
445:
437:machine ethics
408:
407:
405:
404:
397:
390:
382:
379:
378:
375:
374:
368:
365:
364:
361:
360:
357:
356:
351:
346:
341:
335:
330:
329:
326:
325:
322:
321:
316:
311:
306:
301:
292:
287:
282:
276:
271:
270:
267:
266:
263:
262:
257:
252:
247:
242:
241:
240:
230:
225:
220:
219:
218:
213:
208:
198:
193:
191:Earth sciences
188:
183:
181:Bioinformatics
177:
172:
171:
168:
167:
164:
163:
158:
153:
148:
143:
138:
133:
127:
124:
123:
120:
119:
116:
115:
110:
105:
100:
95:
90:
85:
80:
75:
70:
64:
59:
58:
55:
54:
44:
43:
37:
36:
26:
24:
14:
13:
10:
9:
6:
4:
3:
2:
2402:
2391:
2388:
2386:
2385:Transhumanism
2383:
2381:
2378:
2376:
2373:
2372:
2370:
2357:
2347:
2341:
2338:
2336:
2335:
2331:
2329:
2328:
2324:
2322:
2321:
2320:The Precipice
2317:
2315:
2314:
2310:
2308:
2305:
2303:
2302:
2298:
2296:
2293:
2292:
2290:
2286:
2280:
2277:
2275:
2272:
2270:
2269:Frank Wilczek
2267:
2265:
2262:
2260:
2257:
2255:
2252:
2250:
2247:
2245:
2242:
2240:
2237:
2235:
2232:
2230:
2227:
2225:
2222:
2220:
2217:
2215:
2214:Dan Hendrycks
2212:
2210:
2207:
2205:
2202:
2200:
2197:
2195:
2192:
2190:
2187:
2185:
2184:Yoshua Bengio
2182:
2180:
2177:
2175:
2172:
2171:
2169:
2165:
2159:
2156:
2154:
2151:
2149:
2146:
2144:
2141:
2139:
2136:
2134:
2131:
2129:
2126:
2124:
2121:
2119:
2116:
2114:
2111:
2109:
2106:
2104:
2101:
2099:
2096:
2094:
2091:
2090:
2088:
2086:Organizations
2084:
2078:
2075:
2073:
2070:
2068:
2065:
2063:
2060:
2058:
2055:
2053:
2050:
2048:
2045:
2043:
2040:
2038:
2035:
2033:
2030:
2028:
2025:
2023:
2020:
2018:
2015:
2013:
2010:
2008:
2005:
2003:
2000:
1998:
1995:
1994:
1992:
1988:
1984:
1980:
1973:
1968:
1966:
1961:
1959:
1954:
1953:
1950:
1943:
1940:
1937:
1934:
1932:
1928:
1925:
1922:
1919:
1916:
1913:
1910:
1907:
1906:
1902:
1897:
1893:
1890:
1886:
1882:
1879:
1875:
1872:
1869:
1863:
1859:
1855:
1854:
1850:
1842:
1836:
1832:
1825:
1822:
1809:
1805:
1801:
1794:
1791:
1787:
1783:
1779:
1775:
1771:
1767:
1763:
1759:
1755:
1748:
1745:
1737:September 17,
1732:
1728:
1727:
1722:
1715:
1712:
1699:
1695:
1691:
1687:
1680:
1677:
1672:
1668:
1664:
1658:
1653:
1652:
1646:
1640:
1638:
1636:
1634:
1630:
1625:
1621:
1617:
1613:
1609:
1605:
1601:
1597:
1590:
1587:
1574:
1570:
1566:
1559:
1557:
1553:
1545:September 12,
1537:
1530:
1523:
1521:
1517:
1504:
1500:
1496:
1492:
1488:
1484:
1480:
1476:
1472:
1468:
1461:
1458:
1453:
1449:
1445:
1441:
1437:
1433:
1429:
1425:
1418:
1416:
1412:
1407:
1403:
1399:
1395:
1391:
1387:
1384:(36): 41–47.
1383:
1379:
1372:
1370:
1366:
1353:
1349:
1345:
1338:
1335:
1330:
1326:
1321:
1316:
1311:
1306:
1302:
1298:
1294:
1290:
1286:
1282:
1278:
1272:
1269:
1256:
1252:
1248:
1241:
1238:
1233:
1231:9780199678112
1227:
1223:
1222:
1214:
1211:
1205:
1200:
1196:
1192:
1185:
1182:
1166:
1162:
1155:
1151:
1145:
1142:
1137:
1135:0-385-09041-2
1131:
1128:. Doubleday.
1127:
1122:
1114:
1111:
1098:
1094:
1090:
1086:
1082:
1078:
1074:
1068:
1065:
1060:
1054:
1050:
1043:
1040:
1035:
1029:
1025:
1018:
1015:
1010:
1004:
1000:
999:
994:
993:Norvig, Peter
990:
984:
982:
978:
974:
970:
968:9780307744258
964:
960:
959:
951:
948:
942:
937:
934:
932:
929:
927:
924:
921:
918:
916:
913:
911:
908:
906:
903:
901:
898:
896:
893:
891:
888:
886:
883:
881:
878:
876:
873:
871:
868:
866:
863:
861:
858:
856:
853:
851:
848:
846:
843:
841:
838:
836:
833:
831:
828:
826:
823:
821:
818:
816:
813:
811:
808:
806:
803:
802:
797:
795:
792:
790:
789:
782:
780:
779:
774:
770:
769:
762:
754:
752:
750:
746:
742:
741:John McGinnis
738:
736:
732:
728:
724:
723:
718:
712:Public policy
711:
709:
701:
698:
695:
694:
692:
689:
685:
684:
678:
675:
671:
669:
665:
660:
656:
648:
646:
644:
640:
636:
632:
628:
624:
619:
612:
610:
606:
603:
599:
595:
590:
587:
583:
581:
577:
573:
569:
567:
561:
556:
553:
549:
544:
542:
538:
534:
530:
526:
519:
511:
509:
507:
503:
497:
495:
489:
487:
486:
481:
477:
473:
469:
465:
457:
453:
446:
444:
442:
438:
434:
430:
426:
422:
418:
414:
403:
398:
396:
391:
389:
384:
383:
381:
380:
373:
370:
369:
363:
362:
355:
352:
350:
347:
345:
342:
340:
337:
336:
333:
328:
327:
320:
317:
315:
312:
310:
307:
305:
302:
300:
296:
293:
291:
288:
286:
283:
281:
278:
277:
274:
269:
268:
261:
258:
256:
253:
251:
248:
246:
243:
239:
238:Mental health
236:
235:
234:
231:
229:
226:
224:
221:
217:
214:
212:
209:
207:
204:
203:
202:
201:Generative AI
199:
197:
194:
192:
189:
187:
184:
182:
179:
178:
175:
170:
169:
162:
159:
157:
154:
152:
149:
147:
144:
142:
141:Deep learning
139:
137:
134:
132:
129:
128:
122:
121:
114:
111:
109:
106:
104:
101:
99:
96:
94:
91:
89:
86:
84:
81:
79:
76:
74:
71:
69:
66:
65:
62:
57:
56:
50:
46:
45:
42:
38:
34:
33:
30:
19:
2332:
2325:
2318:
2311:
2299:
2259:Jaan Tallinn
2199:Eric Drexler
2189:Nick Bostrom
2041:
2002:AI alignment
1931:Bill Hibbard
1861:
1830:
1824:
1812:. Retrieved
1803:
1793:
1785:
1757:
1753:
1747:
1735:. Retrieved
1726:The Guardian
1724:
1714:
1702:. Retrieved
1693:
1689:
1679:
1650:
1599:
1595:
1589:
1577:. Retrieved
1568:
1543:. Retrieved
1507:. Retrieved
1474:
1470:
1460:
1427:
1423:
1381:
1377:
1356:. Retrieved
1347:
1337:
1320:1721.1/79750
1292:
1288:
1281:Freer, C. E.
1271:
1261:December 23,
1259:. Retrieved
1250:
1240:
1220:
1213:
1194:
1190:
1184:
1172:. Retrieved
1160:
1144:
1125:
1113:
1101:. Retrieved
1084:
1080:
1067:
1048:
1042:
1023:
1017:
997:
972:
957:
950:
810:AI alignment
793:
786:
783:
776:
773:Nick Bostrom
768:The Guardian
766:
764:
739:
720:
719:, author of
717:James Barrat
715:
706:
681:
680:In his book
679:
672:
662:
639:meta-ethical
623:human nature
620:
616:
607:
591:
584:
570:
563:
558:
552:Nick Bostrom
545:
537:Isaac Asimov
521:
499:
493:
491:
483:
476:Peter Norvig
461:
440:
420:
416:
412:
411:
289:
285:Chinese room
174:Applications
29:
2264:Max Tegmark
2249:Martin Rees
2057:Longtermism
2017:AI takeover
1929:— by
1896:Friendly AI
1887:Appears in
1814:January 16,
1509:January 21,
1197:: 483–492.
1174:October 19,
820:AI takeover
533:Roger Bacon
494:Friendly AI
478:'s leading
417:friendly AI
314:Turing test
290:Friendly AI
61:Major goals
2369:Categories
2229:Shane Legg
2204:Sam Harris
2179:Sam Altman
2118:EleutherAI
1671:1083694322
1087:(2): 232.
1026:. Algora.
943:References
759:See also:
653:See also:
560:friendly.'
482:textbook,
319:Regulation
273:Philosophy
228:Healthcare
223:Government
125:Approaches
2244:Huw Price
2234:Elon Musk
2138:Humanity+
2012:AI safety
1774:0952-813X
1616:0951-5666
1499:212407078
1491:1435-5655
1452:198190745
1444:0951-5666
1406:143657841
1398:1477-1756
1199:CiteSeerX
815:AI effect
755:Criticism
674:Seth Baum
668:AI safety
659:AI safety
633:or other
349:AI winter
250:Military
113:AI safety
2356:Category
2224:Bill Joy
1990:Concepts
1881:Archived
1808:Archived
1731:Archived
1704:July 16,
1698:Archived
1624:29012168
1579:July 15,
1573:Archived
1536:Archived
1503:Archived
1358:July 15,
1352:Archived
1329:23679649
1283:(2013).
1255:Archived
1165:Archived
1152:(2008).
1097:Archived
1075:(2011).
995:(2009).
798:See also
372:Glossary
366:Glossary
344:Progress
339:Timeline
299:Takeover
260:Projects
233:Industry
196:Finance
186:Deepfake
136:Symbolic
108:Robotics
83:Planning
1782:7067517
1297:Bibcode
1251:Gizmodo
354:AI boom
332:History
255:Physics
2167:People
2158:OpenAI
1837:
1780:
1772:
1669:
1659:
1622:
1614:
1497:
1489:
1450:
1442:
1404:
1396:
1327:
1228:
1201:
1132:
1103:May 6,
1055:
1030:
1005:
965:
900:OpenAI
657:, and
441:should
304:Ethics
2288:Other
1981:from
1860:. In
1778:S2CID
1620:S2CID
1539:(PDF)
1532:(PDF)
1495:S2CID
1448:S2CID
1402:S2CID
1378:Think
1168:(PDF)
1157:(PDF)
525:golem
429:align
216:Music
211:Audio
1889:AAAI
1835:ISBN
1816:2012
1770:ISSN
1739:2014
1706:2014
1667:OCLC
1657:ISBN
1612:ISSN
1581:2014
1547:2015
1511:2023
1487:ISSN
1440:ISSN
1394:ISSN
1360:2014
1325:PMID
1263:2021
1226:ISBN
1176:2013
1130:ISBN
1105:2013
1053:ISBN
1028:ISBN
1003:ISBN
963:ISBN
531:and
474:and
1997:AGI
1762:doi
1694:104
1604:doi
1479:doi
1432:doi
1386:doi
1315:hdl
1305:doi
1293:110
1195:171
1089:doi
737:.
421:FAI
419:or
206:Art
2371::
1802:.
1784:.
1776:.
1768:.
1758:26
1756:.
1729:.
1723:.
1692:.
1688:.
1665:.
1632:^
1618:.
1610:.
1600:32
1598:.
1571:.
1567:.
1555:^
1519:^
1501:.
1493:.
1485:.
1475:35
1473:.
1469:.
1446:.
1438:.
1428:35
1426:.
1414:^
1400:.
1392:.
1382:13
1380:.
1368:^
1350:.
1346:.
1323:.
1313:.
1303:.
1291:.
1287:.
1279:;
1253:.
1249:.
1193:.
1124:.
1095:.
1085:38
1083:.
1079:.
991:;
980:^
971:.
1971:e
1964:t
1957:v
1843:.
1818:.
1764::
1741:.
1708:.
1673:.
1626:.
1606::
1583:.
1549:.
1513:.
1481::
1454:.
1434::
1408:.
1388::
1362:.
1331:.
1317::
1307::
1299::
1265:.
1234:.
1207:.
1178:.
1138:.
1107:.
1091::
1061:.
1036:.
1011:.
415:(
401:e
394:t
387:v
297:/
20:)
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.