Friendly artificial intelligence

49: 452: 496:. He asserts that friendliness (a desire not to harm humans) should be designed in from the start, but that the designers should recognize both that their own designs may be flawed, and that the robot will learn and evolve over time. Thus the challenge is one of mechanism design—to define a mechanism for evolving AI systems under a system of checks and balances, and to give the systems utility functions that will remain friendly in the face of such changes. 747:, where "Peer review panels of computer and cognitive scientists would sift through projects and choose those that are designed both to advance AI and assure that such advances would be accompanied by appropriate safeguards." McGinnis feels that peer review is better "than regulation to address technical issues that are not possible to capture through bureaucratic mandates". McGinnis notes that his proposal stands in contrast to that of the 791:, say that it will be impossible to ever guarantee "friendly" behavior in AIs because problems of ethical complexity will not yield to software advances or increases in computing power. They write that the criteria upon which friendly AI theories are based work "only when one has not only great powers of prediction about the likelihood of myriad possible outcomes, but certainty and consensus on how one values the different outcomes. 781:, Boyles and Joaquin maintain that such AIs would not be that friendly considering the following: the infinite amount of antecedent counterfactual conditions that would have to be programmed into a machine, the difficulty of cashing out the set of moral values—that is, those that are more ideal than the ones human beings possess at present, and the apparent disconnect between counterfactual antecedents and ideal value consequent. 2352: 771:, Alan Winfield compares human-level artificial intelligence with faster-than-light travel in terms of difficulty, and states that while we need to be "cautious and prepared" given the stakes involved, we "don't need to be obsessing" about the risks of superintelligence. Boyles and Joaquin, on the other hand, argue that Luke Muehlhauser and 677:

strengthened when messages resonate with AI developers; Baum argues that, in contrast, "existing messages about beneficial AI are not always framed well". Baum advocates for "cooperative relationships, and positive framing of AI researchers" and cautions against characterizing AI researchers as "not want(ing) to pursue beneficial designs".

784:

Some philosophers claim that any truly "rational" agent, whether artificial or human, will naturally be benevolent; in this view, deliberate safeguards designed to produce a friendly AI could be unnecessary or even harmful. Other critics question whether it is possible for an artificial intelligence

1786:...the essence of AGIs is their reasoning facilities, and it is the very logic of their being that will compel them to behave in a moral fashion... The real nightmare scenario (is one where) humans find it advantageous to strongly couple themselves to AGIs, with no guarantees against self-deception. 707:

The "preferences" Russell refers to "are all-encompassing; they cover everything you might care about, arbitrarily far into the future." Similarly, "behavior" includes any choice between options, and the uncertainty is such that some probability, which may be quite small, must be assigned to every

617:

Yudkowsky advances the Coherent Extrapolated Volition (CEV) model. According to him, our coherent extrapolated volition is "our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where

676:

argues that the development of safe, socially beneficial artificial intelligence or artificial general intelligence is a function of the social psychology of AI research communities, and so can be constrained by extrinsic measures and motivated by intrinsic measures. Intrinsic motivations can be

1870:. In particular, Sections 1-4 give background to the definition of Friendly AI in Section 5. Section 6 gives two classes of mistakes (technical and philosophical) which would both lead to the accidental creation of non-Friendly AIs. Sections 7-13 discuss further related issues. 775:’s proposal to create friendly AIs appear to be bleak. This is because Muehlhauser and Bostrom seem to hold the idea that intelligent machines could be programmed to think counterfactually about the moral values that humans beings would have had. In an article in 604:

has called the "security mindset": Rather than thinking about how a system will work, imagine how it could fail. For instance, he suggests even an AI that only makes accurate predictions and communicates via a text interface might cause unintended harm.

588:

says that AIs driven to maximize their future freedom of action (or causal path entropy) might be considered friendly if their planning horizon is longer than a certain threshold, and unfriendly if their planning horizon is shorter than that threshold.

690:

lists three principles to guide the development of beneficial machines. He emphasizes that these principles are not meant to be explicitly coded into the machines; rather, they are intended for the human developers. The principles are as follows:

504:, and picks out agents that are safe and useful, not necessarily ones that are "friendly" in the colloquial sense. The concept is primarily invoked in the context of discussions of recursively self-improving artificial agents that rapidly 608:

In 2014, Luke Muehlhauser and Nick Bostrom underlined the need for 'friendly AI'; nonetheless, the difficulties in designing a 'friendly' superintelligence, for instance via programming counterfactual moral thinking, are considerable.

645:; extrapolated volition is intended to be what humanity objectively would want, all things considered, but it can only be defined relative to the psychological and cognitive qualities of present-day, unextrapolated humanity. 1969: 535:. In those stories, the extreme intelligence and power of these humanoid creations clash with their status as slaves (which by nature are seen as sub-human), and cause disastrous conflict. By 1942 these themes prompted 956: 559:

Basically we should assume that a 'superintelligence' would be able to achieve whatever goals it has. Therefore, it is extremely important that the goals we endow it with, and its entire motivation system, is 'human

554:

has said that superintelligent AI systems with goals that are not aligned with human ethics are intrinsically dangerous unless extreme measures are taken to ensure the safety of humanity. He put it this way:

522:

The roots of concern about artificial intelligence are very old. Kevin LaGrandeur showed that the dangers specific to AI can be seen in ancient literature concerning artificial humanoid servants such as the

1730: 2355: 582:, and continuous self-improvement, because of the intrinsic nature of any goal-driven systems and that these drives will, "without special precautions", cause the AI to exhibit undesired behavior. 1962: 2319: 1955: 565: 1572: 2036: 1978: 859: 517: 308: 2107: 1807: 399: 2147: 2374: 2142: 1720: 2326: 2306: 1219: 743:

encourages governments to accelerate friendly AI research. Because the goalposts of friendly AI are not necessarily eminent, he suggests a model similar to the

839: 160: 1422:

Boyles, Robert James M.; Joaquin, Jeremiah Joven (July 23, 2019). "Why friendly AIs won't be that friendly: a friendly reply to Muehlhauser and Bostrom".

996: 484: 443:

behave, friendly artificial intelligence research is focused on how to practically bring about this behavior and ensuring it is adequately constrained.

1254: 543:"—principles hard-wired into all the robots in his fiction, intended to prevent them from turning on their creators, or allowing them to come to harm. 2112: 259: 237: 2152: 1351: 894: 864: 829: 748: 593: 195: 173: 794:

The inner workings of advanced AI systems may be complex and difficult to interpret, leading to concerns about transparency and accountability.

725:, suggested that "a public-private partnership has to be created to bring A.I.-makers together to share ideas about security—something like the 730: 97: 1880: 1660: 1056: 1031: 1006: 392: 318: 272: 227: 222: 625:

and then produce the AI which humanity would want, given sufficient time and insight, to arrive at a satisfactory answer. The appeal to an

1697: 1564: 765:

Some critics believe that both human-level AI and superintelligence are unlikely, and that therefore friendly AI is unlikely. Writing in

371: 343: 338: 232: 2294: 1838: 726: 621:

Rather than a Friendly AI being designed directly by human programmers, it is to be designed by a "seed AI" programmed to first study

331: 200: 190: 180: 2031: 1229: 1133: 966: 834: 568:. He explains: "The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else." 432: 303: 249: 215: 82: 1873:

Omohundro, S. 2008 The Basic AI Drives Appeared in AGI-08 - Proceedings of the First Conference on Artificial General Intelligence

385: 135: 1941: 508:, on the grounds that this hypothetical technology would have a large, rapid, and difficult-to-control impact on human society. 1996: 1799: 424: 67: 1164: 618:

our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted".

2102: 48: 744: 2389: 2333: 2193: 2122: 451: 2379: 2339: 254: 205: 102: 1535: 787: 77: 1502: 2092: 2076: 2026: 930: 869: 760: 155: 1594:

Baum, Seth D. (September 28, 2016). "On the promotion of safe and socially beneficial artificial intelligence".

2127: 2046: 904: 575: 279: 1096: 1982: 1276: 1246: 626: 585: 479: 60: 40: 2384: 2051: 1198: 935: 874: 734: 540: 505: 431:

with human interests or contribute to fostering the improvement of the human species. It is a part of the

150: 2006: 501: 1343: 1072: 574:

says that a sufficiently advanced AI system will, unless explicitly counteracted, exhibit a number of

2097: 1296: 824: 729:, but in partnership with corporations." He urges AI researchers to convene a meeting similar to the 92: 1203: 2312: 1280: 884: 854: 804: 721: 528: 244: 1877: 1777: 1619: 1494: 1447: 1401: 919: 914: 654: 642: 294: 1947: 909: 1895: 2278: 2253: 2071: 1834: 1769: 1685: 1666: 1656: 1644: 1611: 1486: 1439: 1393: 1324: 1225: 1149: 1129: 1052: 1027: 1002: 988: 962: 879: 687: 579: 547: 471: 467: 463: 455: 72: 2300: 2273: 2173: 2021: 1867: 1761: 1603: 1478: 1431: 1385: 1314: 1304: 1088: 844: 682: 630: 210: 145: 130: 2238: 2218: 2208: 2198: 2132: 2066: 1917:— A brief description of Friendly AI by the Machine Intelligence Research Institute. 1884: 925: 670:, in which one provably safe AI generation helps build the next provably safe generation. 663: 634: 571: 87: 777: 1944:— On the motives for and impossibility of FAI; by Adam Keiper and Ari N. Schulman. 1300: 2061: 1926: 1649: 889: 849: 637:

formalism), as providing the ultimate criterion of "Friendliness", is an answer to the

601: 597: 436: 2368: 2268: 2213: 2183: 1498: 1451: 1405: 740: 140: 1914: 1623: 2258: 2188: 2001: 1935: 1930: 1857: 1781: 1725: 1309: 1284: 1153: 1092: 992: 809: 785:

to be friendly. Adam Keiper and Ari N. Schulman, editors of the technology journal

772: 767: 716: 622: 551: 536: 475: 428: 284: 17: 1921:

Creating Friendly AI 1.0: The Analysis and Design of Benevolent Goal Architectures

1765: 1376:

Muehlhauser, Luke; Bostrom, Nick (December 17, 2013). "Why We Need Friendly AI".

1120: 696:

The machine's only objective is to maximize the realization of human preferences.

2263: 2248: 2056: 2016: 819: 532: 313: 298: 1482: 1435: 564:

In 2008 Eliezer Yudkowsky called for the creation of "friendly AI" to mitigate

439:. While machine ethics is concerned with how an artificially intelligent agent 2228: 2203: 2178: 2137: 2117: 1670: 1607: 1389: 973:

Its owner may cede control to what Eliezer Yudkowsky terms a "Friendly AI,"...

638: 1773: 1615: 1490: 1443: 1397: 702:

The ultimate source of information about human preferences is human behavior.

2243: 2233: 2011: 814: 673: 667: 658: 348: 112: 1328: 1920: 1154:"Artificial Intelligence as a Positive and Negative Factor in Global Risk" 2223: 1528: 427:(AGI) that would have a positive (benign) effect on humanity or at least 185: 107: 1858:

Artificial Intelligence as a Positive and Negative Factor in Global Risk

1319: 1467:"The rise of artificial intelligence and the crisis of moral passivity" 1466: 751:, which generally aims to avoid government involvement in friendly AI. 353: 958:

Our Mathematical Universe: My Quest for the Ultimate Nature of Reality

2157: 1721:"Artificial intelligence will not turn into a Frankenstein's monster" 899: 1908: 1651:

Human Compatible: Artificial Intelligence and the Problem of Control

699:

The machine is initially uncertain about what those preferences are.

1076: 524: 450: 1754:

Journal of Experimental & Theoretical Artificial Intelligence

1888: 629:(perhaps expressed, for mathematical purposes, in the form of a 1951: 1218:

Bostrom, Nick (2014). "Chapter 7: The Superintelligent Will".

1752:

Kornai, András (May 15, 2014). "Bounding the impact of AGI".

492:

Yudkowsky (2008) goes into more detail about how to design a

47: 922:– a moral philosophy advocated by proponents of Friendly AI 1866:

Discusses Artificial Intelligence from the perspective of

1024:

The Battle for Compassion: Ethics in an Apathetic Universe

955:

Tegmark, Max (2014). "Life, Our Universe and Everything".

466:, who is best known for popularizing the idea, to discuss 1189:

Omohundro, S. M. (February 2008). "The basic AI drives".

1898:

Ethics and Information Technology volume 23, pp 207–214.

1565:"What Happens When Artificial Intelligence Turns On Us?" 470:

artificial agents that reliably implement human values.

1891:

2008 Workshop on Meta-Reasoning:Thinking About Thinking

566:

existential risk from advanced artificial intelligence

2037:

Existential risk from artificial general intelligence

1800:"The Problem with 'Friendly' Artificial Intelligence" 1534:. Singularity Institute for Artificial Intelligence. 860:

Existential risk from artificial general intelligence

518:

Existential risk from artificial general intelligence

1923:— A near book-length description from the MIRI 2287: 2166: 2108:

Center for Human-Compatible Artificial Intelligence

2085: 1989: 1942:

The Problem with ‘Friendly’ Artificial Intelligence

1909:Ethical Issues in Advanced Artificial Intelligence 1878:Human-Level AI Requires Compassionate Intelligence 1648: 1119: 2148:Leverhulme Centre for the Future of Intelligence 1049:Moral Machines: Teaching Robots Right from Wrong 693: 2143:Institute for Ethics and Emerging Technologies 1936:Commentary on MIRI's Guidelines on Friendly AI 1927:Critique of the MIRI Guidelines on Friendly AI 1798:Keiper, Adam; Schulman, Ari N. (Summer 2011). 1077:"The Persistent Peril of the Artificial Slave" 961:(First ed.). Knopf Doubleday Publishing. 2327:Superintelligence: Paths, Dangers, Strategies 2307:Open letter on artificial intelligence (2015) 1963: 1558: 1556: 1417: 1415: 1247:"How Skynet Might Emerge From Simple Physics" 1221:Superintelligence: Paths, Dangers, Strategies 1159:. In Nick Bostrom; Milan M. Ćirković (eds.). 393: 8: 1522: 1520: 840:Artificial intelligence systems integration 761:Technological singularity § Criticisms 1970: 1956: 1948: 1831:Artificial Intelligence: A Modern Approach 1371: 1369: 998:Artificial Intelligence: A Modern Approach 485:Artificial Intelligence: A Modern Approach 400: 386: 31: 1318: 1308: 1202: 666:has proposed a "scaffolding" approach to 627:objective through contingent human nature 2113:Centre for the Study of Existential Risk 2153:Machine Intelligence Research Institute 1829:Norvig, Peter; Russell, Stuart (2010). 1733:from the original on September 17, 2014 1541:from the original on September 30, 2015 1348:Machine Intelligence Research Institute 1047:Wallach, Wendell; Allen, Colin (2009). 947: 895:Machine Intelligence Research Institute 865:Hallucination (artificial intelligence) 830:Applications of artificial intelligence 749:Machine Intelligence Research Institute 594:Machine Intelligence Research Institute 458:, AI researcher and creator of the term 39: 1505:from the original on February 10, 2023 983: 981: 731:Asilomar Conference on Recombinant DNA 500:'Friendly' is used in this context as 2375:Philosophy of artificial intelligence 1810:from the original on January 15, 2012 1700:from the original on December 1, 2014 1639: 1637: 1635: 1633: 1563:Hendry, Erica R. (January 21, 2014). 1170:from the original on October 19, 2013 1099:from the original on January 13, 2023 708:logically possible human preference. 7: 1257:from the original on October 8, 2021 1342:Muehlhauser, Luke (July 31, 2013). 1224:. Oxford: Oxford University Press. 655:AI control problem § Alignment 546:In modern times as the prospect of 2295:Statement on AI risk of extinction 1760:(3). Informa UK Limited: 417–438. 1690:Northwestern University Law Review 1575:from the original on July 19, 2014 1354:from the original on July 19, 2014 1344:"AI Risk and the Security Mindset" 1245:Dvorsky, George (April 26, 2013). 727:International Atomic Energy Agency 592:Luke Muehlhauser, writing for the 25: 2032:Ethics of artificial intelligence 1894:Froding, B. and Peterson, M 2021 1719:Winfield, Alan (August 9, 2014). 1684:McGinnis, John O. (Summer 2010). 835:Artificial intelligence arms race 433:ethics of artificial intelligence 2351: 2350: 2042:Friendly artificial intelligence 1864:, Oxford University Press, 2008. 1529:"Coherent Extrapolated Volition" 1051:. Oxford University Press, Inc. 578:, such as resource acquisition, 413:Friendly artificial intelligence 1806:. No. 32. pp. 80–89. 1191:Artificial General Intelligence 425:artificial general intelligence 68:Artificial general intelligence 2103:Center for Applied Rationality 1465:Chan, Berman (March 4, 2020). 1310:10.1103/PhysRevLett.110.168702 1093:10.5621/sciefictstud.38.2.0232 613:Coherent extrapolated volition 1: 745:National Institutes of Health 2123:Future of Humanity Institute 1766:10.1080/0952813x.2014.895109 2340:Artificial Intelligence Act 2334:Do You Trust This Computer? 1022:Leighton, Jonathan (2011). 103:Natural language processing 2406: 1527:Eliezer Yudkowsky (2004). 1483:10.1007/s00146-020-00953-9 1436:10.1007/s00146-019-00903-0 758: 652: 550:looms nearer, philosopher 515: 435:and is closely related to 156:Hybrid intelligent systems 78:Recursive self-improvement 2348: 2093:Alignment Research Center 2077:Technological singularity 2027:Effective accelerationism 1862:Global Catastrophic Risks 1833:(3rd ed.). Pearson. 1655:. United States: Viking. 1608:10.1007/s00146-016-0677-0 1390:10.1017/s1477175613000316 1161:Global Catastrophic Risks 931:Technological singularity 870:Hybrid intelligent system 527:, or the proto-robots of 2128:Future of Life Institute 2047:Instrumental convergence 1285:"Causal entropic forces" 905:Regulation of algorithms 280:Artificial consciousness 1983:artificial intelligence 1289:Physical Review Letters 1081:Science Fiction Studies 641:problem of defining an 600:researchers adopt what 586:Alexander Wissner-Gross 506:explode in intelligence 480:artificial intelligence 462:The term was coined by 151:Evolutionary algorithms 41:Artificial intelligence 2052:Intelligence explosion 1938:— by Peter Voss. 1126:The Rest of the Robots 936:Three Laws of Robotics 875:Intelligence explosion 735:risks of biotechnology 705: 562: 541:Three Laws of Robotics 512:Risks of unfriendly AI 498: 488:, describes the idea: 459: 52: 27:AI to benefit humanity 2007:AI capability control 1118:Isaac Asimov (1964). 557: 502:technical terminology 490: 454: 51: 2098:Center for AI Safety 1915:What is Friendly AI? 1569:Smithsonian Magazine 1277:Wissner-Gross, A. D. 1163:. pp. 308–345. 825:Ambient intelligence 93:General game playing 2390:Affective computing 2313:Our Final Invention 1647:(October 8, 2019). 1301:2013PhRvL.110p8702W 885:Intelligent control 855:Emotion recognition 805:Affective computing 722:Our Final Invention 548:superintelligent AI 529:Gerbert of Aurillac 447:Etymology and usage 245:Machine translation 161:Systems integration 98:Knowledge reasoning 35:Part of a series on 18:Friendliness Theory 2380:Singularitarianism 1883:2022-01-09 at the 920:Singularitarianism 915:Sentiment analysis 733:, which discussed 643:objective morality 635:decision-theoretic 596:, recommends that 460: 423:) is hypothetical 53: 2362: 2361: 2279:Eliezer Yudkowsky 2254:Stuart J. Russell 2072:Superintelligence 1686:"Accelerating AI" 1662:978-0-525-55861-3 1150:Eliezer Yudkowsky 1058:978-0-19-537404-9 1033:978-0-87586-870-7 1008:978-0-13-604259-4 1001:. Prentice Hall. 880:Intelligent agent 688:Stuart J. Russell 580:self-preservation 472:Stuart J. Russell 464:Eliezer Yudkowsky 456:Eliezer Yudkowsky 410: 409: 146:Bayesian networks 73:Intelligent agent 16:(Redirected from 2397: 2354: 2353: 2301:Human Compatible 2274:Roman Yampolskiy 2022:Consequentialism 1979:Existential risk 1972: 1965: 1958: 1949: 1868:Existential risk 1845: 1844: 1826: 1820: 1819: 1817: 1815: 1804:The New Atlantis 1795: 1789: 1788: 1749: 1743: 1742: 1740: 1738: 1716: 1710: 1709: 1707: 1705: 1696:(3): 1253–1270. 1681: 1675: 1674: 1654: 1641: 1628: 1627: 1596:AI & Society 1591: 1585: 1584: 1582: 1580: 1560: 1551: 1550: 1548: 1546: 1540: 1533: 1524: 1515: 1514: 1512: 1510: 1471:AI & Society 1462: 1456: 1455: 1424:AI & Society 1419: 1410: 1409: 1373: 1364: 1363: 1361: 1359: 1339: 1333: 1332: 1322: 1312: 1273: 1267: 1266: 1264: 1262: 1242: 1236: 1235: 1215: 1209: 1208: 1206: 1186: 1180: 1179: 1177: 1175: 1169: 1158: 1146: 1140: 1139: 1123: 1115: 1109: 1108: 1106: 1104: 1073:Kevin LaGrandeur 1069: 1063: 1062: 1044: 1038: 1037: 1019: 1013: 1012: 985: 976: 975: 952: 845:Autonomous agent 788:The New Atlantis 778:AI & Society 686:, AI researcher 683:Human Compatible 649:Other approaches 631:utility function 468:superintelligent 402: 395: 388: 309:Existential risk 131:Machine learning 32: 21: 2405: 2404: 2400: 2399: 2398: 2396: 2395: 2394: 2365: 2364: 2363: 2358: 2344: 2283: 2239:Steve Omohundro 2219:Geoffrey Hinton 2209:Stephen Hawking 2194:Paul Christiano 2174:Scott Alexander 2162: 2133:Google DeepMind 2081: 2067:Suffering risks 1985: 1976: 1911:by Nick Bostrom 1905: 1885:Wayback Machine 1876:Mason, C. 2008 1865: 1853: 1851:Further reading 1848: 1841: 1828: 1827: 1823: 1813: 1811: 1797: 1796: 1792: 1751: 1750: 1746: 1736: 1734: 1718: 1717: 1713: 1703: 1701: 1683: 1682: 1678: 1663: 1645:Russell, Stuart 1643: 1642: 1631: 1593: 1592: 1588: 1578: 1576: 1562: 1561: 1554: 1544: 1542: 1538: 1531: 1526: 1525: 1518: 1508: 1506: 1464: 1463: 1459: 1421: 1420: 1413: 1375: 1374: 1367: 1357: 1355: 1341: 1340: 1336: 1275: 1274: 1270: 1260: 1258: 1244: 1243: 1239: 1232: 1217: 1216: 1212: 1204:10.1.1.393.8356 1188: 1187: 1183: 1173: 1171: 1167: 1156: 1148: 1147: 1143: 1136: 1117: 1116: 1112: 1102: 1100: 1071: 1070: 1066: 1059: 1046: 1045: 1041: 1034: 1021: 1020: 1016: 1009: 989:Russell, Stuart 987: 986: 979: 969: 954: 953: 949: 945: 940: 926:Suffering risks 910:Roko's basilisk 800: 763: 757: 714: 664:Steve Omohundro 661: 651: 615: 572:Steve Omohundro 539:to create the " 520: 514: 449: 406: 377: 376: 367: 359: 358: 334: 324: 323: 295:Control problem 275: 265: 264: 176: 166: 165: 126: 118: 117: 88:Computer vision 63: 28: 23: 22: 15: 12: 11: 5: 2403: 2401: 2393: 2392: 2387: 2382: 2377: 2367: 2366: 2360: 2359: 2349: 2346: 2345: 2343: 2342: 2337: 2330: 2323: 2316: 2309: 2304: 2297: 2291: 2289: 2285: 2284: 2282: 2281: 2276: 2271: 2266: 2261: 2256: 2251: 2246: 2241: 2236: 2231: 2226: 2221: 2216: 2211: 2206: 2201: 2196: 2191: 2186: 2181: 2176: 2170: 2168: 2164: 2163: 2161: 2160: 2155: 2150: 2145: 2140: 2135: 2130: 2125: 2120: 2115: 2110: 2105: 2100: 2095: 2089: 2087: 2083: 2082: 2080: 2079: 2074: 2069: 2064: 2062:Machine ethics 2059: 2054: 2049: 2044: 2039: 2034: 2029: 2024: 2019: 2014: 2009: 2004: 1999: 1993: 1991: 1987: 1986: 1977: 1975: 1974: 1967: 1960: 1952: 1946: 1945: 1939: 1933: 1924: 1918: 1912: 1904: 1903:External links 1901: 1900: 1899: 1892: 1874: 1871: 1856:Yudkowsky, E. 1852: 1849: 1847: 1846: 1840:978-0136042594 1839: 1821: 1790: 1744: 1711: 1676: 1661: 1629: 1602:(4): 543–551. 1586: 1552: 1516: 1477:(4): 991–993. 1457: 1430:(2): 505–507. 1411: 1365: 1334: 1295:(16): 168702. 1268: 1237: 1230: 1210: 1181: 1141: 1134: 1121:"Introduction" 1110: 1064: 1057: 1039: 1032: 1014: 1007: 977: 967: 946: 944: 941: 939: 938: 933: 928: 923: 917: 912: 907: 902: 897: 892: 890:Machine ethics 887: 882: 877: 872: 867: 862: 857: 852: 850:Embodied agent 847: 842: 837: 832: 827: 822: 817: 812: 807: 801: 799: 796: 756: 753: 713: 710: 704: 703: 700: 697: 650: 647: 614: 611: 602:Bruce Schneier 598:machine ethics 576:basic "drives" 516:Main article: 513: 510: 448: 445: 437:machine ethics 408: 407: 405: 404: 397: 390: 382: 379: 378: 375: 374: 368: 365: 364: 361: 360: 357: 356: 351: 346: 341: 335: 330: 329: 326: 325: 322: 321: 316: 311: 306: 301: 292: 287: 282: 276: 271: 270: 267: 266: 263: 262: 257: 252: 247: 242: 241: 240: 230: 225: 220: 219: 218: 213: 208: 198: 193: 191:Earth sciences 188: 183: 181:Bioinformatics 177: 172: 171: 168: 167: 164: 163: 158: 153: 148: 143: 138: 133: 127: 124: 123: 120: 119: 116: 115: 110: 105: 100: 95: 90: 85: 80: 75: 70: 64: 59: 58: 55: 54: 44: 43: 37: 36: 26: 24: 14: 13: 10: 9: 6: 4: 3: 2: 2402: 2391: 2388: 2386: 2385:Transhumanism 2383: 2381: 2378: 2376: 2373: 2372: 2370: 2357: 2347: 2341: 2338: 2336: 2335: 2331: 2329: 2328: 2324: 2322: 2321: 2320:The Precipice 2317: 2315: 2314: 2310: 2308: 2305: 2303: 2302: 2298: 2296: 2293: 2292: 2290: 2286: 2280: 2277: 2275: 2272: 2270: 2269:Frank Wilczek 2267: 2265: 2262: 2260: 2257: 2255: 2252: 2250: 2247: 2245: 2242: 2240: 2237: 2235: 2232: 2230: 2227: 2225: 2222: 2220: 2217: 2215: 2214:Dan Hendrycks 2212: 2210: 2207: 2205: 2202: 2200: 2197: 2195: 2192: 2190: 2187: 2185: 2184:Yoshua Bengio 2182: 2180: 2177: 2175: 2172: 2171: 2169: 2165: 2159: 2156: 2154: 2151: 2149: 2146: 2144: 2141: 2139: 2136: 2134: 2131: 2129: 2126: 2124: 2121: 2119: 2116: 2114: 2111: 2109: 2106: 2104: 2101: 2099: 2096: 2094: 2091: 2090: 2088: 2086:Organizations 2084: 2078: 2075: 2073: 2070: 2068: 2065: 2063: 2060: 2058: 2055: 2053: 2050: 2048: 2045: 2043: 2040: 2038: 2035: 2033: 2030: 2028: 2025: 2023: 2020: 2018: 2015: 2013: 2010: 2008: 2005: 2003: 2000: 1998: 1995: 1994: 1992: 1988: 1984: 1980: 1973: 1968: 1966: 1961: 1959: 1954: 1953: 1950: 1943: 1940: 1937: 1934: 1932: 1928: 1925: 1922: 1919: 1916: 1913: 1910: 1907: 1906: 1902: 1897: 1893: 1890: 1886: 1882: 1879: 1875: 1872: 1869: 1863: 1859: 1855: 1854: 1850: 1842: 1836: 1832: 1825: 1822: 1809: 1805: 1801: 1794: 1791: 1787: 1783: 1779: 1775: 1771: 1767: 1763: 1759: 1755: 1748: 1745: 1737:September 17, 1732: 1728: 1727: 1722: 1715: 1712: 1699: 1695: 1691: 1687: 1680: 1677: 1672: 1668: 1664: 1658: 1653: 1652: 1646: 1640: 1638: 1636: 1634: 1630: 1625: 1621: 1617: 1613: 1609: 1605: 1601: 1597: 1590: 1587: 1574: 1570: 1566: 1559: 1557: 1553: 1545:September 12, 1537: 1530: 1523: 1521: 1517: 1504: 1500: 1496: 1492: 1488: 1484: 1480: 1476: 1472: 1468: 1461: 1458: 1453: 1449: 1445: 1441: 1437: 1433: 1429: 1425: 1418: 1416: 1412: 1407: 1403: 1399: 1395: 1391: 1387: 1384:(36): 41–47. 1383: 1379: 1372: 1370: 1366: 1353: 1349: 1345: 1338: 1335: 1330: 1326: 1321: 1316: 1311: 1306: 1302: 1298: 1294: 1290: 1286: 1282: 1278: 1272: 1269: 1256: 1252: 1248: 1241: 1238: 1233: 1231:9780199678112 1227: 1223: 1222: 1214: 1211: 1205: 1200: 1196: 1192: 1185: 1182: 1166: 1162: 1155: 1151: 1145: 1142: 1137: 1135:0-385-09041-2 1131: 1128:. Doubleday. 1127: 1122: 1114: 1111: 1098: 1094: 1090: 1086: 1082: 1078: 1074: 1068: 1065: 1060: 1054: 1050: 1043: 1040: 1035: 1029: 1025: 1018: 1015: 1010: 1004: 1000: 999: 994: 993:Norvig, Peter 990: 984: 982: 978: 974: 970: 968:9780307744258 964: 960: 959: 951: 948: 942: 937: 934: 932: 929: 927: 924: 921: 918: 916: 913: 911: 908: 906: 903: 901: 898: 896: 893: 891: 888: 886: 883: 881: 878: 876: 873: 871: 868: 866: 863: 861: 858: 856: 853: 851: 848: 846: 843: 841: 838: 836: 833: 831: 828: 826: 823: 821: 818: 816: 813: 811: 808: 806: 803: 802: 797: 795: 792: 790: 789: 782: 780: 779: 774: 770: 769: 762: 754: 752: 750: 746: 742: 741:John McGinnis 738: 736: 732: 728: 724: 723: 718: 712:Public policy 711: 709: 701: 698: 695: 694: 692: 689: 685: 684: 678: 675: 671: 669: 665: 660: 656: 648: 646: 644: 640: 636: 632: 628: 624: 619: 612: 610: 606: 603: 599: 595: 590: 587: 583: 581: 577: 573: 569: 567: 561: 556: 553: 549: 544: 542: 538: 534: 530: 526: 519: 511: 509: 507: 503: 497: 495: 489: 487: 486: 481: 477: 473: 469: 465: 457: 453: 446: 444: 442: 438: 434: 430: 426: 422: 418: 414: 403: 398: 396: 391: 389: 384: 383: 381: 380: 373: 370: 369: 363: 362: 355: 352: 350: 347: 345: 342: 340: 337: 336: 333: 328: 327: 320: 317: 315: 312: 310: 307: 305: 302: 300: 296: 293: 291: 288: 286: 283: 281: 278: 277: 274: 269: 268: 261: 258: 256: 253: 251: 248: 246: 243: 239: 238:Mental health 236: 235: 234: 231: 229: 226: 224: 221: 217: 214: 212: 209: 207: 204: 203: 202: 201:Generative AI 199: 197: 194: 192: 189: 187: 184: 182: 179: 178: 175: 170: 169: 162: 159: 157: 154: 152: 149: 147: 144: 142: 141:Deep learning 139: 137: 134: 132: 129: 128: 122: 121: 114: 111: 109: 106: 104: 101: 99: 96: 94: 91: 89: 86: 84: 81: 79: 76: 74: 71: 69: 66: 65: 62: 57: 56: 50: 46: 45: 42: 38: 34: 33: 30: 19: 2332: 2325: 2318: 2311: 2299: 2259:Jaan Tallinn 2199:Eric Drexler 2189:Nick Bostrom 2041: 2002:AI alignment 1931:Bill Hibbard 1861: 1830: 1824: 1812:. Retrieved 1803: 1793: 1785: 1757: 1753: 1747: 1735:. Retrieved 1726:The Guardian 1724: 1714: 1702:. Retrieved 1693: 1689: 1679: 1650: 1599: 1595: 1589: 1577:. Retrieved 1568: 1543:. Retrieved 1507:. Retrieved 1474: 1470: 1460: 1427: 1423: 1381: 1377: 1356:. Retrieved 1347: 1337: 1320:1721.1/79750 1292: 1288: 1281:Freer, C. E. 1271: 1261:December 23, 1259:. Retrieved 1250: 1240: 1220: 1213: 1194: 1190: 1184: 1172:. Retrieved 1160: 1144: 1125: 1113: 1101:. Retrieved 1084: 1080: 1067: 1048: 1042: 1023: 1017: 997: 972: 957: 950: 810:AI alignment 793: 786: 783: 776: 773:Nick Bostrom 768:The Guardian 766: 764: 739: 720: 719:, author of 717:James Barrat 715: 706: 681: 680:In his book 679: 672: 662: 639:meta-ethical 623:human nature 620: 616: 607: 591: 584: 570: 563: 558: 552:Nick Bostrom 545: 537:Isaac Asimov 521: 499: 493: 491: 483: 476:Peter Norvig 461: 440: 420: 416: 412: 411: 289: 285:Chinese room 174:Applications 29: 2264:Max Tegmark 2249:Martin Rees 2057:Longtermism 2017:AI takeover 1929:— by 1896:Friendly AI 1887:Appears in 1814:January 16, 1509:January 21, 1197:: 483–492. 1174:October 19, 820:AI takeover 533:Roger Bacon 494:Friendly AI 478:'s leading 417:friendly AI 314:Turing test 290:Friendly AI 61:Major goals 2369:Categories 2229:Shane Legg 2204:Sam Harris 2179:Sam Altman 2118:EleutherAI 1671:1083694322 1087:(2): 232. 1026:. Algora. 943:References 759:See also: 653:See also: 560:friendly.' 482:textbook, 319:Regulation 273:Philosophy 228:Healthcare 223:Government 125:Approaches 2244:Huw Price 2234:Elon Musk 2138:Humanity+ 2012:AI safety 1774:0952-813X 1616:0951-5666 1499:212407078 1491:1435-5655 1452:198190745 1444:0951-5666 1406:143657841 1398:1477-1756 1199:CiteSeerX 815:AI effect 755:Criticism 674:Seth Baum 668:AI safety 659:AI safety 633:or other 349:AI winter 250:Military 113:AI safety 2356:Category 2224:Bill Joy 1990:Concepts 1881:Archived 1808:Archived 1731:Archived 1704:July 16, 1698:Archived 1624:29012168 1579:July 15, 1573:Archived 1536:Archived 1503:Archived 1358:July 15, 1352:Archived 1329:23679649 1283:(2013). 1255:Archived 1165:Archived 1152:(2008). 1097:Archived 1075:(2011). 995:(2009). 798:See also 372:Glossary 366:Glossary 344:Progress 339:Timeline 299:Takeover 260:Projects 233:Industry 196:Finance 186:Deepfake 136:Symbolic 108:Robotics 83:Planning 1782:7067517 1297:Bibcode 1251:Gizmodo 354:AI boom 332:History 255:Physics 2167:People 2158:OpenAI 1837: 1780: 1772: 1669: 1659: 1622: 1614: 1497: 1489: 1450: 1442: 1404: 1396: 1327: 1228: 1201: 1132: 1103:May 6, 1055: 1030: 1005: 965: 900:OpenAI 657:, and 441:should 304:Ethics 2288:Other 1981:from 1860:. In 1778:S2CID 1620:S2CID 1539:(PDF) 1532:(PDF) 1495:S2CID 1448:S2CID 1402:S2CID 1378:Think 1168:(PDF) 1157:(PDF) 525:golem 429:align 216:Music 211:Audio 1889:AAAI 1835:ISBN 1816:2012 1770:ISSN 1739:2014 1706:2014 1667:OCLC 1657:ISBN 1612:ISSN 1581:2014 1547:2015 1511:2023 1487:ISSN 1440:ISSN 1394:ISSN 1360:2014 1325:PMID 1263:2021 1226:ISBN 1176:2013 1130:ISBN 1105:2013 1053:ISBN 1028:ISBN 1003:ISBN 963:ISBN 531:and 474:and 1997:AGI 1762:doi 1694:104 1604:doi 1479:doi 1432:doi 1386:doi 1315:hdl 1305:doi 1293:110 1195:171 1089:doi 737:. 421:FAI 419:or 206:Art 2371:: 1802:. 1784:. 1776:. 1768:. 1758:26 1756:. 1729:. 1723:. 1692:. 1688:. 1665:. 1632:^ 1618:. 1610:. 1600:32 1598:. 1571:. 1567:. 1555:^ 1519:^ 1501:. 1493:. 1485:. 1475:35 1473:. 1469:. 1446:. 1438:. 1428:35 1426:. 1414:^ 1400:. 1392:. 1382:13 1380:. 1368:^ 1350:. 1346:. 1323:. 1313:. 1303:. 1291:. 1287:. 1279:; 1253:. 1249:. 1193:. 1124:. 1095:. 1085:38 1083:. 1079:. 991:; 980:^ 971:. 1971:e 1964:t 1957:v 1843:. 1818:. 1764:: 1741:. 1708:. 1673:. 1626:. 1606:: 1583:. 1549:. 1513:. 1481:: 1454:. 1434:: 1408:. 1388:: 1362:. 1331:. 1317:: 1307:: 1299:: 1265:. 1234:. 1207:. 1178:. 1138:. 1107:. 1091:: 1061:. 1036:. 1011:. 415:( 401:e 394:t 387:v 297:/ 20:)

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

Knowledge (XXG)

Friendly artificial intelligence

Index