1728:. The software was licensed from third-party developers Joseph Katz and Mark Barton (later, SoftVoice, Inc.) and was featured during the 1984 introduction of the Macintosh computer. This January demo required 512 kilobytes of RAM memory. As a result, it could not run in the 128 kilobytes of RAM the first Mac actually shipped with. So, the demo was accomplished with a prototype 512k Mac, although those in attendance were not told of this and the synthesis demo created considerable excitement for the Macintosh. In the early 1990s Apple expanded its capabilities offering system wide text-to-speech support. With the introduction of faster PowerPC-based computers they included higher quality voice sampling. Apple also introduced
1510:
400:
1642:. The program was available for non-Macintosh Apple computers (including the Apple II, and the Lisa), various Atari models and the Commodore 64. The Apple version preferred additional hardware that contained DACs, although it could instead use the computer's one-bit audio output (with the addition of much distortion) if the card was not present. The Atari made use of the embedded POKEY audio chip. Speech playback on the Atari normally disabled interrupt requests and shut down the ANTIC chip during vocal output. The audible output is extremely distorted speech when the screen is on. The Commodore 64 made use of the 64's embedded SID audio chip.
972:; however, many concatenative systems also have rules-based components. Many systems based on formant synthesis technology generate artificial, robotic-sounding speech that would never be mistaken for human speech. However, maximum naturalness is not always the goal of a speech synthesis system, and formant synthesis systems have advantages over concatenative systems. Formant-synthesized speech can be reliably intelligible, even at very high speeds, avoiding the acoustic glitches that commonly plague concatenative systems. High-speed synthesized speech is used by the visually impaired to quickly navigate computers using a
1886:, a text-to-speech utility for people who have visual impairment. Third-party programs such as JAWS for Windows, Window-Eyes, Non-visual Desktop Access, Supernova and System Access can perform various text-to-speech tasks such as reading text aloud from a specified website, email account, text document, the Windows clipboard, the user's keyboard typing, etc. Not all programs can use speech synthesis directly. Some programs can use plug-ins, extensions or add-ons to read text aloud. Third-party programs are available that can read text from the system clipboard.
559:
1200:. The company states its software is built to adjust the intonation and pacing of delivery based on the context of language input used. It uses advanced algorithms to analyze the contextual aspects of text, aiming to detect emotions like anger, sadness, happiness, or alarm, which enables the system to understand the user's sentiment, resulting in a more realistic and human-like inflection. Other features include multilingual speech generation and long-form content creation with contextually-aware voices.
515:(LSP) method for high-compression speech coding, while at NTT. From 1975 to 1981, Itakura studied problems in speech analysis and synthesis based on the LSP method. In 1980, his team developed an LSP-based speech synthesizer chip. LSP is an important technology for speech synthesis and coding, and in the 1990s was adopted by almost all international speech coding standards as an essential component, contributing to the enhancement of digital speech communication over mobile channels and the internet.
824:(DSP) to the recorded speech. DSP often makes recorded speech sound less natural, although some systems use a small amount of signal processing at the point of concatenation to smooth the waveform. The output from the best unit-selection systems is often indistinguishable from real human voices, especially in contexts for which the TTS system has been tuned. However, maximum naturalness typically require unit-selection speech databases to be very large, in some systems ranging into the
566:
60:
1316:
also be read as "one three two five", "thirteen twenty-five" or "thirteen hundred and twenty five". A TTS system can often infer how to expand a number based on surrounding words, numbers, and punctuation, and sometimes the system provides a way to specify the context if it is ambiguous. Roman numerals can also be read differently depending on context. For example, "Henry VIII" reads as "Henry the Eighth", while "Chapter VIII" reads as "Chapter Eight".
869:. Diphone synthesis suffers from the sonic glitches of concatenative synthesis and the robotic-sounding nature of formant synthesis, and has few of the advantages of either approach other than small size. As such, its use in commercial applications is declining, although it continues to be used in research because there are a number of freely available software implementations. An early example of Diphone synthesis is a teaching robot,
2359:
1338:
1308:" to aid in disambiguating homographs. This technique is quite successful for many cases such as whether "read" should be pronounced as "red" implying past tense, or as "reed" implying present tense. Typical error rates when using HMMs in this fashion are usually below five percent. These techniques also work well for most European languages, although access to required training
185:
1589:). The synthesizer uses a variant of linear predictive coding and has a small in-built vocabulary. The original intent was to release small cartridges that plugged directly into the synthesizer unit, which would increase the device's built-in vocabulary. However, the success of software text-to-speech in the Terminal Emulator II cartridge canceled that plan.
1431:
approach works on any input, but the complexity of the rules grows substantially as the system takes into account irregular spellings or pronunciations. (Consider that the word "of" is very common in
English, yet is the only word in which the letter "f" is pronounced .) As a result, nearly all speech synthesis systems use a combination of these approaches.
1806:
1836:", which allowed command-line users to redirect text output to speech. Speech synthesis was occasionally used in third-party programs, particularly word processors and educational software. The synthesis software remained largely unchanged from the first AmigaOS release and Commodore eventually removed speech synthesis support from AmigaOS 2.1 onward.
1481:, reported that listeners to voice recordings could determine, at better than chance levels, whether or not the speaker was smiling. It was suggested that identification of the vocal features that signal emotional content may be used to help make synthesized speech sound more natural. One of the related issues is modification of the
1439:
loanwords, whose pronunciations are not obvious from their spellings. On the other hand, speech synthesis systems for languages like
English, which have extremely irregular spelling systems, are more likely to rely on dictionaries, and to use rule-based methods only for unusual words, or words that are not in their dictionaries.
1576:
In the early 1980s, TI was known as a pioneer in speech synthesis, and a highly popular plug-in speech synthesizer module was available for the TI-99/4 and 4A. Speech synthesizers were offered free with the purchase of a number of cartridges and were used by many TI-written video games (games offered
1447:
The consistent evaluation of speech synthesis systems may be difficult because of a lack of universally agreed objective evaluation criteria. Different organizations often use different speech data. The quality of speech synthesis systems also depends on the quality of the production technique (which
1235:
used to create convincing speech sentences that sound like specific people saying things they did not say. This technology was initially developed for various applications to improve human life. For example, it can be used to produce audiobooks, and also to help people who have lost their voices (due
748:
Concatenative synthesis is based on the concatenation (stringing together) of segments of recorded speech. Generally, concatenative synthesis produces the most natural-sounding synthesized speech. However, differences between natural variations in speech and the nature of the automated techniques for
2439:
Content creators have used voice cloning tools to recreate their voices for podcasts, narration, and comedy shows. Publishers and authors have also used such software to narrate audiobooks and newsletters. Another area of application is AI video creation with talking heads. Webapps and video editors
1935:
Text-to-speech (TTS) refers to the ability of computers to read text aloud. A TTS engine converts written text to a phonemic representation, then converts the phonemic representation to waveforms that can be output as sound. TTS engines with different languages, dialects and specialized vocabularies
1315:
Deciding how to convert numbers is another problem that TTS systems have to address. It is a simple programming challenge to convert a number into words (at least in
English), like "1325" becoming "one thousand three hundred twenty-five". However, numbers occur in many different contexts; "1325" may
889:
Domain-specific synthesis concatenates prerecorded words and phrases to create complete utterances. It is used in applications where the variety of texts the system will output is limited to a particular domain, like transit schedule announcements or weather reports. The technology is very simple to
828:
of recorded data, representing dozens of hours of speech. Also, unit selection algorithms have been known to select segments from a place that results in less than ideal synthesis (e.g. minor words become unclear) even when a better choice exists in the database. Recently, researchers have proposed
2483:
In the 2010s, Singing synthesis technology has taken advantage of the recent advances in artificial intelligence—deep listening and machine learning to better represent the nuances of the human voice. New high fidelity sample libraries combined with digital audio workstations facilitate editing in
1430:
Each approach has advantages and drawbacks. The dictionary-based approach is quick and accurate, but completely fails if it is given a word which is not in its dictionary. As dictionary size grows, so too does the memory space requirements of the synthesis system. On the other hand, the rule-based
893:
Because these systems are limited by the words and phrases in their databases, they are not general-purpose and can only synthesize the combinations of words and phrases with which they have been preprogrammed. The blending of words within naturally spoken language however can still cause problems
1612:
speech synthesizer chip on a removable cartridge. The
Narrator had 2kB of Read-Only Memory (ROM), and this was utilized to store a database of generic words that could be combined to make phrases in Intellivision games. Since the Orator chip could also accept speech data from external memory, any
1422:
is stored by the program. Determining the correct pronunciation of each word is a matter of looking up each word in the dictionary and replacing the spelling with the pronunciation specified in the dictionary. The other approach is rule-based, in which pronunciation rules are applied to words to
1319:
Similarly, abbreviations can be ambiguous. For example, the abbreviation "in" for "inches" must be differentiated from the word "in", and the address "12 St John St." uses the same abbreviation for both "Saint" and "Street". TTS systems with intelligent front ends can make educated guesses about
1438:
have a very regular writing system, and the prediction of the pronunciation of words based on their spellings is quite successful. Speech synthesis systems for such languages often use the rule-based method extensively, resorting to dictionaries only for those few words, like foreign names and
890:
implement, and has been in commercial use for a long time, in devices like talking clocks and calculators. The level of naturalness of these systems can be very high because the variety of sentence types is limited, and they closely match the prosody and intonation of the original recordings.
2484:
fine detail, such as shifting of formats, adjustment of vibrato, and adjustments to vowels and consonants. Sample libraries for various languages and various accents are available. With today's advancements in vocal synthesis, artists sometimes use sample libraries in lieu of backing singers.
2369:
Speech synthesis techniques are also used in entertainment productions such as games and animations. In 2007, Animo
Limited announced the development of a software application package based on its speech synthesis software FineSpeech, explicitly geared towards customers in the entertainment
1070:
More recent synthesizers, developed by Jorge C. Lucero and colleagues, incorporate models of vocal fold biomechanics, glottal aerodynamics and acoustic wave propagation in the bronchi, trachea, nasal and oral cavities, and thus constitute full systems of physics-based speech simulation.
719:
Naturalness describes how closely the output sounds like human speech, while intelligibility is the ease with which the output is understood. The ideal speech synthesizer is both natural and intelligible. Speech synthesis systems usually try to maximize both characteristics.
2435:
Text-to-speech is also used in second language acquisition. Voki, for instance, is an educational tool created by
Oddcast that allows users to create their own talking avatar, using different accents. They can be emailed, embedded on websites or shared on social media.
2322:
Speech synthesis has long been a vital assistive technology tool and its application in this area is significant and widespread. It allows environmental barriers to be removed for people with a wide range of disabilities. The longest application has been in the use of
1839:
Despite the
American English phoneme limitation, an unofficial version with multilingual speech synthesis was developed. This made use of an enhanced version of the translator library which could translate a number of languages, given a set of rules for each language.
679:
Early electronic speech-synthesizers sounded robotic and were often barely intelligible. The quality of synthesized speech has steadily improved, but as of 2016 output from contemporary speech synthesis systems remains clearly distinguishable from actual human speech.
382:
in the late 1940s and completed it in 1950. There were several different versions of this hardware device; only one currently survives. The machine converts pictures of the acoustic patterns of speech in the form of a spectrogram back into sound. Using this device,
5026:
578:
1281:
that all require expansion into a phonetic representation. There are many spellings in
English which are pronounced differently based on context. For example, "My latest project is to learn how to better project my voice" contains two pronunciations of "project".
1568:
1832:. The synthesis system was divided into a translator library which converted unrestricted English text into a standard set of phonetic codes and a narrator device which implemented a formant model of speech generation.. AmigaOS also featured a high-level "
1626:
1919:
1655:
1203:
The DNN-based speech synthesizers are approaching the naturalness of the human voice. Examples of disadvantages of the method are low robustness when the data are not sufficient, lack of controllability and low performance in auto-regressive models.
1156:(DNN) to produce artificial speech from text (text-to-speech) or spectrum (vocoder). The deep neural networks are trained using a large amount of recorded speech and, in the case of a text-to-speech system, the associated labels and/or input text.
1144:
1166:—hundreds of voices are trained concurrently rather than sequentially, decreasing the required training time and enabling the model to learn and generalize shared emotional context, even for voices with no exposure to such emotional context. The
522:
was released, and was one of the first Speech
Synthesis systems. It consisted of a stand-alone computer hardware and a specialized software that enabled it to read Italian. A second version, released in 1978, was also able to sing Italian in an
1613:
additional words or phrases needed could be stored inside the cartridge itself. The data consisted of strings of analog-filter coefficients to modify the behavior of the chip's synthetic vocal-tract model, rather than simple digitized samples.
1797:
1756:), the user can choose out of a wide range list of multiple voices. VoiceOver voices feature the taking of realistic-sounding breaths between sentences, as well as improved clarity at high read rates over PlainTalk. Mac OS X also includes
1708:
1695:
80:
79:
1448:
may involve analogue or digital recording) and on the facilities used to replay the speech. Evaluating speech synthesis systems has therefore often been compromised by differences between production techniques and replay facilities.
575:
535:
1067:. The system, first marketed in 1994, provides full articulatory-based text-to-speech conversion using a waveguide or transmission-line analog of the human oral and nasal tracts controlled by Carré's "distinctive region model".
5034:
1565:
161:
provides the largest output range, but may lack clarity. For specific usage domains, the storage of entire words or sentences allows for high-quality output. Alternatively, a synthesizer can incorporate a model of the
1236:
to throat disease or other medical problems) to get them back. Commercially, it has opened the door to several opportunities. This technology can also create more personalized digital assistants and natural-sounding
1623:
675:
In 1976, Computalker
Consultants released their CT-1 Speech Synthesizer. Designed by D. Lloyd Rice and Jim Cooper, it was an analog synthesizer built to work with microcomputers using the S-100 bus standard.
576:
451:
was visiting his friend and colleague John Pierce at the Bell Labs Murray Hill facility. Clarke was so impressed by the demonstration that he used it in the climactic scene of his screenplay for his novel
1916:
1652:
4570:
845:
of the language: for example, Spanish has about 800 diphones, and German about 2500. In diphone synthesis, only one example of each diphone is contained in the speech database. At runtime, the target
5503:"Hot AI startup ElevenLabs, founded by ex-Google and Palantir staff, is set to raise $ 18 million at a $ 100 million valuation. Check out the 14-slide pitch deck it used for its $ 2 million pre-seed"
3131:
1141:
1984:. On the other hand, on-line RSS-readers are available on almost any personal computer connected to the Internet. Users can download generated audio files to portable devices, e.g. with a help of
877:. Leachim contained information regarding class curricular and certain biographical information about the students whom it was programmed to teach. It was tested in a fourth grade classroom in
1566:
813:, the desired target utterance is created by determining the best chain of candidate units from the database (unit selection). This process is typically achieved using a specially weighted
1927:
From 1971 to 1996, Votrax produced a number of commercial speech synthesizer components. A Votrax synthesizer was included in the first generation Kurzweil Reading Machine for the Blind.
2150:
which included a Formant synthesis capability. Sequences of up to 512 individual vowel and consonant formants could be stored and replayed, allowing short vocal phrases to be synthesized.
2203:
since the early 2000s has improved beyond the point of human's inability to tell a real human imaged with a real camera from a simulation of a human imaged with a simulation of a camera.
1794:
1252:
reporter Joseph Cox published findings that he had recorded five minutes of himself talking and then used a tool developed by ElevenLabs to create voice deepfakes that defeated a bank's
200:. The front-end has two major tasks. First, it converts raw text containing symbols like numbers and abbreviations into the equivalent of written-out words. This process is often called
169:
The quality of a speech synthesizer is judged by its similarity to the human voice and by its ability to be understood clearly. An intelligible text-to-speech program allows people with
2420:
Text-to-speech for disability and impaired communication aids have become widely available. Text-to-speech is also finding new applications; for example, speech synthesis combined with
1824:
text-to-speech system. It featured a complete system of voice emulation for American English, with both male and female voices and "stress" indicator markers, made possible through the
1624:
1493:
residual). Such pitch synchronous pitch modification techniques need a priori pitch marking of the synthesis speech database using techniques such as epoch extraction using dynamic
2104:) were capable of text-to-phoneme synthesis or reciting complete words and phrases (text-to-dictionary), using a very popular Speech Synthesizer peripheral. TI used a proprietary
2000:
6433:
2055:
Following the commercial failure of the hardware-based Intellivoice, gaming developers sparingly used software synthesis in later games. Earlier systems from Atari, such as the
6593:
1917:
1653:
1174:: each time that speech is generated from the same string of text, the intonation of the speech will be slightly different. The application also supports manually altering the
757:
Unit selection synthesis uses large databases of recorded speech. During database creation, each recorded utterance is segmented into some or all of the following: individual
4953:
2160:
1705:
1692:
7207:
2343:. Work to personalize a synthetic voice to better match a person's personality or historical voice is becoming available. A noted application, of speech synthesis, was the
1999:. It can deliver TTS functionality to anyone (for reasons of accessibility, convenience, entertainment or information) with access to a web browser. The non-profit project
1142:
5210:
3577:
1418:). The simplest approach to text-to-phoneme conversion is the dictionary-based approach, where a large dictionary containing all the words of a language and their correct
5444:
1768:
Standard Additions includes a say verb that allows a script to use any of the installed voices and to control the pitch, speaking rate and modulation of the spoken text.
4893:
4859:
4457:
7187:
4532:
Prathosh, A. P.; Ramakrishnan, A. G.; Ananthapadmanabha, T. V. (December 2013). "Epoch extraction based on integrated linear prediction residual using plosion index".
3573:
5360:
3110:
532:
5676:
976:. Formant synthesizers are usually smaller programs than concatenative systems because they do not have a database of speech samples. They can therefore be used in
4587:
6571:
1795:
3520:
4567:
3609:
3301:. (2003). CMU ARCTIC databases for speech synthesis. CMU-LTI-03-177. Language Technologies Institute, School of Computer Science, Carnegie Mellon University.
1732:
into its systems which provided a fluid command set. More recently, Apple has added sample-based voices. Starting as a curiosity, the speech system of Apple
1063:
in the late 1980s and merged with Apple Computer in 1997), the Trillium software was published under the GNU General Public License, with work continuing as
4328:
5335:
3565:
326:
4245:
Chadha, Anupama; Kumar, Vaibhav; Kashyap, Sonu; Gupta, Mayank (2021), Singh, Pradeep Kumar; Wierzchoń, Sławomir T.; Tanwar, Sudeep; Ganzha, Maria (eds.),
3698:
Englert, Marina; Madazio, Glaucya; Gielow, Ingrid; Lucero, Jorge; Behlau, Mara (2016). "Perceptual error identification of human and synthesized voices".
5886:
2393:
1289:
representations of their input texts, as processes for doing so are unreliable, poorly understood, and computationally ineffective. As a result, various
2241:
that generates high-quality voices from an assortment of fictional characters from a variety of media sources was released. Initial characters included
4231:
1485:
of the sentence, depending upon whether it is an affirmative, interrogative or exclamatory sentence. One of the techniques for pitch modification uses
988:
power are especially limited. Because formant-based systems have complete control of all aspects of the output speech, a wide variety of prosodies and
6982:
6426:
4420:
2477:
337:, Hungary, described in a 1791 paper. This machine added models of the tongue and lips, enabling it to produce consonants as well as vowels. In 1837,
3082:
1980:. On one hand, online RSS-narrators simplify information delivery by allowing users to listen to their favourite news sources and to convert them to
574:
3193:
2787:
1863:
1859:
1706:
1693:
4722:
7151:
2380:
2370:
industries, able to generate narration and lines of dialogue according to user specifications. The application reached maturity in 2008, when NEC
1182:(a term coined by this project), a sentence or phrase that conveys the emotion of the take that serves as a guide for the model during inference.
77:
2858:
1509:
1018:. Creating proper intonation for these projects was painstaking, and the results have yet to be matched by real-time text-to-speech interfaces.
5526:
1320:
ambiguous abbreviations, while others provide the same result in all cases, resulting in nonsensical (and sometimes comical) outputs, such as "
519:
2459:
and includes models of vocal frequency jitter and tremor, airflow noise and laryngeal asymmetries. The synthesizer has been used to mimic the
1564:
1051:
Until recently, articulatory synthesis models have not been incorporated into commercial speech synthesis systems. A notable exception is the
1036:
and the articulation processes occurring there. The first articulatory synthesizer regularly used for laboratory experiments was developed at
5178:
5133:
4679:
4266:
4153:
4086:
3987:
3631:
3233:
3046:
2633:
2569:
2314:, for example, includes tags related to speech recognition, dialogue management and touchtone dialing, in addition to text-to-speech markup.
734:. Each technology has strengths and weaknesses, and the intended uses of a synthesis system will typically determine which approach is used.
577:
533:
6892:
6583:
6419:
2535:
251:
information together make up the symbolic linguistic representation that is output by the front-end. The back-end—often referred to as the
5802:
5284:
4832:
3852:
1622:
7146:
5669:
4857:
Jia, Ye; Zhang, Yu; Weiss, Ron J. (2018-06-12), "Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis",
2261:
749:
segmenting the waveforms sometimes result in audible glitches in the output. There are three main sub-types of concatenative synthesis.
78:
4793:, PDA's and MP3-Players, Proceedings of the 18th International Conference on Database and Expert Systems Applications, Pages: 575–579
2805:
7202:
6753:
6204:
5254:
3158:
1915:
1651:
1558:
1538:
1015:
995:
Examples of non-real-time but highly accurate intonation control in formant synthesis include the work done in the late 1970s for the
810:
501:
131:
3822:
Valle, Rafael (2020). "Mellotron: Multispeaker expressive voice synthesis by conditioning on rhythm, pitch and global style tokens".
3759:
1567:
698:
caused speech synthesizers to become cheaper and more accessible, more people would benefit from the use of text-to-speech programs.
551:
at MIT, and the Bell Labs system; the latter was one of the first multilingual language-independent systems, making extensive use of
466:
puts it to sleep. Despite the success of purely electronic speech synthesis, research into mechanical speech-synthesizers continues.
6907:
6738:
4798:
4108:
3266:
2500:
1385:
1359:
1140:
399:
5502:
6678:
6281:
5977:
5946:
5942:
4696:
4488:
2937:
2292:
2032:
1625:
255:—then converts the symbolic linguistic representation into sound. In certain systems, this part includes the computation of the
7095:
6748:
6199:
6043:
4505:
Muralishankar, R.; Ramakrishnan, A. G.; Prathibha, P. (February 2004). "Modification of pitch using DCT in the source domain".
3337:
2831:
712:
318:
302:
212:
1793:
6743:
6488:
5879:
5697:
5662:
3498:
2704:
2340:
1918:
1654:
1363:
1134:
485:
5314:
3882:
3463:
3107:
2175:
to achieve text-to-speech synthesis, that can be made to sound almost like anybody from a speech sample of only 5 seconds.
7192:
7012:
6733:
1523:
1143:
1059:, where much of the original research was conducted. Following the demise of the various incarnations of NeXT (started by
895:
454:
365:
4769:
6705:
5774:
5741:
2876:"A History of Realtime Digital Speech on Packet Networks: Part II of Linear Predictive Coding and the Internet Protocol"
1704:
1691:
1460:
949:
463:
4594:
1501:
regions of speech. In general, prosody remains a challenge for speech synthesizers, and is an active research topic.
1451:
Since 2005, however, some researchers have started to evaluate speech synthesis systems using a common speech dataset.
30:
7050:
7035:
7007:
6872:
6867:
6442:
5937:
5843:
5787:
5702:
2510:
2425:
552:
416:
306:
1796:
1348:
4295:
3362:
2908:
531:
7197:
6787:
6758:
6536:
6194:
5792:
5782:
5707:
3515:
3379:
Muralishankar, R; Ramakrishnan, A.G.; Prathibha, P (2004). "Modification of Pitch using DCT in the Source Domain".
1945:
1171:
5469:
1367:
1352:
7182:
6630:
6483:
6209:
5872:
2452:
2444:
allow users to create video content involving AI avatars, who are made to speak using text-to-speech technology.
2167:
presented the work 'Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis', which
2125:
1609:
1486:
1032:
Articulatory synthesis consists of computational techniques for synthesizing speech based on models of the human
930:
926:
921:, many final consonants become no longer silent if followed by a word that begins with a vowel, an effect called
866:
850:
821:
789:
set to a "forced alignment" mode with some manual correction afterward, using visual representations such as the
688:
3438:
2675:
944:
synthesis does not use human speech samples at runtime. Instead, the synthesized speech output is created using
356:, which automatically analyzed speech into its fundamental tones and resonances. From his work on the vocoder,
7156:
7080:
6812:
6768:
6653:
6551:
6179:
6023:
5006:
4790:
3569:
2525:
2495:
2222:
2042:
1707:
1694:
1635:
1474:
1197:
1044:, Tom Baer, and Paul Mermelstein. This synthesizer, known as ASY, was based on vocal tract models developed at
989:
854:
801:
of the units in the speech database is then created based on the segmentation and acoustic parameters like the
652:" programming technique to produce a synthesized speech waveform. Another early example, the arcade version of
469:
5057:"Effect of Text-to-Speech and Human Reader on Listening Comprehension for Students with Learning Disabilities"
7060:
7030:
6697:
6370:
6255:
6124:
6109:
5764:
5229:
3557:
3545:
2520:
2374:
announced a web service that allows users to create phrases from the voices of characters from the Japanese
1889:
1817:
1270:
1232:
1000:
929:
cannot be reproduced by a simple word-concatenation system, which would require additional complexity to be
743:
146:
69:
4888:
2225:. It was driven only by a voice track as source data for the animation after the training phase to acquire
1192:, AI-assisted text-to-speech software, Speech Synthesis, which can produce lifelike speech by synthesizing
534:
6917:
6610:
6588:
6578:
6546:
6521:
6365:
6319:
5737:
3585:
2505:
2229:
and wider facial information from training material consisting of 2D videos with audio had been completed.
1761:
1528:
1305:
1027:
992:
can be output, conveying not just questions and statements, but a variety of emotions and tones of voice.
914:
906:
782:
695:
659:
631:
598:
505:
233:
217:
135:
4253:, Lecture Notes in Networks and Systems, vol. 203, Singapore: Springer Singapore, pp. 557–566,
3733:
3285:
2221:
2017 an audio driven digital look-alike of upper torso of Barack Obama was presented by researchers from
597:
portable calculator for the blind in 1976. Other devices had primarily educational purposes, such as the
6777:
6334:
6073:
5956:
5717:
4726:
3597:
3561:
3363:
The MBROLA Project: Towards a set of high quality speech synthesizers of use for non commercial purposes
2447:
Speech synthesis is a valuable computational aid for the analysis and assessment of speech disorders. A
2413:
2200:
1781:
1753:
1668:
1464:
1398:
Speech synthesis systems use two basic approaches to determine the pronunciation of a word based on its
1096:
1092:
1056:
953:
846:
802:
654:
558:
330:
248:
221:
4424:
341:
produced a "speaking machine" based on von Kempelen's design, and in 1846, Joseph Faber exhibited the "
274:, some people tried to build machines to emulate human speech. Some early legends of the existence of "
5386:
3929:
3406:
3078:
2855:
1055:-based system originally developed and marketed by Trillium Sound Research, a spin-off company of the
7130:
6806:
6782:
6635:
6329:
6038:
5996:
5853:
5727:
5056:
3774:
3311:
2752:
2594:
2348:
2019:
1992:
1957:
1672:
1579:
1435:
1037:
493:
420:
375:
4012:
2585:
Rubin, P.; Baer, T.; Mermelstein, P. (1981). "An articulatory synthesizer for perceptual research".
1757:
7110:
7040:
6997:
6953:
6725:
6715:
6710:
6598:
5835:
5825:
5749:
5631:
3593:
3010:; Nebbia, Luciano (1 November 1995). "Interactive voice technology at work: The CSELT experience".
2441:
2327:
for people with visual impairment, but text-to-speech systems are now commonly used by people with
2207:
2186:
system with similar aims at the 2018 NeurIPS conference, though the result is rather unconvincing.
2172:
2133:
1961:
1892:
is a server-based package for voice synthesis and recognition. It is designed for network use with
1585:
1423:
determine their pronunciations based on their spellings. This is similar to the "sounding out", or
1253:
1211:
required and sometimes the output of speech synthesizer may result in the mistakes of tone sandhi.
1153:
1080:
981:
829:
various automated methods to detect unnatural segments in unit-selection speech synthesis systems.
590:
565:
512:
197:
3258:
1638:
was the first commercial all-software voice synthesis program. It was later used as the basis for
605:
in 1978. Fidelity released a speaking version of its electronic chess computer in 1979. The first
7120:
6992:
6857:
6620:
6603:
6461:
6349:
6339:
6250:
5812:
5754:
5722:
5336:"Now hear this: Voice cloning AI startup ElevenLabs nabs $ 19M from a16z and other heavy hitters"
5184:
5139:
5094:
4902:
4868:
4549:
4480:
4438:
4329:"Val Kilmer Gets His Voice Back After Throat Cancer Battle Using AI Technology: Hear the Results"
4272:
4212:
4159:
4131:
4102:
4063:
4045:
3823:
3670:
3007:
2421:
2332:
2296:
2211:
2143:
2083:
1883:
1867:
1729:
1241:
1116:
1104:
1084:
945:
874:
786:
371:
338:
202:
174:
139:
5202:
4978:
3174:
2335:
as well as by pre-literate children. They are also frequently employed to aid those with severe
5276:
5203:"Evolution of Reading Machines for the Blind: Haskins Laboratories" Research as a Case History"
5118:
2022 Wave Electronics and its Application in Information and Telecommunication Systems (WECONF)
4251:
Proceedings of Second International Conference on Computing, Communications, and Cyber-Security
4185:"Anticipating and addressing the ethical implications of deepfakes in the context of elections"
3844:
3479:
2690:("Mechanism of the human speech with description of its speaking machine", J. B. Degen, Wien).
7125:
6837:
6645:
6556:
6314:
6134:
5848:
5797:
5712:
5477:
5419:
5174:
5129:
5086:
4794:
4675:
4361:
4303:
4262:
4204:
4149:
4090:
3962:
3790:
3715:
3627:
3414:
3262:
3229:
3154:
3042:
2900:
2768:
2629:
2565:
2530:
2429:
2336:
2168:
1855:
1498:
1490:
1424:
1175:
1045:
996:
758:
602:
481:
342:
271:
170:
154:
119:
5411:
4954:"An artificial-intelligence first: Voice-mimicking software reportedly used in a major theft"
3903:
3605:
2649:
Van Santen, J. (April 1994). "Assignment of segmental duration in text-to-speech synthesis".
589:
electronics featuring speech synthesis began emerging in the 1970s. One of the first was the
411:
The first computer-based speech-synthesis systems originated in the late 1950s. Noriko Umeda
6887:
6862:
6663:
6566:
6189:
6139:
5759:
5608:
5166:
5121:
5076:
5068:
4618:
4541:
4514:
4472:
4254:
4196:
4141:
4055:
3782:
3707:
3662:
3388:
3250:
3217:
3019:
2890:
2809:
2760:
2658:
2602:
2515:
2385:
2307:. Although each of these was proposed as a standard, none of them have been widely adopted.
2252:
1893:
1749:
1717:
1664:
1321:
922:
841:(sound-to-sound transitions) occurring in a language. The number of diphones depends on the
820:
Unit selection provides the greatest naturalness, because it applies only a small amount of
798:
477:
448:
379:
178:
4791:
The Pediaphon – Speech Interface to the free Knowledge (XXG) Encyclopedia for Mobile Phones
4184:
2358:
2193:
researchers know of 3 cases where digital sound-alikes technology has been used for crime.
7114:
7075:
7070:
6938:
6668:
6541:
6516:
6498:
6033:
5820:
4574:
3798:
3524:
3114:
2862:
2407:
2401:
2362:
2344:
2284:
2247:
2035:
which uses diphone-based synthesis, as well as more modern and better-sounding techniques.
1849:
1207:
For tonal languages, such as Chinese or Taiwanese language, there are different levels of
977:
918:
870:
663:
548:
500:
during the 1970s. LPC was later the basis for early speech synthesizer chips, such as the
404:
283:
4246:
3251:
2003:
was created in 2006 to provide a similar web-based TTS interface to the Knowledge (XXG).
909:
is usually only pronounced when the following word has a vowel as its first letter (e.g.
610:
5000:
4836:
3778:
2934:
Proceedings of the 5th International Conference on Spoken Language Processing (ICSLP'98)
2756:
2598:
6822:
6802:
6526:
6380:
5445:"Arrested Succession Parody On YouTube Features 'Narration' By AI-Generated Ron Howard"
5361:"Sztuczna inteligencja czyta głosem Jarosława Kuźniara. Rewolucja w radiu i podcastach"
4671:
3325:
Automatic Detection of Unnatural Word-Level Segments in Unit-Selection Speech Synthesis
2926:
2688:
Mechanismus der menschlichen Sprache nebst der Beschreibung seiner sprechenden Maschine
2622:
2620:
van Santen, Jan P. H.; Sproat, Richard W.; Olive, Joseph P.; Hirschberg, Julia (1997).
2196:
This increases the stress on the disinformation situation coupled with the facts that
2190:
2179:
2087:
2071:
1973:
1745:
1721:
1297:, like examining neighboring words and using statistics about frequency of occurrence.
1237:
1228:
1221:
985:
489:
424:
384:
166:
and other human voice characteristics to create a completely "synthetic" voice output.
4296:"AI gave Val Kilmer his voice back. But critics worry the technology could be misused"
3954:
2957:
2299:
in 2004. Older speech synthesis markup languages include Java Speech Markup Language (
2010:
through the W3C Audio Incubator Group with the involvement of The BBC and Google Inc.
7176:
7085:
6897:
6877:
6658:
6400:
6390:
6309:
6053:
5903:
5307:"『Portal』のGLaDOSや『UNDERTALE』のサンズがテキストを読み上げてくれる。文章に込められた感情まで再現することを目指すサービス「15.ai」が話題に"
5188:
5143:
5125:
5098:
4700:
4276:
4216:
4163:
4067:
4037:
3904:"Generative AI comes for cinema dubbing: Audio AI startup ElevenLabs raises pre-seed"
3875:"『Portal』のGLaDOSや『UNDERTALE』のサンズがテキストを読み上げてくれる。文章に込められた感情まで再現することを目指すサービス「15.ai」が話題に"
3651:
3537:
3298:
3281:
3225:
3023:
2558:
2448:
2324:
2183:
2075:
1965:
1601:
1532:
1482:
1419:
1193:
1189:
1167:
973:
814:
806:
668:
473:
4553:
4484:
4145:
4059:
3674:
3341:
2835:
2310:
Speech synthesis markup languages are distinguished from dialogue markup languages.
7065:
6395:
6344:
6184:
6013:
4353:
3601:
2266:
2119:
1897:
1879:
1777:
1605:
1278:
1041:
1007:
842:
636:
431:
computer to synthesize speech, an event among the most prominent in the history of
357:
279:
193:
5527:"AI-Generated Voice Firm Clamps Down After 4chan Makes Celebrity Voices for Abuse"
5170:
5163:
2021 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI)
5157:
Zhao, Yunxin; Song, Minguang; Yue, Yanghao; Kuruvilla-Dugdale, Mili (2021-07-27).
5072:
3666:
3324:
2719:
1812:
The second operating system to feature advanced speech synthesis capabilities was
865:. or more recent techniques such as pitch modification in the source domain using
4518:
4476:
4258:
4038:"Probing the phonetic and phonological knowledge of tones in Mandarin TTS models"
3711:
3392:
3096:
2985:
76:
7022:
6902:
6615:
6531:
6508:
6456:
6324:
6240:
6048:
5927:
5685:
5306:
4385:
3874:
3589:
2147:
1996:
1969:
1765:
1752:) there was only one standard voice shipping with Mac OS X. Starting with 10.6 (
1411:
1337:
1309:
1208:
1088:
1033:
1011:
794:
613:
444:
310:
287:
275:
268:
259:(pitch contour, phoneme durations), which is then imposed on the output speech.
163:
5158:
5113:
4921:
4123:
6625:
6411:
6265:
6230:
6129:
6119:
6058:
5648:
5640:
4545:
3541:
3494:
3118:
2271:
2256:
2189:
By 2019 the digital sound-alikes found their way to the hands of criminals as
2060:
2056:
1875:
1871:
1249:
1185:
1060:
684:
606:
524:
440:
153:. Systems differ in the size of the stored speech units; a system that stores
95:
5481:
5423:
5090:
4365:
4307:
4208:
4200:
4128:
2020 IEEE International Conference on Multimedia & Expo Workshops (ICMEW)
4094:
3966:
3418:
2904:
1269:
The process of normalizing text is rarely straightforward. Texts are full of
6493:
6260:
6104:
6083:
6078:
5932:
5613:
4887:
Arık, Sercan Ö.; Chen, Jitong; Peng, Kainan; Ping, Wei; Zhou, Yanqi (2018),
4748:"How to configure and use Text-to-Speech in Windows XP and in Windows Vista"
4663:
4013:"ElevenLabs' Powerful New AI Tool Lets You Make a Full Audiobook in Minutes"
3988:"Voice-generating platform ElevenLabs raises $ 19M, launches detection tool"
3786:
2927:"The Distance Measure for Line Spectrum Pairs Applied to Speech Recognition"
2464:
2456:
2115:
2094:
2038:
1821:
1741:
1737:
1733:
1725:
1639:
1294:
1290:
1286:
1100:
1064:
957:
878:
618:
497:
432:
361:
349:
334:
5112:
Triandafilidi, Ioanis I.; Tatarnikova, T. M.; Poponin, A. S. (2022-05-30).
4999:
Suwajanakorn, Supasorn; Seitz, Steven; Kemelmacher-Shlizerman, Ira (2017),
4568:
TI will exit dedicated speech-synthesis chips, transfer products to Sensory
3719:
3312:
Language Generation and Speech Synthesis in Dialogues for Language Learning
2662:
2100:
Some models of Texas Instruments home computers produced in 1979 and 1981 (
1121:
Sinewave synthesis is a technique for synthesizing speech by replacing the
785:. Typically, the division into segments is done using a specially modified
5576:
4811:
3794:
2875:
2772:
2365:
was one of the most famous people to use a speech computer to communicate.
2108:
to embed complete spoken phrases into applications, primarily video games.
1988:
receiver, and listen to them while walking, jogging or commuting to work.
1801:
Example of speech synthesis with the included Say utility in Workbench 1.3
415:
developed the first general English text-to-speech system in 1968, at the
6968:
6948:
6933:
6912:
6882:
6827:
6792:
6673:
6291:
6235:
6154:
6149:
6068:
6028:
5922:
5602:
4747:
3661:. Lyon, France: International Speech Communication Association: 587–591.
2328:
2311:
2234:
2226:
2218:
2101:
1679:
1518:
1415:
1403:
1399:
1122:
1048:
in the 1960s and 1970s by Paul Mermelstein, Cecil Coker, and colleagues.
965:
825:
790:
770:
766:
645:
586:
459:
388:
242:
150:
130:) system converts normal language text into speech; other systems render
115:
5654:
5081:
569:
Fidelity Voice Chess Challenger (1979), the first talking chess computer
7105:
6963:
6943:
6817:
6561:
6476:
6159:
6018:
5951:
4926:
3581:
2895:
2478:
Music technology (electronic and digital) § Vocal synthesis after 2010s
2371:
2079:
2064:
1985:
1981:
1833:
1829:
1813:
1494:
1407:
941:
838:
762:
729:
723:
The two primary technologies generating synthetic speech waveforms are
627:
544:
436:
428:
353:
322:
298:
294:
184:
158:
2467:
speakers with controlled levels of roughness, breathiness and strain.
1805:
557:
539:
DECtalk demo recording using the Perfect Paul and Uppity Ursula voices
236:. The process of assigning phonetic transcriptions to words is called
6471:
6466:
6375:
6245:
5972:
5918:
5635:
2764:
2606:
2460:
2389:
2352:
2242:
2164:
2137:
2129:
2026:
1909:
1598:
1274:
862:
778:
229:
225:
107:
3503:
Proceedings ESCA-NATO Workshop and Applications of Speech Technology
2788:"Louis Gerstman, 61, a Specialist In Speech Disorders and Processes"
2743:
Klatt, D (1987). "Review of text-to-speech conversion for English".
1549:
Popular systems offering speech synthesis as a built-in capability.
1083:, also called Statistical Parametric Synthesis. In this system, the
837:
Diphone synthesis uses a minimal speech database containing all the
5864:
5277:"ゲームキャラ音声読み上げソフト「15.ai」公開中。『Undertale』や『Portal』のキャラに好きなセリフを言ってもらえる"
4907:
4873:
4136:
4050:
3845:"ゲームキャラ音声読み上げソフト「15.ai」公開中。『Undertale』や『Portal』のキャラに好きなセリフを言ってもらえる"
3828:
894:
unless the many variations are taken into account. For example, in
488:(NTT) in 1966. Further developments in LPC technology were made by
7161:
6797:
6385:
6286:
6214:
6114:
6088:
6063:
5982:
5607:(Master of Music Music thesis). Florida International University.
3361:
T. Dutoit, V. Pagel, N. Pierret, F. Bataille, O. van der Vrecken.
3194:"Ann Syrdal, Who Helped Give Computers a Female Voice, Dies at 74"
2678:, Helsinki University of Technology, Retrieved on November 4, 2006
2375:
2357:
2304:
2238:
2105:
1913:
1825:
1791:
1702:
1689:
1649:
1620:
1562:
1508:
1497:
index applied on the integrated linear prediction residual of the
1159:
1138:
961:
858:
809:), duration, position in the syllable, and neighboring phones. At
572:
564:
529:
398:
314:
192:
A text-to-speech system (or "engine") is composed of two parts: a
4979:"Face2Face: Real-time Face Capture and Reenactment of RGB Videos"
2347:
which incorporated text-to-phonetics software based on work from
2097:
incorporated the Texas Instruments TMS5220 speech synthesis chip.
849:
of a sentence is superimposed on these minimal units by means of
6683:
6144:
4833:"Smithsonian Speech Synthesis History Project (SSSHP) 1986–2002"
4639:
2961:
2300:
2287:
have been established for the rendition of text as speech in an
1231:(also known as voice cloning or deepfake audio) is a product of
1052:
1004:
774:
6415:
5868:
5658:
4588:"1400XL/1450XL Speech Handler External Reference Specification"
648:, for which the game's developer, Hiroshi Suzuki, developed a "
6958:
5644:
4931:
4399:
4232:"Deepfake Audio Boom Exploits One Billion-Dollar Startup's AI"
3737:
2832:"Where "HAL" First Spoke (Bell Labs Speech Synthesis website)"
2288:
2206:
2D video forgery techniques were presented in 2016 that allow
2111:
2067:
and Open Sesame), also had games utilizing software synthesis.
2007:
1977:
1331:
706:
The most important qualities of a speech synthesis system are
387:
and colleagues discovered acoustic cues for the perception of
305:
won the first prize in a competition announced by the Russian
3758:
Remez, R.; Rubin, P.; Pisoni, D.; Carrell, T. (22 May 1981).
3039:
Multilingual Text-to-Speech Synthesis: The Bell Labs Approach
2556:
Allen, Jonathan; Hunnicutt, M. Sharon; Klatt, Dennis (1987).
177:
to listen to written words on a home computer. Many computer
4981:. Proc. Computer Vision and Pattern Recognition (CVPR), IEEE
1804:
1293:
techniques are used to guess the proper way to disambiguate
58:
4423:. University of Portsmouth. January 9, 2008. Archived from
1712:
MacinTalk 2 demo featuring the Mr. Hughes and Marvin voices
1675:
to enable World English Spelling text-to-speech synthesis.
1478:
683:
Synthesized voices typically sounded male until 1990, when
5120:. St. Petersburg, Russian Federation: IEEE. pp. 1–5.
2140:
and others use speech synthesis for automobile navigation.
1816:, introduced in 1985. The voice synthesis was licensed by
1148:
Speech synthesis example using the HiFi-GAN neural vocoder
3955:"This Podcast Is Not Hosted by AI Voice Clones. We Swear"
181:
have included speech synthesizers since the early 1990s.
86:
A synthetic voice announcing an arriving train in Sweden.
5255:"Code Geass Speech Synthesizer Service Offered in Japan"
4723:"Accessibility Tutorials for Windows XP: Using Narrator"
3286:
Perfect synthesis for all of the people all of the time.
2703:
Mattingly, Ignatius G. (1974). Sebeok, Thomas A. (ed.).
2451:
synthesizer, developed by Jorge C. Lucero et al. at the
1748:(10.4). During 10.4 (Tiger) and first releases of 10.5 (
1682:
computers were sold with "stspeech.tos" on floppy disk.
4124:"Deepfake Detection: Current Challenges and Next Steps"
2705:"Speech synthesis for phonetic and phonological models"
1300:
Recently TTS systems have begun to use HMMs (discussed
1099:) of speech are modeled simultaneously by HMMs. Speech
360:
developed a keyboard-operated voice-synthesizer called
34:
5412:"Generative AI Podcasts Are Here. Prepare to Be Bored"
5114:"Speech Synthesis System for People with Disabilities"
4439:"Smile – And The World Can Hear You, Even If You Hide"
4083:
Weaponised deep fakes: National security and democracy
1820:
from SoftVoice, Inc., who also developed the original
1764:
application that converts text to audible speech. The
968:
of artificial speech. This method is sometimes called
110:. A computer system used for this purpose is called a
5159:"Personalizing TTS Voices for Progressive Dysarthria"
4458:"The vocal communication of different kinds of smile"
4183:
Diakopoulos, Nicholas; Johnson, Deborah (June 2020).
3484:. World Future Society. 1978. pp. 359, 360, 361.
3338:"Pitch-Synchronous Overlap and Add (PSOLA) Synthesis"
3257:. Cambridge, UK: Cambridge University Press. p.
3138:. Vol. 11, no. 2. pp. 134-175 (160-3).
1663:
Arguably, the first speech system integrated into an
1402:, a process which is often called text-to-phoneme or
5641:
Simulated singing with the singing robot Pavarobotti
4354:"AI-Generated Voice Deepfakes Aren't Scary Good—Yet"
3439:"1960 - Rudy the Robot - Michael Freeman (American)"
1964:
and gadgets that can read messages directly from an
7139:
7094:
7049:
7021:
6981:
6926:
6848:
6836:
6767:
6724:
6696:
6644:
6507:
6449:
6358:
6300:
6274:
6223:
6172:
6097:
6006:
5995:
5965:
5911:
5902:
5834:
5811:
5773:
5736:
5551:
5055:Brunow, David A.; Cullen, Theresa A. (2021-07-03).
4081:Smith, Hannah; Mansted, Katherine (April 1, 2020).
3760:"Speech perception without traditional speech cues"
2161:
Conference on Neural Information Processing Systems
1991:A growing field in Internet based TTS is web-based
1870:. SAPI 4.0 was available as an optional add-on for
1079:HMM-based synthesis is a synthesis method based on
345:". In 1923, Paget resurrected Wheatstone's design.
5211:Journal of Rehabilitation Research and Development
4697:"Translator Library (Multilingual-speech version)"
3650:Lucero, J. C.; Schoentgen, J.; Behlau, M. (2013).
3497:, J.L. Gauvain, B. Prouts, C. Bouhier, R. Boesch.
3323:William Yang Wang and Kallirroi Georgila. (2011).
3153:. Vol. 1. SMG Szczepaniak. pp. 544–615.
2856:Anthropomorphic Talking Robot Waseda-Talker Series
2621:
2557:
1572:TI-99/4A speech demo using the built-in vocabulary
1285:Most text-to-speech (TTS) systems do not generate
582:Speech output from Fidelity Voice Chess Challenger
220:to each word, and divides and marks the text into
5387:"AI Can Clone Your Favorite Podcast Host's Voice"
4894:Advances in Neural Information Processing Systems
4860:Advances in Neural Information Processing Systems
3930:"AI Can Clone Your Favorite Podcast Host's Voice"
2980:
2978:
2432:using 15.ai and external voice control software.
1671:computers. These used the Votrax SC01 chip and a
543:Dominant systems in the 1980s and 1990s were the
5552:"Usage of text-to-speech in AI video generation"
5002:Synthesizing Obama: Learning Lip Sync from Audio
3132:"The Replay Years: Reflections from Eddie Adlum"
1608:Voice Synthesis module in 1982. It included the
1125:(main bands of energy) with pure tone whistles.
1103:are generated from HMMs themselves based on the
403:Computer and speech synthesizer housing used by
364:(Voice Demonstrator), which he exhibited at the
5230:"Speech Synthesis Software for Anime Announced"
2424:allows for interaction with mobile devices via
2006:Other work is being done in the context of the
149:pieces of recorded speech that are stored in a
4770:"An introduction to Text-To-Speech in Android"
3652:"Physics-based synthesis of disordered voices"
3499:Generation and Synthesis of Broadcast Messages
3151:The Untold History of Japanese Game Developers
2718:. Mouton, The Hague: 2451–2487. Archived from
1936:are available through third-party publishers.
1513:A speech synthesis kit produced by Bell System
6427:
5880:
5670:
1923:Votrax Type 'N Talk speech synthesizer (1980)
8:
4922:"Fake voices 'help cyber-crooks steal cash'"
4534:IEEE Trans. Audio Speech Language Processing
3610:Escape from the Planet of the Robot Monsters
3179:Smithsonian Speech Synthesis History Project
2925:Zheng, F.; Song, Z.; Li, L.; Yu, W. (1998).
2745:Journal of the Acoustical Society of America
2587:Journal of the Acoustical Society of America
2428:interfaces. Some users have also created AI
1736:has evolved into a fully supported program,
1312:is frequently difficult in these languages.
3087:: "Talking electronic game", April 27, 1982
2676:History and Development of Speech Synthesis
2041:which uses articulatory synthesis from the
1995:, e.g. 'Browsealoud' from a UK company and
1866:components to support speech synthesis and
1744:was for the first time featured in 2005 in
1716:The first speech system integrated into an
1577:with speech during this promotion included
1366:. Unsourced material may be challenged and
1324:" being rendered as "Ulysses South Grant".
321:notation: , , , and ). There followed the
18:
6845:
6641:
6434:
6420:
6412:
6003:
5908:
5887:
5873:
5865:
5677:
5663:
5655:
5470:"Can A.I. Be Funny? This Troupe Thinks So"
4619:"It Sure Is Great To Get Out Of That Bag!"
2988:. IEEE Global History Network. 20 May 2009
2146:produced a music synthesizer in 1999, the
2029:which supports a broad range of languages.
1948:added support for speech synthesis (TTS).
462:computer sings the same song as astronaut
5612:
5080:
4906:
4889:"Neural Voice Cloning with a Few Samples"
4872:
4135:
4049:
3827:
3546:Star Trek: Strategic Operations Simulator
3314:, masters thesis, Section 5.6 on page 54.
2894:
1386:Learn how and when to remove this message
609:to feature speech synthesis was the 1980
1976:. Some specialized software can narrate
1301:
964:levels are varied over time to create a
511:In 1975, Fumitada Itakura developed the
247:conversion. Phonetic transcriptions and
183:
24:This is an accepted version of this page
7188:Applications of artificial intelligence
4725:. Microsoft. 2011-01-29. Archived from
4195:(7) (published 2020-06-05): 2072–2098.
3734:"The HMM-based Speech Synthesis System"
2952:
2950:
2548:
2381:Code Geass: Lelouch of the Rebellion R2
694:Kurzweil predicted in 2005 that as the
20:
5165:. Athens, Greece: IEEE. pp. 1–4.
4100:
2560:From Text to Speech: The MITalk system
2345:Kurzweil Reading Machine for the Blind
2291:-compliant format. The most recent is
2102:Texas Instruments TI-99/4 and TI-99/4A
435:. Kelly's voice recorder synthesizer (
106:is the artificial production of human
93:
7208:History of human–computer interaction
5649:how the robot synthesized the singing
4087:Australian Strategic Policy Institute
3693:
3691:
3645:
3643:
3622:John Holmes and Wendy Holmes (2001).
2384:. 15.ai has been frequently used for
2351:and a black-box synthesizer built by
1473:by Amy Drahota and colleagues at the
547:system, based largely on the work of
476:, began development with the work of
307:Imperial Academy of Sciences and Arts
145:Synthesized speech can be created by
51:Artificial production of human speech
7:
6893:Simple Knowledge Organization System
4327:Etienne, Vanessa (August 19, 2021).
3736:. Hts.sp.nitech.ac.j. Archived from
3578:Indiana Jones and the Temple of Doom
2536:Text to speech in digital television
1667:was the circa 1983 unreleased Atari
1414:to describe distinctive sounds in a
1364:adding citations to reliable sources
1152:Deep learning speech synthesis uses
138:into speech. The reverse process is
5305:Yoshiyuki, Furushima (2021-01-18).
4421:"Smile -and the world can hear you"
3873:Yoshiyuki, Furushima (2021-01-18).
3006:Billi, Roberto; Canavesio, Franco;
2395:My Little Pony: Friendship Is Magic
2262:My Little Pony: Friendship Is Magic
2118:included VoiceType, a precursor to
1740:, for people with vision problems.
443:", with musical accompaniment from
132:symbolic linguistic representations
6205:Texas Instruments LPC Speech Chips
5604:Vocal Synthesis and Deep Listening
5385:Ashworth, Boone (April 12, 2023).
5257:. Animenewsnetwork.com. 2008-09-09
4695:Devitt, Francesco (30 June 1995).
4230:Murphy, Margi (20 February 2024).
3928:Ashworth, Boone (April 12, 2023).
3468:. New York Media, LLC. 1979-07-30.
3108:Gaming's most important evolutions
1559:Texas Instruments LPC Speech Chips
1539:Texas Instruments LPC Speech Chips
502:Texas Instruments LPC Speech Chips
391:segments (consonants and vowels).
327:acoustic-mechanical speech machine
49:
6908:Thesaurus (information retrieval)
4772:. Android-developers.blogspot.com
2786:Lambert, Bruce (March 21, 1992).
2501:Comparison of speech synthesizers
2279:Speech synthesis markup languages
1956:Currently, there are a number of
1170:model used by the application is
313:that could produce the five long
309:for models he built of the human
6282:Speech Synthesis Markup Language
5943:Festival Speech Synthesis System
5126:10.1109/WECONF55058.2022.9803600
4835:. Mindspring.com. Archived from
4768:Jean-Michel Trivi (2009-09-23).
3624:Speech Synthesis and Recognition
3407:"Education: Marvel of The Bronx"
3175:"A Short History of Computalker"
2943:from the original on 2022-10-09.
2914:from the original on 2022-10-09.
2476:This section is an excerpt from
2293:Speech Synthesis Markup Language
2033:Festival Speech Synthesis System
2022:systems are available, such as:
1427:, approach to learning reading.
1336:
1220:This section is an excerpt from
662:produced the first multi-player
188:Overview of a typical TTS system
94:Problems playing this file? See
74:
6044:Microsoft text-to-speech voices
5601:Bruno, Chelsea A (2014-03-25).
5317:from the original on 2021-01-18
5287:from the original on 2021-01-19
4668:Amiga Hardware Reference Manual
4146:10.1109/icmew46912.2020.9105991
4060:10.21437/speechprosody.2020-190
3885:from the original on 2021-01-18
3855:from the original on 2021-01-19
2986:"Fumitada Itakura Oral History"
1455:Prosodics and emotional content
319:International Phonetic Alphabet
303:Christian Gottlieb Kratzenstein
6489:Natural language understanding
5577:"AI Text to speech for videos"
5027:"Voice Cloning for the Masses"
4388:. World Wide Web Organization.
3130:Adlum, Eddie (November 1985).
2651:Computer Speech & Language
2564:. Cambridge University Press.
2341:voice output communication aid
1659:Atari ST speech synthesis demo
1135:Deep learning speech synthesis
1095:(voice source), and duration (
486:Nippon Telegraph and Telephone
1:
7013:Optical character recognition
5275:Kurosawa, Yuki (2021-01-19).
5171:10.1109/BHI50953.2021.9508522
5073:10.1080/07380569.2021.1953362
4107:: CS1 maint: date and year (
3843:Kurosawa, Yuki (2021-01-19).
3667:10.21437/Interspeech.2013-161
2712:Current Trends in Linguistics
2128:Navigation units produced by
1720:that shipped in quantity was
1545:Hardware and software systems
1524:General Instrument SP0256-AL2
1265:Text normalization challenges
1129:Deep learning-based synthesis
419:in Japan. In 1961, physicist
267:Long before the invention of
216:. The front-end then assigns
6706:Multi-document summarization
4952:Drew, Harwell (2019-09-04).
4519:10.1016/j.specom.2003.05.001
4477:10.1016/j.specom.2007.10.001
4259:10.1007/978-981-16-0733-2_39
3986:Wiggers, Kyle (2023-06-20).
3712:10.1016/j.jvoice.2015.07.017
3393:10.1016/j.specom.2003.05.001
3024:10.1016/0167-6393(95)00030-R
2883:Found. Trends Signal Process
2806:"Arthur C. Clarke Biography"
2624:Progress in Speech Synthesis
2339:usually through a dedicated
1461:Emotional speech recognition
950:physical modelling synthesis
658:, also dates from 1980. The
644:), released in 1980 for the
114:, and can be implemented in
7036:Latent Dirichlet allocation
7008:Natural language generation
6873:Machine-readable dictionary
6868:Linguistic Linked Open Data
6443:Natural language processing
5468:Fadulu, Lola (2023-07-06).
5033:. The Batch. Archived from
3037:Sproat, Richard W. (1997).
2834:. Bell Labs. Archived from
2511:Orca (assistive technology)
2455:, simulates the physics of
2426:natural language processing
2163:(NeurIPS) researchers from
1531:DT1050 Digitalker (Mozer –
1188:is primarily known for its
553:natural language processing
417:Electrotechnical Laboratory
7224:
6788:Explicit semantic analysis
6537:Deep linguistic processing
5643:or a description from the
5367:(in Polish). April 9, 2023
3149:Szczepaniak, John (2014).
2475:
1907:
1847:
1556:
1458:
1328:Text-to-phoneme challenges
1219:
1178:of a generated line using
1132:
1114:
1025:
741:
691:, created a female voice.
689:AT&T Bell Laboratories
634:with speech synthesis was
366:1939 New York World's Fair
7203:Computational linguistics
6631:Word-sense disambiguation
6484:Computational linguistics
6210:General Instrument SP0256
5693:
5025:Ng, Andrew (2020-04-01).
4674:Publishing Company, Inc.
4546:10.1109/TASL.2013.2273717
2958:"List of IEEE Milestones"
1604:game console offered the
1487:discrete cosine transform
1180:emotional contextualizers
1003:, and in the early 1980s
885:Domain-specific synthesis
867:discrete cosine transform
851:digital signal processing
822:digital signal processing
7157:Natural Language Toolkit
7081:Pronunciation assessment
6983:Automatic identification
6813:Latent semantic analysis
6769:Distributional semantics
6654:Compound-term processing
6552:Named-entity recognition
6024:Software Automatic Mouth
5061:Computers in the Schools
5007:University of Washington
4644:Amazon Web Services, Inc
4201:10.1177/1461444820925811
4036:Zhu, Jian (2020-05-25).
3253:Text-to-speech synthesis
2874:Gray, Robert M. (2010).
2526:Speech-generating device
2496:Chinese speech synthesis
2223:University of Washington
2043:Free Software Foundation
1858:desktop systems can use
1636:Software Automatic Mouth
1630:A demo of SAM on the C64
1475:University of Portsmouth
1406:-to-phoneme conversion (
898:dialects of English the
855:linear predictive coding
753:Unit selection synthesis
702:Synthesizer technologies
591:Telesensory Systems Inc.
470:Linear predictive coding
31:latest accepted revision
7061:Automated essay scoring
7031:Document classification
6698:Automatic summarization
6371:Concatenative synthesis
6256:Microsoft Speech Server
6125:NIAONiao Virtual Singer
5614:10.25148/etd.fi14040802
4750:. Microsoft. 2007-05-07
4247:"Deepfake: An Overview"
4189:New Media & Society
4044:. ISCA: ISCA: 930–934.
3787:10.1126/science.7233191
3574:The Empire Strikes Back
3288:IEEE TTS Workshop 2002.
3222:The Singularity is Near
3192:CadeMetz (2020-08-20).
2521:Silent speech interface
2295:(SSML), which became a
2237:web application called
1890:Microsoft Speech Server
1818:Commodore International
1634:Also released in 1982,
1469:A study in the journal
1233:artificial intelligence
1014:arcade games using the
948:and an acoustic model (
873:, that was invented by
744:Concatenative synthesis
738:Concatenation synthesis
725:concatenative synthesis
666:using voice synthesis,
218:phonetic transcriptions
136:phonetic transcriptions
6918:Universal Dependencies
6611:Terminology extraction
6594:Semantic decomposition
6589:Semantic role labeling
6579:Part-of-speech tagging
6547:Information extraction
6532:Coreference resolution
6522:Collocation extraction
6366:Articulatory synthesis
6320:Franklin Seaney Cooper
4977:Thies, Justus (2016).
4666:; et al. (1991).
3706:(5): 639.e17–639.e23.
3097:Voice Chess Challenger
2663:10.1006/csla.1994.1005
2506:List of screen readers
2453:University of Brasília
2366:
2178:Also researchers from
1931:Text-to-speech systems
1924:
1809:
1802:
1713:
1700:
1660:
1631:
1573:
1529:National Semiconductor
1514:
1489:in the source domain (
1149:
1028:Articulatory synthesis
1022:Articulatory synthesis
952:). Parameters such as
696:cost-performance ratio
660:Milton Bradley Company
632:personal computer game
583:
570:
562:
540:
439:) recreated the song "
408:
374:and his colleagues at
372:Dr. Franklin S. Cooper
189:
70:Automatic announcement
63:
6679:Sentence segmentation
6335:Wolfgang von Kempelen
6115:CeVIO Creative Studio
6074:CeVIO Creative Studio
5957:Automatik Text Reader
5803:Karplus–Strong string
3626:(2nd ed.). CRC.
3249:Taylor, Paul (2009).
3067:Gevaryahu, Jonathan,
2414:SpongeBob SquarePants
2361:
2214:in existing 2D video.
2201:Human image synthesis
2090:, and the Bebook Neo.
1968:and web pages from a
1922:
1808:
1800:
1782:Software as a Service
1711:
1698:
1658:
1629:
1571:
1512:
1465:Prosody (linguistics)
1443:Evaluation challenges
1147:
1093:fundamental frequency
1057:University of Calgary
1010:machines and in many
970:rules-based synthesis
954:fundamental frequency
803:fundamental frequency
599:Speak & Spell toy
581:
568:
561:
538:
455:2001: A Space Odyssey
402:
331:Wolfgang von Kempelen
187:
62:
7193:Assistive technology
7131:Voice user interface
6842:datasets and corpora
6783:Document-term matrix
6636:Word-sense induction
6330:Haskins Laboratories
6039:Microsoft Speech API
5854:Software synthesizer
5698:Frequency modulation
4507:Speech Communication
4465:Speech Communication
4456:Drahota, A. (2008).
4400:"Blizzard Challenge"
3381:Speech Communication
3344:on February 22, 2007
3012:Speech Communication
2812:on December 11, 1997
2349:Haskins Laboratories
2333:reading disabilities
2173:speaker verification
2155:Digital sound-alikes
2020:open-source software
1993:assistive technology
1784:in AWS (from 2017).
1673:finite state machine
1471:Speech Communication
1436:phonemic orthography
1410:is the term used by
1360:improve this section
1254:voice-authentication
1154:deep neural networks
1081:hidden Markov models
1040:in the mid-1970s by
1038:Haskins Laboratories
672:, in the same year.
494:Manfred R. Schroeder
421:John Larry Kelly, Jr
376:Haskins Laboratories
175:reading disabilities
7111:Interactive fiction
7041:Pachinko allocation
6998:Speech segmentation
6954:Google Ngram Viewer
6726:Machine translation
6716:Text simplification
6711:Sentence extraction
6599:Semantic similarity
5836:Digital synthesizer
4703:on 26 February 2012
4122:Lyu, Siwei (2020).
4042:Speech Prosody 2020
3779:1981Sci...212..947R
3606:Vindicators Part II
3517:Music and Computers
3514:Dartmouth College:
3008:Ciaramella, Alberto
2757:1987ASAJ...82..737K
2599:1981ASAJ...70..321R
2059:(Baseball) and the
1164:multi-speaker model
1075:HMM-based synthesis
879:the Bronx, New York
853:techniques such as
622:(known in Japan as
513:line spectral pairs
484:and Shuzo Saito of
21:Page version status
7121:Question answering
6993:Speech recognition
6858:Corpus linguistics
6838:Language resources
6621:Textual entailment
6604:Sentiment analysis
6340:Ignatius Mattingly
5813:Analog synthesizer
5775:Physical modelling
5533:. January 30, 2023
5501:Kanetkar, Riddhi.
5474:The New York Times
5234:Anime News Network
4789:Andreas Bischoff,
4573:2012-05-28 at the
4386:"Speech synthesis"
4352:Newman, Lily Hay.
4089:. pp. 11–13.
3910:. January 23, 2023
3566:Return of the Jedi
3523:2011-06-08 at the
3198:The New York Times
3113:2011-06-15 at the
2896:10.1561/2000000036
2861:2016-03-04 at the
2792:The New York Times
2430:virtual assistants
2422:speech recognition
2367:
2297:W3C recommendation
2212:facial expressions
2210:counterfeiting of
2169:transfers learning
2084:PocketBook eReader
1925:
1868:speech recognition
1810:
1803:
1762:command-line based
1730:speech recognition
1714:
1701:
1661:
1632:
1574:
1515:
1505:Dedicated hardware
1242:speech translation
1150:
1117:Sinewave synthesis
1111:Sinewave synthesis
1105:maximum likelihood
1085:frequency spectrum
946:additive synthesis
875:Michael J. Freeman
624:Speak & Rescue
584:
571:
563:
541:
447:. Coincidentally,
423:and his colleague
409:
395:Electronic devices
339:Charles Wheatstone
203:text normalization
190:
171:visual impairments
140:speech recognition
112:speech synthesizer
64:
27:
7198:Auditory displays
7170:
7169:
7126:Virtual assistant
7051:Computer-assisted
6977:
6976:
6734:Computer-assisted
6692:
6691:
6684:Word segmentation
6646:Text segmentation
6584:Semantic analysis
6572:Syntactic parsing
6557:Ontology learning
6409:
6408:
6315:Catherine Browman
6168:
6167:
5991:
5990:
5978:Lyricos / Flinger
5862:
5861:
5849:Scanned synthesis
5788:Digital waveguide
5703:Linear arithmetic
5180:978-1-6654-0358-0
5135:978-1-6654-7083-4
4681:978-0-201-56776-2
4577:." June 14, 2001.
4540:(12): 2471–2480.
4268:978-981-16-0732-5
4155:978-1-7281-1485-9
3773:(4497): 947–949.
3633:978-0-7484-0856-6
3556:Examples include
3536:Examples include
3505:, September 1993.
3465:New York Magazine
3443:cyberneticzoo.com
3367:ICSLP Proceedings
3327:, IEEE ASRU 2011.
3297:John Kominek and
3235:978-0-14-303788-0
3218:Kurzweil, Raymond
3048:978-0-7923-8027-6
2635:978-0-387-94701-3
2571:978-0-521-30641-6
2531:Speech processing
2471:Singing synthesis
2337:speech impairment
2233:In March 2020, a
1920:
1844:Microsoft Windows
1798:
1709:
1696:
1656:
1627:
1569:
1553:Texas Instruments
1491:linear prediction
1434:Languages with a
1425:synthetic phonics
1396:
1395:
1388:
1145:
1046:Bell Laboratories
1016:TMS5220 LPC Chips
1001:Speak & Spell
997:Texas Instruments
937:Formant synthesis
931:context-sensitive
833:Diphone synthesis
787:speech recognizer
603:Texas Instruments
579:
536:
506:Speak & Spell
482:Nagoya University
472:(LPC), a form of
286:(1198–1280), and
272:signal processing
179:operating systems
81:
39:17 September 2024
7215:
7183:Speech synthesis
7147:Formal semantics
7096:Natural language
7003:Speech synthesis
6985:and data capture
6888:Semantic network
6863:Lexical resource
6846:
6664:Lexical analysis
6642:
6567:Semantic parsing
6436:
6429:
6422:
6413:
6251:Windows Narrator
6190:Pattern playback
6140:Symphonic Choirs
6004:
5909:
5896:Speech synthesis
5889:
5882:
5875:
5866:
5783:Banded waveguide
5708:Phase distortion
5679:
5672:
5665:
5656:
5632:Speech synthesis
5619:
5618:
5616:
5598:
5592:
5591:
5589:
5587:
5573:
5567:
5566:
5564:
5562:
5548:
5542:
5541:
5539:
5538:
5523:
5517:
5516:
5514:
5513:
5507:Business Insider
5498:
5492:
5491:
5489:
5488:
5465:
5459:
5458:
5456:
5455:
5440:
5434:
5433:
5431:
5430:
5407:
5401:
5400:
5398:
5397:
5382:
5376:
5375:
5373:
5372:
5357:
5351:
5350:
5348:
5347:
5332:
5326:
5325:
5323:
5322:
5311:Denfaminicogamer
5302:
5296:
5295:
5293:
5292:
5272:
5266:
5265:
5263:
5262:
5251:
5245:
5244:
5242:
5241:
5226:
5220:
5219:
5207:
5199:
5193:
5192:
5154:
5148:
5147:
5109:
5103:
5102:
5084:
5052:
5046:
5045:
5043:
5042:
5022:
5016:
5015:
5014:
5013:
4996:
4990:
4989:
4987:
4986:
4974:
4968:
4967:
4965:
4964:
4949:
4943:
4942:
4940:
4939:
4918:
4912:
4911:
4910:
4884:
4878:
4877:
4876:
4854:
4848:
4847:
4845:
4844:
4829:
4823:
4822:
4820:
4819:
4808:
4802:
4787:
4781:
4780:
4778:
4777:
4765:
4759:
4758:
4756:
4755:
4744:
4738:
4737:
4735:
4734:
4729:on June 21, 2003
4719:
4713:
4712:
4710:
4708:
4699:. Archived from
4692:
4686:
4685:
4670:(3rd ed.).
4660:
4654:
4653:
4651:
4650:
4636:
4630:
4629:
4627:
4626:
4615:
4609:
4608:
4606:
4605:
4599:
4593:. Archived from
4592:
4584:
4578:
4564:
4558:
4557:
4529:
4523:
4522:
4502:
4496:
4495:
4493:
4487:. Archived from
4462:
4453:
4447:
4446:
4435:
4429:
4428:
4427:on May 17, 2008.
4417:
4411:
4410:
4408:
4407:
4396:
4390:
4389:
4382:
4376:
4375:
4373:
4372:
4349:
4343:
4342:
4340:
4339:
4324:
4318:
4317:
4315:
4314:
4292:
4286:
4285:
4284:
4283:
4242:
4236:
4235:
4227:
4221:
4220:
4180:
4174:
4173:
4171:
4170:
4139:
4130:. pp. 1–6.
4119:
4113:
4112:
4106:
4098:
4085:. Vol. 28.
4078:
4072:
4071:
4053:
4033:
4027:
4026:
4024:
4023:
4011:Bonk, Lawrence.
4008:
4002:
4001:
3999:
3998:
3983:
3977:
3976:
3974:
3973:
3950:
3944:
3943:
3941:
3940:
3925:
3919:
3918:
3916:
3915:
3900:
3894:
3893:
3891:
3890:
3879:Denfaminicogamer
3870:
3864:
3863:
3861:
3860:
3840:
3834:
3833:
3831:
3819:
3813:
3812:
3810:
3809:
3803:
3797:. Archived from
3764:
3755:
3749:
3748:
3746:
3745:
3730:
3724:
3723:
3700:Journal of Voice
3695:
3686:
3685:
3683:
3681:
3659:Interspeech 2013
3656:
3647:
3638:
3637:
3619:
3613:
3554:
3548:
3534:
3528:
3512:
3506:
3492:
3486:
3485:
3476:
3470:
3469:
3460:
3454:
3453:
3451:
3450:
3435:
3429:
3428:
3426:
3425:
3403:
3397:
3396:
3376:
3370:
3359:
3353:
3352:
3350:
3349:
3340:. Archived from
3334:
3328:
3321:
3315:
3308:
3302:
3295:
3289:
3279:
3273:
3272:
3256:
3246:
3240:
3239:
3214:
3208:
3207:
3205:
3204:
3189:
3183:
3182:
3171:
3165:
3164:
3146:
3140:
3139:
3127:
3121:
3105:
3099:
3094:
3088:
3086:
3085:
3081:
3076:Breslow, et al.
3074:
3068:
3065:
3059:
3053:
3052:
3034:
3028:
3027:
3003:
2997:
2996:
2994:
2993:
2982:
2973:
2972:
2970:
2968:
2954:
2945:
2944:
2942:
2931:
2922:
2916:
2915:
2913:
2898:
2880:
2871:
2865:
2853:
2847:
2846:
2844:
2843:
2828:
2822:
2821:
2819:
2817:
2808:. Archived from
2802:
2796:
2795:
2783:
2777:
2776:
2765:10.1121/1.395275
2740:
2734:
2733:
2731:
2730:
2724:
2709:
2700:
2694:
2693:
2685:
2679:
2673:
2667:
2666:
2646:
2640:
2639:
2627:
2617:
2611:
2610:
2607:10.1121/1.386780
2582:
2576:
2575:
2563:
2553:
2516:Paperless office
2440:like Elai.io or
2411:fandom, and the
2392:, including the
2386:content creation
2285:markup languages
2253:Twilight Sparkle
1921:
1894:web applications
1799:
1718:operating system
1710:
1699:MacinTalk 1 demo
1697:
1665:operating system
1657:
1628:
1570:
1391:
1384:
1380:
1377:
1371:
1340:
1332:
1322:Ulysses S. Grant
1172:nondeterministic
1146:
978:embedded systems
916:
908:
642:Shoplifting Girl
580:
537:
508:toys from 1978.
478:Fumitada Itakura
449:Arthur C. Clarke
380:Pattern playback
278:" involved Pope
104:Speech synthesis
83:
82:
61:
7223:
7222:
7218:
7217:
7216:
7214:
7213:
7212:
7173:
7172:
7171:
7166:
7135:
7115:Syntax guessing
7097:
7090:
7076:Predictive text
7071:Grammar checker
7052:
7045:
7017:
6984:
6973:
6939:Bank of English
6922:
6850:
6841:
6832:
6763:
6720:
6688:
6640:
6542:Distant reading
6517:Argument mining
6503:
6499:Text processing
6445:
6440:
6410:
6405:
6354:
6302:
6296:
6270:
6219:
6164:
6093:
6034:Microsoft Agent
5998:
5987:
5961:
5898:
5893:
5863:
5858:
5844:Analog modeling
5830:
5821:Graphical sound
5807:
5769:
5732:
5689:
5686:Sound synthesis
5683:
5628:
5623:
5622:
5600:
5599:
5595:
5585:
5583:
5575:
5574:
5570:
5560:
5558:
5550:
5549:
5545:
5536:
5534:
5525:
5524:
5520:
5511:
5509:
5500:
5499:
5495:
5486:
5484:
5467:
5466:
5462:
5453:
5451:
5442:
5441:
5437:
5428:
5426:
5409:
5408:
5404:
5395:
5393:
5384:
5383:
5379:
5370:
5368:
5359:
5358:
5354:
5345:
5343:
5334:
5333:
5329:
5320:
5318:
5304:
5303:
5299:
5290:
5288:
5274:
5273:
5269:
5260:
5258:
5253:
5252:
5248:
5239:
5237:
5228:
5227:
5223:
5205:
5201:
5200:
5196:
5181:
5156:
5155:
5151:
5136:
5111:
5110:
5106:
5054:
5053:
5049:
5040:
5038:
5031:deeplearning.ai
5024:
5023:
5019:
5011:
5009:
4998:
4997:
4993:
4984:
4982:
4976:
4975:
4971:
4962:
4960:
4958:Washington Post
4951:
4950:
4946:
4937:
4935:
4920:
4919:
4915:
4886:
4885:
4881:
4856:
4855:
4851:
4842:
4840:
4831:
4830:
4826:
4817:
4815:
4810:
4809:
4805:
4788:
4784:
4775:
4773:
4767:
4766:
4762:
4753:
4751:
4746:
4745:
4741:
4732:
4730:
4721:
4720:
4716:
4706:
4704:
4694:
4693:
4689:
4682:
4662:
4661:
4657:
4648:
4646:
4638:
4637:
4633:
4624:
4622:
4617:
4616:
4612:
4603:
4601:
4597:
4590:
4586:
4585:
4581:
4575:Wayback Machine
4565:
4561:
4531:
4530:
4526:
4504:
4503:
4499:
4491:
4460:
4455:
4454:
4450:
4445:. January 2008.
4437:
4436:
4432:
4419:
4418:
4414:
4405:
4403:
4398:
4397:
4393:
4384:
4383:
4379:
4370:
4368:
4351:
4350:
4346:
4337:
4335:
4326:
4325:
4321:
4312:
4310:
4300:Washington Post
4294:
4293:
4289:
4281:
4279:
4269:
4244:
4243:
4239:
4229:
4228:
4224:
4182:
4181:
4177:
4168:
4166:
4156:
4121:
4120:
4116:
4099:
4080:
4079:
4075:
4035:
4034:
4030:
4021:
4019:
4010:
4009:
4005:
3996:
3994:
3985:
3984:
3980:
3971:
3969:
3952:
3951:
3947:
3938:
3936:
3927:
3926:
3922:
3913:
3911:
3902:
3901:
3897:
3888:
3886:
3872:
3871:
3867:
3858:
3856:
3842:
3841:
3837:
3821:
3820:
3816:
3807:
3805:
3801:
3762:
3757:
3756:
3752:
3743:
3741:
3732:
3731:
3727:
3697:
3696:
3689:
3679:
3677:
3654:
3649:
3648:
3641:
3634:
3621:
3620:
3616:
3555:
3551:
3535:
3531:
3525:Wayback Machine
3513:
3509:
3493:
3489:
3478:
3477:
3473:
3462:
3461:
3457:
3448:
3446:
3437:
3436:
3432:
3423:
3421:
3405:
3404:
3400:
3378:
3377:
3373:
3360:
3356:
3347:
3345:
3336:
3335:
3331:
3322:
3318:
3309:
3305:
3296:
3292:
3280:
3276:
3269:
3248:
3247:
3243:
3236:
3216:
3215:
3211:
3202:
3200:
3191:
3190:
3186:
3173:
3172:
3168:
3161:
3148:
3147:
3143:
3129:
3128:
3124:
3115:Wayback Machine
3106:
3102:
3095:
3091:
3083:
3077:
3075:
3071:
3066:
3062:
3056:
3049:
3036:
3035:
3031:
3005:
3004:
3000:
2991:
2989:
2984:
2983:
2976:
2966:
2964:
2956:
2955:
2948:
2940:
2929:
2924:
2923:
2919:
2911:
2878:
2873:
2872:
2868:
2863:Wayback Machine
2854:
2850:
2841:
2839:
2830:
2829:
2825:
2815:
2813:
2804:
2803:
2799:
2785:
2784:
2780:
2742:
2741:
2737:
2728:
2726:
2722:
2707:
2702:
2701:
2697:
2691:
2686:
2682:
2674:
2670:
2648:
2647:
2643:
2636:
2619:
2618:
2614:
2584:
2583:
2579:
2572:
2555:
2554:
2550:
2545:
2540:
2491:
2486:
2485:
2481:
2473:
2402:Team Fortress 2
2363:Stephen Hawking
2320:
2281:
2157:
2052:
2016:
1954:
1944:Version 1.6 of
1942:
1933:
1914:
1912:
1906:
1852:
1850:Microsoft Agent
1846:
1792:
1790:
1774:
1703:
1690:
1688:
1650:
1648:
1621:
1619:
1610:SP0256 Narrator
1595:
1563:
1561:
1555:
1547:
1507:
1467:
1457:
1445:
1392:
1381:
1375:
1372:
1357:
1341:
1330:
1306:parts of speech
1304:) to generate "
1267:
1262:
1246:
1245:
1225:
1217:
1215:Audio deepfakes
1139:
1137:
1131:
1119:
1113:
1077:
1030:
1024:
939:
917:). Likewise in
913:is realized as
887:
835:
765:, half-phones,
755:
746:
740:
713:intelligibility
704:
664:electronic game
628:Sun Electronics
573:
530:
405:Stephen Hawking
397:
284:Albertus Magnus
265:
238:text-to-phoneme
101:
100:
92:
90:
89:
88:
87:
84:
75:
72:
65:
59:
52:
47:
46:
45:
44:
43:
42:
26:
12:
11:
5:
7221:
7219:
7211:
7210:
7205:
7200:
7195:
7190:
7185:
7175:
7174:
7168:
7167:
7165:
7164:
7159:
7154:
7149:
7143:
7141:
7137:
7136:
7134:
7133:
7128:
7123:
7118:
7108:
7102:
7100:
7098:user interface
7092:
7091:
7089:
7088:
7083:
7078:
7073:
7068:
7063:
7057:
7055:
7047:
7046:
7044:
7043:
7038:
7033:
7027:
7025:
7019:
7018:
7016:
7015:
7010:
7005:
7000:
6995:
6989:
6987:
6979:
6978:
6975:
6974:
6972:
6971:
6966:
6961:
6956:
6951:
6946:
6941:
6936:
6930:
6928:
6924:
6923:
6921:
6920:
6915:
6910:
6905:
6900:
6895:
6890:
6885:
6880:
6875:
6870:
6865:
6860:
6854:
6852:
6843:
6834:
6833:
6831:
6830:
6825:
6823:Word embedding
6820:
6815:
6810:
6803:Language model
6800:
6795:
6790:
6785:
6780:
6774:
6772:
6765:
6764:
6762:
6761:
6756:
6754:Transfer-based
6751:
6746:
6741:
6736:
6730:
6728:
6722:
6721:
6719:
6718:
6713:
6708:
6702:
6700:
6694:
6693:
6690:
6689:
6687:
6686:
6681:
6676:
6671:
6666:
6661:
6656:
6650:
6648:
6639:
6638:
6633:
6628:
6623:
6618:
6613:
6607:
6606:
6601:
6596:
6591:
6586:
6581:
6576:
6575:
6574:
6569:
6559:
6554:
6549:
6544:
6539:
6534:
6529:
6527:Concept mining
6524:
6519:
6513:
6511:
6505:
6504:
6502:
6501:
6496:
6491:
6486:
6481:
6480:
6479:
6474:
6464:
6459:
6453:
6451:
6447:
6446:
6441:
6439:
6438:
6431:
6424:
6416:
6407:
6406:
6404:
6403:
6398:
6393:
6388:
6383:
6381:Inverse filter
6378:
6373:
6368:
6362:
6360:
6356:
6355:
6353:
6352:
6347:
6342:
6337:
6332:
6327:
6322:
6317:
6312:
6306:
6304:
6298:
6297:
6295:
6294:
6289:
6284:
6278:
6276:
6272:
6271:
6269:
6268:
6263:
6258:
6253:
6248:
6243:
6238:
6233:
6227:
6225:
6221:
6220:
6218:
6217:
6212:
6207:
6202:
6197:
6192:
6187:
6182:
6176:
6174:
6170:
6169:
6166:
6165:
6163:
6162:
6157:
6152:
6147:
6142:
6137:
6132:
6127:
6122:
6117:
6112:
6107:
6101:
6099:
6095:
6094:
6092:
6091:
6086:
6081:
6076:
6071:
6066:
6061:
6056:
6051:
6046:
6041:
6036:
6031:
6026:
6021:
6016:
6010:
6008:
6001:
5993:
5992:
5989:
5988:
5986:
5985:
5980:
5975:
5969:
5967:
5963:
5962:
5960:
5959:
5954:
5949:
5940:
5935:
5930:
5925:
5915:
5913:
5906:
5900:
5899:
5894:
5892:
5891:
5884:
5877:
5869:
5860:
5859:
5857:
5856:
5851:
5846:
5840:
5838:
5832:
5831:
5829:
5828:
5823:
5817:
5815:
5809:
5808:
5806:
5805:
5800:
5795:
5793:Direct digital
5790:
5785:
5779:
5777:
5771:
5770:
5768:
5767:
5762:
5757:
5752:
5746:
5744:
5734:
5733:
5731:
5730:
5725:
5720:
5715:
5710:
5705:
5700:
5694:
5691:
5690:
5684:
5682:
5681:
5674:
5667:
5659:
5653:
5652:
5638:
5627:
5626:External links
5624:
5621:
5620:
5593:
5568:
5543:
5518:
5493:
5460:
5443:Suciu, Peter.
5435:
5410:Knibbs, Kate.
5402:
5377:
5352:
5327:
5297:
5267:
5246:
5221:
5194:
5179:
5149:
5134:
5104:
5067:(3): 214–231.
5047:
5017:
4991:
4969:
4944:
4913:
4879:
4849:
4824:
4803:
4782:
4760:
4739:
4714:
4687:
4680:
4672:Addison-Wesley
4655:
4640:"Amazon Polly"
4631:
4621:. folklore.org
4610:
4579:
4559:
4524:
4513:(2): 143–154.
4497:
4494:on 2013-07-03.
4471:(4): 278–287.
4448:
4430:
4412:
4391:
4377:
4344:
4319:
4287:
4267:
4237:
4222:
4175:
4154:
4114:
4073:
4028:
4003:
3978:
3945:
3920:
3895:
3865:
3835:
3814:
3750:
3725:
3687:
3639:
3632:
3614:
3549:
3529:
3507:
3487:
3471:
3455:
3430:
3413:. 1974-04-01.
3398:
3387:(2): 143–154.
3371:
3354:
3329:
3316:
3303:
3290:
3274:
3267:
3241:
3234:
3209:
3184:
3166:
3160:978-0992926007
3159:
3141:
3122:
3100:
3089:
3069:
3060:
3054:
3047:
3029:
3018:(3): 263–271.
2998:
2974:
2946:
2917:
2889:(4): 203–303.
2866:
2848:
2823:
2797:
2778:
2735:
2695:
2680:
2668:
2641:
2634:
2612:
2593:(2): 321–328.
2577:
2570:
2547:
2546:
2544:
2541:
2539:
2538:
2533:
2528:
2523:
2518:
2513:
2508:
2503:
2498:
2492:
2490:
2487:
2482:
2474:
2472:
2469:
2325:screen readers
2319:
2316:
2280:
2277:
2259:from the show
2231:
2230:
2215:
2208:near real-time
2204:
2180:Baidu Research
2156:
2153:
2152:
2151:
2141:
2123:
2109:
2098:
2091:
2088:enTourage eDGe
2074:, such as the
2072:e-book readers
2068:
2051:
2048:
2047:
2046:
2036:
2030:
2015:
2012:
1974:Google Toolbar
1953:
1950:
1941:
1938:
1932:
1929:
1908:Main article:
1905:
1902:
1845:
1842:
1789:
1786:
1773:
1770:
1746:Mac OS X Tiger
1722:Apple Computer
1687:
1684:
1647:
1644:
1618:
1615:
1594:
1591:
1557:Main article:
1554:
1551:
1546:
1543:
1542:
1541:
1536:
1526:
1521:
1506:
1503:
1456:
1453:
1444:
1441:
1420:pronunciations
1394:
1393:
1344:
1342:
1335:
1329:
1326:
1266:
1263:
1261:
1258:
1238:text-to-speech
1229:audio deepfake
1226:
1222:Audio deepfake
1218:
1216:
1213:
1133:Main article:
1130:
1127:
1115:Main article:
1112:
1109:
1076:
1073:
1026:Main article:
1023:
1020:
986:microprocessor
938:
935:
902:in words like
886:
883:
834:
831:
754:
751:
742:Main article:
739:
736:
703:
700:
637:Manbiki Shoujo
490:Bishnu S. Atal
425:Louis Gerstman
396:
393:
385:Alvin Liberman
352:developed the
348:In the 1930s,
282:(d. 1003 AD),
264:
261:
257:target prosody
222:prosodic units
208:pre-processing
124:text-to-speech
91:
85:
73:
68:
67:
66:
57:
56:
55:
50:
48:
28:
22:
19:
17:
16:
15:
14:
13:
10:
9:
6:
4:
3:
2:
7220:
7209:
7206:
7204:
7201:
7199:
7196:
7194:
7191:
7189:
7186:
7184:
7181:
7180:
7178:
7163:
7160:
7158:
7155:
7153:
7152:Hallucination
7150:
7148:
7145:
7144:
7142:
7138:
7132:
7129:
7127:
7124:
7122:
7119:
7116:
7112:
7109:
7107:
7104:
7103:
7101:
7099:
7093:
7087:
7086:Spell checker
7084:
7082:
7079:
7077:
7074:
7072:
7069:
7067:
7064:
7062:
7059:
7058:
7056:
7054:
7048:
7042:
7039:
7037:
7034:
7032:
7029:
7028:
7026:
7024:
7020:
7014:
7011:
7009:
7006:
7004:
7001:
6999:
6996:
6994:
6991:
6990:
6988:
6986:
6980:
6970:
6967:
6965:
6962:
6960:
6957:
6955:
6952:
6950:
6947:
6945:
6942:
6940:
6937:
6935:
6932:
6931:
6929:
6925:
6919:
6916:
6914:
6911:
6909:
6906:
6904:
6901:
6899:
6898:Speech corpus
6896:
6894:
6891:
6889:
6886:
6884:
6881:
6879:
6878:Parallel text
6876:
6874:
6871:
6869:
6866:
6864:
6861:
6859:
6856:
6855:
6853:
6847:
6844:
6839:
6835:
6829:
6826:
6824:
6821:
6819:
6816:
6814:
6811:
6808:
6804:
6801:
6799:
6796:
6794:
6791:
6789:
6786:
6784:
6781:
6779:
6776:
6775:
6773:
6770:
6766:
6760:
6757:
6755:
6752:
6750:
6747:
6745:
6742:
6740:
6739:Example-based
6737:
6735:
6732:
6731:
6729:
6727:
6723:
6717:
6714:
6712:
6709:
6707:
6704:
6703:
6701:
6699:
6695:
6685:
6682:
6680:
6677:
6675:
6672:
6670:
6669:Text chunking
6667:
6665:
6662:
6660:
6659:Lemmatisation
6657:
6655:
6652:
6651:
6649:
6647:
6643:
6637:
6634:
6632:
6629:
6627:
6624:
6622:
6619:
6617:
6614:
6612:
6609:
6608:
6605:
6602:
6600:
6597:
6595:
6592:
6590:
6587:
6585:
6582:
6580:
6577:
6573:
6570:
6568:
6565:
6564:
6563:
6560:
6558:
6555:
6553:
6550:
6548:
6545:
6543:
6540:
6538:
6535:
6533:
6530:
6528:
6525:
6523:
6520:
6518:
6515:
6514:
6512:
6510:
6509:Text analysis
6506:
6500:
6497:
6495:
6492:
6490:
6487:
6485:
6482:
6478:
6475:
6473:
6470:
6469:
6468:
6465:
6463:
6460:
6458:
6455:
6454:
6452:
6450:General terms
6448:
6444:
6437:
6432:
6430:
6425:
6423:
6418:
6417:
6414:
6402:
6401:Voice cloning
6399:
6397:
6394:
6392:
6391:Phase vocoder
6389:
6387:
6384:
6382:
6379:
6377:
6374:
6372:
6369:
6367:
6364:
6363:
6361:
6357:
6351:
6348:
6346:
6343:
6341:
6338:
6336:
6333:
6331:
6328:
6326:
6323:
6321:
6318:
6316:
6313:
6311:
6310:Alan W. Black
6308:
6307:
6305:
6299:
6293:
6290:
6288:
6285:
6283:
6280:
6279:
6277:
6273:
6267:
6264:
6262:
6259:
6257:
6254:
6252:
6249:
6247:
6244:
6242:
6239:
6237:
6234:
6232:
6229:
6228:
6226:
6222:
6216:
6213:
6211:
6208:
6206:
6203:
6201:
6198:
6196:
6193:
6191:
6188:
6186:
6183:
6181:
6178:
6177:
6175:
6171:
6161:
6158:
6156:
6153:
6151:
6148:
6146:
6143:
6141:
6138:
6136:
6133:
6131:
6128:
6126:
6123:
6121:
6118:
6116:
6113:
6111:
6108:
6106:
6103:
6102:
6100:
6096:
6090:
6087:
6085:
6082:
6080:
6077:
6075:
6072:
6070:
6067:
6065:
6062:
6060:
6057:
6055:
6054:Voice browser
6052:
6050:
6047:
6045:
6042:
6040:
6037:
6035:
6032:
6030:
6027:
6025:
6022:
6020:
6017:
6015:
6012:
6011:
6009:
6005:
6002:
6000:
5994:
5984:
5981:
5979:
5976:
5974:
5971:
5970:
5968:
5964:
5958:
5955:
5953:
5950:
5948:
5944:
5941:
5939:
5936:
5934:
5931:
5929:
5926:
5924:
5920:
5917:
5916:
5914:
5910:
5907:
5905:
5904:Free software
5901:
5897:
5890:
5885:
5883:
5878:
5876:
5871:
5870:
5867:
5855:
5852:
5850:
5847:
5845:
5842:
5841:
5839:
5837:
5833:
5827:
5824:
5822:
5819:
5818:
5816:
5814:
5810:
5804:
5801:
5799:
5796:
5794:
5791:
5789:
5786:
5784:
5781:
5780:
5778:
5776:
5772:
5766:
5765:Concatenative
5763:
5761:
5758:
5756:
5753:
5751:
5748:
5747:
5745:
5743:
5739:
5735:
5729:
5726:
5724:
5721:
5719:
5716:
5714:
5711:
5709:
5706:
5704:
5701:
5699:
5696:
5695:
5692:
5687:
5680:
5675:
5673:
5668:
5666:
5661:
5660:
5657:
5650:
5646:
5642:
5639:
5637:
5633:
5630:
5629:
5625:
5615:
5610:
5606:
5605:
5597:
5594:
5582:
5578:
5572:
5569:
5557:
5553:
5547:
5544:
5532:
5528:
5522:
5519:
5508:
5504:
5497:
5494:
5483:
5479:
5475:
5471:
5464:
5461:
5450:
5446:
5439:
5436:
5425:
5421:
5417:
5413:
5406:
5403:
5392:
5388:
5381:
5378:
5366:
5362:
5356:
5353:
5341:
5337:
5331:
5328:
5316:
5312:
5308:
5301:
5298:
5286:
5282:
5278:
5271:
5268:
5256:
5250:
5247:
5235:
5231:
5225:
5222:
5217:
5213:
5212:
5204:
5198:
5195:
5190:
5186:
5182:
5176:
5172:
5168:
5164:
5160:
5153:
5150:
5145:
5141:
5137:
5131:
5127:
5123:
5119:
5115:
5108:
5105:
5100:
5096:
5092:
5088:
5083:
5078:
5074:
5070:
5066:
5062:
5058:
5051:
5048:
5037:on 2020-08-07
5036:
5032:
5028:
5021:
5018:
5008:
5004:
5003:
4995:
4992:
4980:
4973:
4970:
4959:
4955:
4948:
4945:
4933:
4929:
4928:
4923:
4917:
4914:
4909:
4904:
4900:
4896:
4895:
4890:
4883:
4880:
4875:
4870:
4867:: 4485–4495,
4866:
4862:
4861:
4853:
4850:
4839:on 2013-10-03
4838:
4834:
4828:
4825:
4813:
4807:
4804:
4800:
4799:0-7695-2932-1
4796:
4792:
4786:
4783:
4771:
4764:
4761:
4749:
4743:
4740:
4728:
4724:
4718:
4715:
4702:
4698:
4691:
4688:
4683:
4677:
4673:
4669:
4665:
4659:
4656:
4645:
4641:
4635:
4632:
4620:
4614:
4611:
4600:on 2012-03-24
4596:
4589:
4583:
4580:
4576:
4572:
4569:
4563:
4560:
4555:
4551:
4547:
4543:
4539:
4535:
4528:
4525:
4520:
4516:
4512:
4508:
4501:
4498:
4490:
4486:
4482:
4478:
4474:
4470:
4466:
4459:
4452:
4449:
4444:
4443:Science Daily
4440:
4434:
4431:
4426:
4422:
4416:
4413:
4402:. Festvox.org
4401:
4395:
4392:
4387:
4381:
4378:
4367:
4363:
4359:
4355:
4348:
4345:
4334:
4330:
4323:
4320:
4309:
4305:
4301:
4297:
4291:
4288:
4278:
4274:
4270:
4264:
4260:
4256:
4252:
4248:
4241:
4238:
4233:
4226:
4223:
4218:
4214:
4210:
4206:
4202:
4198:
4194:
4190:
4186:
4179:
4176:
4165:
4161:
4157:
4151:
4147:
4143:
4138:
4133:
4129:
4125:
4118:
4115:
4110:
4104:
4096:
4092:
4088:
4084:
4077:
4074:
4069:
4065:
4061:
4057:
4052:
4047:
4043:
4039:
4032:
4029:
4018:
4014:
4007:
4004:
3993:
3989:
3982:
3979:
3968:
3964:
3960:
3956:
3953:WIRED Staff.
3949:
3946:
3935:
3931:
3924:
3921:
3909:
3905:
3899:
3896:
3884:
3880:
3876:
3869:
3866:
3854:
3850:
3846:
3839:
3836:
3830:
3825:
3818:
3815:
3804:on 2011-12-16
3800:
3796:
3792:
3788:
3784:
3780:
3776:
3772:
3768:
3761:
3754:
3751:
3740:on 2012-02-13
3739:
3735:
3729:
3726:
3721:
3717:
3713:
3709:
3705:
3701:
3694:
3692:
3688:
3676:
3672:
3668:
3664:
3660:
3653:
3646:
3644:
3640:
3635:
3629:
3625:
3618:
3615:
3611:
3607:
3603:
3599:
3595:
3591:
3587:
3583:
3579:
3575:
3571:
3567:
3563:
3559:
3553:
3550:
3547:
3543:
3539:
3538:Astro Blaster
3533:
3530:
3526:
3522:
3519:
3518:
3511:
3508:
3504:
3500:
3496:
3491:
3488:
3483:
3482:
3475:
3472:
3467:
3466:
3459:
3456:
3444:
3440:
3434:
3431:
3420:
3416:
3412:
3408:
3402:
3399:
3394:
3390:
3386:
3382:
3375:
3372:
3368:
3364:
3358:
3355:
3343:
3339:
3333:
3330:
3326:
3320:
3317:
3313:
3310:Julia Zhang.
3307:
3304:
3300:
3299:Alan W. Black
3294:
3291:
3287:
3283:
3282:Alan W. Black
3278:
3275:
3270:
3268:9780521899277
3264:
3260:
3255:
3254:
3245:
3242:
3237:
3231:
3227:
3226:Penguin Books
3223:
3219:
3213:
3210:
3199:
3195:
3188:
3185:
3180:
3176:
3170:
3167:
3162:
3156:
3152:
3145:
3142:
3137:
3133:
3126:
3123:
3120:
3116:
3112:
3109:
3104:
3101:
3098:
3093:
3090:
3080:
3073:
3070:
3064:
3061:
3058:
3055:
3050:
3044:
3040:
3033:
3030:
3025:
3021:
3017:
3013:
3009:
3002:
2999:
2987:
2981:
2979:
2975:
2963:
2959:
2953:
2951:
2947:
2939:
2936:(3): 1123–6.
2935:
2928:
2921:
2918:
2910:
2906:
2902:
2897:
2892:
2888:
2884:
2877:
2870:
2867:
2864:
2860:
2857:
2852:
2849:
2838:on 2000-04-07
2837:
2833:
2827:
2824:
2811:
2807:
2801:
2798:
2793:
2789:
2782:
2779:
2774:
2770:
2766:
2762:
2758:
2754:
2751:(3): 737–93.
2750:
2746:
2739:
2736:
2725:on 2013-05-12
2721:
2717:
2713:
2706:
2699:
2696:
2689:
2684:
2681:
2677:
2672:
2669:
2664:
2660:
2657:(2): 95–128.
2656:
2652:
2645:
2642:
2637:
2631:
2626:
2625:
2616:
2613:
2608:
2604:
2600:
2596:
2592:
2588:
2581:
2578:
2573:
2567:
2562:
2561:
2552:
2549:
2542:
2537:
2534:
2532:
2529:
2527:
2524:
2522:
2519:
2517:
2514:
2512:
2509:
2507:
2504:
2502:
2499:
2497:
2494:
2493:
2488:
2479:
2470:
2468:
2466:
2462:
2458:
2454:
2450:
2449:voice quality
2445:
2443:
2437:
2433:
2431:
2427:
2423:
2418:
2416:
2415:
2410:
2409:
2404:
2403:
2398:
2396:
2391:
2387:
2383:
2382:
2377:
2373:
2364:
2360:
2356:
2354:
2350:
2346:
2342:
2338:
2334:
2330:
2326:
2317:
2315:
2313:
2308:
2306:
2302:
2298:
2294:
2290:
2286:
2278:
2276:
2274:
2273:
2268:
2264:
2263:
2258:
2254:
2250:
2249:
2244:
2240:
2236:
2228:
2224:
2220:
2216:
2213:
2209:
2205:
2202:
2199:
2198:
2197:
2194:
2192:
2187:
2185:
2184:voice cloning
2181:
2176:
2174:
2170:
2166:
2162:
2154:
2149:
2145:
2142:
2139:
2135:
2131:
2127:
2124:
2121:
2117:
2113:
2110:
2107:
2103:
2099:
2096:
2092:
2089:
2085:
2081:
2077:
2076:Amazon Kindle
2073:
2069:
2066:
2062:
2058:
2054:
2053:
2049:
2044:
2040:
2037:
2034:
2031:
2028:
2025:
2024:
2023:
2021:
2013:
2011:
2009:
2004:
2002:
1998:
1994:
1989:
1987:
1983:
1979:
1975:
1971:
1967:
1966:e-mail client
1963:
1959:
1951:
1949:
1947:
1939:
1937:
1930:
1928:
1911:
1903:
1901:
1899:
1895:
1891:
1887:
1885:
1881:
1877:
1873:
1869:
1865:
1861:
1857:
1851:
1843:
1841:
1837:
1835:
1834:Speak Handler
1831:
1827:
1823:
1819:
1815:
1807:
1787:
1785:
1783:
1779:
1771:
1769:
1767:
1763:
1759:
1755:
1751:
1747:
1743:
1739:
1735:
1731:
1727:
1723:
1719:
1685:
1683:
1681:
1676:
1674:
1670:
1669:1400XL/1450XL
1666:
1645:
1643:
1641:
1637:
1616:
1614:
1611:
1607:
1603:
1602:Intellivision
1600:
1592:
1590:
1588:
1587:
1582:
1581:
1560:
1552:
1550:
1544:
1540:
1537:
1534:
1533:Forrest Mozer
1530:
1527:
1525:
1522:
1520:
1517:
1516:
1511:
1504:
1502:
1500:
1496:
1492:
1488:
1484:
1483:pitch contour
1480:
1476:
1472:
1466:
1462:
1454:
1452:
1449:
1442:
1440:
1437:
1432:
1428:
1426:
1421:
1417:
1413:
1409:
1405:
1401:
1390:
1387:
1379:
1369:
1365:
1361:
1355:
1354:
1350:
1345:This section
1343:
1339:
1334:
1333:
1327:
1325:
1323:
1317:
1313:
1311:
1307:
1303:
1298:
1296:
1292:
1288:
1283:
1280:
1279:abbreviations
1276:
1272:
1264:
1259:
1257:
1255:
1251:
1243:
1239:
1234:
1230:
1223:
1214:
1212:
1210:
1205:
1201:
1199:
1195:
1194:vocal emotion
1191:
1190:browser-based
1187:
1183:
1181:
1177:
1173:
1169:
1168:deep learning
1165:
1161:
1157:
1155:
1136:
1128:
1126:
1124:
1118:
1110:
1108:
1106:
1102:
1098:
1094:
1090:
1086:
1082:
1074:
1072:
1068:
1066:
1062:
1058:
1054:
1049:
1047:
1043:
1039:
1035:
1029:
1021:
1019:
1017:
1013:
1009:
1006:
1002:
998:
993:
991:
987:
983:
979:
975:
974:screen reader
971:
967:
963:
959:
955:
951:
947:
943:
936:
934:
932:
928:
924:
920:
912:
905:
901:
897:
891:
884:
882:
880:
876:
872:
868:
864:
860:
856:
852:
848:
844:
840:
832:
830:
827:
823:
818:
816:
815:decision tree
812:
808:
804:
800:
796:
792:
788:
784:
780:
776:
772:
768:
764:
760:
752:
750:
745:
737:
735:
733:
731:
726:
721:
718:
715:
714:
709:
701:
699:
697:
692:
690:
686:
681:
677:
673:
671:
670:
665:
661:
657:
656:
651:
647:
643:
639:
638:
633:
629:
625:
621:
620:
615:
612:
608:
604:
600:
596:
592:
588:
567:
560:
556:
554:
550:
546:
528:
526:
521:
516:
514:
509:
507:
503:
499:
495:
491:
487:
483:
479:
475:
474:speech coding
471:
467:
465:
461:
457:
456:
450:
446:
442:
438:
434:
430:
426:
422:
418:
414:
406:
401:
394:
392:
390:
386:
381:
377:
373:
369:
367:
363:
359:
355:
351:
346:
344:
340:
336:
332:
328:
324:
320:
316:
312:
308:
304:
300:
296:
293:In 1779, the
291:
290:(1214–1294).
289:
285:
281:
277:
273:
270:
262:
260:
258:
254:
250:
246:
244:
239:
235:
231:
227:
223:
219:
215:
214:
209:
205:
204:
199:
195:
186:
182:
180:
176:
172:
167:
165:
160:
156:
152:
148:
147:concatenating
143:
141:
137:
133:
129:
125:
121:
117:
113:
109:
105:
99:
97:
71:
54:
40:
36:
32:
25:
7066:Concordancer
7002:
6462:Bag-of-words
6396:Self-voicing
6345:Philip Rubin
6224:Applications
6185:Mockingboard
6014:Amazon Polly
5997:Proprietary
5895:
5738:Sample-based
5603:
5596:
5584:. Retrieved
5581:synthesia.io
5580:
5571:
5559:. Retrieved
5555:
5546:
5535:. Retrieved
5531:www.vice.com
5530:
5521:
5510:. Retrieved
5506:
5496:
5485:. Retrieved
5473:
5463:
5452:. Retrieved
5448:
5438:
5427:. Retrieved
5415:
5405:
5394:. Retrieved
5390:
5380:
5369:. Retrieved
5364:
5355:
5344:. Retrieved
5342:. 2023-06-20
5339:
5330:
5319:. Retrieved
5310:
5300:
5289:. Retrieved
5280:
5270:
5259:. Retrieved
5249:
5238:. Retrieved
5236:. 2007-05-02
5233:
5224:
5215:
5209:
5197:
5162:
5152:
5117:
5107:
5082:11244/316759
5064:
5060:
5050:
5039:. Retrieved
5035:the original
5030:
5020:
5010:, retrieved
5001:
4994:
4983:. Retrieved
4972:
4961:. Retrieved
4957:
4947:
4936:. Retrieved
4934:. 2019-07-08
4925:
4916:
4898:
4892:
4882:
4864:
4858:
4852:
4841:. Retrieved
4837:the original
4827:
4816:. Retrieved
4806:
4785:
4774:. Retrieved
4763:
4752:. Retrieved
4742:
4731:. Retrieved
4727:the original
4717:
4705:. Retrieved
4701:the original
4690:
4667:
4658:
4647:. Retrieved
4643:
4634:
4623:. Retrieved
4613:
4602:. Retrieved
4595:the original
4582:
4562:
4537:
4533:
4527:
4510:
4506:
4500:
4489:the original
4468:
4464:
4451:
4442:
4433:
4425:the original
4415:
4404:. Retrieved
4394:
4380:
4369:. Retrieved
4357:
4347:
4336:. Retrieved
4332:
4322:
4311:. Retrieved
4299:
4290:
4280:, retrieved
4250:
4240:
4234:. Bloomberg.
4225:
4192:
4188:
4178:
4167:. Retrieved
4127:
4117:
4082:
4076:
4041:
4031:
4020:. Retrieved
4016:
4006:
3995:. Retrieved
3991:
3981:
3970:. Retrieved
3958:
3948:
3937:. Retrieved
3933:
3923:
3912:. Retrieved
3907:
3898:
3887:. Retrieved
3878:
3868:
3857:. Retrieved
3848:
3838:
3817:
3806:. Retrieved
3799:the original
3770:
3766:
3753:
3742:. Retrieved
3738:the original
3728:
3703:
3699:
3678:. Retrieved
3658:
3623:
3617:
3602:RoadBlasters
3552:
3532:
3516:
3510:
3502:
3490:
3481:The Futurist
3480:
3474:
3464:
3458:
3447:. Retrieved
3445:. 2010-09-13
3442:
3433:
3422:. Retrieved
3410:
3401:
3384:
3380:
3374:
3366:
3357:
3346:. Retrieved
3342:the original
3332:
3319:
3306:
3293:
3277:
3252:
3244:
3221:
3212:
3201:. Retrieved
3197:
3187:
3178:
3169:
3150:
3144:
3135:
3125:
3103:
3092:
3072:
3063:
3057:
3041:. Springer.
3038:
3032:
3015:
3011:
3001:
2990:. Retrieved
2965:. Retrieved
2933:
2920:
2886:
2882:
2869:
2851:
2840:. Retrieved
2836:the original
2826:
2814:. Retrieved
2810:the original
2800:
2791:
2781:
2748:
2744:
2738:
2727:. Retrieved
2720:the original
2715:
2711:
2698:
2687:
2683:
2671:
2654:
2650:
2644:
2628:. Springer.
2623:
2615:
2590:
2586:
2580:
2559:
2551:
2446:
2438:
2434:
2419:
2412:
2406:
2405:fandom, the
2400:
2394:
2379:
2368:
2321:
2318:Applications
2309:
2283:A number of
2282:
2270:
2267:Tenth Doctor
2260:
2246:
2232:
2195:
2188:
2182:presented a
2177:
2159:At the 2018
2158:
2120:IBM ViaVoice
2017:
2005:
1990:
1958:applications
1955:
1943:
1934:
1926:
1898:call centers
1888:
1880:Windows 2000
1853:
1838:
1811:
1775:
1754:Snow Leopard
1715:
1677:
1662:
1633:
1606:Intellivoice
1596:
1584:
1578:
1575:
1548:
1470:
1468:
1450:
1446:
1433:
1429:
1397:
1382:
1373:
1358:Please help
1346:
1318:
1314:
1299:
1284:
1268:
1247:
1206:
1202:
1184:
1179:
1163:
1158:
1151:
1120:
1078:
1069:
1050:
1042:Philip Rubin
1031:
994:
969:
940:
915:/ˌklɪəɹˈʌʊt/
910:
903:
899:
892:
888:
843:phonotactics
836:
819:
756:
747:
728:
724:
722:
717:
711:
707:
705:
693:
682:
678:
674:
667:
653:
649:
641:
635:
630:. The first
623:
617:
611:shoot 'em up
601:produced by
594:
585:
549:Dennis Klatt
542:
517:
510:
504:used in the
468:
458:, where the
453:
412:
410:
370:
358:Homer Dudley
347:
292:
280:Silvester II
276:Brazen Heads
266:
256:
252:
241:
237:
213:tokenization
211:
207:
201:
191:
168:
144:
127:
123:
122:products. A
111:
103:
102:
53:
38:
29:This is the
23:
7023:Topic model
6903:Text corpus
6749:Statistical
6616:Text mining
6457:AI-complete
6325:Gunnar Fant
6303:Researchers
6301:Developers/
6241:Dr. Sbaitso
6049:Readspeaker
5928:Gnopernicus
5718:Subtractive
5340:VentureBeat
4812:"gnuspeech"
4566:EE Times. "
3590:Gauntlet II
3570:Road Runner
2692:(in German)
2388:in various
2148:Yamaha FS1R
2116:OS/2 Warp 4
2014:Open source
1997:Readspeaker
1970:web browser
1766:AppleScript
1240:as well as
1209:tone sandhi
1107:criterion.
1089:vocal tract
1034:vocal tract
1012:Atari, Inc.
990:intonations
927:alternation
911:"clear out"
795:spectrogram
708:naturalness
614:arcade game
464:Dave Bowman
445:Max Mathews
325:-operated "
317:sounds (in
311:vocal tract
288:Roger Bacon
253:synthesizer
245:-to-phoneme
164:vocal tract
7177:Categories
6744:Rule-based
6626:Truecasing
6494:Stop words
6266:Voice font
6231:AOLbyPhone
6130:PPG Phonem
6120:Chipspeech
6059:CoolSpeech
5728:Distortion
5586:12 October
5537:2023-02-03
5512:2023-07-25
5487:2023-07-25
5454:2023-07-25
5429:2023-07-25
5396:2023-04-25
5371:2023-04-25
5346:2023-07-25
5321:2021-01-18
5291:2021-01-19
5261:2010-02-17
5240:2010-02-17
5218:(1). 1984.
5041:2020-04-02
5012:2018-03-02
4985:2016-06-18
4963:2019-09-08
4938:2019-09-11
4908:1802.06006
4874:1806.04558
4843:2010-02-17
4818:2010-02-17
4776:2010-02-17
4754:2010-02-17
4733:2011-01-29
4664:Miner, Jay
4649:2020-04-28
4625:2013-03-24
4604:2012-02-22
4406:2012-02-22
4371:2023-07-25
4338:2022-07-01
4333:PEOPLE.com
4313:2022-06-29
4282:2022-06-29
4169:2022-06-29
4137:2003.09234
4051:1912.10915
4022:2023-07-25
3997:2023-07-25
3992:TechCrunch
3972:2023-07-25
3939:2023-04-25
3914:2023-02-03
3889:2021-01-18
3859:2021-01-19
3829:1910.11997
3808:2011-12-14
3744:2012-02-22
3542:Space Fury
3495:L.F. Lamel
3449:2019-05-23
3424:2019-05-28
3348:2008-05-28
3203:2020-08-23
3119:GamesRadar
3079:US 4326710
2992:2009-07-21
2842:2010-02-17
2816:5 December
2729:2011-12-13
2543:References
2331:and other
2272:Doctor Who
2265:, and the
2257:Fluttershy
2061:Atari 2600
2057:Atari 5200
1876:Windows 98
1872:Windows 95
1848:See also:
1459:See also:
1376:April 2023
1295:homographs
1271:heteronyms
1260:Challenges
1198:intonation
1186:ElevenLabs
1061:Steve Jobs
896:non-rhotic
685:Ann Syrdal
650:zero cross
607:video game
525:a cappella
441:Daisy Bell
378:built the
301:scientist
269:electronic
96:media help
7053:reviewing
6851:standards
6849:Types and
6275:Protocols
6261:PlainTalk
6105:Alter/Ego
6084:LaLaVoice
6079:Voiceroid
5973:eCantorix
5933:Gnuspeech
5750:Wavetable
5561:10 August
5482:0362-4331
5424:1059-1028
5281:AUTOMATON
5189:236982893
5144:250118756
5099:243101945
5091:0738-0569
4814:. Gnu.org
4366:1059-1028
4308:0190-8286
4277:236666289
4217:226196422
4209:1461-4448
4164:214605906
4103:cite book
4095:2209-9689
4068:209444942
3967:1059-1028
3849:AUTOMATON
3558:Star Wars
3419:0040-781X
2905:1932-8346
2465:dysphonic
2457:phonation
2442:Synthesia
2095:BBC Micro
2039:gnuspeech
2001:Pediaphon
1978:RSS-feeds
1828:'s audio
1822:MacinTalk
1742:VoiceOver
1738:PlainTalk
1734:Macintosh
1726:MacInTalk
1640:Macintalk
1412:linguists
1347:does not
1291:heuristic
1248:In 2023,
1244:services.
1101:waveforms
1065:gnuspeech
826:gigabytes
783:sentences
771:morphemes
767:syllables
732:synthesis
619:Stratovox
555:methods.
527:" style.
518:In 1975,
498:Bell Labs
433:Bell Labs
362:The Voder
350:Bell Labs
335:Pressburg
234:sentences
194:front-end
6969:Wikidata
6949:FrameNet
6934:BabelNet
6913:Treebank
6883:PropBank
6828:Word2vec
6793:fastText
6674:Stemming
6292:VoiceXML
6236:DialogOS
6155:Vocaloid
6150:Vocalina
6135:Realivox
6069:CereProc
6029:Talk It!
6007:Speaking
5999:software
5923:eSpeakNG
5912:Speaking
5755:Granular
5723:Additive
5365:Press.pl
5315:Archived
5285:Archived
4571:Archived
4554:10491251
4485:46693018
4017:Lifewire
3883:Archived
3853:Archived
3720:26337775
3675:17451802
3598:Paperboy
3586:Gauntlet
3521:Archived
3220:(2005).
3111:Archived
2938:Archived
2909:Archived
2859:Archived
2489:See also
2417:fandom.
2329:dyslexia
2312:VoiceXML
2235:freeware
2227:lip sync
2219:SIGGRAPH
2191:Symantec
2134:Magellan
1982:podcasts
1952:Internet
1884:Narrator
1776:Used in
1680:Atari ST
1519:Icophone
1416:language
1404:grapheme
1400:spelling
1287:semantic
1256:system.
1123:formants
980:, where
966:waveform
839:diphones
811:run time
791:waveform
763:diphones
646:PET 2001
626:), from
587:Handheld
460:HAL 9000
427:used an
389:phonetic
343:Euphonia
243:grapheme
198:back-end
159:diphones
151:database
120:hardware
116:software
35:reviewed
7140:Related
7106:Chatbot
6964:WordNet
6944:DBpedia
6818:Seq2seq
6562:Parsing
6477:Trigram
6359:Process
6180:Echo II
6173:Machine
6160:Xiaoice
6098:Singing
6019:DECtalk
5966:Singing
5952:FreeTTS
5826:Modular
5798:Formant
5742:Sampler
5713:Scanned
5556:elai.io
4927:bbc.com
4707:9 April
3795:7233191
3775:Bibcode
3767:Science
3680:Aug 27,
3562:Firefox
3527:, 1993.
3369:, 1996.
2967:15 July
2773:2958525
2753:Bibcode
2595:Bibcode
2390:fandoms
2378:series
2372:Biglobe
2080:Samsung
2065:Quadrun
1986:podcast
1962:plugins
1946:Android
1940:Android
1856:Windows
1854:Modern
1830:chipset
1814:AmigaOS
1788:AmigaOS
1780:and as
1750:Leopard
1580:Alpiner
1495:plosion
1408:phoneme
1368:removed
1353:sources
1310:corpora
1275:numbers
1176:emotion
1162:uses a
1097:prosody
958:voicing
942:Formant
925:. This
923:liaison
907:/ˈklɪə/
904:"clear"
871:Leachim
847:prosody
779:phrases
730:formant
655:Berzerk
595:Speech+
545:DECtalk
437:vocoder
429:IBM 704
407:in 1999
354:vocoder
323:bellows
263:History
249:prosody
230:clauses
226:phrases
224:, like
7113:(c.f.
6771:models
6759:Neural
6472:Bigram
6467:n-gram
6376:Currah
6350:Yamaha
6246:MBROLA
6195:Phasor
6110:Cantor
5919:eSpeak
5760:Vector
5636:Curlie
5480:
5449:Forbes
5422:
5187:
5177:
5142:
5132:
5097:
5089:
4801:, 2007
4797:
4678:
4552:
4483:
4364:
4306:
4275:
4265:
4215:
4207:
4162:
4152:
4093:
4066:
3965:
3908:Sifted
3793:
3718:
3673:
3630:
3594:A.P.B.
3544:, and
3417:
3265:
3232:
3157:
3136:RePlay
3084:
3045:
2903:
2771:
2632:
2568:
2461:timbre
2408:Portal
2399:, the
2397:fandom
2353:Votrax
2303:) and
2248:Portal
2243:GLaDOS
2165:Google
2144:Yamaha
2138:TomTom
2130:Garmin
2050:Others
2027:eSpeak
1910:Votrax
1904:Votrax
1882:added
1864:SAPI 5
1860:SAPI 4
1772:Amazon
1599:Mattel
1593:Mattel
1586:Parsec
1499:voiced
1277:, and
1008:arcade
982:memory
960:, and
919:French
863:MBROLA
781:, and
759:phones
669:Milton
593:(TSI)
413:et al.
299:Danish
295:German
232:, and
196:and a
155:phones
108:speech
7162:spaCy
6807:large
6798:GloVe
6386:PSOLA
6287:SABLE
6215:TuVox
6089:15.ai
6064:IVONA
5983:Sinsy
5947:Flite
5688:types
5416:Wired
5391:Wired
5206:(PDF)
5185:S2CID
5140:S2CID
5095:S2CID
4903:arXiv
4869:arXiv
4598:(PDF)
4591:(PDF)
4550:S2CID
4492:(PDF)
4481:S2CID
4461:(PDF)
4358:Wired
4273:S2CID
4213:S2CID
4160:S2CID
4132:arXiv
4064:S2CID
4046:arXiv
3959:Wired
3934:Wired
3824:arXiv
3802:(PDF)
3763:(PDF)
3671:S2CID
3655:(PDF)
2941:(PDF)
2930:(PDF)
2912:(PDF)
2879:(PDF)
2723:(PDF)
2708:(PDF)
2376:anime
2305:SABLE
2269:from
2245:from
2239:15.ai
2171:from
2106:codec
2086:Pro,
2070:Some
2018:Some
1826:Amiga
1778:Alexa
1686:Apple
1646:Atari
1302:above
1160:15.ai
962:noise
859:PSOLA
807:pitch
799:index
797:. An
775:words
687:, at
329:" of
315:vowel
210:, or
134:like
6927:Data
6778:BERT
6200:RIAS
6145:UTAU
5938:Orca
5588:2023
5563:2022
5478:ISSN
5420:ISSN
5175:ISBN
5130:ISBN
5087:ISSN
4795:ISBN
4709:2013
4676:ISBN
4362:ISSN
4304:ISSN
4263:ISBN
4205:ISSN
4150:ISBN
4109:link
4091:ISSN
3963:ISSN
3791:PMID
3716:PMID
3682:2015
3628:ISBN
3582:720°
3415:ISSN
3411:Time
3263:ISBN
3230:ISBN
3155:ISBN
3043:ISBN
2969:2019
2962:IEEE
2901:ISSN
2818:2017
2769:PMID
2630:ISBN
2566:ISBN
2301:JSML
2255:and
2093:The
2082:E6,
1896:and
1874:and
1862:and
1760:, a
1678:The
1597:The
1583:and
1463:and
1351:any
1349:cite
1250:VICE
1196:and
1053:NeXT
1005:Sega
999:toy
984:and
793:and
727:and
710:and
520:MUSA
492:and
6959:UBY
5740:or
5647:on
5645:BBC
5634:at
5609:doi
5167:doi
5122:doi
5077:hdl
5069:doi
4932:BBC
4542:doi
4515:doi
4473:doi
4255:doi
4197:doi
4142:doi
4056:doi
3783:doi
3771:212
3708:doi
3663:doi
3604:,
3389:doi
3020:doi
2891:doi
2761:doi
2659:doi
2603:doi
2463:of
2289:XML
2217:In
2126:GPS
2114:'s
2112:IBM
2008:W3C
1972:or
1758:say
1724:'s
1617:SAM
1362:by
1227:An
1091:),
900:"r"
861:or
496:at
480:of
333:of
240:or
173:or
157:or
128:TTS
118:or
37:on
7179::
5579:.
5554:.
5529:.
5505:.
5476:.
5472:.
5447:.
5418:.
5414:.
5389:.
5363:.
5338:.
5313:.
5309:.
5283:.
5279:.
5232:.
5216:21
5214:.
5208:.
5183:.
5173:.
5161:.
5138:.
5128:.
5116:.
5093:.
5085:.
5075:.
5065:38
5063:.
5059:.
5029:.
5005:,
4956:.
4930:.
4924:.
4901:,
4899:31
4897:,
4891:,
4865:31
4863:,
4642:.
4548:.
4538:21
4536:.
4511:42
4509:.
4479:.
4469:50
4467:.
4463:.
4441:.
4360:.
4356:.
4331:.
4302:.
4298:.
4271:,
4261:,
4249:,
4211:.
4203:.
4193:23
4191:.
4187:.
4158:.
4148:.
4140:.
4126:.
4105:}}
4101:{{
4062:.
4054:.
4040:.
4015:.
3990:.
3961:.
3957:.
3932:.
3906:.
3881:.
3877:.
3851:.
3847:.
3789:.
3781:.
3769:.
3765:.
3714:.
3704:30
3702:.
3690:^
3669:.
3657:.
3642:^
3608:,
3600:,
3596:,
3592:,
3588:,
3584:,
3580:,
3576:,
3572:,
3568:,
3564:,
3560:,
3540:,
3501:,
3441:.
3409:.
3385:42
3383:.
3365:.
3284:,
3261:.
3228:.
3224:.
3196:.
3177:.
3134:.
3117:,
3016:17
3014:.
2977:^
2960:.
2949:^
2932:.
2907:.
2899:.
2885:.
2881:.
2790:.
2767:.
2759:.
2749:82
2747:.
2716:12
2714:.
2710:.
2653:.
2601:.
2591:70
2589:.
2355:.
2275:.
2251:,
2136:,
2132:,
2078:,
1960:,
1900:.
1878:.
1479:UK
1477:,
1273:,
956:,
933:.
881:.
857:,
817:.
777:,
773:,
769:,
761:,
616:,
368:.
228:,
206:,
142:.
33:,
7117:)
6840:,
6809:)
6805:(
6435:e
6428:t
6421:v
5945:/
5921:/
5888:e
5881:t
5874:v
5678:e
5671:t
5664:v
5651:.
5617:.
5611::
5590:.
5565:.
5540:.
5515:.
5490:.
5457:.
5432:.
5399:.
5374:.
5349:.
5324:.
5294:.
5264:.
5243:.
5191:.
5169::
5146:.
5124::
5101:.
5079::
5071::
5044:.
4988:.
4966:.
4941:.
4905::
4871::
4846:.
4821:.
4779:.
4757:.
4736:.
4711:.
4684:.
4652:.
4628:.
4607:.
4556:.
4544::
4521:.
4517::
4475::
4409:.
4374:.
4341:.
4316:.
4257::
4219:.
4199::
4172:.
4144::
4134::
4111:)
4097:.
4070:.
4058::
4048::
4025:.
4000:.
3975:.
3942:.
3917:.
3892:.
3862:.
3832:.
3826::
3811:.
3785::
3777::
3747:.
3722:.
3710::
3684:.
3665::
3636:.
3612:.
3452:.
3427:.
3395:.
3391::
3351:.
3271:.
3259:3
3238:.
3206:.
3181:.
3163:.
3051:.
3026:.
3022::
2995:.
2971:.
2893::
2887:3
2845:.
2820:.
2794:.
2775:.
2763::
2755::
2732:.
2665:.
2661::
2655:8
2638:.
2609:.
2605::
2597::
2574:.
2480:.
2122:.
2063:(
2045:.
1535:)
1389:)
1383:(
1378:)
1374:(
1370:.
1356:.
1224:.
1087:(
805:(
716:.
640:(
523:"
297:-
126:(
98:.
41:.
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.