Knowledge (XXG)

NETtalk (artificial neural network)

Source 📝

288: 64:
The network was trained on a large amount of English words and their corresponding pronunciations, and is able to generate pronunciations for unseen words with a high level of accuracy. The success of the NETtalk network inspired further research in the field of pronunciation generation and speech
114:
NETtalk was created to explore the mechanisms of learning to correctly pronounce English text. The authors note that learning to read involves a complex mechanism involving many parts of the human brain. NETtalk does not specifically model the image processing stages and letter recognition of the
123:
during training and during performance testing. It is NETtalk's task to learn proper associations between the correct pronunciation with a given sequence of letters based on the context in which the letters appear. In other words, NETtalk learns to use the letters around the currently pronounced
20: 99:
The input of the network has 203 units, divided into 7 groups of 29 units each. Each group is a one-hot encoding of one character. There are 29 possible characters: 26 letters, comma, period, and word boundary (whitespace).
106:
The output has 26 units. 21 units encode for articulatory features (point of articulation, voicing, vowel height, etc.) of phonemes, and 5 units encode for stress and syllable boundaries.
80:. The development process was described in a 1993 interview. It took three months to create the training dataset, but only a few days to train the network. 38:
and Charles Rosenberg. The intent behind NETtalk was to construct simplified models that might shed light on the complexity of learning human level
329: 119:. Rather, it assumes that the letters have been pre-classified and recognized, and these letter sequences comprising words are then shown to the 358: 252: 226: 173: 348: 198: 88:
The network had three layers and 18,629 adjustable weights, large by the standards of 1986. There were worries that it would
242: 66: 322: 74: 353: 31: 271: 57:
NETtalk is a program that learns to pronounce written English text by being shown text as input and matching
295: 315: 58: 276: 46:
model that could also learn to perform a comparable task. The authors trained it in two ways, once by
35: 248: 222: 194: 169: 163: 47: 299: 145: 92:
the dataset, but it was trained successfully. The dataset was a 20,000-word subset of the
51: 19: 120: 342: 116: 43: 93: 287: 89: 65:
synthesis and demonstrated the potential of neural networks for solving complex
69:
problems. The output of the network was a stream of phonemes, which fed into
39: 73:
to produce audible speech, It achieved popular success, appearing on the
125: 70: 18: 34:. It is the result of research carried out in the mid-1980s by 96:, with manually annotated phoneme and stress for each letter. 221:. Cambridge, Massachusetts London, England: The MIT Press. 168:. Springer Science & Business Media. pp. 123–. 303: 128:
that provide cues as to its intended phonemic mapping.
146:
Parallel networks that learn to pronounce English text
193:(First ed.). The MIT Press. pp. 161–163. 144:
Sejnowski, Terrence J., and Charles R. Rosenberg. "
244:Talking Nets: An Oral History of Neural Networks 323: 8: 165:An Introduction to Text-to-Speech Synthesis 330: 316: 137: 42:tasks, and their implementation as a 7: 284: 282: 277:New York Times article about NETtalk 212: 210: 162:Thierry Dutoit (30 November 2001). 302:. You can help Knowledge (XXG) by 14: 286: 217:Sejnowski, Terrence J. (2018). 191:Connectionist Symbol Processing 103:The hidden layer has 80 units. 1: 359:Artificial intelligence stubs 272:Original NETtalk training set 247:. The MIT Press. 2000-02-28. 219:The deep learning revolution 110:Achievements and limitations 375: 349:Artificial neural networks 281: 189:Hinton, Geoffrey (1991). 32:artificial neural network 16:Artificial neural network 296:artificial intelligence 59:phonetic transcriptions 298:-related article is a 24: 22: 152:1.1 (1987): 145-168. 36:Terrence Sejnowski 25: 23:NETtalk structure. 311: 310: 254:978-0-262-26715-1 228:978-0-262-03803-4 175:978-1-4020-0369-1 48:Boltzmann machine 366: 354:Speech synthesis 332: 325: 318: 290: 283: 259: 258: 239: 233: 232: 214: 205: 204: 186: 180: 179: 159: 153: 142: 61:for comparison. 374: 373: 369: 368: 367: 365: 364: 363: 339: 338: 337: 336: 268: 263: 262: 255: 241: 240: 236: 229: 216: 215: 208: 201: 188: 187: 183: 176: 161: 160: 156: 150:Complex systems 143: 139: 134: 112: 86: 52:backpropagation 17: 12: 11: 5: 372: 370: 362: 361: 356: 351: 341: 340: 335: 334: 327: 320: 312: 309: 308: 291: 280: 279: 274: 267: 266:External links 264: 261: 260: 253: 234: 227: 206: 199: 181: 174: 154: 136: 135: 133: 130: 121:neural network 111: 108: 85: 82: 15: 13: 10: 9: 6: 4: 3: 2: 371: 360: 357: 355: 352: 350: 347: 346: 344: 333: 328: 326: 321: 319: 314: 313: 307: 305: 301: 297: 292: 289: 285: 278: 275: 273: 270: 269: 265: 256: 250: 246: 245: 238: 235: 230: 224: 220: 213: 211: 207: 202: 200:0-262-58106-X 196: 192: 185: 182: 177: 171: 167: 166: 158: 155: 151: 147: 141: 138: 131: 129: 127: 122: 118: 117:visual cortex 109: 107: 104: 101: 97: 95: 91: 83: 81: 79: 77: 72: 68: 62: 60: 55: 53: 49: 45: 44:connectionist 41: 37: 33: 29: 21: 304:expanding it 293: 243: 237: 218: 190: 184: 164: 157: 149: 140: 113: 105: 102: 98: 94:Brown Corpus 87: 84:Architecture 75: 63: 56: 50:and once by 27: 26: 343:Categories 132:References 40:cognitive 126:phoneme 90:overfit 71:DECtalk 28:NETtalk 251:  225:  197:  172:  30:is an 294:This 76:Today 300:stub 249:ISBN 223:ISBN 195:ISBN 170:ISBN 78:show 148:." 67:NLP 345:: 209:^ 54:. 331:e 324:t 317:v 306:. 257:. 231:. 203:. 178:.

Index


artificial neural network
Terrence Sejnowski
cognitive
connectionist
Boltzmann machine
backpropagation
phonetic transcriptions
NLP
DECtalk
Today show
overfit
Brown Corpus
visual cortex
neural network
phoneme
Parallel networks that learn to pronounce English text
An Introduction to Text-to-Speech Synthesis
ISBN
978-1-4020-0369-1
ISBN
0-262-58106-X


ISBN
978-0-262-03803-4
Talking Nets: An Oral History of Neural Networks
ISBN
978-0-262-26715-1
Original NETtalk training set

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.