288:
64:
The network was trained on a large amount of
English words and their corresponding pronunciations, and is able to generate pronunciations for unseen words with a high level of accuracy. The success of the NETtalk network inspired further research in the field of pronunciation generation and speech
114:
NETtalk was created to explore the mechanisms of learning to correctly pronounce
English text. The authors note that learning to read involves a complex mechanism involving many parts of the human brain. NETtalk does not specifically model the image processing stages and letter recognition of the
123:
during training and during performance testing. It is NETtalk's task to learn proper associations between the correct pronunciation with a given sequence of letters based on the context in which the letters appear. In other words, NETtalk learns to use the letters around the currently pronounced
20:
99:
The input of the network has 203 units, divided into 7 groups of 29 units each. Each group is a one-hot encoding of one character. There are 29 possible characters: 26 letters, comma, period, and word boundary (whitespace).
106:
The output has 26 units. 21 units encode for articulatory features (point of articulation, voicing, vowel height, etc.) of phonemes, and 5 units encode for stress and syllable boundaries.
80:. The development process was described in a 1993 interview. It took three months to create the training dataset, but only a few days to train the network.
38:
and
Charles Rosenberg. The intent behind NETtalk was to construct simplified models that might shed light on the complexity of learning human level
329:
119:. Rather, it assumes that the letters have been pre-classified and recognized, and these letter sequences comprising words are then shown to the
358:
252:
226:
173:
348:
198:
88:
The network had three layers and 18,629 adjustable weights, large by the standards of 1986. There were worries that it would
242:
66:
322:
74:
353:
31:
271:
57:
NETtalk is a program that learns to pronounce written
English text by being shown text as input and matching
295:
315:
58:
276:
46:
model that could also learn to perform a comparable task. The authors trained it in two ways, once by
35:
248:
222:
194:
169:
163:
47:
299:
145:
92:
the dataset, but it was trained successfully. The dataset was a 20,000-word subset of the
51:
19:
120:
342:
116:
43:
93:
287:
89:
65:
synthesis and demonstrated the potential of neural networks for solving complex
69:
problems. The output of the network was a stream of phonemes, which fed into
39:
73:
to produce audible speech, It achieved popular success, appearing on the
125:
70:
18:
34:. It is the result of research carried out in the mid-1980s by
96:, with manually annotated phoneme and stress for each letter.
221:. Cambridge, Massachusetts London, England: The MIT Press.
168:. Springer Science & Business Media. pp. 123–.
303:
128:
that provide cues as to its intended phonemic mapping.
146:
Parallel networks that learn to pronounce
English text
193:(First ed.). The MIT Press. pp. 161–163.
144:
Sejnowski, Terrence J., and
Charles R. Rosenberg. "
244:Talking Nets: An Oral History of Neural Networks
323:
8:
165:An Introduction to Text-to-Speech Synthesis
330:
316:
137:
42:tasks, and their implementation as a
7:
284:
282:
277:New York Times article about NETtalk
212:
210:
162:Thierry Dutoit (30 November 2001).
302:. You can help Knowledge (XXG) by
14:
286:
217:Sejnowski, Terrence J. (2018).
191:Connectionist Symbol Processing
103:The hidden layer has 80 units.
1:
359:Artificial intelligence stubs
272:Original NETtalk training set
247:. The MIT Press. 2000-02-28.
219:The deep learning revolution
110:Achievements and limitations
375:
349:Artificial neural networks
281:
189:Hinton, Geoffrey (1991).
32:artificial neural network
16:Artificial neural network
296:artificial intelligence
59:phonetic transcriptions
298:-related article is a
24:
22:
152:1.1 (1987): 145-168.
36:Terrence Sejnowski
25:
23:NETtalk structure.
311:
310:
254:978-0-262-26715-1
228:978-0-262-03803-4
175:978-1-4020-0369-1
48:Boltzmann machine
366:
354:Speech synthesis
332:
325:
318:
290:
283:
259:
258:
239:
233:
232:
214:
205:
204:
186:
180:
179:
159:
153:
142:
61:for comparison.
374:
373:
369:
368:
367:
365:
364:
363:
339:
338:
337:
336:
268:
263:
262:
255:
241:
240:
236:
229:
216:
215:
208:
201:
188:
187:
183:
176:
161:
160:
156:
150:Complex systems
143:
139:
134:
112:
86:
52:backpropagation
17:
12:
11:
5:
372:
370:
362:
361:
356:
351:
341:
340:
335:
334:
327:
320:
312:
309:
308:
291:
280:
279:
274:
267:
266:External links
264:
261:
260:
253:
234:
227:
206:
199:
181:
174:
154:
136:
135:
133:
130:
121:neural network
111:
108:
85:
82:
15:
13:
10:
9:
6:
4:
3:
2:
371:
360:
357:
355:
352:
350:
347:
346:
344:
333:
328:
326:
321:
319:
314:
313:
307:
305:
301:
297:
292:
289:
285:
278:
275:
273:
270:
269:
265:
256:
250:
246:
245:
238:
235:
230:
224:
220:
213:
211:
207:
202:
200:0-262-58106-X
196:
192:
185:
182:
177:
171:
167:
166:
158:
155:
151:
147:
141:
138:
131:
129:
127:
122:
118:
117:visual cortex
109:
107:
104:
101:
97:
95:
91:
83:
81:
79:
77:
72:
68:
62:
60:
55:
53:
49:
45:
44:connectionist
41:
37:
33:
29:
21:
304:expanding it
293:
243:
237:
218:
190:
184:
164:
157:
149:
140:
113:
105:
102:
98:
94:Brown Corpus
87:
84:Architecture
75:
63:
56:
50:and once by
27:
26:
343:Categories
132:References
40:cognitive
126:phoneme
90:overfit
71:DECtalk
28:NETtalk
251:
225:
197:
172:
30:is an
294:This
76:Today
300:stub
249:ISBN
223:ISBN
195:ISBN
170:ISBN
78:show
148:."
67:NLP
345::
209:^
54:.
331:e
324:t
317:v
306:.
257:.
231:.
203:.
178:.
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.