64:
The goal of the wake-sleep algorithm is to find a hierarchical representation of observed data. In a graphical representation of the algorithm, data is applied to the algorithm at the bottom, while higher layers form gradually more abstract representations. Between each pair of layers are two sets of
85:
Neurons are fired by recognition connections (from what would be input to what would be output). Generative connections (leading from outputs to inputs) are then modified to increase probability that they would recreate the correct activity in the layer below – closer to actual data from sensory
94:
The process is reversed in the “sleep” phase – neurons are fired by generative connections while recognition connections are being modified to increase probability that they would recreate the correct activity in the layer above – further to actual data from sensory input.
51:
for observed data. The name of the algorithm derives from its use of two learning phases, the “wake” phase and the “sleep” phase, which are performed alternately. It can be conceived as a model for learning in the brain, but is also being applied for
103:
Since the recognition network is limited in its flexibility, it might not be able to approximate the posterior distribution of latent variables well. To better approximate the posterior distribution, it is possible to employ
20:
108:, with the recognition network as the proposal distribution. This improved approximation of the posterior distribution also improves the overall performance of the model.
44:
243:
400:
77:
Training consists of two phases – the “wake” phase and the “sleep” phase. It has been proven that this learning algorithm is convergent.
270:
287:
Katayama, Katsuki; Ando, Masataka; Horiguchi, Tsuyoshi (2004-04-01). "Models of MT and MST areas using wake–sleep algorithm".
117:
23:
Layers of the neural network. R, G are weights used by the wake-sleep algorithm to modify data inside the layers.
32:
170:
105:
48:
328:
194:
66:
349:
304:
186:
123:
40:
296:
178:
53:
36:
161:; Neal, Radford (1995-05-26). "The wake-sleep algorithm for unsupervised neural networks".
69:
from data, and generative weights, which define how these representations relate to data.
150:
174:
216:
394:
371:
158:
198:
300:
266:
212:
154:
120:, a type of neural net that is trained with a conceptually similar algorithm.
182:
308:
190:
327:
Bornschein, Jörg; Bengio, Yoshua (2014-06-10). "Reweighted Wake-Sleep".
65:
weights: Recognition weights, which define how representations are
333:
18:
271:"Does the wake-sleep algorithm produce good density estimators?"
19:
126:, a neural network model trained by the wake-sleep algorithm.
242:
Ikeda, Shiro; Amari, Shun-ichi; Nakahara, Hiroyuki (1998).
372:"Factor Analysis Using Delta Rules Wake-Sleep Learning"
350:"Wake-sleep algorithm for representational learning"
276:. Advances in Neural Information Processing Systems.
248:Advances in Neural Information Processing Systems
370:Neal, Radford M.; Dayan, Peter (1996-11-24).
8:
217:"Helmholtz Machines and Wake-Sleep Learning"
332:
244:"Convergence of the Wake-Sleep Algorithm"
265:Frey, Brendan J.; Hinton, Geoffrey E.;
135:
7:
322:
320:
318:
237:
235:
145:
143:
141:
139:
45:expectation-maximization algorithm
43:. The algorithm is similar to the
14:
348:Maei, Hamid Reza (2007-01-25).
16:Unsupervised learning algorithm
1:
301:10.1016/j.neunet.2003.07.004
118:Restricted Boltzmann machine
401:Machine learning algorithms
417:
47:, and optimizes the model
352:. University of Montreal
377:. University of Toronto
183:10.1126/science.7761831
24:
33:unsupervised learning
22:
29:wake-sleep algorithm
175:1995Sci...268.1158H
169:(5214): 1158–1161.
151:Hinton, Geoffrey E.
106:importance sampling
35:algorithm for deep
41:Helmholtz Machines
25:
124:Helmholtz machine
90:The "sleep" phase
37:generative models
408:
386:
385:
383:
382:
376:
367:
361:
360:
358:
357:
345:
339:
338:
336:
324:
313:
312:
284:
278:
277:
275:
262:
256:
255:
239:
230:
229:
227:
226:
221:
209:
203:
202:
159:Frey, Brendan J.
147:
81:The "wake" phase
54:machine learning
416:
415:
411:
410:
409:
407:
406:
405:
391:
390:
389:
380:
378:
374:
369:
368:
364:
355:
353:
347:
346:
342:
326:
325:
316:
289:Neural Networks
286:
285:
281:
273:
264:
263:
259:
241:
240:
233:
224:
222:
219:
211:
210:
206:
149:
148:
137:
133:
114:
101:
92:
83:
75:
62:
17:
12:
11:
5:
414:
412:
404:
403:
393:
392:
388:
387:
362:
340:
314:
295:(3): 339–351.
279:
269:(1996-05-01).
257:
231:
204:
134:
132:
129:
128:
127:
121:
113:
110:
100:
97:
91:
88:
82:
79:
74:
71:
61:
58:
15:
13:
10:
9:
6:
4:
3:
2:
413:
402:
399:
398:
396:
373:
366:
363:
351:
344:
341:
335:
330:
323:
321:
319:
315:
310:
306:
302:
298:
294:
290:
283:
280:
272:
268:
261:
258:
253:
249:
245:
238:
236:
232:
218:
214:
208:
205:
200:
196:
192:
188:
184:
180:
176:
172:
168:
164:
160:
156:
152:
146:
144:
142:
140:
136:
130:
125:
122:
119:
116:
115:
111:
109:
107:
98:
96:
89:
87:
80:
78:
72:
70:
68:
59:
57:
55:
50:
46:
42:
39:, especially
38:
34:
30:
21:
379:. Retrieved
365:
354:. Retrieved
343:
292:
288:
282:
267:Dayan, Peter
260:
254:. MIT Press.
251:
247:
223:. Retrieved
213:Dayan, Peter
207:
166:
162:
155:Dayan, Peter
102:
93:
84:
76:
63:
28:
26:
60:Description
381:2015-11-01
356:2011-11-01
225:2015-11-01
131:References
99:Extensions
49:likelihood
334:1406.2751
395:Category
309:15037352
112:See also
73:Training
67:inferred
191:7761831
171:Bibcode
163:Science
86:input.
307:
199:871473
197:
189:
31:is an
375:(PDF)
329:arXiv
274:(PDF)
220:(PDF)
195:S2CID
305:PMID
187:PMID
27:The
297:doi
179:doi
167:268
397::
317:^
303:.
293:17
291:.
252:11
250:.
246:.
234:^
215:.
193:.
185:.
177:.
165:.
157:;
153:;
138:^
56:.
384:.
359:.
337:.
331::
311:.
299::
228:.
201:.
181::
173::
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.