79:
879:
Higher-order data: Although this is rarely discussed in the scientific literature, PLSA extends naturally to higher order data (three modes and higher), i.e. it can model co-occurrences over three or more variables. In the symmetric formulation above, this is done simply by adding conditional
1026:
789:. The number of parameters grows linearly with the number of documents. In addition, although PLSA is a generative model of the documents in the collection it is estimated on, it is not a generative model of new documents.
493:
46:
for the analysis of two-mode and co-occurrence data. In effect, one can derive a low-dimensional representation of the observed variables in terms of their affinity to certain hidden variables, just as in
936:
864:
Generative models: The following models have been developed to address an often-criticized shortcoming of PLSA, namely that it is not a proper generative model for new documents.
518:
being the words' topic. Note that the number of topics is a hyperparameter that must be chosen in advance and is not estimated from the data. The first formulation is the
942:
1137:
752:
715:
654:
617:
217:
160:
1097:
787:
321:
678:
580:
560:
540:
516:
281:
257:
237:
180:
123:
103:
754:. Although we have used words and documents in this example, the co-occurrence of any couple of discrete variables may be modelled in exactly the same way.
1122:
1046:
333:
1068:
Nonnegative Matrix
Factorization and Probabilistic Latent Semantic Indexing: Equivalence Chi-Square Statistic, and a Hybrid Method. AAAI 2006"
959:
Pinoli, Pietro; et, al. (2013). "Enhanced probabilistic latent semantic analysis with weighting schemes to predict genomic annotations".
1117:
976:
1142:
894:
880:
probability distributions for these additional variables. This is the probabilistic analogue to non-negative tensor factorisation.
1147:
1054:
63:
938:
Learning the
Similarity of Documents : an information-geometric approach to document retrieval and categorization
867:
820:
323:
of words and documents, PLSA models the probability of each co-occurrence as a mixture of conditionally independent
1152:
906:
324:
55:
48:
871:
1101:
1080:
On the equivalence between Non-negative Matrix
Factorization and Probabilistic Latent Semantic Indexing"
816:
812:
43:
1043:
1079:
1027:
A Probabilistic
Framework for the Hierarchic Organisation and Classification of Document Collections
963:. The 13th IEEE International Conference on BioInformatics and BioEngineering. IEEE. pp. 1–4.
911:
260:
835:
916:
890:
67:
972:
66:), probabilistic latent semantic analysis is based on a mixture decomposition derived from a
1008:
964:
824:
720:
683:
622:
585:
185:
128:
760:
294:
1050:
284:
993:
828:
663:
565:
545:
525:
501:
266:
242:
222:
165:
108:
88:
82:
59:
1067:
1131:
805:
793:
1012:
839:
968:
946:
488:{\displaystyle P(w,d)=\sum _{c}P(c)P(d|c)P(w|c)=P(d)\sum _{c}P(c|d)P(w|c)}
857:
Symmetric: HPLSA ("Hierarchical
Probabilistic Latent Semantic Analysis")
680:, a latent class is chosen conditionally to the document according to
17:
1053:, in "Advances in Information Retrieval -- Proceedings of the 24th
1091:
854:
Asymmetric: MASHA ("Multinomial ASymmetric
Hierarchical Analysis")
78:
182:
is a word drawn from the word distribution of this word's topic,
897:. The present terminology was coined in 1999 by Thomas Hofmann.
125:
is a word's topic drawn from the document's topic distribution,
1044:
A Hierarchical Model for
Clustering and Categorising Documents
838:
used in the probabilistic latent semantic analysis has severe
717:, and a word is then generated from that class according to
1042:
Eric
Gaussier, Cyril Goutte, Kris Popat and Francine Chen,
85:
representing the PLSA model ("asymmetric" formulation).
1096:, Proceedings of the Twenty-Second Annual International
992:
Blei, David M.; Andrew Y. Ng; Michael I. Jordan (2003).
291:
Considering observations in the form of co-occurrences
763:
723:
686:
666:
625:
588:
582:
in similar ways (using the conditional probabilities
568:
548:
528:
504:
336:
297:
269:
245:
225:
188:
168:
131:
111:
91:
62:
and downsizes the occurrence tables (usually via a
42:, especially in information retrieval circles) is a
1057:
European
Colloquium on IR Research (ECIR-02)", 2002
804:PLSA may be used in a discriminative setting, via
781:
746:
709:
672:
648:
611:
574:
554:
534:
510:
487:
315:
275:
251:
231:
211:
174:
154:
117:
97:
943:Advances in Neural Information Processing Systems
893:(see references therein), and it is related to
8:
874:prior on the per-document topic distribution
1100:Conference on Research and Development in
762:
757:So, the number of parameters is equal to
733:
722:
696:
685:
665:
656:), whereas the second formulation is the
635:
624:
598:
587:
567:
562:are both generated from the latent class
547:
527:
503:
474:
454:
439:
409:
389:
362:
335:
296:
268:
244:
224:
198:
187:
167:
141:
130:
110:
90:
77:
1138:Statistical natural language processing
928:
792:Their parameters are learned using the
1118:Probabilistic Latent Semantic Analysis
1093:Probabilistic Latent Semantic Indexing
1078:Chris Ding, Tao Li, Wei Peng (2008). "
1066:Chris Ding, Tao Li, Wei Peng (2006). "
660:formulation, where, for each document
36:probabilistic latent semantic indexing
28:Probabilistic latent semantic analysis
1031:Information Processing and Management
1025:Alexei Vinokourov and Mark Girolami,
7:
1001:Journal of Machine Learning Research
25:
895:non-negative matrix factorization
105:is the document index variable,
741:
734:
727:
704:
697:
690:
643:
636:
629:
606:
599:
592:
482:
475:
468:
462:
455:
448:
432:
426:
417:
410:
403:
397:
390:
383:
377:
371:
352:
340:
310:
298:
206:
199:
192:
149:
142:
135:
1:
994:"Latent Dirichlet Allocation"
961:Proceedings of IEEE BIBE 2013
64:singular value decomposition
1013:10.1162/jmlr.2003.3.4-5.993
868:Latent Dirichlet allocation
821:natural language processing
51:, from which PLSA evolved.
1169:
1143:Classification algorithms
969:10.1109/BIBE.2013.6701702
851:Hierarchical extensions:
811:PLSA has applications in
325:multinomial distributions
1123:Complete PLSA DEMO in C#
907:Compound term processing
889:This is an example of a
834:It is reported that the
56:latent semantic analysis
49:latent semantic analysis
1148:Latent variable models
783:
748:
747:{\displaystyle P(w|c)}
711:
710:{\displaystyle P(c|d)}
674:
650:
649:{\displaystyle P(w|c)}
613:
612:{\displaystyle P(d|c)}
576:
556:
536:
512:
489:
317:
288:
277:
253:
233:
213:
212:{\displaystyle P(w|c)}
176:
156:
155:{\displaystyle P(c|d)}
119:
99:
1102:Information Retrieval
831:, and related areas.
813:information retrieval
784:
782:{\displaystyle cd+wc}
749:
712:
675:
651:
614:
577:
557:
537:
513:
490:
318:
316:{\displaystyle (w,d)}
278:
254:
234:
214:
177:
157:
120:
100:
81:
54:Compared to standard
44:statistical technique
923:References and notes
761:
721:
684:
664:
623:
586:
566:
546:
526:
502:
334:
295:
267:
261:observable variables
243:
223:
186:
166:
129:
109:
89:
912:Pachinko allocation
522:formulation, where
1049:2016-03-04 at the
917:Vector space model
891:latent class model
779:
744:
707:
670:
646:
609:
572:
552:
532:
508:
485:
444:
367:
313:
289:
273:
249:
229:
209:
172:
152:
115:
95:
68:latent class model
1153:Language modeling
673:{\displaystyle d}
575:{\displaystyle c}
555:{\displaystyle d}
535:{\displaystyle w}
511:{\displaystyle c}
435:
358:
276:{\displaystyle c}
252:{\displaystyle w}
232:{\displaystyle d}
175:{\displaystyle w}
118:{\displaystyle c}
98:{\displaystyle d}
58:which stems from
34:), also known as
16:(Redirected from
1160:
1105:
1104:(SIGIR-99), 1999
1090:Thomas Hofmann,
1088:
1082:
1076:
1070:
1064:
1058:
1040:
1034:
1023:
1017:
1016:
998:
989:
983:
982:
956:
950:
945:12, pp-914-920,
935:Thomas Hofmann,
933:
825:machine learning
788:
786:
785:
780:
753:
751:
750:
745:
737:
716:
714:
713:
708:
700:
679:
677:
676:
671:
655:
653:
652:
647:
639:
618:
616:
615:
610:
602:
581:
579:
578:
573:
561:
559:
558:
553:
541:
539:
538:
533:
517:
515:
514:
509:
494:
492:
491:
486:
478:
458:
443:
413:
393:
366:
322:
320:
319:
314:
282:
280:
279:
274:
258:
256:
255:
250:
238:
236:
235:
230:
218:
216:
215:
210:
202:
181:
179:
178:
173:
161:
159:
158:
153:
145:
124:
122:
121:
116:
104:
102:
101:
96:
21:
1168:
1167:
1163:
1162:
1161:
1159:
1158:
1157:
1128:
1127:
1114:
1109:
1108:
1089:
1085:
1077:
1073:
1065:
1061:
1051:Wayback Machine
1041:
1037:
1024:
1020:
996:
991:
990:
986:
979:
978:978-147993163-7
958:
957:
953:
934:
930:
925:
903:
887:
848:
802:
759:
758:
719:
718:
682:
681:
662:
661:
621:
620:
584:
583:
564:
563:
544:
543:
524:
523:
500:
499:
332:
331:
293:
292:
285:latent variable
265:
264:
241:
240:
221:
220:
184:
183:
164:
163:
127:
126:
107:
106:
87:
86:
76:
23:
22:
15:
12:
11:
5:
1166:
1164:
1156:
1155:
1150:
1145:
1140:
1130:
1129:
1126:
1125:
1120:
1113:
1112:External links
1110:
1107:
1106:
1083:
1071:
1059:
1035:
1018:
984:
977:
951:
927:
926:
924:
921:
920:
919:
914:
909:
902:
899:
886:
883:
882:
881:
877:
876:
875:
861:
860:
859:
858:
855:
847:
844:
829:bioinformatics
806:Fisher kernels
801:
798:
778:
775:
772:
769:
766:
743:
740:
736:
732:
729:
726:
706:
703:
699:
695:
692:
689:
669:
645:
642:
638:
634:
631:
628:
608:
605:
601:
597:
594:
591:
571:
551:
531:
507:
496:
495:
484:
481:
477:
473:
470:
467:
464:
461:
457:
453:
450:
447:
442:
438:
434:
431:
428:
425:
422:
419:
416:
412:
408:
405:
402:
399:
396:
392:
388:
385:
382:
379:
376:
373:
370:
365:
361:
357:
354:
351:
348:
345:
342:
339:
312:
309:
306:
303:
300:
272:
248:
228:
208:
205:
201:
197:
194:
191:
171:
151:
148:
144:
140:
137:
134:
114:
94:
83:Plate notation
75:
72:
60:linear algebra
24:
14:
13:
10:
9:
6:
4:
3:
2:
1165:
1154:
1151:
1149:
1146:
1144:
1141:
1139:
1136:
1135:
1133:
1124:
1121:
1119:
1116:
1115:
1111:
1103:
1099:
1095:
1094:
1087:
1084:
1081:
1075:
1072:
1069:
1063:
1060:
1056:
1052:
1048:
1045:
1039:
1036:
1032:
1028:
1022:
1019:
1014:
1010:
1006:
1002:
995:
988:
985:
980:
974:
970:
966:
962:
955:
952:
948:
944:
940:
939:
932:
929:
922:
918:
915:
913:
910:
908:
905:
904:
900:
898:
896:
892:
884:
878:
873:
869:
866:
865:
863:
862:
856:
853:
852:
850:
849:
845:
843:
841:
837:
832:
830:
826:
822:
818:
814:
809:
807:
799:
797:
795:
790:
776:
773:
770:
767:
764:
755:
738:
730:
724:
701:
693:
687:
667:
659:
640:
632:
626:
603:
595:
589:
569:
549:
529:
521:
505:
479:
471:
465:
459:
451:
445:
440:
436:
429:
423:
420:
414:
406:
400:
394:
386:
380:
374:
368:
363:
359:
355:
349:
346:
343:
337:
330:
329:
328:
326:
307:
304:
301:
286:
270:
262:
246:
226:
203:
195:
189:
169:
146:
138:
132:
112:
92:
84:
80:
73:
71:
69:
65:
61:
57:
52:
50:
45:
41:
37:
33:
29:
19:
1092:
1086:
1074:
1062:
1038:
1030:
1021:
1007:: 993–1022.
1004:
1000:
987:
960:
954:
937:
931:
888:
836:aspect model
833:
810:
803:
794:EM algorithm
791:
756:
657:
519:
497:
290:
263:, the topic
53:
39:
35:
31:
27:
26:
840:overfitting
827:from text,
800:Application
1132:Categories
846:Extensions
842:problems.
658:asymmetric
947:MIT Press
872:Dirichlet
870:– adds a
817:filtering
520:symmetric
437:∑
360:∑
1055:BCS-IRSG
1047:Archived
901:See also
885:History
219:. The
1033:, 2002
975:
949:, 2000
162:, and
1098:SIGIR
1029:, in
997:(PDF)
498:with
283:is a
74:Model
973:ISBN
815:and
619:and
542:and
259:are
239:and
40:PLSI
32:PLSA
18:PLSA
1009:doi
965:doi
1134::
1003:.
999:.
971:.
941:,
823:,
819:,
808:.
796:.
327::
70:.
1015:.
1011::
1005:3
981:.
967::
777:c
774:w
771:+
768:d
765:c
742:)
739:c
735:|
731:w
728:(
725:P
705:)
702:d
698:|
694:c
691:(
688:P
668:d
644:)
641:c
637:|
633:w
630:(
627:P
607:)
604:c
600:|
596:d
593:(
590:P
570:c
550:d
530:w
506:c
483:)
480:c
476:|
472:w
469:(
466:P
463:)
460:d
456:|
452:c
449:(
446:P
441:c
433:)
430:d
427:(
424:P
421:=
418:)
415:c
411:|
407:w
404:(
401:P
398:)
395:c
391:|
387:d
384:(
381:P
378:)
375:c
372:(
369:P
364:c
356:=
353:)
350:d
347:,
344:w
341:(
338:P
311:)
308:d
305:,
302:w
299:(
287:.
271:c
247:w
227:d
207:)
204:c
200:|
196:w
193:(
190:P
170:w
150:)
147:d
143:|
139:c
136:(
133:P
113:c
93:d
38:(
30:(
20:)
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.