1178:, Most Suitable Sense Annotation (MSSA) labels word-senses through an unsupervised and knowledge-based approach, considering a word's context in a pre-defined sliding window. Once the words are disambiguated, they can be used in a standard word embeddings technique, so multi-sense embeddings are produced. MSSA architecture allows the disambiguation and annotation process to be performed recurrently in a self-improving manner.
1107:, a word embedding toolkit that can train vector space models faster than previous approaches. The word2vec approach has been widely used in experimentation and was instrumental in raising interest for word embeddings as a technology, moving the research strand out of specialised research into broader experimentation and eventually paving the way for practical application.
1357:
News texts (a commonly used data corpus), which consists of text written by professional journalists, still shows disproportionate word associations reflecting gender and racial biases when extracting word analogies. For example, one of the analogies generated using the aforementioned word embedding is “man is to computer programmer as woman is to homemaker”.
1221:
applications have been proposed by Asgari and Mofrad. Named bio-vectors (BioVec) to refer to biological sequences in general with protein-vectors (ProtVec) for proteins (amino-acid sequences) and gene-vectors (GeneVec) for gene sequences, this representation can be widely used in applications of deep
1033:
models have been used as a knowledge representation for some time. Such models aim to quantify and categorize semantic similarities between linguistic items based on their distributional properties in large samples of language data. The underlying idea that "a word is characterized by the company it
1161:
skip-gram, Multi-Sense Skip-Gram (MSSG) performs word-sense discrimination and embedding simultaneously, improving its training time, while assuming a specific number of senses for each word. In the Non-Parametric Multi-Sense Skip-Gram (NP-MSSG) this number can vary depending on each word. Combining
1204:
have been developed. Unlike static word embeddings, these embeddings are at the token-level, in that each occurrence of a word has its own embedding. These embeddings better reflect the multi-sense nature of words, because occurrences of a word in similar contexts are situated in similar regions of
1078:
Word embeddings come in two different styles, one in which words are expressed as vectors of co-occurring words, and another in which words are expressed as vectors of linguistic contexts in which the words occur; these different styles are studied in
Lavelli et al., 2004. Roweis and Saul published
1356:
Word embeddings may contain the biases and stereotypes contained in the trained dataset, as
Bolukbasi et al. points out in the 2016 paper “Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings” that a publicly available (and popular) word2vec embedding trained on Google
1041:
The notion of a semantic space with lexical items (words or multi-word terms) represented as vectors or embeddings is based on the computational challenges of capturing distributional characteristics and using them for practical application to measure similarity between words, phrases, or entire
2655:
Reimers, Nils, and Iryna
Gurevych. "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks." In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3982-3992.
1094:
The approach has been adopted by many research groups after theoretical advances in 2010 had been made on the quality of vectors and the training speed of the model, as well as after hardware advances allowed for a broader parameter space to be explored profitably. In 2013, a team at
1360:
Research done by Jieyu Zhou et al. shows that the applications of these trained word embeddings without careful oversight likely perpetuates existing bias in society, which is introduced through unaltered training data. Furthermore, word embeddings can even amplify these biases .
2177:. Vol. Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Los Angeles, California: Association for Computational Linguistics. pp. 109–117.
1153:
might have. The necessity to accommodate multiple meanings per word in different vectors (multi-sense embeddings) is the motivation for several contributions in NLP to split single-sense embeddings into multi-sense ones.
1070:
A study published in NeurIPS (NIPS) 2002 introduced the use of both word and document embeddings applying the method of kernel CCA to bilingual (and multi-lingual) corpora, also providing an early example of
1066:
et al. provided in a series of papers titled "Neural probabilistic language models" to reduce the high dimensionality of word representations in contexts by "learning a distributed representation for words".
2500:
Reif, Emily, Ann Yuan, Martin
Wattenberg, Fernanda B. Viegas, Andy Coenen, Adam Pearce, and Been Kim. "Visualizing and measuring the geometry of BERT." Advances in Neural Information Processing Systems 32
1883:, Proceedings of the Methods and Applications of Semantic Indexing Workshop at the 7th International Conference on Terminology and Knowledge Engineering, TKE 2005, August 16, Copenhagen, Denmark
1046:
for information retrieval. Such vector space models for words and their distributional data implemented in their simplest form results in a very sparse vector space of high dimensionality (cf.
1230:. The results presented by Asgari and Mofrad suggest that BioVectors can characterize biological sequences in terms of biochemical and biophysical interpretations of the underlying patterns.
877:
2491:
Lucy, Li, and David Bamman. "Characterizing
English variation across social media communities with BERT." Transactions of the Association for Computational Linguistics 9 (2021): 538-556.
975:
vector that encodes the meaning of the word in such a way that the words that are closer in the vector space are expected to be similar in meaning. Word embeddings can be obtained using
915:
1250:
and then using the resulting text to create word embeddings. The results presented by Rabii and Cook suggest that the resulting vectors can capture expert knowledge about games like
2899:
Bolukbasi, Tolga; Chang, Kai-Wei; Zou, James; Saligrama, Venkatesh; Kalai, Adam (2016-07-21). "Man is to
Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings".
3055:
1157:
Most approaches that produce multi-sense embeddings can be divided into two main categories for their word sense representation, i.e., unsupervised and knowledge-based. Based on
3215:
2463:
Proceedings of the 2019 Conference of the North
American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
872:
1826:
862:
2878:
Bolukbasi, Tolga; Chang, Kai-Wei; Zou, James; Saligrama, Venkatesh; Kalai, Adam (2016). "Man is to
Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings".
703:
2249:
Neelakantan, Arvind; Shankar, Jeevan; Passos, Alexandre; McCallum, Andrew (2014). "Efficient Non-parametric
Estimation of Multiple Embeddings per Word in Vector Space".
1584:
910:
3193:
1422:
Mikolov, Tomas; Sutskever, Ilya; Chen, Kai; Corrado, Greg; Dean, Jeffrey (2013). "Distributed
Representations of Words and Phrases and their Compositionality".
1325:
867:
718:
449:
950:
753:
3604:
3048:
2635:
Kiros, Ryan; Zhu, Yukun; Salakhutdinov, Ruslan; Zemel, Richard S.; Torralba, Antonio; Urtasun, Raquel; Fidler, Sanja (2015). "skip-thought vectors".
3773:
829:
1604:
1375:
378:
1957:
Bengio, Yoshua; Schwenk, Holger; Senécal, Jean-Sébastien; Morin, Fréderic; Gauvain, Jean-Luc (2006). "A Neural Probabilistic Language Model".
2794:
2182:
1974:
1406:
1009:
Word and phrase embeddings, when used as the underlying input representation, have been shown to boost the performance in NLP tasks such as
3814:
3514:
3205:
3041:
887:
650:
185:
2228:
Camacho-Collados, Jose; Pilehvar, Mohammad Taher (2018). "From Word to Sense Embeddings: A Survey on Vector Representations of Meaning".
2093:
1119:
is that words with multiple meanings are conflated into a single representation (a single vector in the semantic space). In other words,
3809:
3768:
2921:
1907:
905:
1398:
Speech and language processing : an introduction to natural language processing, computational linguistics, and speech recognition
1087:" (LLE) to discover representations of high dimensional data structures. Most new word embedding techniques after about 2005 rely on a
3819:
3375:
1276:. A more recent and popular approach for representing sentences is Sentence-BERT, or SentenceTransformers, which modifies pre-trained
1084:
1006:, probabilistic models, explainable knowledge base method, and explicit representation in terms of the context in which words appear.
738:
713:
662:
3529:
3360:
1756:
1658:
786:
781:
434:
1830:
1865:
Karlgren, Jussi; Sahlgren, Magnus (2001). Uesaka, Yoshinori; Kanerva, Pentti; Asoh, Hideki (eds.). "From words to understanding".
3300:
444:
82:
2290:
Ruas, Terry; Grosky, William; Aizawa, Akiko (2019-12-01). "Multi-sense embeddings through a word sense disambiguation process".
3717:
3370:
1091:
architecture instead of more probabilistic and algebraic models, after foundational work done by Yoshua Bengio and colleagues.
1855:, Proceedings of the 22nd Annual Conference of the Cognitive Science Society, p. 1036. Mahwah, New Jersey: Erlbaum, 2000.
1469:
839:
3365:
3110:
943:
603:
424:
1991:
1127:
are not handled properly. For example, in the sentence "The club I tried yesterday was great!", it is not clear if the term
1509:
3634:
3355:
814:
516:
292:
3327:
1321:
1051:
771:
708:
618:
596:
439:
429:
3824:
3672:
3657:
3629:
3494:
3489:
3064:
2128:
964:
922:
834:
819:
280:
102:
809:
2848:
1029:, a quantitative methodological approach to understanding meaning in observed language, word embeddings or semantic
3804:
3409:
3380:
3158:
1530:
Qureshi, M. Atif; Greene, Derek (2018-06-04). "EVE: explainable vector based embedding technique using Knowledge".
882:
559:
454:
242:
175:
135:
3252:
3105:
1852:
1175:
995:
936:
542:
310:
180:
3778:
3702:
3434:
3390:
3275:
3173:
1297:
1190:
1072:
1055:
1026:
999:
564:
484:
407:
325:
155:
117:
112:
72:
67:
3682:
3652:
3319:
2033:
Roweis, Sam T.; Saul, Lawrence K. (2000). "Nonlinear Dimensionality Reduction by Locally Linear Embedding".
1637:
Socher, Richard; Perelygin, Alex; Wu, Jean; Chuang, Jason; Manning, Chris; Ng, Andrew; Potts, Chris (2013).
1047:
511:
360:
260:
87:
1328:(t-SNE) are both used to reduce the dimensionality of word vector spaces and visualize word embeddings and
3539:
3232:
3210:
3200:
3168:
3143:
2050:
1182:
1167:
971:
is a representation of a word. The embedding is used in text analysis. Typically, the representation is a
691:
667:
569:
330:
305:
265:
77:
1272:
concept. In 2015, some researchers suggested "skip-thought vectors" as a means to improve the quality of
1246:
using logs of gameplay data. The process requires transcribing actions that occur during a game within a
3399:
1305:
1277:
1268:
The idea has been extended to embeddings of entire sentences or even documents, e.g. in the form of the
1201:
645:
467:
419:
275:
190:
62:
2416:
Li, Jiwei; Jurafsky, Dan (2015). "Do Multi-Sense Embeddings Improve Natural Language Understanding?".
1613:
3752:
3428:
3404:
3257:
2537:
2042:
1491:
Word Embedding Revisited: A New Representation Learning and Explicit Matrix Factorization Perspective
1329:
1186:
1003:
984:
574:
524:
2715:
Pires, Telmo; Schlinger, Eva; Garrette, Dan (2019-06-04). "How multilingual is Multilingual BERT?".
2055:
3732:
3662:
3619:
3575:
3347:
3337:
3332:
3220:
2592:
Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment
1273:
677:
613:
584:
489:
315:
248:
234:
220:
195:
145:
97:
57:
1289:
3742:
3614:
3479:
3242:
3225:
3083:
3019:
2933:
2900:
2879:
2822:
2716:
2681:
2636:
2617:
2527:
2474:
2439:
2421:
2335:
2299:
2272:
2254:
2229:
2076:
1807:
1762:
1565:
1539:
1448:
1423:
1263:
1116:
1043:
1014:
655:
579:
365:
160:
2516:"Continuous Distributed Representation of Biological Sequences for Deep Proteomics and Genomics"
1181:
The use of multi-sense embeddings is known to improve performance in several NLP tasks, such as
1038:, but also has roots in the contemporaneous work on search systems and in cognitive psychology.
2772:"A visualization of evolving clinical sentiment using vector representations of clinical notes"
2014:. 13th ACM International Conference on Information and Knowledge Management. pp. 615–624.
1737:
Salton, Gerard (1962). "Some experiments in the generation of word and document associations".
3747:
3459:
3267:
3178:
3011:
2810:
2790:
2609:
2565:
2376:
2327:
2251:
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)
2210:
2178:
2068:
1970:
1932:
1752:
1557:
1402:
1243:
1010:
748:
591:
504:
300:
270:
215:
210:
165:
107:
3624:
3509:
3484:
3285:
3188:
3001:
2970:
2943:
2800:
2782:
2599:
2555:
2545:
2466:
2431:
2366:
2317:
2309:
2264:
2060:
2015:
1962:
1797:
1789:
1742:
1704:
1549:
1370:
980:
776:
529:
479:
389:
373:
343:
205:
200:
150:
140:
38:
2963:"Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints"
1896:, In Proceedings of the 30th Annual Conference of the Cognitive Science Society: 1300–1305.
1739:
Proceedings of the December 4-6, 1962, fall joint computer conference on - AFIPS '62 (Fall)
3736:
3697:
3692:
3560:
3290:
3163:
3138:
3120:
2834:
2587:
1893:
1780:
Salton, Gerard; Wong, A; Yang, C S (1975). "A Vector Space Model for Automatic Indexing".
1445:
Conference of the European Chapter of the Association for Computational Linguistics (EACL)
1247:
1059:
804:
608:
474:
414:
2114:
Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics
2541:
2106:
2046:
3424:
3148:
2805:
2560:
2515:
2420:. Stroudsburg, PA, USA: Association for Computational Linguistics. pp. 1722–1732.
2253:. Stroudsburg, PA, USA: Association for Computational Linguistics. pp. 1059–1069.
1317:
1269:
1218:
1088:
976:
824:
355:
92:
2967:
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
2418:
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing
1638:
3798:
3707:
3519:
3499:
3280:
3023:
2621:
1928:
1345:
1139:
1133:
1100:
1063:
1030:
743:
672:
554:
285:
170:
2478:
2339:
2276:
1569:
3687:
2961:
Zhao, Jieyu; Wang, Tianlu; Yatskar, Mark; Ordonez, Vicente; Chang, Kai-Wei (2017).
2680:
Zhao, Jieyu; et al. (2018) (2018). "Learning Gender-Neutral Word Embeddings".
2443:
2080:
1993:
Inferring a semantic representation of text via cross-language correlation analysis
1811:
1766:
1489:
1341:
988:
2405:. Santa Fe, New Mexico, USA: Association for Computational Linguistics: 1638–1649.
2064:
2771:
2550:
3644:
3524:
3237:
3153:
3130:
3078:
2355:"Word Sense Disambiguation Studio: A Flexible System for WSD Feature Extraction"
1443:
Lebret, RĂ©mi; Collobert, Ronan (2013). "Word Emdeddings through Hellinger PCA".
1239:
1035:
972:
549:
43:
17:
3006:
2989:
2313:
3247:
3033:
2604:
2457:
Devlin, Jacob; Chang, Ming-Wei; Lee, Kenton; Toutanova, Kristina (June 2019).
2207:
Improving word representations via global context and multiple word prototypes
2146:
1553:
1223:
698:
394:
320:
3015:
2786:
2613:
2403:
Proceedings of the 27th International Conference on Computational Linguistics
2380:
2331:
2214:
2172:
1640:
Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
1561:
3115:
2398:
2019:
1966:
1880:
1747:
1145:
1115:
Historically, one of the main limitations of static word embeddings or word
1050:). Reducing the number of dimensions using linear algebraic methods such as
857:
638:
2974:
2814:
2569:
2470:
2435:
2116:. Proceedings of Machine Learning Research. Vol. R5. pp. 246–252.
2072:
1708:
1396:
2666:
2322:
2268:
1793:
1695:
Luhn, H.P. (1953). "A New Method of Recording and Searching Information".
3590:
3570:
3555:
3534:
3504:
3449:
3414:
3295:
2947:
2856:
2371:
2354:
1340:
For instance, the fastText is also used to calculate word embeddings for
1309:
1293:
1227:
1171:
1158:
1120:
1104:
3727:
3585:
3565:
3439:
3183:
3098:
1603:
Socher, Richard; Bauer, John; Manning, Christopher; Ng, Andrew (2013).
1163:
1124:
633:
3093:
3088:
2962:
2755:
2458:
1802:
1313:
1096:
983:
techniques, where words or phrases from the vocabulary are mapped to
384:
2750:
2938:
2905:
2884:
2721:
2686:
2641:
2532:
2426:
2304:
2259:
2234:
1586:
Linguistic Regularities in Sparse and Explicit Word Representations
1544:
3783:
3419:
2736:
1999:. Advances in Neural Information Processing Systems. Vol. 15.
1990:
Vinkourov, Alexei; Cristianini, Nello; Shawe-Taylor, John (2002).
1453:
1428:
1251:
628:
623:
350:
2990:"Word embeddings are biased. But whose bias are they reflecting?"
1672:
Firth, J.R. (1957). "A synopsis of linguistic theory 1930–1955".
1196:
As of the late 2010s, contextually-meaningful embeddings such as
3305:
2010:
Lavelli, Alberto; Sebastiani, Fabrizio; Zanoli, Roberto (2004).
1301:
1217:
grams in biological sequences (e.g. DNA, RNA, and Proteins) for
1197:
1042:
documents. The first generation of semantic space models is the
3037:
2353:
Agre, Gennady; Petrov, Daniel; Keskinova, Simona (2019-03-01).
2012:
Distributional term representations: an experimental comparison
1931:; Ducharme, RĂ©jean; Vincent, Pascal; Jauvin, Christian (2003).
3580:
2920:
Dieng, Adji B.; Ruiz, Francisco J. R.; Blei, David M. (2020).
2588:"Revealing Game Dynamics via Word Embeddings of Gameplay Data"
1062:
approach for collecting word co-occurrence contexts. In 2000,
2926:
Transactions of the Association for Computational Linguistics
1851:
Kanerva, Pentti, Kristoferson, Jan and Holst, Anders (2000):
2701:
1853:
Random Indexing of Text Samples for Latent Semantic Analysis
2107:"Hierarchical probabilistic neural network language model"
1906:
Bengio, Yoshua; RĂ©jean, Ducharme; Pascal, Vincent (2000).
1892:
Sahlgren, Magnus, Holst, Anders and Pentti Kanerva (2008)
1242:
have been proposed by Rabii and Cook as a way to discover
1288:
Software for training and using word embeddings includes
2770:
Ghassemi, Mohammad; Mark, Roger; Nemati, Shamim (2015).
2465:. Association for Computational Linguistics: 4171–4186.
1280:
with the use of siamese and triplet network structures.
916:
List of datasets in computer vision and image processing
2397:
Akbik, Alan; Blythe, Duncan; Vollgraf, Roland (2018).
1827:"The most influential paper Gerard Salton never wrote"
1471:
Neural Word Embedding as Implicit Matrix Factorization
2135:. 21 (NIPS 2008). Curran Associates, Inc.: 1081–1088.
1894:
Permutations as a Means to Encode Order in Word Space
2399:"Contextual String Embeddings for Sequence Labeling"
2129:"A Scalable Hierarchical Distributed Language Model"
1497:. Int'l J. Conf. on Artificial Intelligence (IJCAI).
1254:
that are not explicitly stated in the game's rules.
3761:
3716:
3671:
3643:
3603:
3548:
3470:
3458:
3389:
3346:
3318:
3266:
3129:
3071:
2174:
Multi-Prototype Vector-Space Models of Word Meaning
2112:. In Cowell, Robert G.; Ghahramani, Zoubin (eds.).
1722:Osgood, C.E.; Suci, G.J.; Tannenbaum, P.H. (1957).
2988:Petreski, Davor; Hashim, Ibrahim C. (2022-05-26).
2514:Asgari, Ehsaneddin; Mofrad, Mohammad R.K. (2015).
2459:"Proceedings of the 2019 Conference of the North"
2133:Advances in Neural Information Processing Systems
1162:the prior knowledge of lexical databases (e.g.,
2779:2015 Computing in Cardiology Conference (CinC)
2171:Reisinger, Joseph; Mooney, Raymond J. (2010).
911:List of datasets for machine-learning research
3049:
1961:. Vol. 194. Springer. pp. 137–186.
944:
8:
1326:T-Distributed Stochastic Neighbour Embedding
2586:Rabii, Younès; Cook, Michael (2021-10-04).
1510:"Euclidean Embedding of Co-occurrence Data"
1401:. Upper Saddle River, N.J.: Prentice Hall.
1395:Jurafsky, Daniel; H. James, Martin (2000).
3467:
3263:
3056:
3042:
3034:
1606:Parsing with compositional vector grammars
1532:Journal of Intelligent Information Systems
951:
937:
29:
3005:
2937:
2904:
2883:
2804:
2720:
2685:
2640:
2603:
2559:
2549:
2531:
2425:
2370:
2321:
2303:
2258:
2233:
2054:
1801:
1746:
1543:
1452:
1427:
1300:, GN-GloVe, Flair embeddings, AllenNLP's
1034:keeps" was proposed in a 1957 article by
994:Methods to generate this mapping include
2127:Mnih, Andriy; Hinton, Geoffrey (2009).
2105:Morin, Fredric; Bengio, Yoshua (2005).
1959:Studies in Fuzziness and Soft Computing
1933:"A Neural Probabilistic Language Model"
1908:"A Neural Probabilistic Language Model"
1682:Selected Papers of J.R. Firth 1952–1959
1387:
1021:Development and history of the approach
37:
2830:
2820:
1867:Foundations of Real-World Intelligence
2581:
2579:
2509:
2507:
2392:
2390:
1238:Word embeddings with applications in
27:Method in natural language processing
7:
3515:Simple Knowledge Organization System
2922:"Topic Modeling in Embedding Spaces"
2781:. Vol. 2015. pp. 629–632.
1940:Journal of Machine Learning Research
1659:"A brief history of word embeddings"
1517:Journal of Machine Learning Research
1209:For biological sequences: BioVectors
1185:, semantic relation identification,
2855:. Lexical Computing. Archived from
1583:Levy, Omer; Goldberg, Yoav (2014).
1468:Levy, Omer; Goldberg, Yoav (2014).
906:Glossary of artificial intelligence
1881:An Introduction to Random Indexing
1376:Distributional–relational database
1131:is related to the word sense of a
25:
3530:Thesaurus (information retrieval)
2292:Expert Systems with Applications
1612:. Proc. ACL Conf. Archived from
1054:then led to the introduction of
1726:. University of Illinois Press.
3111:Natural language understanding
1674:Studies in Linguistic Analysis
326:Relevance vector machine (RVM)
1:
3635:Optical character recognition
2065:10.1126/science.290.5500.2323
1869:. CSLI Publications: 294–308.
1488:Li, Yitan; Xu, Linli (2015).
815:Computational learning theory
379:Expectation–maximization (EM)
3328:Multi-document summarization
2551:10.1371/journal.pone.0141287
1322:Principal Component Analysis
1052:singular value decomposition
772:Coefficient of determination
619:Convolutional neural network
331:Support vector machine (SVM)
3815:Natural language processing
3658:Latent Dirichlet allocation
3630:Natural language generation
3495:Machine-readable dictionary
3490:Linguistic Linked Open Data
3065:Natural language processing
1348:that are available online.
965:natural language processing
923:Outline of machine learning
820:Empirical risk minimization
3841:
3810:Artificial neural networks
3410:Explicit semantic analysis
3159:Deep linguistic processing
3007:10.1007/s00146-022-01443-w
2314:10.1016/j.eswa.2019.06.026
1724:The Measurement of Meaning
1592:. CoNLL. pp. 171–180.
1261:
1149:, or any other sense that
1058:in the late 1980s and the
560:Feedforward neural network
311:Artificial neural networks
3820:Computational linguistics
3253:Word-sense disambiguation
3106:Computational linguistics
2605:10.1609/aiide.v17i1.18907
1782:Communications of the ACM
1680:F.R. Palmer, ed. (1968).
1554:10.1007/s10844-018-0511-x
1176:word sense disambiguation
543:Artificial neural network
3779:Natural Language Toolkit
3703:Pronunciation assessment
3605:Automatic identification
3435:Latent semantic analysis
3391:Distributional semantics
3276:Compound-term processing
3174:Named-entity recognition
2787:10.1109/CIC.2015.7410989
1879:Sahlgren, Magnus (2005)
1508:Globerson, Amir (2007).
1296:, Stanford University's
1205:BERT’s embedding space.
1193:and sentiment analysis.
1191:named entity recognition
1085:locally linear embedding
1073:self-supervised learning
1056:latent semantic analysis
1027:distributional semantics
1000:dimensionality reduction
852:Journals and conferences
799:Mathematical foundations
709:Temporal difference (TD)
565:Recurrent neural network
485:Conditional random field
408:Dimensionality reduction
156:Dimensionality reduction
118:Quantum machine learning
113:Neuromorphic engineering
73:Self-supervised learning
68:Semi-supervised learning
3683:Automated essay scoring
3653:Document classification
3320:Automatic summarization
2020:10.1145/1031171.1031284
1967:10.1007/3-540-33486-6_6
1748:10.1145/1461518.1461544
1336:Examples of application
1174:), word embeddings and
1048:curse of dimensionality
261:Apprenticeship learning
3540:Universal Dependencies
3233:Terminology extraction
3216:Semantic decomposition
3211:Semantic role labeling
3201:Part-of-speech tagging
3169:Information extraction
3154:Coreference resolution
3144:Collocation extraction
2969:. pp. 2979–2989.
1709:10.1002/asi.5090040104
1697:American Documentation
1183:part-of-speech tagging
810:Bias–variance tradeoff
692:Reinforcement learning
668:Spiking neural network
78:Reinforcement learning
3301:Sentence segmentation
2205:Huang, Eric. (2012).
1825:Dubin, David (2004).
1794:10.1145/361219.361220
1111:Polysemy and homonymy
646:Neural radiance field
468:Structured prediction
191:Structured prediction
63:Unsupervised learning
3753:Voice user interface
3464:datasets and corpora
3405:Document-term matrix
3258:Word-sense induction
2975:10.18653/v1/D17-1323
2948:10.1162/tacl_a_00325
2471:10.18653/v1/N19-1423
2436:10.18653/v1/d15-1200
2372:10.3390/info10030097
1741:. pp. 234–250.
1352:Ethical implications
1213:Word embeddings for
1187:semantic relatedness
1004:co-occurrence matrix
835:Statistical learning
733:Learning with humans
525:Local outlier factor
3733:Interactive fiction
3663:Pachinko allocation
3620:Speech segmentation
3576:Google Ngram Viewer
3348:Machine translation
3338:Text simplification
3333:Sentence extraction
3221:Semantic similarity
2542:2015PLoSO..1041287A
2269:10.3115/v1/d14-1113
2151:Google Code Archive
2047:2000Sci...290.2323R
1274:machine translation
1258:Sentence embeddings
1117:vector space models
1075:of word embeddings
678:Electrochemical RAM
585:reservoir computing
316:Logistic regression
235:Supervised learning
221:Multimodal learning
196:Feature engineering
141:Generative modeling
103:Rule-based learning
98:Curriculum learning
58:Supervised learning
33:Part of a series on
3825:Semantic relations
3743:Question answering
3615:Speech recognition
3480:Corpus linguistics
3460:Language resources
3243:Textual entailment
3226:Sentiment analysis
2859:on 8 February 2018
2849:"Embedding Viewer"
1833:on 18 October 2020
1684:. London: Longman.
1657:Sahlgren, Magnus.
1447:. Vol. 2014.
1264:Sentence embedding
1044:vector space model
1015:sentiment analysis
246: •
161:Density estimation
3805:Language modeling
3792:
3791:
3748:Virtual assistant
3673:Computer-assisted
3599:
3598:
3356:Computer-assisted
3314:
3313:
3306:Word segmentation
3268:Text segmentation
3206:Semantic analysis
3194:Syntactic parsing
3179:Ontology learning
2796:978-1-5090-0685-4
2184:978-1-932432-65-7
1976:978-3-540-30609-2
1408:978-0-13-095069-7
1244:emergent gameplay
1036:John Rupert Firth
1011:syntactic parsing
977:language modeling
961:
960:
766:Model diagnostics
749:Human-in-the-loop
592:Boltzmann machine
505:Anomaly detection
301:Linear regression
216:Ontology learning
211:Grammar induction
186:Semantic analysis
181:Association rules
166:Anomaly detection
108:Neuro-symbolic AI
16:(Redirected from
3832:
3769:Formal semantics
3718:Natural language
3625:Speech synthesis
3607:and data capture
3510:Semantic network
3485:Lexical resource
3468:
3286:Lexical analysis
3264:
3189:Semantic parsing
3058:
3051:
3044:
3035:
3028:
3027:
3009:
2994:AI & Society
2985:
2979:
2978:
2958:
2952:
2951:
2941:
2917:
2911:
2910:
2908:
2896:
2890:
2889:
2887:
2875:
2869:
2868:
2866:
2864:
2853:Embedding Viewer
2845:
2839:
2838:
2832:
2828:
2826:
2818:
2808:
2776:
2767:
2761:
2760:
2747:
2741:
2740:
2733:
2727:
2726:
2724:
2712:
2706:
2705:
2698:
2692:
2691:
2689:
2677:
2671:
2670:
2663:
2657:
2653:
2647:
2646:
2644:
2632:
2626:
2625:
2607:
2583:
2574:
2573:
2563:
2553:
2535:
2526:(11): e0141287.
2511:
2502:
2498:
2492:
2489:
2483:
2482:
2454:
2448:
2447:
2429:
2413:
2407:
2406:
2394:
2385:
2384:
2374:
2350:
2344:
2343:
2325:
2307:
2287:
2281:
2280:
2262:
2246:
2240:
2239:
2237:
2225:
2219:
2218:
2202:
2196:
2195:
2193:
2191:
2168:
2162:
2161:
2159:
2157:
2143:
2137:
2136:
2124:
2118:
2117:
2111:
2102:
2096:
2091:
2085:
2084:
2058:
2041:(5500): 2323–6.
2030:
2024:
2023:
2007:
2001:
2000:
1998:
1987:
1981:
1980:
1954:
1948:
1947:
1937:
1925:
1919:
1918:
1912:
1903:
1897:
1890:
1884:
1877:
1871:
1870:
1862:
1856:
1849:
1843:
1842:
1840:
1838:
1829:. Archived from
1822:
1816:
1815:
1805:
1777:
1771:
1770:
1750:
1734:
1728:
1727:
1719:
1713:
1712:
1692:
1686:
1685:
1677:
1669:
1663:
1662:
1654:
1648:
1647:
1645:
1634:
1628:
1627:
1625:
1624:
1618:
1611:
1600:
1594:
1593:
1591:
1580:
1574:
1573:
1547:
1527:
1521:
1520:
1514:
1505:
1499:
1498:
1496:
1485:
1479:
1478:
1476:
1465:
1459:
1458:
1456:
1440:
1434:
1433:
1431:
1419:
1413:
1412:
1392:
1371:Brown clustering
981:feature learning
953:
946:
939:
900:Related articles
777:Confusion matrix
530:Isolation forest
475:Graphical models
254:
253:
206:Learning to rank
201:Feature learning
39:Machine learning
30:
21:
18:Vector embedding
3840:
3839:
3835:
3834:
3833:
3831:
3830:
3829:
3795:
3794:
3793:
3788:
3757:
3737:Syntax guessing
3719:
3712:
3698:Predictive text
3693:Grammar checker
3674:
3667:
3639:
3606:
3595:
3561:Bank of English
3544:
3472:
3463:
3454:
3385:
3342:
3310:
3262:
3164:Distant reading
3139:Argument mining
3125:
3121:Text processing
3067:
3062:
3032:
3031:
2987:
2986:
2982:
2960:
2959:
2955:
2919:
2918:
2914:
2898:
2897:
2893:
2877:
2876:
2872:
2862:
2860:
2847:
2846:
2842:
2829:
2819:
2797:
2774:
2769:
2768:
2764:
2749:
2748:
2744:
2735:
2734:
2730:
2714:
2713:
2709:
2700:
2699:
2695:
2679:
2678:
2674:
2665:
2664:
2660:
2654:
2650:
2634:
2633:
2629:
2585:
2584:
2577:
2513:
2512:
2505:
2499:
2495:
2490:
2486:
2456:
2455:
2451:
2415:
2414:
2410:
2396:
2395:
2388:
2352:
2351:
2347:
2289:
2288:
2284:
2248:
2247:
2243:
2227:
2226:
2222:
2204:
2203:
2199:
2189:
2187:
2185:
2170:
2169:
2165:
2155:
2153:
2145:
2144:
2140:
2126:
2125:
2121:
2109:
2104:
2103:
2099:
2094:he:יהושע ×‘× ×’'יו
2092:
2088:
2056:10.1.1.111.3313
2032:
2031:
2027:
2009:
2008:
2004:
1996:
1989:
1988:
1984:
1977:
1956:
1955:
1951:
1935:
1927:
1926:
1922:
1910:
1905:
1904:
1900:
1891:
1887:
1878:
1874:
1864:
1863:
1859:
1850:
1846:
1836:
1834:
1824:
1823:
1819:
1788:(11): 613–620.
1779:
1778:
1774:
1759:
1736:
1735:
1731:
1721:
1720:
1716:
1694:
1693:
1689:
1679:
1671:
1670:
1666:
1656:
1655:
1651:
1643:
1636:
1635:
1631:
1622:
1620:
1616:
1609:
1602:
1601:
1597:
1589:
1582:
1581:
1577:
1529:
1528:
1524:
1512:
1507:
1506:
1502:
1494:
1487:
1486:
1482:
1474:
1467:
1466:
1462:
1442:
1441:
1437:
1421:
1420:
1416:
1409:
1394:
1393:
1389:
1384:
1367:
1354:
1338:
1286:
1270:thought vectors
1266:
1260:
1248:formal language
1236:
1211:
1113:
1060:random indexing
1023:
996:neural networks
957:
928:
927:
901:
893:
892:
853:
845:
844:
805:Kernel machines
800:
792:
791:
767:
759:
758:
739:Active learning
734:
726:
725:
694:
684:
683:
609:Diffusion model
545:
535:
534:
507:
497:
496:
470:
460:
459:
415:Factor analysis
410:
400:
399:
383:
346:
336:
335:
256:
255:
239:
238:
237:
226:
225:
131:
123:
122:
88:Online learning
53:
41:
28:
23:
22:
15:
12:
11:
5:
3838:
3836:
3828:
3827:
3822:
3817:
3812:
3807:
3797:
3796:
3790:
3789:
3787:
3786:
3781:
3776:
3771:
3765:
3763:
3759:
3758:
3756:
3755:
3750:
3745:
3740:
3730:
3724:
3722:
3720:user interface
3714:
3713:
3711:
3710:
3705:
3700:
3695:
3690:
3685:
3679:
3677:
3669:
3668:
3666:
3665:
3660:
3655:
3649:
3647:
3641:
3640:
3638:
3637:
3632:
3627:
3622:
3617:
3611:
3609:
3601:
3600:
3597:
3596:
3594:
3593:
3588:
3583:
3578:
3573:
3568:
3563:
3558:
3552:
3550:
3546:
3545:
3543:
3542:
3537:
3532:
3527:
3522:
3517:
3512:
3507:
3502:
3497:
3492:
3487:
3482:
3476:
3474:
3465:
3456:
3455:
3453:
3452:
3447:
3445:Word embedding
3442:
3437:
3432:
3425:Language model
3422:
3417:
3412:
3407:
3402:
3396:
3394:
3387:
3386:
3384:
3383:
3378:
3376:Transfer-based
3373:
3368:
3363:
3358:
3352:
3350:
3344:
3343:
3341:
3340:
3335:
3330:
3324:
3322:
3316:
3315:
3312:
3311:
3309:
3308:
3303:
3298:
3293:
3288:
3283:
3278:
3272:
3270:
3261:
3260:
3255:
3250:
3245:
3240:
3235:
3229:
3228:
3223:
3218:
3213:
3208:
3203:
3198:
3197:
3196:
3191:
3181:
3176:
3171:
3166:
3161:
3156:
3151:
3149:Concept mining
3146:
3141:
3135:
3133:
3127:
3126:
3124:
3123:
3118:
3113:
3108:
3103:
3102:
3101:
3096:
3086:
3081:
3075:
3073:
3069:
3068:
3063:
3061:
3060:
3053:
3046:
3038:
3030:
3029:
3000:(2): 975–982.
2980:
2953:
2912:
2891:
2870:
2840:
2831:|journal=
2795:
2762:
2742:
2728:
2707:
2693:
2672:
2658:
2648:
2627:
2598:(1): 187–194.
2575:
2503:
2493:
2484:
2449:
2408:
2386:
2345:
2323:2027.42/145475
2282:
2241:
2220:
2197:
2183:
2163:
2138:
2119:
2097:
2086:
2025:
2002:
1982:
1975:
1949:
1929:Bengio, Yoshua
1920:
1898:
1885:
1872:
1857:
1844:
1817:
1772:
1757:
1729:
1714:
1687:
1664:
1649:
1629:
1595:
1575:
1522:
1500:
1480:
1460:
1435:
1414:
1407:
1386:
1385:
1383:
1380:
1379:
1378:
1373:
1366:
1363:
1353:
1350:
1337:
1334:
1318:Deeplearning4j
1285:
1282:
1262:Main article:
1259:
1256:
1235:
1232:
1219:bioinformatics
1210:
1207:
1112:
1109:
1089:neural network
1022:
1019:
969:word embedding
959:
958:
956:
955:
948:
941:
933:
930:
929:
926:
925:
920:
919:
918:
908:
902:
899:
898:
895:
894:
891:
890:
885:
880:
875:
870:
865:
860:
854:
851:
850:
847:
846:
843:
842:
837:
832:
827:
825:Occam learning
822:
817:
812:
807:
801:
798:
797:
794:
793:
790:
789:
784:
782:Learning curve
779:
774:
768:
765:
764:
761:
760:
757:
756:
751:
746:
741:
735:
732:
731:
728:
727:
724:
723:
722:
721:
711:
706:
701:
695:
690:
689:
686:
685:
682:
681:
675:
670:
665:
660:
659:
658:
648:
643:
642:
641:
636:
631:
626:
616:
611:
606:
601:
600:
599:
589:
588:
587:
582:
577:
572:
562:
557:
552:
546:
541:
540:
537:
536:
533:
532:
527:
522:
514:
508:
503:
502:
499:
498:
495:
494:
493:
492:
487:
482:
471:
466:
465:
462:
461:
458:
457:
452:
447:
442:
437:
432:
427:
422:
417:
411:
406:
405:
402:
401:
398:
397:
392:
387:
381:
376:
371:
363:
358:
353:
347:
342:
341:
338:
337:
334:
333:
328:
323:
318:
313:
308:
303:
298:
290:
289:
288:
283:
278:
268:
266:Decision trees
263:
257:
243:classification
233:
232:
231:
228:
227:
224:
223:
218:
213:
208:
203:
198:
193:
188:
183:
178:
173:
168:
163:
158:
153:
148:
143:
138:
136:Classification
132:
129:
128:
125:
124:
121:
120:
115:
110:
105:
100:
95:
93:Batch learning
90:
85:
80:
75:
70:
65:
60:
54:
51:
50:
47:
46:
35:
34:
26:
24:
14:
13:
10:
9:
6:
4:
3:
2:
3837:
3826:
3823:
3821:
3818:
3816:
3813:
3811:
3808:
3806:
3803:
3802:
3800:
3785:
3782:
3780:
3777:
3775:
3774:Hallucination
3772:
3770:
3767:
3766:
3764:
3760:
3754:
3751:
3749:
3746:
3744:
3741:
3738:
3734:
3731:
3729:
3726:
3725:
3723:
3721:
3715:
3709:
3708:Spell checker
3706:
3704:
3701:
3699:
3696:
3694:
3691:
3689:
3686:
3684:
3681:
3680:
3678:
3676:
3670:
3664:
3661:
3659:
3656:
3654:
3651:
3650:
3648:
3646:
3642:
3636:
3633:
3631:
3628:
3626:
3623:
3621:
3618:
3616:
3613:
3612:
3610:
3608:
3602:
3592:
3589:
3587:
3584:
3582:
3579:
3577:
3574:
3572:
3569:
3567:
3564:
3562:
3559:
3557:
3554:
3553:
3551:
3547:
3541:
3538:
3536:
3533:
3531:
3528:
3526:
3523:
3521:
3520:Speech corpus
3518:
3516:
3513:
3511:
3508:
3506:
3503:
3501:
3500:Parallel text
3498:
3496:
3493:
3491:
3488:
3486:
3483:
3481:
3478:
3477:
3475:
3469:
3466:
3461:
3457:
3451:
3448:
3446:
3443:
3441:
3438:
3436:
3433:
3430:
3426:
3423:
3421:
3418:
3416:
3413:
3411:
3408:
3406:
3403:
3401:
3398:
3397:
3395:
3392:
3388:
3382:
3379:
3377:
3374:
3372:
3369:
3367:
3364:
3362:
3361:Example-based
3359:
3357:
3354:
3353:
3351:
3349:
3345:
3339:
3336:
3334:
3331:
3329:
3326:
3325:
3323:
3321:
3317:
3307:
3304:
3302:
3299:
3297:
3294:
3292:
3291:Text chunking
3289:
3287:
3284:
3282:
3281:Lemmatisation
3279:
3277:
3274:
3273:
3271:
3269:
3265:
3259:
3256:
3254:
3251:
3249:
3246:
3244:
3241:
3239:
3236:
3234:
3231:
3230:
3227:
3224:
3222:
3219:
3217:
3214:
3212:
3209:
3207:
3204:
3202:
3199:
3195:
3192:
3190:
3187:
3186:
3185:
3182:
3180:
3177:
3175:
3172:
3170:
3167:
3165:
3162:
3160:
3157:
3155:
3152:
3150:
3147:
3145:
3142:
3140:
3137:
3136:
3134:
3132:
3131:Text analysis
3128:
3122:
3119:
3117:
3114:
3112:
3109:
3107:
3104:
3100:
3097:
3095:
3092:
3091:
3090:
3087:
3085:
3082:
3080:
3077:
3076:
3074:
3072:General terms
3070:
3066:
3059:
3054:
3052:
3047:
3045:
3040:
3039:
3036:
3025:
3021:
3017:
3013:
3008:
3003:
2999:
2995:
2991:
2984:
2981:
2976:
2972:
2968:
2964:
2957:
2954:
2949:
2945:
2940:
2935:
2931:
2927:
2923:
2916:
2913:
2907:
2902:
2895:
2892:
2886:
2881:
2874:
2871:
2858:
2854:
2850:
2844:
2841:
2836:
2824:
2816:
2812:
2807:
2802:
2798:
2792:
2788:
2784:
2780:
2773:
2766:
2763:
2759:. 2018-10-25.
2758:
2757:
2752:
2746:
2743:
2738:
2732:
2729:
2723:
2718:
2711:
2708:
2703:
2697:
2694:
2688:
2683:
2676:
2673:
2668:
2662:
2659:
2652:
2649:
2643:
2638:
2631:
2628:
2623:
2619:
2615:
2611:
2606:
2601:
2597:
2593:
2589:
2582:
2580:
2576:
2571:
2567:
2562:
2557:
2552:
2547:
2543:
2539:
2534:
2529:
2525:
2521:
2517:
2510:
2508:
2504:
2497:
2494:
2488:
2485:
2480:
2476:
2472:
2468:
2464:
2460:
2453:
2450:
2445:
2441:
2437:
2433:
2428:
2423:
2419:
2412:
2409:
2404:
2400:
2393:
2391:
2387:
2382:
2378:
2373:
2368:
2364:
2360:
2356:
2349:
2346:
2341:
2337:
2333:
2329:
2324:
2319:
2315:
2311:
2306:
2301:
2297:
2293:
2286:
2283:
2278:
2274:
2270:
2266:
2261:
2256:
2252:
2245:
2242:
2236:
2231:
2224:
2221:
2216:
2212:
2208:
2201:
2198:
2186:
2180:
2176:
2175:
2167:
2164:
2152:
2148:
2142:
2139:
2134:
2130:
2123:
2120:
2115:
2108:
2101:
2098:
2095:
2090:
2087:
2082:
2078:
2074:
2070:
2066:
2062:
2057:
2052:
2048:
2044:
2040:
2036:
2029:
2026:
2021:
2017:
2013:
2006:
2003:
1995:
1994:
1986:
1983:
1978:
1972:
1968:
1964:
1960:
1953:
1950:
1945:
1941:
1934:
1930:
1924:
1921:
1916:
1909:
1902:
1899:
1895:
1889:
1886:
1882:
1876:
1873:
1868:
1861:
1858:
1854:
1848:
1845:
1832:
1828:
1821:
1818:
1813:
1809:
1804:
1799:
1795:
1791:
1787:
1783:
1776:
1773:
1768:
1764:
1760:
1758:9781450378796
1754:
1749:
1744:
1740:
1733:
1730:
1725:
1718:
1715:
1710:
1706:
1702:
1698:
1691:
1688:
1683:
1678:Reprinted in
1675:
1668:
1665:
1660:
1653:
1650:
1642:
1641:
1633:
1630:
1619:on 2016-08-11
1615:
1608:
1607:
1599:
1596:
1588:
1587:
1579:
1576:
1571:
1567:
1563:
1559:
1555:
1551:
1546:
1541:
1537:
1533:
1526:
1523:
1518:
1511:
1504:
1501:
1493:
1492:
1484:
1481:
1473:
1472:
1464:
1461:
1455:
1450:
1446:
1439:
1436:
1430:
1425:
1418:
1415:
1410:
1404:
1400:
1399:
1391:
1388:
1381:
1377:
1374:
1372:
1369:
1368:
1364:
1362:
1358:
1351:
1349:
1347:
1346:Sketch Engine
1343:
1335:
1333:
1331:
1327:
1323:
1319:
1316:, Indra, and
1315:
1311:
1307:
1303:
1299:
1295:
1291:
1290:Tomáš Mikolov
1283:
1281:
1279:
1275:
1271:
1265:
1257:
1255:
1253:
1249:
1245:
1241:
1233:
1231:
1229:
1225:
1220:
1216:
1208:
1206:
1203:
1199:
1194:
1192:
1188:
1184:
1179:
1177:
1173:
1169:
1165:
1160:
1155:
1152:
1148:
1147:
1142:
1141:
1136:
1135:
1134:club sandwich
1130:
1126:
1122:
1118:
1110:
1108:
1106:
1102:
1101:Tomas Mikolov
1098:
1092:
1090:
1086:
1082:
1076:
1074:
1068:
1065:
1061:
1057:
1053:
1049:
1045:
1039:
1037:
1032:
1031:feature space
1028:
1020:
1018:
1016:
1012:
1007:
1005:
1001:
997:
992:
990:
986:
982:
978:
974:
970:
966:
954:
949:
947:
942:
940:
935:
934:
932:
931:
924:
921:
917:
914:
913:
912:
909:
907:
904:
903:
897:
896:
889:
886:
884:
881:
879:
876:
874:
871:
869:
866:
864:
861:
859:
856:
855:
849:
848:
841:
838:
836:
833:
831:
828:
826:
823:
821:
818:
816:
813:
811:
808:
806:
803:
802:
796:
795:
788:
785:
783:
780:
778:
775:
773:
770:
769:
763:
762:
755:
752:
750:
747:
745:
744:Crowdsourcing
742:
740:
737:
736:
730:
729:
720:
717:
716:
715:
712:
710:
707:
705:
702:
700:
697:
696:
693:
688:
687:
679:
676:
674:
673:Memtransistor
671:
669:
666:
664:
661:
657:
654:
653:
652:
649:
647:
644:
640:
637:
635:
632:
630:
627:
625:
622:
621:
620:
617:
615:
612:
610:
607:
605:
602:
598:
595:
594:
593:
590:
586:
583:
581:
578:
576:
573:
571:
568:
567:
566:
563:
561:
558:
556:
555:Deep learning
553:
551:
548:
547:
544:
539:
538:
531:
528:
526:
523:
521:
519:
515:
513:
510:
509:
506:
501:
500:
491:
490:Hidden Markov
488:
486:
483:
481:
478:
477:
476:
473:
472:
469:
464:
463:
456:
453:
451:
448:
446:
443:
441:
438:
436:
433:
431:
428:
426:
423:
421:
418:
416:
413:
412:
409:
404:
403:
396:
393:
391:
388:
386:
382:
380:
377:
375:
372:
370:
368:
364:
362:
359:
357:
354:
352:
349:
348:
345:
340:
339:
332:
329:
327:
324:
322:
319:
317:
314:
312:
309:
307:
304:
302:
299:
297:
295:
291:
287:
286:Random forest
284:
282:
279:
277:
274:
273:
272:
269:
267:
264:
262:
259:
258:
251:
250:
245:
244:
236:
230:
229:
222:
219:
217:
214:
212:
209:
207:
204:
202:
199:
197:
194:
192:
189:
187:
184:
182:
179:
177:
174:
172:
171:Data cleaning
169:
167:
164:
162:
159:
157:
154:
152:
149:
147:
144:
142:
139:
137:
134:
133:
127:
126:
119:
116:
114:
111:
109:
106:
104:
101:
99:
96:
94:
91:
89:
86:
84:
83:Meta-learning
81:
79:
76:
74:
71:
69:
66:
64:
61:
59:
56:
55:
49:
48:
45:
40:
36:
32:
31:
19:
3688:Concordancer
3444:
3084:Bag-of-words
2997:
2993:
2983:
2966:
2956:
2929:
2925:
2915:
2894:
2873:
2861:. Retrieved
2857:the original
2852:
2843:
2778:
2765:
2754:
2745:
2731:
2710:
2696:
2675:
2661:
2651:
2630:
2595:
2591:
2523:
2519:
2496:
2487:
2462:
2452:
2417:
2411:
2402:
2362:
2358:
2348:
2295:
2291:
2285:
2250:
2244:
2223:
2206:
2200:
2188:. Retrieved
2173:
2166:
2154:. Retrieved
2150:
2141:
2132:
2122:
2113:
2100:
2089:
2038:
2034:
2028:
2011:
2005:
1992:
1985:
1958:
1952:
1946:: 1137–1155.
1943:
1939:
1923:
1914:
1901:
1888:
1875:
1866:
1860:
1847:
1835:. Retrieved
1831:the original
1820:
1785:
1781:
1775:
1738:
1732:
1723:
1717:
1700:
1696:
1690:
1681:
1673:
1667:
1652:
1639:
1632:
1621:. Retrieved
1614:the original
1605:
1598:
1585:
1578:
1535:
1531:
1525:
1516:
1503:
1490:
1483:
1470:
1463:
1444:
1438:
1417:
1397:
1390:
1359:
1355:
1342:text corpora
1339:
1287:
1267:
1237:
1222:learning in
1214:
1212:
1195:
1180:
1156:
1150:
1144:
1138:
1132:
1128:
1114:
1093:
1083:how to use "
1080:
1077:
1069:
1040:
1024:
1008:
1002:on the word
993:
989:real numbers
968:
962:
830:PAC learning
517:
366:
361:Hierarchical
293:
247:
241:
3645:Topic model
3525:Text corpus
3371:Statistical
3238:Text mining
3079:AI-complete
2932:: 439–453.
2359:Information
2298:: 288–303.
2190:October 25,
1538:: 137–165.
1240:game design
1234:Game design
973:real-valued
714:Multi-agent
651:Transformer
550:Autoencoder
306:Naive Bayes
44:data mining
3799:Categories
3366:Rule-based
3248:Truecasing
3116:Stop words
2939:1907.04907
2906:1607.06520
2885:1607.06520
2722:1906.01502
2687:1809.01496
2642:1506.06726
2533:1503.05140
2427:1506.01070
2305:2101.08700
2260:1504.06654
2235:1805.04032
2147:"word2vec"
1837:18 October
1623:2014-08-14
1545:1702.06891
1382:References
1324:(PCA) and
1224:proteomics
1168:ConceptNet
699:Q-learning
597:Restricted
395:Mean shift
344:Clustering
321:Perceptron
249:regression
151:Clustering
146:Regression
3675:reviewing
3473:standards
3471:Types and
3024:249112516
3016:1435-5655
2833:ignored (
2823:cite book
2622:248175634
2614:2334-0924
2381:2078-2489
2365:(3): 97.
2332:0957-4174
2215:857900050
2051:CiteSeerX
1803:1813/6057
1703:: 14–16.
1562:0925-9902
1454:1312.5542
1429:1310.4546
1146:golf club
1140:clubhouse
967:(NLP), a
858:ECML PKDD
840:VC theory
787:ROC curve
719:Self-play
639:DeepDream
480:Bayes net
271:Ensembles
52:Paradigms
3591:Wikidata
3571:FrameNet
3556:BabelNet
3535:Treebank
3505:PropBank
3450:Word2vec
3415:fastText
3296:Stemming
2815:27774487
2737:"Gensim"
2570:26555596
2520:PLOS ONE
2479:52967399
2340:52225306
2277:15251438
2073:11125150
1646:. EMNLP.
1570:10656055
1365:See also
1330:clusters
1310:fastText
1294:Word2vec
1284:Software
1228:genomics
1172:BabelNet
1159:word2vec
1125:homonymy
1121:polysemy
1105:word2vec
1103:created
281:Boosting
130:Problems
3762:Related
3728:Chatbot
3586:WordNet
3566:DBpedia
3440:Seq2seq
3184:Parsing
3099:Trigram
2806:5070922
2751:"Indra"
2667:"GloVe"
2561:4640716
2538:Bibcode
2501:(2019).
2444:6222768
2156:23 July
2081:5987139
2043:Bibcode
2035:Science
1915:NeurIPS
1812:6473756
1767:9937095
1676:: 1–32.
1477:. NIPS.
1164:WordNet
1099:led by
1081:Science
985:vectors
863:NeurIPS
680:(ECRAM)
634:AlexNet
276:Bagging
3735:(c.f.
3393:models
3381:Neural
3094:Bigram
3089:n-gram
3022:
3014:
2813:
2803:
2793:
2756:GitHub
2702:"Elmo"
2620:
2612:
2568:
2558:
2477:
2442:
2379:
2338:
2330:
2275:
2213:
2181:
2079:
2071:
2053:
1973:
1810:
1765:
1755:
1568:
1560:
1405:
1314:Gensim
1097:Google
1064:Bengio
656:Vision
512:RANSAC
390:OPTICS
385:DBSCAN
369:-means
176:AutoML
3784:spaCy
3429:large
3420:GloVe
3020:S2CID
2934:arXiv
2901:arXiv
2880:arXiv
2863:7 Feb
2775:(PDF)
2717:arXiv
2682:arXiv
2656:2019.
2637:arXiv
2618:S2CID
2528:arXiv
2475:S2CID
2440:S2CID
2422:arXiv
2336:S2CID
2300:arXiv
2273:S2CID
2255:arXiv
2230:arXiv
2110:(PDF)
2077:S2CID
1997:(PDF)
1936:(PDF)
1911:(PDF)
1808:S2CID
1763:S2CID
1644:(PDF)
1617:(PDF)
1610:(PDF)
1590:(PDF)
1566:S2CID
1540:arXiv
1513:(PDF)
1495:(PDF)
1475:(PDF)
1449:arXiv
1424:arXiv
1298:GloVe
1252:chess
878:IJCAI
704:SARSA
663:Mamba
629:LeNet
624:U-Net
450:t-SNE
374:Fuzzy
351:BIRCH
3549:Data
3400:BERT
3012:ISSN
2865:2018
2835:help
2811:PMID
2791:ISBN
2610:ISSN
2566:PMID
2377:ISSN
2328:ISSN
2211:OCLC
2192:2019
2179:ISBN
2158:2021
2069:PMID
1971:ISBN
1839:2020
1753:ISBN
1558:ISSN
1403:ISBN
1306:BERT
1302:ELMo
1278:BERT
1226:and
1202:BERT
1200:and
1198:ELMo
1151:club
1129:club
1123:and
1013:and
979:and
888:JMLR
873:ICLR
868:ICML
754:RLHF
570:LSTM
356:CURE
42:and
3581:UBY
3002:doi
2971:doi
2944:doi
2801:PMC
2783:doi
2600:doi
2556:PMC
2546:doi
2467:doi
2432:doi
2367:doi
2318:hdl
2310:doi
2296:136
2265:doi
2061:doi
2039:290
2016:doi
1963:doi
1798:hdl
1790:doi
1743:doi
1705:doi
1550:doi
1344:in
1292:'s
1079:in
1025:In
987:of
963:In
614:SOM
604:GAN
580:ESN
575:GRU
520:-NN
455:SDL
445:PGD
440:PCA
435:NMF
430:LDA
425:ICA
420:CCA
296:-NN
3801::
3018:.
3010:.
2998:38
2996:.
2992:.
2965:.
2942:.
2928:.
2924:.
2851:.
2827::
2825:}}
2821:{{
2809:.
2799:.
2789:.
2777:.
2753:.
2616:.
2608:.
2596:17
2594:.
2590:.
2578:^
2564:.
2554:.
2544:.
2536:.
2524:10
2522:.
2518:.
2506:^
2473:.
2461:.
2438:.
2430:.
2401:.
2389:^
2375:.
2363:10
2361:.
2357:.
2334:.
2326:.
2316:.
2308:.
2294:.
2271:.
2263:.
2209:.
2149:.
2131:.
2075:.
2067:.
2059:.
2049:.
2037:.
1969:.
1942:.
1938:.
1913:.
1806:.
1796:.
1786:18
1784:.
1761:.
1751:.
1699:.
1564:.
1556:.
1548:.
1536:53
1534:.
1515:.
1332:.
1320:.
1312:,
1308:,
1304:,
1215:n-
1189:,
1170:,
1166:,
1143:,
1137:,
1017:.
998:,
991:.
883:ML
3739:)
3462:,
3431:)
3427:(
3057:e
3050:t
3043:v
3026:.
3004::
2977:.
2973::
2950:.
2946::
2936::
2930:8
2909:.
2903::
2888:.
2882::
2867:.
2837:)
2817:.
2785::
2739:.
2725:.
2719::
2704:.
2690:.
2684::
2669:.
2645:.
2639::
2624:.
2602::
2572:.
2548::
2540::
2530::
2481:.
2469::
2446:.
2434::
2424::
2383:.
2369::
2342:.
2320::
2312::
2302::
2279:.
2267::
2257::
2238:.
2232::
2217:.
2194:.
2160:.
2083:.
2063::
2045::
2022:.
2018::
1979:.
1965::
1944:3
1917:.
1841:.
1814:.
1800::
1792::
1769:.
1745::
1711:.
1707::
1701:4
1661:.
1626:.
1572:.
1552::
1542::
1519:.
1457:.
1451::
1432:.
1426::
1411:.
952:e
945:t
938:v
518:k
367:k
294:k
252:)
240:(
20:)
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.