1298:
2684:). The space of documents is then scanned using HDBSCAN, and clusters of similar documents are found. Next, the centroid of documents identified in a cluster is considered to be that cluster's topic vector. Finally, top2vec searches the semantic space for word embeddings located near to the topic vector to ascertain the 'meaning' of the topic. The word with embeddings most similar to the topic vector might be assigned as the topic's title, whereas far away word embeddings may be considered unrelated.
2727:) to refer to biological sequences in general with protein-vectors (ProtVec) for proteins (amino-acid sequences) and gene-vectors (GeneVec) for gene sequences, this representation can be widely used in applications of machine learning in proteomics and genomics. The results suggest that BioVectors can characterize biological sequences in terms of biochemical and biophysical interpretations of the underlying patterns. A similar variant, dna2vec, has shown that there is correlation between
2794:
2847:(LSA), when it is trained with medium to large corpus size (more than 10 million words). However, with a small training corpus, LSA showed better performance. Additionally they show that the best parameter setting depends on the task and the training corpus. Nevertheless, for skip-gram models trained in medium size corpora, with 50 dimensions, a window size of 15 and 10 negative samples seems to be a good parameter setting.
2748:(OOV) words and morphologically similar words. If the Word2vec model has not encountered a particular word before, it will be forced to use a random vector, which is generally far from its ideal representation. This can particularly be an issue in domains like medicine where synonyms and related words can be used depending on the preferred style of radiologist, and words may have been used infrequently in a large corpus.
5586:
5566:
2802:
algebraic operations on the vector representations of these words such that the vector representation of "Brother" - "Man" + "Woman" produces a result which is closest to the vector representation of "Sister" in the model. Such relationships can be generated for a range of semantic relations (such as
Country–Capital) as well as syntactic relations (e.g. present tense–past tense).
2663:(PV-DM), is identical to CBOW other than it also provides a unique document identifier as a piece of additional context. The second architecture, Distributed Bag of Words version of Paragraph Vector (PV-DBOW), is identical to the skip-gram model except that it attempts to predict the window of surrounding context words from the paragraph identifier instead of the current word.
2013:
2819:
When assessing the quality of a vector model, a user may draw on this accuracy test which is implemented in word2vec, or develop their own test set which is meaningful to the corpora which make up the model. This approach offers a more challenging test than simply arguing that the words most similar to a given test word are intuitively plausible.
3145:
2831:
In models using large corpora and a high number of dimensions, the skip-gram model yields the highest overall accuracy, and consistently produces the highest accuracy on semantic relationships, as well as yielding the highest syntactic accuracy in most cases. However, the CBOW is less computationally
2818:
Mikolov et al. (2013) developed an approach to assessing the quality of a word2vec model which draws on the semantic and syntactic patterns discussed above. They developed a set of 8,869 semantic relations and 10,675 syntactic relations which they use as a benchmark to test the accuracy of a model.
2780:
Levy et al. (2015) show that much of the superior performance of word2vec or similar embeddings in downstream tasks is not a result of the models per se, but of the choice of specific hyperparameters. Transferring these hyperparameters to more 'traditional' approaches yields similar performances in
2666:
doc2vec also has the ability to capture the semantic ‘meanings’ for additional pieces of ‘context’ around words; doc2vec can estimate the semantic embeddings for speakers or speaker attributes, groups, and periods of time. For example, doc2vec has been used to estimate the political positions
2662:
doc2vec estimates the distributed representations of documents much like how word2vec estimates representations of words: doc2vec utilizes either of two model architectures, both of which are allegories to the architectures used in word2vec. The first, Distributed Memory Model of
Paragraph Vectors
2755:
from clinical texts, which include ambiguity of free text narrative style, lexical variations, use of ungrammatical and telegraphic phases, arbitrary ordering of words, and frequent appearance of abbreviations and acronyms. Of particular interest, the IWE model (trained on the one institutional
2003:
The quantity on the left is fast to compute, but the quantity on the right is slow, as it involves summing over the entire vocabulary set for each word in the corpus. Furthermore, to use gradient ascent to maximize the log-probability requires computing the gradient of the quantity on the right,
1148:
In the continuous skip-gram architecture, the model uses the current word to predict the surrounding window of context words. The skip-gram architecture weighs nearby context words more heavily than more distant context words. According to the authors' note, CBOW is faster while skip-gram does a
2801:
The word embedding approach is able to capture multiple different degrees of similarity between words. Mikolov et al. (2013) found that semantic and syntactic patterns can be reproduced using vector arithmetic. Patterns such as "Man is to Woman as
Brother is to Sister" can be generated through
1144:
The CBOW can be viewed as a ‘fill in the blank’ task, where the word embedding represents the way the word influences the relative probabilities of other words in the context window. Words which are semantically similar should influence these probabilities in similar ways, because semantically
2827:
The use of different model parameters and different corpus sizes can greatly affect the quality of a word2vec model. Accuracy can be improved in a number of ways, including the choice of model architecture (CBOW or Skip-Gram), increasing the training data set, increasing the number of vector
2805:
This facet of word2vec has been exploited in a variety of other contexts. For example, word2vec has been used to map a vector space of words in one language to a vector space constructed from another language. Relationships between translated words in both spaces can be used to assist with
2595:
of sampled negative instances. According to the authors, hierarchical softmax works better for infrequent words while negative sampling works better for frequent words and better with low dimensional vectors. As training epochs increase, hierarchical softmax stops being useful.
2499:
2001:
2835:
Overall, accuracy increases with the number of words used and the number of dimensions. Mikolov et al. report that doubling the amount of training data results in an increase in computational complexity equivalent to doubling the number of vector dimensions.
2691:, top2vec provides canonical ‘distance’ metrics between two topics, or between a topic and another embeddings (word, document, or otherwise). Together with results from HDBSCAN, users can generate topic hierarchies, or groups of related topics and subtopics.
1841:
2543:
was developed by a team at
Stanford specifically as a competitor, and the original paper noted multiple improvements of GloVe over word2vec. Mikolov argued that the comparison was unfair as GloVe was trained on more data, and that the
2622:
The size of the context window determines how many words before and after a given word are included as context words of the given word. According to the authors' note, the recommended value is 10 for skip-gram and 5 for CBOW.
3378:
2342:
2694:
Furthermore, a user can use the results of top2vec to infer the topics of out-of-sample documents. After inferring the embedding for a new document, must only search the space of topics for the closest topic vector.
2240:
2743:
An extension of word vectors for creating a dense vector representation of unstructured radiology reports has been proposed by
Banerjee et al. One of the biggest challenges with Word2vec is how to handle unknown or
2828:
dimensions, and increasing the window size of words considered by the algorithm. Each of these improvements comes with the cost of increased computational complexity and therefore increased model generation time.
1281:
The idea of skip-gram is that the vector of a word should be close to the vector of each of its neighbors. The idea of CBOW is that the vector-sum of a word's neighbors should be close to the vector of the word.
2768:
learning in the word2vec framework are poorly understood. Goldberg and Levy point out that the word2vec objective function causes words that occur in similar contexts to have similar embeddings (as measured by
2675:
Another extension of word2vec is top2vec, which leverages both document and word embeddings to estimate distributed representations of topics. top2vec takes document embeddings learned from a doc2vec model and
1846:
1058:
representations of words. These vectors capture information about the meaning of the word based on the surrounding words. The word2vec algorithm estimates these representations by modeling text in a large
2604:
High-frequency and low-frequency words often provide little information. Words with a frequency above a certain threshold, or below a certain threshold, may be subsampled or removed to speed up training.
2613:
Quality of word embedding increases with higher dimensionality. But after reaching some point, marginal gain diminishes. Typically, the dimensionality of the vectors is set to be between 100 and 1,000.
2337:
1589:
2106:
1492:
3140:, Mikolov, Tomas; Chen, Kai & Corrado, Gregory S. et al., "Computing numeric representations of words in a high-dimensional space", published 2015-05-19, assigned to
2108:
That is, we want to maximize the total probability for the corpus, as seen by a probability model that uses words to predict its word neighbors. We predict each word-neighbor independently, thus
2785:
for text, which involves a random walk generation process based upon loglinear topic model. They use this to explain some properties of word embeddings, including their use to solve analogies.
5460:
1401:
961:
1705:
3815:
Altszyler, E.; Ribeiro, S.; Sigman, M.; Fernández Slezak, D. (2017). "The interpretation of dream meaning: Resolving ambiguity using Latent
Semantic Analysis in a small corpus of text".
999:
3193:"We use our insights to construct a new model for word representation which we call GloVe, for Global Vectors, because the global corpus statistics are captured directly by the model."
1250:
1710:
3163:
3947:
1643:
4107:
956:
1152:
After the model has trained, the learned word embeddings are positioned in the vector space such that words that share common contexts in the corpus — that is, words that are
946:
5302:
787:
4708:
3002:
1141:(CBOW) or continuously sliding skip-gram. In both architectures, word2vec considers both individual words and a sliding context window as it iterates over the corpus.
1276:
2563:, which add multiple neural-network attention layers on top of a word embedding model similar to Word2vec, have come to be regarded as the state of the art in NLP.
1082:
of numbers which capture relationships between words. In particular, words which appear in similar contexts are mapped to vectors which are nearby as measured by
994:
5627:
4085:
2681:
2529:
1208:
1188:
1707:, then take the dot-product-softmax with every other vector sum (this step is similar to the attention mechanism in Transformers), to obtain the probability:
951:
802:
533:
3114:
Mikolov, Tomáš; Karafiát, Martin; Burget, Lukáš; Černocký, Jan; Khudanpur, Sanjeev (26 September 2010). "Recurrent neural network based language model".
1034:
837:
4496:
3940:
2756:
dataset) successfully translated to a different institutional dataset which demonstrates good generalizability of the approach across institutions.
2552:
1156:
and syntactically similar — are located close to one another in the space. More dissimilar words are located farther from one another in the space.
4818:
4665:
2839:
Altszyler and coauthors (2017) studied Word2vec performance in two semantic tests for different corpus size. They found that Word2vec has a steep
913:
462:
2248:
1500:
4701:
2111:
3905:
3503:
2023:
1494:
That is, we want to maximize the total probability for the corpus, as seen by a probability model that uses word neighbors to predict words.
1409:
5491:
4406:
4097:
3933:
2959:
Mikolov, Tomas; Chen, Kai; Corrado, Greg; Dean, Jeffrey (16 January 2013). "Efficient
Estimation of Word Representations in Vector Space".
2535:
Embedding vectors created using the Word2vec algorithm have some advantages compared to earlier algorithms such as those using n-grams and
971:
734:
269:
2587:
and/or negative sampling. To approximate the conditional log-likelihood a model seeks to maximize, the hierarchical softmax method uses a
5632:
5592:
5143:
4880:
4660:
989:
1145:
similar words should be used in similar contexts. The order of context words does not influence prediction (bag of words assumption).
4267:
3203:
Joulin, Armand; Grave, Edouard; Bojanowski, Piotr; Mikolov, Tomas (9 August 2016). "Bag of Tricks for
Efficient Text Classification".
2667:
of political parties in various
Congresses and Parliaments in the U.S. and U.K., respectively, and various governmental institutions.
822:
797:
746:
2532:
2013. It also took months for the code to be approved for open-sourcing. Other researchers helped analyse and explain the algorithm.
5404:
5031:
4838:
4694:
4421:
4252:
2980:
2840:
870:
865:
518:
2494:{\displaystyle \sum _{i\in C,j\in N+i}\left(v_{w_{i}}\cdot v_{w_{j}}-\ln \sum _{w\in V}e^{v_{w}\cdot v_{w_{\color {red}i}}}\right)}
1305:
Suppose we want each word in the corpus to be predicted by every other word in a small span of 4 words. We write the neighbor set
5359:
4192:
3900:
528:
166:
2659:
tools (see below), with the Java and Python versions also supporting inference of document embeddings on new, unseen documents.
4609:
4262:
2917:
2728:
2591:
to reduce calculation. The negative sampling method, on the other hand, approaches the maximization problem by minimizing the
923:
5622:
5546:
5486:
5084:
4257:
4002:
2777:. However, they note that this explanation is "very hand-wavy" and argue that a more formal explanation would be preferable.
1027:
687:
508:
3069:
Goldberg, Yoav; Levy, Omer (2014). "word2vec
Explained: Deriving Mikolov et al.'s Negative-Sampling Word-Embedding Method".
1167:
A corpus is a sequence of words. Both CBOW and skip-gram are methods to learn one vector per word appearing in the corpus.
5079:
4768:
4526:
4247:
3737:
Mikolov, Tomas; Yih, Wen-tau; Zweig, Geoffrey (2013). "Linguistic Regularities in Continuous Space Word Representations".
2648:
2514:
898:
600:
376:
5521:
4875:
4828:
4823:
4219:
3186:
2656:
855:
792:
702:
680:
523:
513:
66:
1996:{\displaystyle \sum _{i\in C,j\in N+i}\left(v_{w_{i}}\cdot v_{w_{j}}-\ln \sum _{w\in V}e^{v_{w}\cdot v_{w_{j}}}\right)}
5637:
5572:
4868:
4794:
4564:
4549:
4521:
4386:
4381:
3956:
2688:
2652:
1051:
1006:
918:
903:
364:
186:
3167:
893:
5196:
5131:
4732:
4301:
4272:
4050:
2781:
downstream tasks. Arora et al. (2016) explain word2vec and related algorithms as performing inference for a simple
1308:
1134:
966:
643:
538:
326:
259:
219:
3281:
1648:
5597:
5455:
5094:
4925:
4748:
4144:
3997:
1020:
626:
394:
264:
3226:"On the validity of pre-trained transformers for natural language processing in the software engineering domain"
5496:
4753:
4670:
4594:
4326:
4282:
4167:
4065:
3614:"Radiology report annotation using intelligent word embeddings: Applied to multi-institutional chest CT cohort"
3306:
2902:
2844:
2774:
2677:
2644:
2536:
2518:
1213:
648:
568:
491:
409:
239:
201:
196:
156:
151:
5541:
5526:
5179:
5174:
5074:
4942:
4723:
4574:
4544:
4211:
595:
444:
344:
171:
5501:
5261:
4980:
4975:
4431:
4124:
4102:
4092:
4060:
4035:
2752:
1297:
775:
751:
653:
414:
389:
349:
161:
2643:
pieces of texts, such as sentences, paragraphs, or entire documents. doc2vec has been implemented in the
5531:
5516:
5481:
5169:
5069:
4937:
4291:
3767:
2912:
2560:
1597:
729:
551:
503:
359:
274:
146:
5399:
2745:
3137:
5551:
5506:
4952:
4897:
4743:
4738:
4644:
4320:
4296:
4149:
3545:
3016:
2867:
2751:
IWE combines Word2vec with a semantic dictionary mapping technique to tackle the major challenges of
658:
608:
2339:
The probability model is still the dot-product-softmax model, so the calculation proceeds as before.
5126:
5104:
4853:
4848:
4806:
4758:
4624:
4554:
4511:
4467:
4239:
4229:
4224:
4112:
2974:
2807:
1153:
1087:
761:
697:
668:
573:
399:
332:
318:
304:
279:
229:
181:
141:
78:
5565:
2510:
1114:
that are trained to reconstruct linguistic contexts of words. Word2vec takes as its input a large
1068:
5511:
5089:
4918:
4634:
4506:
4371:
4134:
4117:
3975:
3850:
3824:
3753:
3707:
3592:
3535:
3483:
3441:
3409:
3342:
3263:
3237:
3204:
3164:"Yesterday we received a Test of Time Award at NeurIPS for the word2vec paper from ten years ago"
3095:
3070:
3006:
2960:
2887:
2872:
2524:
Word2vec was created, patented, and published in 2013 by a team of researchers led by Mikolov at
1127:
739:
663:
449:
244:
3524:"Continuous Distributed Representation of Biological Sequences for Deep Proteomics and Genomics"
1836:{\displaystyle Pr(w|w_{j}:j\in N+i):={\frac {e^{v_{w}\cdot v}}{\sum _{w\in V}e^{v_{w}\cdot v}}}}
3337:
Le, Quoc; Mikolov, Tomas (May 2014). "Distributed Representations of Sentences and Documents".
5577:
5369:
5021:
4892:
4885:
4639:
4351:
4159:
4070:
3842:
3643:
3573:
3499:
3423:
3419:
3401:
3255:
2770:
2732:
1083:
1079:
832:
675:
588:
384:
354:
299:
294:
249:
191:
5322:
5312:
5119:
4913:
4863:
4858:
4801:
4789:
4516:
4401:
4376:
4177:
4080:
3834:
3717:
3674:
3633:
3625:
3591:
Ng, Patrick (2017). "dna2vec: Consistent vector representations of variable-length k-mers".
3563:
3553:
3491:
3247:
3119:
2877:
2782:
2724:
2584:
1286:
1255:
860:
613:
563:
473:
457:
427:
289:
284:
234:
224:
122:
101:
5435:
5379:
5201:
4843:
4763:
4628:
4589:
4584:
4452:
4182:
4055:
4030:
4012:
3780:
3190:
2793:
888:
692:
558:
498:
3549:
3020:
5409:
5374:
5364:
5189:
4947:
4773:
4336:
4316:
4040:
3794:
3638:
3613:
3568:
3523:
2892:
2882:
2765:
2592:
2588:
2004:
which is intractable. This prompted the authors to use numerical approximation tricks.
1193:
1173:
1111:
1107:
1060:
908:
439:
176:
91:
86:
3910:
3363:
5616:
5354:
5334:
5251:
4930:
4599:
4411:
4391:
4172:
3854:
3267:
2720:
827:
756:
638:
369:
254:
3402:"Gov2Vec: Learning Distributed Representations of Institutions and Their Legal Text"
3379:"Word Embeddings for the Analysis of Ideological Placement in Parliamentary Corpora"
1067:
words or suggest additional words for a partial sentence. Word2vec was developed by
5440:
5271:
4579:
3915:
1138:
1119:
1055:
3123:
3877:
3838:
3558:
3495:
2997:
Mikolov, Tomas; Sutskever, Ilya; Chen, Kai; Corrado, Greg S.; Dean, Jeff (2013).
5536:
5307:
5216:
5211:
4833:
4811:
4536:
4416:
4129:
4045:
4022:
3970:
3612:
Banerjee, Imon; Chen, Matthew C.; Lungren, Matthew P.; Rubin, Daniel L. (2018).
3141:
2862:
1115:
633:
127:
3663:"Improving Distributional Similarity with Lessons Learned from Word Embeddings"
3440:
Angelov, Dimo (August 2020). "Top2Vec: Distributed Representations of Topics".
3225:
5430:
5389:
5384:
5297:
5206:
5114:
5026:
5006:
4139:
3925:
3629:
3039:
2704:
2501:
There is only a single difference from the CBOW equation, highlighted in red.
1123:
1098:
are nearby, as are those for "but" and "however", and "Berlin" and "Germany".
782:
478:
404:
106:
71:
36:
3259:
3251:
5425:
5394:
5292:
5136:
5099:
5036:
4990:
4985:
4970:
4007:
3895:
3180:
2572:
2245:
Products are numerically unstable, so we convert it by taking the logarithm:
1497:
Products are numerically unstable, so we convert it by taking the logarithm:
941:
722:
41:
4686:
3846:
3662:
3647:
3577:
2999:
Distributed representations of words and phrases and their compositionality
3752:
Jansen, Stefan (9 May 2017). "Word and Phrase Translation with word2vec".
3673:. Transactions of the Association for Computational Linguistics: 211–225.
5327:
4482:
4462:
4447:
4426:
4396:
4306:
4187:
3722:
3679:
3040:"Google Code Archive - Long-term storage for Google Code Project Hosting"
2897:
2857:
2545:
2235:{\displaystyle Pr(w_{j}:j\in N+i|w_{i})=\prod _{j\in N+i}Pr(w_{j}|w_{i})}
3871:
2723:
applications has been proposed by Asgari and Mofrad. Named bio-vectors (
2548:
project showed that word2vec is superior when trained on the same data.
1133:
Word2vec can utilize either of two model architectures to produce these
17:
5450:
5287:
5241:
5164:
5064:
5059:
5011:
4619:
4477:
4457:
4331:
4075:
3990:
3890:
3695:
3490:. Lecture Notes in Computer Science. Vol. 7819. pp. 160–172.
3462:
2716:
1064:
717:
2012:
5465:
5445:
5317:
5109:
3985:
3980:
3920:
3467:
3224:
Von der Mosel, Julian; Trautsch, Alexander; Herbold, Steffen (2022).
2551:
As of 2022, the straight Word2vec approach was described as "dated."
2525:
1072:
468:
3339:
Proceedings of the 31st International Conference on Machine Learning
2575:. The following are some important parameters in word2vec training.
3829:
3758:
3712:
3597:
3540:
3446:
3414:
3242:
3209:
5266:
5246:
5236:
5231:
5226:
5221:
5184:
5016:
4675:
4311:
3874:
3484:"Density-Based Clustering Based on Hierarchical Density Estimates"
3347:
3100:
3075:
3011:
2965:
2792:
2540:
2528:
over two papers. The original paper was rejected by reviewers for
2011:
1406:
Then the training objective is to maximize the following quantity:
1296:
712:
707:
434:
5256:
4197:
2907:
2556:
4690:
3929:
3696:"A Latent Variable Model Approach to PMI-based Word Embeddings"
1190:("vocabulary") be the set of all words appearing in the corpus
1106:
Word2vec is a group of related models that are used to produce
4472:
2712:
2708:
3700:
Transactions of the Association for Computational Linguistics
3667:
Transactions of the Association for Computational Linguistics
3314:
Journal of Machine Learning Research, 2008. Vol. 9, pg. 2595
1289:, but the framework allows other ways to measure closeness.
1843:
The quantity to be maximized is then after simplifications:
3482:
Campello, Ricardo; Moulavi, Davoud; Sander, Joerg (2013).
2332:{\displaystyle \sum _{i\in C,j\in N+i}\ln Pr(w_{j}|w_{i})}
1584:{\displaystyle \sum _{i\in C}\ln Pr(w_{i}|w_{j}:j\in N+i)}
1591:
That is, we maximize the log-probability of the corpus.
1285:
In the original publication, "closeness" is measured by
1000:
List of datasets in computer vision and image processing
3377:
Rheault, Ludovic; Cochrane, Christopher (3 July 2019).
2101:{\displaystyle \prod _{i\in C}Pr(w_{j}:j\in N+i|w_{i})}
1487:{\displaystyle \prod _{i\in C}Pr(w_{i}|w_{j}:j\in N+i)}
2345:
2251:
2114:
2026:
1849:
1713:
1651:
1600:
1503:
1412:
1311:
1258:
1216:
1196:
1176:
2789:
Preservation of semantic and syntactic relationships
1130:
being assigned a corresponding vector in the space.
5474:
5418:
5347:
5280:
5152:
5052:
5045:
4999:
4963:
4906:
4782:
4722:
4653:
4608:
4563:
4535:
4495:
4440:
4362:
4350:
4281:
4238:
4210:
4158:
4021:
3963:
2773:) and note that this is in line with J. R. Firth's
100:
77:
65:
47:
35:
3522:Asgari, Ehsaneddin; Mofrad, Mohammad R.K. (2015).
2843:, outperforming another word-embedding technique,
2639:doc2vec, generates distributed representations of
2583:A Word2vec model can be trained with hierarchical
2521:with a single hidden layer to language modelling.
2493:
2331:
2234:
2100:
1995:
1835:
1699:
1637:
1583:
1486:
1395:
1270:
1244:
1202:
1182:
1090:between the words, so for example the vectors for
3003:Advances in Neural Information Processing Systems
2571:Results of word2vec training can be sensitive to
1594:Our probability model is as follows: Given words
3661:Levy, Omer; Goldberg, Yoav; Dagan, Ido (2015).
3488:Advances in Knowledge Discovery and Data Mining
2832:expensive and yields similar accuracy results.
2739:Radiology and intelligent word embeddings (IWE)
2631:There are a variety of extensions to word2vec.
1078:Word2vec represents a word as a high-dimension
2992:
2990:
995:List of datasets for machine-learning research
4702:
3941:
3182:GloVe: Global Vectors for Word Representation
2680:them into a lower dimension (typically using
1396:{\displaystyle N=\{-4,-3,-2,-1,+1,+2,+3,+4\}}
1028:
8:
2703:An extension of word vectors for n-grams in
1700:{\displaystyle v:=\sum _{j\in N+i}v_{w_{j}}}
1632:
1601:
1390:
1318:
30:
72:https://code.google.com/archive/p/word2vec/
5049:
4709:
4695:
4687:
4359:
4155:
3948:
3934:
3926:
2954:
1035:
1021:
113:
29:
3828:
3757:
3721:
3711:
3678:
3637:
3596:
3567:
3557:
3539:
3445:
3413:
3346:
3241:
3230:IEEE Transactions on Software Engineering
3208:
3099:
3074:
3064:
3062:
3060:
3010:
2964:
2952:
2950:
2948:
2946:
2944:
2942:
2940:
2938:
2936:
2934:
2687:As opposed to other topic models such as
2474:
2469:
2456:
2451:
2435:
2414:
2409:
2394:
2389:
2350:
2344:
2320:
2311:
2305:
2256:
2250:
2223:
2214:
2208:
2177:
2161:
2152:
2128:
2113:
2089:
2080:
2056:
2031:
2025:
1978:
1973:
1960:
1955:
1939:
1918:
1913:
1898:
1893:
1854:
1848:
1816:
1811:
1795:
1776:
1771:
1765:
1735:
1726:
1712:
1689:
1684:
1662:
1650:
1608:
1599:
1554:
1545:
1539:
1508:
1502:
1457:
1448:
1442:
1417:
1411:
1310:
1257:
1245:{\displaystyle v_{w}\in \mathbb {R} ^{n}}
1236:
1232:
1231:
1221:
1215:
1195:
1175:
2020:For skip-gram, the training objective is
1063:. Once trained, such a model can detect
2930:
121:
3795:"Gensim - Deep learning with word2vec"
3776:
3765:
3170:from the original on 24 December 2023.
2972:
2797:Visual illustration of word embeddings
1164:This section is based on expositions.
1110:. These models are shallow, two-layer
27:Models used to produce word embeddings
3694:Arora, S; et al. (Summer 2016).
3517:
3515:
3435:
3433:
3332:
3330:
3092:word2vec Parameter Learning Explained
1301:Continuous Bag of Words model (CBOW).
7:
5628:Natural language processing toolkits
5547:Generative adversarial network (GAN)
4407:Simple Knowledge Organization System
3157:
3155:
3034:
3032:
3030:
3162:Mikolov, Tomáš (13 December 2023).
2517:) with co-authors applied a simple
990:Glossary of artificial intelligence
3461:Angelov, Dimo (11 November 2022).
3118:. ISCA: ISCA. pp. 1045–1048.
1638:{\displaystyle \{w_{j}:j\in N+i\}}
1210:. Our goal is to learn one vector
25:
4422:Thesaurus (information retrieval)
3618:Journal of Biomedical Informatics
2475:
1149:better job for infrequent words.
5585:
5584:
5564:
2814:Assessing the quality of a model
3282:"Parameter (hs & negative)"
2918:Normalized compression distance
1126:, with each unique word in the
1122:, typically of several hundred
5497:Recurrent neural network (RNN)
5487:Differentiable neural computer
4003:Natural language understanding
3400:Nay, John (21 December 2017).
3307:"Visualizing Data using t-SNE"
2883:Neural network language models
2326:
2312:
2298:
2229:
2215:
2201:
2167:
2153:
2121:
2095:
2081:
2049:
1759:
1727:
1720:
1578:
1546:
1532:
1481:
1449:
1435:
1293:Continuous Bag of Words (CBOW)
1086:. This indicates the level of
410:Relevance vector machine (RVM)
1:
5542:Variational autoencoder (VAE)
5502:Long short-term memory (LSTM)
4769:Computational learning theory
4527:Optical character recognition
3124:10.21437/interspeech.2010-343
2979:: CS1 maint: date and year (
2515:Brno University of Technology
899:Computational learning theory
463:Expectation–maximization (EM)
5522:Convolutional neural network
4220:Multi-document summarization
3839:10.1016/j.concog.2017.09.004
3559:10.1371/journal.pone.0141287
3496:10.1007/978-3-642-37456-2_14
2823:Parameters and model quality
1645:, it takes their vector sum
856:Coefficient of determination
703:Convolutional neural network
415:Support vector machine (SVM)
5517:Multilayer perceptron (MLP)
4550:Latent Dirichlet allocation
4522:Natural language generation
4387:Machine-readable dictionary
4382:Linguistic Linked Open Data
3957:Natural language processing
3817:Consciousness and Cognition
2764:The reasons for successful
1135:distributed representations
1052:natural language processing
1007:Outline of machine learning
904:Empirical risk minimization
5654:
5633:Artificial neural networks
5593:Artificial neural networks
5507:Gated recurrent unit (GRU)
4733:Differentiable programming
4302:Explicit semantic analysis
4051:Deep linguistic processing
644:Feedforward neural network
395:Artificial neural networks
5560:
4926:Artificial neural network
4749:Automatic differentiation
4145:Word-sense disambiguation
3998:Computational linguistics
3630:10.1016/j.jbi.2017.11.012
3090:Rong, Xin (5 June 2016),
2775:distributional hypothesis
2735:of dna2vec word vectors.
627:Artificial neural network
4754:Neuromorphic engineering
4717:Differentiable computing
4671:Natural Language Toolkit
4595:Pronunciation assessment
4497:Automatic identification
4327:Latent semantic analysis
4283:Distributional semantics
4168:Compound-term processing
4066:Named-entity recognition
3252:10.1109/TSE.2022.3178469
2845:latent semantic analysis
2537:latent semantic analysis
2519:recurrent neural network
936:Journals and conferences
883:Mathematical foundations
793:Temporal difference (TD)
649:Recurrent neural network
569:Conditional random field
492:Dimensionality reduction
240:Dimensionality reduction
202:Quantum machine learning
197:Neuromorphic engineering
157:Self-supervised learning
152:Semi-supervised learning
5527:Residual neural network
4943:Artificial Intelligence
4575:Automated essay scoring
4545:Document classification
4212:Automatic summarization
2555:-based models, such as
1075:and published in 2013.
345:Apprenticeship learning
4432:Universal Dependencies
4125:Terminology extraction
4108:Semantic decomposition
4103:Semantic role labeling
4093:Part-of-speech tagging
4061:Information extraction
4046:Coreference resolution
4036:Collocation extraction
3775:Cite journal requires
2798:
2753:information extraction
2495:
2333:
2236:
2102:
2017:
1997:
1837:
1701:
1639:
1585:
1488:
1397:
1302:
1272:
1271:{\displaystyle w\in V}
1246:
1204:
1184:
894:Bias–variance tradeoff
776:Reinforcement learning
752:Spiking neural network
162:Reinforcement learning
53:; 11 years ago
5623:Free science software
5482:Neural Turing machine
5070:Human image synthesis
4193:Sentence segmentation
2913:BERT (language model)
2796:
2731:similarity score and
2496:
2334:
2237:
2103:
2015:
1998:
1838:
1702:
1640:
1586:
1489:
1398:
1300:
1273:
1247:
1205:
1185:
1137:of words: continuous
730:Neural radiance field
552:Structured prediction
275:Structured prediction
147:Unsupervised learning
5573:Computer programming
5552:Graph neural network
5127:Text-to-video models
5105:Text-to-image models
4953:Large language model
4938:Scientific computing
4744:Statistical manifold
4739:Information geometry
4645:Voice user interface
4356:datasets and corpora
4297:Document-term matrix
4150:Word-sense induction
3723:10.1162/tacl_a_00106
3680:10.1162/tacl_a_00134
2868:Document-term matrix
2343:
2249:
2112:
2024:
1847:
1711:
1649:
1598:
1501:
1410:
1309:
1256:
1214:
1194:
1174:
1160:Mathematical details
1054:(NLP) for obtaining
919:Statistical learning
817:Learning with humans
609:Local outlier factor
4919:In-context learning
4759:Pattern recognition
4625:Interactive fiction
4555:Pachinko allocation
4512:Speech segmentation
4468:Google Ngram Viewer
4240:Machine translation
4230:Text simplification
4225:Sentence extraction
4113:Semantic similarity
3906:Python (TensorFlow)
3726:– via ACLWEB.
3550:2015PLoSO..1041287A
3021:2013arXiv1310.4546M
2808:machine translation
1088:semantic similarity
762:Electrochemical RAM
669:reservoir computing
400:Logistic regression
319:Supervised learning
305:Multimodal learning
280:Feature engineering
225:Generative modeling
187:Rule-based learning
182:Curriculum learning
142:Supervised learning
117:Part of a series on
32:
5638:Semantic relations
5512:Echo state network
5400:Jürgen Schmidhuber
5095:Facial recognition
5090:Speech recognition
5000:Software libraries
4635:Question answering
4507:Speech recognition
4372:Corpus linguistics
4352:Language resources
4135:Textual entailment
4118:Sentiment analysis
3383:Political Analysis
3189:2020-09-03 at the
2888:Vector space model
2873:Feature extraction
2799:
2579:Training algorithm
2491:
2479:
2446:
2379:
2329:
2285:
2232:
2194:
2098:
2042:
2018:
1993:
1950:
1883:
1833:
1806:
1697:
1679:
1635:
1581:
1519:
1484:
1428:
1393:
1303:
1268:
1242:
1200:
1180:
1071:and colleagues at
1050:is a technique in
330: •
245:Density estimation
37:Original author(s)
5608:
5607:
5370:Stephen Grossberg
5343:
5342:
4684:
4683:
4640:Virtual assistant
4565:Computer-assisted
4491:
4490:
4248:Computer-assisted
4206:
4205:
4198:Word segmentation
4160:Text segmentation
4098:Semantic analysis
4086:Syntactic parsing
4071:Ontology learning
3505:978-3-642-37455-5
2771:cosine similarity
2746:out-of-vocabulary
2733:cosine similarity
2431:
2346:
2252:
2173:
2027:
1935:
1850:
1831:
1791:
1658:
1504:
1413:
1203:{\displaystyle C}
1183:{\displaystyle V}
1084:cosine similarity
1045:
1044:
850:Model diagnostics
833:Human-in-the-loop
676:Boltzmann machine
589:Anomaly detection
385:Linear regression
300:Ontology learning
295:Grammar induction
270:Semantic analysis
265:Association rules
250:Anomaly detection
192:Neuro-symbolic AI
112:
111:
16:(Redirected from
5645:
5598:Machine learning
5588:
5587:
5568:
5323:Action selection
5313:Self-driving car
5120:Stable Diffusion
5085:Speech synthesis
5050:
4914:Machine learning
4790:Gradient descent
4711:
4704:
4697:
4688:
4661:Formal semantics
4610:Natural language
4517:Speech synthesis
4499:and data capture
4402:Semantic network
4377:Lexical resource
4360:
4178:Lexical analysis
4156:
4081:Semantic parsing
3950:
3943:
3936:
3927:
3859:
3858:
3832:
3812:
3806:
3805:
3803:
3801:
3791:
3785:
3784:
3778:
3773:
3771:
3763:
3761:
3749:
3743:
3742:
3734:
3728:
3727:
3725:
3715:
3691:
3685:
3684:
3682:
3658:
3652:
3651:
3641:
3609:
3603:
3602:
3600:
3588:
3582:
3581:
3571:
3561:
3543:
3534:(11): e0141287.
3519:
3510:
3509:
3479:
3473:
3472:
3458:
3452:
3451:
3449:
3437:
3428:
3427:
3417:
3397:
3391:
3390:
3374:
3368:
3367:
3362:Rehurek, Radim.
3359:
3353:
3352:
3350:
3334:
3325:
3324:
3322:
3320:
3311:
3303:
3297:
3296:
3294:
3292:
3278:
3272:
3271:
3245:
3236:(4): 1487–1507.
3221:
3215:
3214:
3212:
3200:
3194:
3178:
3172:
3171:
3159:
3150:
3149:
3148:
3144:
3134:
3128:
3127:
3116:Interspeech 2010
3111:
3105:
3104:
3103:
3087:
3081:
3080:
3078:
3066:
3055:
3054:
3052:
3050:
3036:
3025:
3024:
3014:
2994:
2985:
2984:
2978:
2970:
2968:
2956:
2878:Feature learning
2783:generative model
2729:Needleman–Wunsch
2707:sequences (e.g.
2567:Parameterization
2500:
2498:
2497:
2492:
2490:
2486:
2485:
2484:
2483:
2482:
2481:
2480:
2461:
2460:
2445:
2421:
2420:
2419:
2418:
2401:
2400:
2399:
2398:
2378:
2338:
2336:
2335:
2330:
2325:
2324:
2315:
2310:
2309:
2284:
2241:
2239:
2238:
2233:
2228:
2227:
2218:
2213:
2212:
2193:
2166:
2165:
2156:
2133:
2132:
2107:
2105:
2104:
2099:
2094:
2093:
2084:
2061:
2060:
2041:
2002:
2000:
1999:
1994:
1992:
1988:
1987:
1986:
1985:
1984:
1983:
1982:
1965:
1964:
1949:
1925:
1924:
1923:
1922:
1905:
1904:
1903:
1902:
1882:
1842:
1840:
1839:
1834:
1832:
1830:
1829:
1828:
1821:
1820:
1805:
1789:
1788:
1781:
1780:
1766:
1740:
1739:
1730:
1706:
1704:
1703:
1698:
1696:
1695:
1694:
1693:
1678:
1644:
1642:
1641:
1636:
1613:
1612:
1590:
1588:
1587:
1582:
1559:
1558:
1549:
1544:
1543:
1518:
1493:
1491:
1490:
1485:
1462:
1461:
1452:
1447:
1446:
1427:
1402:
1400:
1399:
1394:
1277:
1275:
1274:
1269:
1251:
1249:
1248:
1243:
1241:
1240:
1235:
1226:
1225:
1209:
1207:
1206:
1201:
1189:
1187:
1186:
1181:
1037:
1030:
1023:
984:Related articles
861:Confusion matrix
614:Isolation forest
559:Graphical models
338:
337:
290:Learning to rank
285:Feature learning
123:Machine learning
114:
61:
59:
54:
33:
21:
5653:
5652:
5648:
5647:
5646:
5644:
5643:
5642:
5613:
5612:
5609:
5604:
5556:
5470:
5436:Google DeepMind
5414:
5380:Geoffrey Hinton
5339:
5276:
5202:Project Debater
5148:
5046:Implementations
5041:
4995:
4959:
4902:
4844:Backpropagation
4778:
4764:Tensor calculus
4718:
4715:
4685:
4680:
4649:
4629:Syntax guessing
4611:
4604:
4590:Predictive text
4585:Grammar checker
4566:
4559:
4531:
4498:
4487:
4453:Bank of English
4436:
4364:
4355:
4346:
4277:
4234:
4202:
4154:
4056:Distant reading
4031:Argument mining
4017:
4013:Text processing
3959:
3954:
3911:Python (Gensim)
3887:
3885:Implementations
3868:
3863:
3862:
3814:
3813:
3809:
3799:
3797:
3793:
3792:
3788:
3774:
3764:
3751:
3750:
3746:
3736:
3735:
3731:
3693:
3692:
3688:
3660:
3659:
3655:
3611:
3610:
3606:
3590:
3589:
3585:
3521:
3520:
3513:
3506:
3481:
3480:
3476:
3460:
3459:
3455:
3439:
3438:
3431:
3399:
3398:
3394:
3376:
3375:
3371:
3361:
3360:
3356:
3336:
3335:
3328:
3318:
3316:
3309:
3305:
3304:
3300:
3290:
3288:
3280:
3279:
3275:
3223:
3222:
3218:
3202:
3201:
3197:
3191:Wayback Machine
3179:
3175:
3161:
3160:
3153:
3146:
3136:
3135:
3131:
3113:
3112:
3108:
3089:
3088:
3084:
3068:
3067:
3058:
3048:
3046:
3044:code.google.com
3038:
3037:
3028:
2996:
2995:
2988:
2971:
2958:
2957:
2932:
2927:
2922:
2853:
2825:
2816:
2791:
2762:
2741:
2701:
2673:
2641:variable-length
2637:
2629:
2620:
2611:
2602:
2581:
2573:parametrization
2569:
2530:ICLR conference
2507:
2470:
2465:
2452:
2447:
2410:
2405:
2390:
2385:
2384:
2380:
2341:
2340:
2316:
2301:
2247:
2246:
2219:
2204:
2157:
2124:
2110:
2109:
2085:
2052:
2022:
2021:
2010:
1974:
1969:
1956:
1951:
1914:
1909:
1894:
1889:
1888:
1884:
1845:
1844:
1812:
1807:
1790:
1772:
1767:
1731:
1709:
1708:
1685:
1680:
1647:
1646:
1604:
1596:
1595:
1550:
1535:
1499:
1498:
1453:
1438:
1408:
1407:
1307:
1306:
1295:
1254:
1253:
1230:
1217:
1212:
1211:
1192:
1191:
1172:
1171:
1162:
1118:and produces a
1112:neural networks
1108:word embeddings
1104:
1041:
1012:
1011:
985:
977:
976:
937:
929:
928:
889:Kernel machines
884:
876:
875:
851:
843:
842:
823:Active learning
818:
810:
809:
778:
768:
767:
693:Diffusion model
629:
619:
618:
591:
581:
580:
554:
544:
543:
499:Factor analysis
494:
484:
483:
467:
430:
420:
419:
340:
339:
323:
322:
321:
310:
309:
215:
207:
206:
172:Online learning
137:
125:
96:
57:
55:
52:
48:Initial release
28:
23:
22:
15:
12:
11:
5:
5651:
5649:
5641:
5640:
5635:
5630:
5625:
5615:
5614:
5606:
5605:
5603:
5602:
5601:
5600:
5595:
5582:
5581:
5580:
5575:
5561:
5558:
5557:
5555:
5554:
5549:
5544:
5539:
5534:
5529:
5524:
5519:
5514:
5509:
5504:
5499:
5494:
5489:
5484:
5478:
5476:
5472:
5471:
5469:
5468:
5463:
5458:
5453:
5448:
5443:
5438:
5433:
5428:
5422:
5420:
5416:
5415:
5413:
5412:
5410:Ilya Sutskever
5407:
5402:
5397:
5392:
5387:
5382:
5377:
5375:Demis Hassabis
5372:
5367:
5365:Ian Goodfellow
5362:
5357:
5351:
5349:
5345:
5344:
5341:
5340:
5338:
5337:
5332:
5331:
5330:
5320:
5315:
5310:
5305:
5300:
5295:
5290:
5284:
5282:
5278:
5277:
5275:
5274:
5269:
5264:
5259:
5254:
5249:
5244:
5239:
5234:
5229:
5224:
5219:
5214:
5209:
5204:
5199:
5194:
5193:
5192:
5182:
5177:
5172:
5167:
5162:
5156:
5154:
5150:
5149:
5147:
5146:
5141:
5140:
5139:
5134:
5124:
5123:
5122:
5117:
5112:
5102:
5097:
5092:
5087:
5082:
5077:
5072:
5067:
5062:
5056:
5054:
5047:
5043:
5042:
5040:
5039:
5034:
5029:
5024:
5019:
5014:
5009:
5003:
5001:
4997:
4996:
4994:
4993:
4988:
4983:
4978:
4973:
4967:
4965:
4961:
4960:
4958:
4957:
4956:
4955:
4948:Language model
4945:
4940:
4935:
4934:
4933:
4923:
4922:
4921:
4910:
4908:
4904:
4903:
4901:
4900:
4898:Autoregression
4895:
4890:
4889:
4888:
4878:
4876:Regularization
4873:
4872:
4871:
4866:
4861:
4851:
4846:
4841:
4839:Loss functions
4836:
4831:
4826:
4821:
4816:
4815:
4814:
4804:
4799:
4798:
4797:
4786:
4784:
4780:
4779:
4777:
4776:
4774:Inductive bias
4771:
4766:
4761:
4756:
4751:
4746:
4741:
4736:
4728:
4726:
4720:
4719:
4716:
4714:
4713:
4706:
4699:
4691:
4682:
4681:
4679:
4678:
4673:
4668:
4663:
4657:
4655:
4651:
4650:
4648:
4647:
4642:
4637:
4632:
4622:
4616:
4614:
4612:user interface
4606:
4605:
4603:
4602:
4597:
4592:
4587:
4582:
4577:
4571:
4569:
4561:
4560:
4558:
4557:
4552:
4547:
4541:
4539:
4533:
4532:
4530:
4529:
4524:
4519:
4514:
4509:
4503:
4501:
4493:
4492:
4489:
4488:
4486:
4485:
4480:
4475:
4470:
4465:
4460:
4455:
4450:
4444:
4442:
4438:
4437:
4435:
4434:
4429:
4424:
4419:
4414:
4409:
4404:
4399:
4394:
4389:
4384:
4379:
4374:
4368:
4366:
4357:
4348:
4347:
4345:
4344:
4339:
4337:Word embedding
4334:
4329:
4324:
4317:Language model
4314:
4309:
4304:
4299:
4294:
4288:
4286:
4279:
4278:
4276:
4275:
4270:
4268:Transfer-based
4265:
4260:
4255:
4250:
4244:
4242:
4236:
4235:
4233:
4232:
4227:
4222:
4216:
4214:
4208:
4207:
4204:
4203:
4201:
4200:
4195:
4190:
4185:
4180:
4175:
4170:
4164:
4162:
4153:
4152:
4147:
4142:
4137:
4132:
4127:
4121:
4120:
4115:
4110:
4105:
4100:
4095:
4090:
4089:
4088:
4083:
4073:
4068:
4063:
4058:
4053:
4048:
4043:
4041:Concept mining
4038:
4033:
4027:
4025:
4019:
4018:
4016:
4015:
4010:
4005:
4000:
3995:
3994:
3993:
3988:
3978:
3973:
3967:
3965:
3961:
3960:
3955:
3953:
3952:
3945:
3938:
3930:
3924:
3923:
3918:
3913:
3908:
3903:
3901:Python (Spark)
3898:
3893:
3886:
3883:
3882:
3881:
3867:
3866:External links
3864:
3861:
3860:
3807:
3786:
3777:|journal=
3744:
3729:
3686:
3653:
3604:
3583:
3511:
3504:
3474:
3453:
3429:
3392:
3369:
3354:
3326:
3298:
3273:
3216:
3195:
3173:
3151:
3129:
3106:
3082:
3056:
3026:
2986:
2929:
2928:
2926:
2923:
2921:
2920:
2915:
2910:
2905:
2900:
2895:
2893:Thought vector
2890:
2885:
2880:
2875:
2870:
2865:
2860:
2854:
2852:
2849:
2841:learning curve
2824:
2821:
2815:
2812:
2810:of new words.
2790:
2787:
2766:word embedding
2761:
2758:
2740:
2737:
2721:bioinformatics
2700:
2697:
2672:
2669:
2636:
2633:
2628:
2625:
2619:
2618:Context window
2616:
2610:
2609:Dimensionality
2607:
2601:
2598:
2593:log-likelihood
2580:
2577:
2568:
2565:
2506:
2503:
2489:
2478:
2473:
2468:
2464:
2459:
2455:
2450:
2444:
2441:
2438:
2434:
2430:
2427:
2424:
2417:
2413:
2408:
2404:
2397:
2393:
2388:
2383:
2377:
2374:
2371:
2368:
2365:
2362:
2359:
2356:
2353:
2349:
2328:
2323:
2319:
2314:
2308:
2304:
2300:
2297:
2294:
2291:
2288:
2283:
2280:
2277:
2274:
2271:
2268:
2265:
2262:
2259:
2255:
2231:
2226:
2222:
2217:
2211:
2207:
2203:
2200:
2197:
2192:
2189:
2186:
2183:
2180:
2176:
2172:
2169:
2164:
2160:
2155:
2151:
2148:
2145:
2142:
2139:
2136:
2131:
2127:
2123:
2120:
2117:
2097:
2092:
2088:
2083:
2079:
2076:
2073:
2070:
2067:
2064:
2059:
2055:
2051:
2048:
2045:
2040:
2037:
2034:
2030:
2009:
2006:
1991:
1981:
1977:
1972:
1968:
1963:
1959:
1954:
1948:
1945:
1942:
1938:
1934:
1931:
1928:
1921:
1917:
1912:
1908:
1901:
1897:
1892:
1887:
1881:
1878:
1875:
1872:
1869:
1866:
1863:
1860:
1857:
1853:
1827:
1824:
1819:
1815:
1810:
1804:
1801:
1798:
1794:
1787:
1784:
1779:
1775:
1770:
1764:
1761:
1758:
1755:
1752:
1749:
1746:
1743:
1738:
1734:
1729:
1725:
1722:
1719:
1716:
1692:
1688:
1683:
1677:
1674:
1671:
1668:
1665:
1661:
1657:
1654:
1634:
1631:
1628:
1625:
1622:
1619:
1616:
1611:
1607:
1603:
1580:
1577:
1574:
1571:
1568:
1565:
1562:
1557:
1553:
1548:
1542:
1538:
1534:
1531:
1528:
1525:
1522:
1517:
1514:
1511:
1507:
1483:
1480:
1477:
1474:
1471:
1468:
1465:
1460:
1456:
1451:
1445:
1441:
1437:
1434:
1431:
1426:
1423:
1420:
1416:
1392:
1389:
1386:
1383:
1380:
1377:
1374:
1371:
1368:
1365:
1362:
1359:
1356:
1353:
1350:
1347:
1344:
1341:
1338:
1335:
1332:
1329:
1326:
1323:
1320:
1317:
1314:
1294:
1291:
1267:
1264:
1261:
1252:for each word
1239:
1234:
1229:
1224:
1220:
1199:
1179:
1161:
1158:
1116:corpus of text
1103:
1100:
1043:
1042:
1040:
1039:
1032:
1025:
1017:
1014:
1013:
1010:
1009:
1004:
1003:
1002:
992:
986:
983:
982:
979:
978:
975:
974:
969:
964:
959:
954:
949:
944:
938:
935:
934:
931:
930:
927:
926:
921:
916:
911:
909:Occam learning
906:
901:
896:
891:
885:
882:
881:
878:
877:
874:
873:
868:
866:Learning curve
863:
858:
852:
849:
848:
845:
844:
841:
840:
835:
830:
825:
819:
816:
815:
812:
811:
808:
807:
806:
805:
795:
790:
785:
779:
774:
773:
770:
769:
766:
765:
759:
754:
749:
744:
743:
742:
732:
727:
726:
725:
720:
715:
710:
700:
695:
690:
685:
684:
683:
673:
672:
671:
666:
661:
656:
646:
641:
636:
630:
625:
624:
621:
620:
617:
616:
611:
606:
598:
592:
587:
586:
583:
582:
579:
578:
577:
576:
571:
566:
555:
550:
549:
546:
545:
542:
541:
536:
531:
526:
521:
516:
511:
506:
501:
495:
490:
489:
486:
485:
482:
481:
476:
471:
465:
460:
455:
447:
442:
437:
431:
426:
425:
422:
421:
418:
417:
412:
407:
402:
397:
392:
387:
382:
374:
373:
372:
367:
362:
352:
350:Decision trees
347:
341:
327:classification
317:
316:
315:
312:
311:
308:
307:
302:
297:
292:
287:
282:
277:
272:
267:
262:
257:
252:
247:
242:
237:
232:
227:
222:
220:Classification
216:
213:
212:
209:
208:
205:
204:
199:
194:
189:
184:
179:
177:Batch learning
174:
169:
164:
159:
154:
149:
144:
138:
135:
134:
131:
130:
119:
118:
110:
109:
104:
98:
97:
95:
94:
92:Word embedding
89:
87:Language model
83:
81:
75:
74:
69:
63:
62:
58:July 29, 2013.
51:July 29, 2013.
49:
45:
44:
39:
26:
24:
14:
13:
10:
9:
6:
4:
3:
2:
5650:
5639:
5636:
5634:
5631:
5629:
5626:
5624:
5621:
5620:
5618:
5611:
5599:
5596:
5594:
5591:
5590:
5583:
5579:
5576:
5574:
5571:
5570:
5567:
5563:
5562:
5559:
5553:
5550:
5548:
5545:
5543:
5540:
5538:
5535:
5533:
5530:
5528:
5525:
5523:
5520:
5518:
5515:
5513:
5510:
5508:
5505:
5503:
5500:
5498:
5495:
5493:
5490:
5488:
5485:
5483:
5480:
5479:
5477:
5475:Architectures
5473:
5467:
5464:
5462:
5459:
5457:
5454:
5452:
5449:
5447:
5444:
5442:
5439:
5437:
5434:
5432:
5429:
5427:
5424:
5423:
5421:
5419:Organizations
5417:
5411:
5408:
5406:
5403:
5401:
5398:
5396:
5393:
5391:
5388:
5386:
5383:
5381:
5378:
5376:
5373:
5371:
5368:
5366:
5363:
5361:
5358:
5356:
5355:Yoshua Bengio
5353:
5352:
5350:
5346:
5336:
5335:Robot control
5333:
5329:
5326:
5325:
5324:
5321:
5319:
5316:
5314:
5311:
5309:
5306:
5304:
5301:
5299:
5296:
5294:
5291:
5289:
5286:
5285:
5283:
5279:
5273:
5270:
5268:
5265:
5263:
5260:
5258:
5255:
5253:
5252:Chinchilla AI
5250:
5248:
5245:
5243:
5240:
5238:
5235:
5233:
5230:
5228:
5225:
5223:
5220:
5218:
5215:
5213:
5210:
5208:
5205:
5203:
5200:
5198:
5195:
5191:
5188:
5187:
5186:
5183:
5181:
5178:
5176:
5173:
5171:
5168:
5166:
5163:
5161:
5158:
5157:
5155:
5151:
5145:
5142:
5138:
5135:
5133:
5130:
5129:
5128:
5125:
5121:
5118:
5116:
5113:
5111:
5108:
5107:
5106:
5103:
5101:
5098:
5096:
5093:
5091:
5088:
5086:
5083:
5081:
5078:
5076:
5073:
5071:
5068:
5066:
5063:
5061:
5058:
5057:
5055:
5051:
5048:
5044:
5038:
5035:
5033:
5030:
5028:
5025:
5023:
5020:
5018:
5015:
5013:
5010:
5008:
5005:
5004:
5002:
4998:
4992:
4989:
4987:
4984:
4982:
4979:
4977:
4974:
4972:
4969:
4968:
4966:
4962:
4954:
4951:
4950:
4949:
4946:
4944:
4941:
4939:
4936:
4932:
4931:Deep learning
4929:
4928:
4927:
4924:
4920:
4917:
4916:
4915:
4912:
4911:
4909:
4905:
4899:
4896:
4894:
4891:
4887:
4884:
4883:
4882:
4879:
4877:
4874:
4870:
4867:
4865:
4862:
4860:
4857:
4856:
4855:
4852:
4850:
4847:
4845:
4842:
4840:
4837:
4835:
4832:
4830:
4827:
4825:
4822:
4820:
4819:Hallucination
4817:
4813:
4810:
4809:
4808:
4805:
4803:
4800:
4796:
4793:
4792:
4791:
4788:
4787:
4785:
4781:
4775:
4772:
4770:
4767:
4765:
4762:
4760:
4757:
4755:
4752:
4750:
4747:
4745:
4742:
4740:
4737:
4735:
4734:
4730:
4729:
4727:
4725:
4721:
4712:
4707:
4705:
4700:
4698:
4693:
4692:
4689:
4677:
4674:
4672:
4669:
4667:
4666:Hallucination
4664:
4662:
4659:
4658:
4656:
4652:
4646:
4643:
4641:
4638:
4636:
4633:
4630:
4626:
4623:
4621:
4618:
4617:
4615:
4613:
4607:
4601:
4600:Spell checker
4598:
4596:
4593:
4591:
4588:
4586:
4583:
4581:
4578:
4576:
4573:
4572:
4570:
4568:
4562:
4556:
4553:
4551:
4548:
4546:
4543:
4542:
4540:
4538:
4534:
4528:
4525:
4523:
4520:
4518:
4515:
4513:
4510:
4508:
4505:
4504:
4502:
4500:
4494:
4484:
4481:
4479:
4476:
4474:
4471:
4469:
4466:
4464:
4461:
4459:
4456:
4454:
4451:
4449:
4446:
4445:
4443:
4439:
4433:
4430:
4428:
4425:
4423:
4420:
4418:
4415:
4413:
4412:Speech corpus
4410:
4408:
4405:
4403:
4400:
4398:
4395:
4393:
4392:Parallel text
4390:
4388:
4385:
4383:
4380:
4378:
4375:
4373:
4370:
4369:
4367:
4361:
4358:
4353:
4349:
4343:
4340:
4338:
4335:
4333:
4330:
4328:
4325:
4322:
4318:
4315:
4313:
4310:
4308:
4305:
4303:
4300:
4298:
4295:
4293:
4290:
4289:
4287:
4284:
4280:
4274:
4271:
4269:
4266:
4264:
4261:
4259:
4256:
4254:
4253:Example-based
4251:
4249:
4246:
4245:
4243:
4241:
4237:
4231:
4228:
4226:
4223:
4221:
4218:
4217:
4215:
4213:
4209:
4199:
4196:
4194:
4191:
4189:
4186:
4184:
4183:Text chunking
4181:
4179:
4176:
4174:
4173:Lemmatisation
4171:
4169:
4166:
4165:
4163:
4161:
4157:
4151:
4148:
4146:
4143:
4141:
4138:
4136:
4133:
4131:
4128:
4126:
4123:
4122:
4119:
4116:
4114:
4111:
4109:
4106:
4104:
4101:
4099:
4096:
4094:
4091:
4087:
4084:
4082:
4079:
4078:
4077:
4074:
4072:
4069:
4067:
4064:
4062:
4059:
4057:
4054:
4052:
4049:
4047:
4044:
4042:
4039:
4037:
4034:
4032:
4029:
4028:
4026:
4024:
4023:Text analysis
4020:
4014:
4011:
4009:
4006:
4004:
4001:
3999:
3996:
3992:
3989:
3987:
3984:
3983:
3982:
3979:
3977:
3974:
3972:
3969:
3968:
3966:
3964:General terms
3962:
3958:
3951:
3946:
3944:
3939:
3937:
3932:
3931:
3928:
3922:
3919:
3917:
3914:
3912:
3909:
3907:
3904:
3902:
3899:
3897:
3894:
3892:
3889:
3888:
3884:
3879:
3875:
3873:
3872:Wikipedia2Vec
3870:
3869:
3865:
3856:
3852:
3848:
3844:
3840:
3836:
3831:
3826:
3822:
3818:
3811:
3808:
3796:
3790:
3787:
3782:
3769:
3760:
3755:
3748:
3745:
3740:
3733:
3730:
3724:
3719:
3714:
3709:
3705:
3701:
3697:
3690:
3687:
3681:
3676:
3672:
3668:
3664:
3657:
3654:
3649:
3645:
3640:
3635:
3631:
3627:
3623:
3619:
3615:
3608:
3605:
3599:
3594:
3587:
3584:
3579:
3575:
3570:
3565:
3560:
3555:
3551:
3547:
3542:
3537:
3533:
3529:
3525:
3518:
3516:
3512:
3507:
3501:
3497:
3493:
3489:
3485:
3478:
3475:
3470:
3469:
3464:
3457:
3454:
3448:
3443:
3436:
3434:
3430:
3425:
3421:
3416:
3411:
3407:
3403:
3396:
3393:
3388:
3384:
3380:
3373:
3370:
3365:
3358:
3355:
3349:
3344:
3340:
3333:
3331:
3327:
3315:
3308:
3302:
3299:
3287:
3286:Google Groups
3283:
3277:
3274:
3269:
3265:
3261:
3257:
3253:
3249:
3244:
3239:
3235:
3231:
3227:
3220:
3217:
3211:
3206:
3199:
3196:
3192:
3188:
3185:
3183:
3177:
3174:
3169:
3165:
3158:
3156:
3152:
3143:
3139:
3133:
3130:
3125:
3121:
3117:
3110:
3107:
3102:
3097:
3093:
3086:
3083:
3077:
3072:
3065:
3063:
3061:
3057:
3045:
3041:
3035:
3033:
3031:
3027:
3022:
3018:
3013:
3008:
3004:
3000:
2993:
2991:
2987:
2982:
2976:
2967:
2962:
2955:
2953:
2951:
2949:
2947:
2945:
2943:
2941:
2939:
2937:
2935:
2931:
2924:
2919:
2916:
2914:
2911:
2909:
2906:
2904:
2901:
2899:
2896:
2894:
2891:
2889:
2886:
2884:
2881:
2879:
2876:
2874:
2871:
2869:
2866:
2864:
2861:
2859:
2856:
2855:
2850:
2848:
2846:
2842:
2837:
2833:
2829:
2822:
2820:
2813:
2811:
2809:
2803:
2795:
2788:
2786:
2784:
2778:
2776:
2772:
2767:
2759:
2757:
2754:
2749:
2747:
2738:
2736:
2734:
2730:
2726:
2722:
2718:
2714:
2710:
2706:
2698:
2696:
2692:
2690:
2685:
2683:
2679:
2670:
2668:
2664:
2660:
2658:
2654:
2650:
2646:
2642:
2634:
2632:
2626:
2624:
2617:
2615:
2608:
2606:
2599:
2597:
2594:
2590:
2586:
2578:
2576:
2574:
2566:
2564:
2562:
2558:
2554:
2549:
2547:
2542:
2538:
2533:
2531:
2527:
2522:
2520:
2516:
2512:
2511:Tomáš Mikolov
2504:
2502:
2487:
2476:
2471:
2466:
2462:
2457:
2453:
2448:
2442:
2439:
2436:
2432:
2428:
2425:
2422:
2415:
2411:
2406:
2402:
2395:
2391:
2386:
2381:
2375:
2372:
2369:
2366:
2363:
2360:
2357:
2354:
2351:
2347:
2321:
2317:
2306:
2302:
2295:
2292:
2289:
2286:
2281:
2278:
2275:
2272:
2269:
2266:
2263:
2260:
2257:
2253:
2243:
2224:
2220:
2209:
2205:
2198:
2195:
2190:
2187:
2184:
2181:
2178:
2174:
2170:
2162:
2158:
2149:
2146:
2143:
2140:
2137:
2134:
2129:
2125:
2118:
2115:
2090:
2086:
2077:
2074:
2071:
2068:
2065:
2062:
2057:
2053:
2046:
2043:
2038:
2035:
2032:
2028:
2014:
2007:
2005:
1989:
1979:
1975:
1970:
1966:
1961:
1957:
1952:
1946:
1943:
1940:
1936:
1932:
1929:
1926:
1919:
1915:
1910:
1906:
1899:
1895:
1890:
1885:
1879:
1876:
1873:
1870:
1867:
1864:
1861:
1858:
1855:
1851:
1825:
1822:
1817:
1813:
1808:
1802:
1799:
1796:
1792:
1785:
1782:
1777:
1773:
1768:
1762:
1756:
1753:
1750:
1747:
1744:
1741:
1736:
1732:
1723:
1717:
1714:
1690:
1686:
1681:
1675:
1672:
1669:
1666:
1663:
1659:
1655:
1652:
1629:
1626:
1623:
1620:
1617:
1614:
1609:
1605:
1592:
1575:
1572:
1569:
1566:
1563:
1560:
1555:
1551:
1540:
1536:
1529:
1526:
1523:
1520:
1515:
1512:
1509:
1505:
1495:
1478:
1475:
1472:
1469:
1466:
1463:
1458:
1454:
1443:
1439:
1432:
1429:
1424:
1421:
1418:
1414:
1404:
1387:
1384:
1381:
1378:
1375:
1372:
1369:
1366:
1363:
1360:
1357:
1354:
1351:
1348:
1345:
1342:
1339:
1336:
1333:
1330:
1327:
1324:
1321:
1315:
1312:
1299:
1292:
1290:
1288:
1283:
1279:
1265:
1262:
1259:
1237:
1227:
1222:
1218:
1197:
1177:
1168:
1165:
1159:
1157:
1155:
1150:
1146:
1142:
1140:
1136:
1131:
1129:
1125:
1121:
1117:
1113:
1109:
1101:
1099:
1097:
1093:
1089:
1085:
1081:
1076:
1074:
1070:
1069:Tomáš Mikolov
1066:
1062:
1057:
1053:
1049:
1038:
1033:
1031:
1026:
1024:
1019:
1018:
1016:
1015:
1008:
1005:
1001:
998:
997:
996:
993:
991:
988:
987:
981:
980:
973:
970:
968:
965:
963:
960:
958:
955:
953:
950:
948:
945:
943:
940:
939:
933:
932:
925:
922:
920:
917:
915:
912:
910:
907:
905:
902:
900:
897:
895:
892:
890:
887:
886:
880:
879:
872:
869:
867:
864:
862:
859:
857:
854:
853:
847:
846:
839:
836:
834:
831:
829:
828:Crowdsourcing
826:
824:
821:
820:
814:
813:
804:
801:
800:
799:
796:
794:
791:
789:
786:
784:
781:
780:
777:
772:
771:
763:
760:
758:
757:Memtransistor
755:
753:
750:
748:
745:
741:
738:
737:
736:
733:
731:
728:
724:
721:
719:
716:
714:
711:
709:
706:
705:
704:
701:
699:
696:
694:
691:
689:
686:
682:
679:
678:
677:
674:
670:
667:
665:
662:
660:
657:
655:
652:
651:
650:
647:
645:
642:
640:
639:Deep learning
637:
635:
632:
631:
628:
623:
622:
615:
612:
610:
607:
605:
603:
599:
597:
594:
593:
590:
585:
584:
575:
574:Hidden Markov
572:
570:
567:
565:
562:
561:
560:
557:
556:
553:
548:
547:
540:
537:
535:
532:
530:
527:
525:
522:
520:
517:
515:
512:
510:
507:
505:
502:
500:
497:
496:
493:
488:
487:
480:
477:
475:
472:
470:
466:
464:
461:
459:
456:
454:
452:
448:
446:
443:
441:
438:
436:
433:
432:
429:
424:
423:
416:
413:
411:
408:
406:
403:
401:
398:
396:
393:
391:
388:
386:
383:
381:
379:
375:
371:
370:Random forest
368:
366:
363:
361:
358:
357:
356:
353:
351:
348:
346:
343:
342:
335:
334:
329:
328:
320:
314:
313:
306:
303:
301:
298:
296:
293:
291:
288:
286:
283:
281:
278:
276:
273:
271:
268:
266:
263:
261:
258:
256:
255:Data cleaning
253:
251:
248:
246:
243:
241:
238:
236:
233:
231:
228:
226:
223:
221:
218:
217:
211:
210:
203:
200:
198:
195:
193:
190:
188:
185:
183:
180:
178:
175:
173:
170:
168:
167:Meta-learning
165:
163:
160:
158:
155:
153:
150:
148:
145:
143:
140:
139:
133:
132:
129:
124:
120:
116:
115:
108:
105:
103:
99:
93:
90:
88:
85:
84:
82:
80:
76:
73:
70:
68:
64:
50:
46:
43:
40:
38:
34:
19:
5610:
5441:Hugging Face
5405:David Silver
5159:
5053:Audio–visual
4907:Applications
4886:Augmentation
4731:
4580:Concordancer
4341:
3976:Bag-of-words
3878:introduction
3820:
3816:
3810:
3798:. Retrieved
3789:
3768:cite journal
3747:
3738:
3732:
3703:
3699:
3689:
3670:
3666:
3656:
3621:
3617:
3607:
3586:
3531:
3527:
3487:
3477:
3466:
3456:
3405:
3395:
3386:
3382:
3372:
3357:
3338:
3317:. Retrieved
3313:
3301:
3289:. Retrieved
3285:
3276:
3233:
3229:
3219:
3198:
3181:
3176:
3132:
3115:
3109:
3091:
3085:
3047:. Retrieved
3043:
2998:
2838:
2834:
2830:
2826:
2817:
2804:
2800:
2779:
2763:
2750:
2742:
2702:
2693:
2686:
2674:
2665:
2661:
2640:
2638:
2630:
2621:
2612:
2603:
2600:Sub-sampling
2589:Huffman tree
2582:
2570:
2550:
2534:
2523:
2508:
2244:
2019:
1593:
1496:
1405:
1304:
1284:
1280:
1169:
1166:
1163:
1154:semantically
1151:
1147:
1143:
1139:bag of words
1132:
1120:vector space
1105:
1095:
1091:
1077:
1047:
1046:
914:PAC learning
601:
450:
445:Hierarchical
377:
331:
325:
5589:Categories
5537:Autoencoder
5492:Transformer
5360:Alex Graves
5308:OpenAI Five
5212:IBM Watsonx
4834:Convolution
4812:Overfitting
4537:Topic model
4417:Text corpus
4263:Statistical
4130:Text mining
3971:AI-complete
3823:: 178–187.
3706:: 385–399.
3142:Google Inc.
2863:Autoencoder
2553:Transformer
798:Multi-agent
735:Transformer
634:Autoencoder
390:Naive Bayes
128:data mining
5617:Categories
5578:Technology
5431:EleutherAI
5390:Fei-Fei Li
5385:Yann LeCun
5298:Q-learning
5281:Decisional
5207:IBM Watson
5115:Midjourney
5007:TensorFlow
4854:Activation
4807:Regression
4802:Clustering
4258:Rule-based
4140:Truecasing
4008:Stop words
3916:Java/Scala
3830:1610.01520
3759:1705.03127
3741:: 746–751.
3713:1502.03520
3598:1701.06279
3541:1503.05140
3447:2008.09470
3415:1609.06616
3243:2109.04738
3210:1607.01759
3138:US 9037464
2975:cite arXiv
2925:References
2705:biological
2699:BioVectors
2627:Extensions
1124:dimensions
1065:synonymous
783:Q-learning
681:Restricted
479:Mean shift
428:Clustering
405:Perceptron
333:regression
235:Clustering
230:Regression
107:Apache-2.0
67:Repository
5461:MIT CSAIL
5426:Anthropic
5395:Andrew Ng
5293:AlphaZero
5137:VideoPoet
5100:AlphaFold
5037:MindSpore
4991:SpiNNaker
4986:Memristor
4893:Diffusion
4869:Rectifier
4849:Batchnorm
4829:Attention
4824:Adversary
4567:reviewing
4365:standards
4363:Types and
3855:195347873
3739:HLT-Naacl
3624:: 11–20.
3463:"Top2Vec"
3348:1405.4053
3268:237485425
3260:1939-3520
3101:1411.2738
3076:1402.3722
3012:1310.4546
2966:1301.3781
2513:(then at
2509:In 2010,
2463:⋅
2440:∈
2433:∑
2429:
2423:−
2403:⋅
2367:∈
2355:∈
2348:∑
2290:
2273:∈
2261:∈
2254:∑
2182:∈
2175:∏
2141:∈
2069:∈
2036:∈
2029:∏
2016:Skip-gram
2008:Skip-gram
1967:⋅
1944:∈
1937:∑
1933:
1927:−
1907:⋅
1871:∈
1859:∈
1852:∑
1823:⋅
1800:∈
1793:∑
1783:⋅
1748:∈
1667:∈
1660:∑
1621:∈
1567:∈
1524:
1513:∈
1506:∑
1470:∈
1422:∈
1415:∏
1349:−
1340:−
1331:−
1322:−
1263:∈
1228:∈
942:ECML PKDD
924:VC theory
871:ROC curve
803:Self-play
723:DeepDream
564:Bayes net
355:Ensembles
136:Paradigms
42:Google AI
5569:Portals
5328:Auto-GPT
5160:Word2vec
4964:Hardware
4881:Datasets
4783:Concepts
4483:Wikidata
4463:FrameNet
4448:BabelNet
4427:Treebank
4397:PropBank
4342:Word2vec
4307:fastText
4188:Stemming
3847:28943127
3648:29175548
3578:26555596
3528:PLOS ONE
3364:"Gensim"
3319:18 March
3187:Archived
3168:Archived
2898:fastText
2858:Semantle
2851:See also
2760:Analysis
2717:proteins
2546:fastText
1102:Approach
1048:Word2vec
365:Boosting
214:Problems
31:word2vec
18:Word2Vec
5451:Meta AI
5288:AlphaGo
5272:PanGu-Σ
5242:ChatGPT
5217:Granite
5165:Seq2seq
5144:Whisper
5065:WaveNet
5060:AlexNet
5032:Flux.jl
5012:PyTorch
4864:Sigmoid
4859:Softmax
4724:General
4654:Related
4620:Chatbot
4478:WordNet
4458:DBpedia
4332:Seq2seq
4076:Parsing
3991:Trigram
3800:10 June
3639:5771955
3569:4640716
3546:Bibcode
3424:3087278
3291:13 June
3049:13 June
3017:Bibcode
2678:reduces
2671:top2vec
2635:doc2vec
2585:softmax
2505:History
1287:softmax
947:NeurIPS
764:(ECRAM)
718:AlexNet
360:Bagging
102:License
56: (
5466:Huawei
5446:OpenAI
5348:People
5318:MuZero
5180:Gemini
5175:Claude
5110:DALL-E
5022:Theano
4627:(c.f.
4285:models
4273:Neural
3986:Bigram
3981:n-gram
3853:
3845:
3646:
3636:
3576:
3566:
3502:
3468:GitHub
3422:
3266:
3258:
3147:
2725:BioVec
2719:) for
2715:, and
2649:Python
2526:Google
1128:corpus
1080:vector
1073:Google
1061:corpus
1056:vector
740:Vision
596:RANSAC
474:OPTICS
469:DBSCAN
453:-means
260:AutoML
5532:Mamba
5303:SARSA
5267:LLaMA
5262:BLOOM
5247:GPT-J
5237:GPT-4
5232:GPT-3
5227:GPT-2
5222:GPT-1
5185:LaMDA
5017:Keras
4676:spaCy
4321:large
4312:GloVe
3851:S2CID
3825:arXiv
3754:arXiv
3708:arXiv
3593:arXiv
3536:arXiv
3442:arXiv
3410:arXiv
3343:arXiv
3310:(PDF)
3264:S2CID
3238:arXiv
3205:arXiv
3184:(pdf)
3096:arXiv
3071:arXiv
3007:arXiv
2961:arXiv
2903:GloVe
2657:Scala
2541:GloVe
962:IJCAI
788:SARSA
747:Mamba
713:LeNet
708:U-Net
534:t-SNE
458:Fuzzy
435:BIRCH
5456:Mila
5257:PaLM
5190:Bard
5170:BERT
5153:Text
5132:Sora
4441:Data
4292:BERT
3843:PMID
3802:2016
3781:help
3644:PMID
3574:PMID
3500:ISBN
3420:SSRN
3406:SSRN
3389:(1).
3321:2017
3293:2016
3256:ISSN
3051:2016
2981:link
2908:ELMo
2682:UMAP
2653:Java
2651:and
2561:BERT
2559:and
2557:ELMo
1170:Let
1094:and
1092:walk
972:JMLR
957:ICLR
952:ICML
838:RLHF
654:LSTM
440:CURE
126:and
79:Type
5197:NMT
5080:OCR
5075:HWR
5027:JAX
4981:VPU
4976:TPU
4971:IPU
4795:SGD
4473:UBY
3835:doi
3718:doi
3675:doi
3634:PMC
3626:doi
3564:PMC
3554:doi
3492:doi
3248:doi
3120:doi
2713:RNA
2709:DNA
2689:LDA
1096:ran
698:SOM
688:GAN
664:ESN
659:GRU
604:-NN
539:SDL
529:PGD
524:PCA
519:NMF
514:LDA
509:ICA
504:CCA
380:-NN
5619::
3896:C#
3849:.
3841:.
3833:.
3821:56
3819:.
3772::
3770:}}
3766:{{
3716:.
3702:.
3698:.
3669:.
3665:.
3642:.
3632:.
3622:77
3620:.
3616:.
3572:.
3562:.
3552:.
3544:.
3532:10
3530:.
3526:.
3514:^
3498:.
3486:.
3465:.
3432:^
3418:.
3408:.
3404:.
3387:28
3385:.
3381:.
3341:.
3329:^
3312:.
3284:.
3262:.
3254:.
3246:.
3234:49
3232:.
3228:.
3166:.
3154:^
3094:,
3059:^
3042:.
3029:^
3015:.
3005:.
3001:.
2989:^
2977:}}
2973:{{
2933:^
2711:,
2647:,
2539:.
2426:ln
2287:ln
2242:.
1930:ln
1763::=
1656::=
1521:ln
1403:.
1278:.
967:ML
4710:e
4703:t
4696:v
4631:)
4354:,
4323:)
4319:(
3949:e
3942:t
3935:v
3921:R
3891:C
3880:)
3876:(
3857:.
3837::
3827::
3804:.
3783:)
3779:(
3762:.
3756::
3720::
3710::
3704:4
3683:.
3677::
3671:3
3650:.
3628::
3601:.
3595::
3580:.
3556::
3548::
3538::
3508:.
3494::
3471:.
3450:.
3444::
3426:.
3412::
3366:.
3351:.
3345::
3323:.
3295:.
3270:.
3250::
3240::
3213:.
3207::
3126:.
3122::
3098::
3079:.
3073::
3053:.
3023:.
3019::
3009::
2983:)
2969:.
2963::
2655:/
2645:C
2488:)
2477:i
2472:w
2467:v
2458:w
2454:v
2449:e
2443:V
2437:w
2416:j
2412:w
2407:v
2396:i
2392:w
2387:v
2382:(
2376:i
2373:+
2370:N
2364:j
2361:,
2358:C
2352:i
2327:)
2322:i
2318:w
2313:|
2307:j
2303:w
2299:(
2296:r
2293:P
2282:i
2279:+
2276:N
2270:j
2267:,
2264:C
2258:i
2230:)
2225:i
2221:w
2216:|
2210:j
2206:w
2202:(
2199:r
2196:P
2191:i
2188:+
2185:N
2179:j
2171:=
2168:)
2163:i
2159:w
2154:|
2150:i
2147:+
2144:N
2138:j
2135::
2130:j
2126:w
2122:(
2119:r
2116:P
2096:)
2091:i
2087:w
2082:|
2078:i
2075:+
2072:N
2066:j
2063::
2058:j
2054:w
2050:(
2047:r
2044:P
2039:C
2033:i
1990:)
1980:j
1976:w
1971:v
1962:w
1958:v
1953:e
1947:V
1941:w
1920:j
1916:w
1911:v
1900:i
1896:w
1891:v
1886:(
1880:i
1877:+
1874:N
1868:j
1865:,
1862:C
1856:i
1826:v
1818:w
1814:v
1809:e
1803:V
1797:w
1786:v
1778:w
1774:v
1769:e
1760:)
1757:i
1754:+
1751:N
1745:j
1742::
1737:j
1733:w
1728:|
1724:w
1721:(
1718:r
1715:P
1691:j
1687:w
1682:v
1676:i
1673:+
1670:N
1664:j
1653:v
1633:}
1630:i
1627:+
1624:N
1618:j
1615::
1610:j
1606:w
1602:{
1579:)
1576:i
1573:+
1570:N
1564:j
1561::
1556:j
1552:w
1547:|
1541:i
1537:w
1533:(
1530:r
1527:P
1516:C
1510:i
1482:)
1479:i
1476:+
1473:N
1467:j
1464::
1459:j
1455:w
1450:|
1444:i
1440:w
1436:(
1433:r
1430:P
1425:C
1419:i
1391:}
1388:4
1385:+
1382:,
1379:3
1376:+
1373:,
1370:2
1367:+
1364:,
1361:1
1358:+
1355:,
1352:1
1346:,
1343:2
1337:,
1334:3
1328:,
1325:4
1319:{
1316:=
1313:N
1266:V
1260:w
1238:n
1233:R
1223:w
1219:v
1198:C
1178:V
1036:e
1029:t
1022:v
602:k
451:k
378:k
336:)
324:(
60:)
20:)
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.