Knowledge (XXG)

Paraphrasing (computational linguistics)

Source πŸ“

1462:. The quality of a paraphrase depends on its context, whether it is being used as a summary, and how it is generated, among other factors. Additionally, a good paraphrase usually is lexically dissimilar from its source phrase. The simplest method used to evaluate paraphrase generation would be through the use of human judges. Unfortunately, evaluation through human judges tends to be time-consuming. Automated approaches to evaluation prove to be challenging as it is essentially a problem as difficult as paraphrase recognition. While originally used to evaluate machine translations, bilingual evaluation understudy ( 1385:. Skip-thought vectors are produced through the use of a skip-thought model which consists of three key components, an encoder and two decoders. Given a corpus of documents, the skip-thought model is trained to take a sentence as input and encode it into a skip-thought vector. The skip-thought vector is used as input for both decoders; one attempts to reproduce the previous sentence and the other the following sentence in its entirety. The encoder and decoder can be implemented through the use of a 918:
methods. Autoencoder models predict word replacement candidates with a one-hot distribution over the vocabulary, while autoregressive and seq2seq models generate new text based on the source predicting one word at a time. More advanced efforts also exist to make paraphrasing controllable according to
177:
to produce potential paraphrases in the original language. For example, the phrase "under control" in an English sentence is aligned with the phrase "unter kontrolle" in its German counterpart. The phrase "unter kontrolle" is then found in another German sentence with the aligned English phrase being
1449:
do relatively well. However, there is difficulty calculating f1-scores due to trouble producing a complete list of paraphrases for a given phrase and the fact that good paraphrases are dependent upon context. A metric designed to counter these problems is ParaMetric. ParaMetric aims to calculate the
877:
encoding of all the words in a sentence as input and produces a final hidden vector, which can represent the input sentence. The decoding LSTM takes the hidden vector as input and generates a new sentence, terminating in an end-of-sentence token. The encoder and decoder are trained to take a phrase
669: 160:
overlap. Recurring patterns are found within clusters by using multi-sequence alignment. Then the position of argument words is determined by finding areas of high variability within each cluster, aka between words shared by more than 50% of a cluster's sentences. Pairings between patterns are then
1081:
vectors. The autoencoder is then applied recursively with the new vectors as inputs until a single vector is produced. Given an odd number of inputs, the first vector is forwarded as-is to the next level of recursion. The autoencoder is trained to reproduce every vector in the full recursion tree,
1454:
of an automatic paraphrase system by comparing the automatic alignment of paraphrases to a manual alignment of similar phrases. Since ParaMetric is simply rating the quality of phrase alignment, it can be used to rate paraphrase generation systems, assuming it uses phrase alignment as part of its
1469:
Metrics specifically designed to evaluate paraphrase generation include paraphrase in n-gram change (PINC) and paraphrase evaluation metric (PEM) along with the aforementioned ParaMetric. PINC is designed to be used with BLEU and help cover its inadequacies. Since BLEU has difficulty measuring
1483:
The Quora Question Pairs Dataset, which contains hundreds of thousands of duplicate questions, has become a common dataset for the evaluation of paraphrase detectors. Consistently reliable paraphrase detection have all used the Transformer architecture and all have relied on large amounts of
1474:
between the sentence, excluding n-grams that appear in the source sentence to maintain some semantic equivalence. PEM, on the other hand, attempts to evaluate the "adequacy, fluency, and lexical dissimilarity" of paraphrases by returning a single value heuristic calculated using
936:. The main concept is to produce a vector representation of a sentence and its components by recursively using an autoencoder. The vector representations of paraphrases should have similar vector representations; they are processed, then fed as input into a 161:
found by comparing similar variable words between different corpora. Finally, new paraphrases can be generated by choosing a matching cluster for a source sentence, then substituting the source sentence's argument into any number of patterns in the cluster.
1479:
overlap in a pivot language. However, a large drawback to PEM is that it must be trained using large, in-domain parallel corpora and human judges. It is equivalent to training a paraphrase recognition to evaluate a paraphrase generation system.
919:
predefined quality dimensions, such as semantic preservation or lexical diversity. Many Transformer-based paraphrase generation methods rely on unsupervised learning to leverage large amounts of training data and scale their methods.
446: 1466:) has been used successfully to evaluate paraphrase generation models as well. However, paraphrases often have several lexically different but equally valid solutions, hurting BLEU and other similar evaluation metrics. 1416:
layer and trained end-to-end on identification tasks. Transformers achieve strong results when transferring between domains and paraphrasing techniques compared to more traditional machine learning methods such as
1236: 906:. These models are so fluent in generating text that human experts cannot identify if an example was human-authored or machine-generated. Transformer-based paraphrase generation relies on 2169:
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
2132:
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
1526: 355: 1079: 1300: 230: 438: 759: 715: 1455:
generation process. A notable drawback to ParaMetric is the large and exhaustive set of manual alignments that must be initially created before a rating can be produced.
1437:
Multiple methods can be used to evaluate paraphrases. Since paraphrase recognition can be posed as a classification problem, most standard evaluations metrics such as
1367: 1195: 1168: 1137: 1110: 854: 826: 402: 284: 257: 1617: 1340: 1320: 1256: 1045: 1025: 1001: 981: 961: 799: 779: 375: 664:{\displaystyle {\hat {e_{2}}}={\text{arg}}\max _{e_{2}\neq e_{1}}\Pr(e_{2}|e_{1},S)={\text{arg}}\max _{e_{2}\neq e_{1}}\sum _{f}\Pr(e_{2}|f,S)\Pr(f|e_{1},S)} 2244:. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Portland, Oregon. pp. 190–200. 869:(LSTM) models to generate paraphrases. In short, the model consists of an encoder and decoder component, both implemented using variations of a stacked 1470:
lexical dissimilarity, PINC is a measurement of the lack of n-gram overlap between a source sentence and a candidate paraphrase. It is essentially the
1584:
Wahle, Jan Philip; Ruas, Terry; Kirstein, Frederic; Gipp, Bela (2022). "How Large Language Models are Transforming Machine-Paraphrase Plagiarism".
1412:
influenced paraphrase generation, their application in identifying paraphrases showed great success. Models such as BERT can be adapted with a
1931:
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
2295:- a dataset consisting of 5800 pairs of sentences extracted from news articles annotated to note whether a pair captures semantic equivalence 2097: 1862: 1534: 1369:
roughly even sections. The output is then normalized to have mean 0 and standard deviation 1 and is fed into a fully connected layer with a
1409: 895: 1139:
of length 4 and 3 respectively, the autoencoders would produce 7 and 5 vector representations including the initial word embeddings. The
2314: 1446: 1396:
Since paraphrases carry the same semantic meaning between one another, they should have similar skip-thought vectors. Thus a simple
69: 1560:. EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing. Honolulu, Hawaii. pp. 196–205. 1574:." Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vol. 1. 2014. 1426: 1400:
can be trained to good performance with the absolute difference and component-wise product of two skip-thought vectors as input.
2261:. Proceedings of the 2010 Conference on Empricial Methods in Natural Language Processing. MIT, Massachusetts. pp. 923–932. 170: 1381:
Skip-thought vectors are an attempt to create a vector representation of the semantic meaning of a sentence, similarly to the
1999:
Kiros, Ryan; Zhu, Yukun; Salakhutdinov, Ruslan; Zemel, Richard; Torralba, Antonio; Urtasun, Raquel; Fidler, Sanja (2015),
1422: 883: 102: 77: 49: 31: 1200: 903: 870: 2319: 1668:
Prakash, Aaditya; Hasan, Sadid A.; Lee, Kathy; Datla, Vivek; Qadir, Ashequl; Liu, Joey; Farri, Oladimeji (2016),
937: 45: 1886:
Bandel, Elron; Aharonov, Ranit; Shmueli-Scheuer, Michal; Shnayderman, Ilya; Slonim, Noam; Ein-Dor, Liat (2022).
886:. New paraphrases are generated by inputting a new phrase to the encoder and passing the output to the decoder. 1892:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
1743:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
1386: 2272: 2204:. Proceedings of the 22nd International Conference on Computational Linguistics. Manchester. pp. 97–104. 289: 1970:. Online and Punta Cana, Dominican Republic: Association for Computational Linguistics. pp. 5136–5150. 1708:. Online and Punta Cana, Dominican Republic: Association for Computational Linguistics. pp. 5075–5086. 61: 2066:
Wahle, Jan Philip; Ruas, Terry; FoltΓ½nek, TomΓ‘Ε‘; Meuschke, Norman; Gipp, Bela (2022), Smits, Malte (ed.),
1493: 1390: 866: 1050: 1413: 1739:"Is GPT-3 Text Indistinguishable from Human Text? Scarecrow: A Framework for Scrutinizing Machine Text" 1265: 184: 2014: 1683: 1451: 911: 65: 407: 1498: 1459: 1418: 1397: 1962:
Niu, Tong; Yavuz, Semih; Zhou, Yingbo; Keskar, Nitish Shirish; Wang, Huan; Xiong, Caiming (2021).
720: 676: 2292: 2215: 2182: 2145: 2103: 2075: 2048: 2004: 1981: 1944: 1895: 1868: 1840: 1813: 1793: 1766: 1746: 1719: 1673: 1611: 1589: 1503: 1140: 57: 1635: 173:
as proposed by Bannard and Callison-Burch. The chief concept consists of aligning phrases in a
30:
This article is about automated generation and recognition of paraphrases. For other uses, see
2093: 1858: 2205: 2172: 2135: 2085: 2040: 1971: 1934: 1905: 1850: 1803: 1756: 1709: 1599: 1525:
Socher, Richard; Huang, Eric; Pennington, Jeffrey; Ng, Andrew; Manning, Christopher (2011),
1370: 97:
Barzilay and Lee proposed a method to generate paraphrases through the usage of monolingual
73: 1657:. Proceedings of the 43rd Annual Meeting of the ACL. Ann Arbor, Michigan. pp. 597–604. 1345: 1173: 1146: 1115: 1088: 832: 804: 380: 262: 235: 101:, namely news articles covering the same event on the same day. Training consists of using 1833:"Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphrase Detection" 1373:
output. The dynamic pooling to softmax model is trained using pairs of known paraphrases.
2067: 1933:. Seattle, United States: Association for Computational Linguistics. pp. 3254–3263. 1027:-dimensional vector as output. The same autoencoder is applied to every pair of words in 2018: 1687: 2165:"ProtAugment: Intent Detection Meta-Learning through Unsupervised Diverse Paraphrasing" 1790:
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
1325: 1305: 1241: 1030: 1010: 1004: 986: 966: 946: 932:
Paraphrase recognition has been attempted by Socher et al through the use of recursive
899: 784: 764: 360: 174: 1976: 1968:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
1939: 1714: 1706:
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
1586:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
1458:
The evaluation of paraphrase generation has similar difficulties as the evaluation of
898:, paraphrase generation approaches improved their ability to generate text by scaling 17: 2308: 2186: 2149: 2107: 1985: 1948: 1872: 1817: 1770: 1723: 1603: 1471: 1259: 98: 2301:- A searchable database containing millions of paraphrases in 16 different languages 2177: 2140: 2052: 1785: 1761: 1737:
Dou, Yao; Forbes, Maxwell; Koncel-Kedziorski, Rik; Smith, Noah; Choi, Yejin (2022).
1910: 1808: 878:
and reproduce the one-hot distribution of a corresponding paraphrase by minimizing
404:
is added as a prior to add context to the paraphrase. Thus the optimal paraphrase,
377:, a potential phrase translation in the pivot language. Additionally, the sentence 105:
to generate sentence-level paraphrases from an unannotated corpus. This is done by
2219: 1637:
Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment
2089: 1854: 1784:
Liu, Xianggen; Mou, Lili; Meng, Fandong; Zhou, Hao; Zhou, Jie; Song, Sen (2020).
2039:. Minneapolis, Minnesota: Association for Computational Linguistics: 4171–4186. 1484:
pre-training with more general data before fine-tuning with the question pairs.
933: 907: 81: 1963: 1926: 1832: 1701: 1421:. Other successful methods based on the Transformer architecture include using 2074:, vol. 13192, Cham: Springer International Publishing, pp. 393–413, 879: 53: 2239: 2164: 2127: 1738: 1652: 1555: 1531:
Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection
2210: 1887: 2256: 2044: 1571: 56:. Applications of paraphrasing are varied including information retrieval, 2171:. Online: Association for Computational Linguistics. pp. 2454–2466. 2134:. Online: Association for Computational Linguistics. pp. 7106–7116. 2031:
Devlin, Jacob; Chang, Ming-Wei; Lee, Kenton; Toutanova, Kristina (2019).
1745:. Dublin, Ireland: Association for Computational Linguistics: 7250–7274. 1442: 1438: 1382: 128:
finding pairings between such patterns the represent paraphrases, i.e. "
2128:"Improving Paraphrase Detection with the Adversarial Paraphrasing Task" 2032: 1927:"Unsupervised Paraphrasability Prediction for Compound Nominalizations" 1894:. Dublin, Ireland: Association for Computational Linguistics: 596–609. 915: 874: 156:
This is achieved by first clustering similar sentences together using
1831:
Wahle, Jan Philip; Ruas, Terry; Meuschke, Norman; Gipp, Bela (2021).
1476: 157: 1557:
Syntactic Constraints on Paraphrases Extracted from Parallel Corpora
781:
as a prior is modeled by calculating the probability of forming the
2080: 2009: 1900: 1845: 1798: 1751: 1678: 1594: 2163:
Dopierre, Thomas; Gravier, Christophe; Logerais, Wilfried (2021).
1588:. Online and Abu Dhabi, United Arab Emirates. pp. 952–963. 1463: 1925:
Lee, John Sie Yuen; Lim, Ho Hung; Carol Webster, Carol (2022).
1670:
Neural Paraphrase Generation with Staked Residual LSTM Networks
761:
can be approximated by simply taking their frequencies. Adding
1792:. Online: Association for Computational Linguistics: 302–312. 2258:
PEM: A Paraphrase Evaluation Metric Exploiting Parallel Texts
2200:
Callison-Burch, Chris; Cohn, Trevor; Lapata, Mirella (2008).
109:
finding recurring patterns in each individual corpus, i.e. "
2298: 2202:
ParaMetric: An Automatic Evaluation Metric for Paraphrasing
1964:"Unsupervised Paraphrasing with Pretrained Language Models" 1837:
2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL)
2241:
Collecting Highly Parallel Data for Paraphrase Evaluation
2072:
Information for a Better World: Shaping the Global Future
1702:"Paraphrase Generation: A Survey of the State of the Art" 1322:
are not uniform in size among all potential sentences,
27:
Automatic generation or recognition of paraphrased text
1527:"Advances in Neural Information Processing Systems 24" 1143:
is then taken between every combination of vectors in
902:
parameters and heavily parallelizing training through
1348: 1328: 1308: 1268: 1244: 1203: 1176: 1149: 1118: 1091: 1053: 1033: 1013: 989: 969: 949: 835: 807: 787: 767: 723: 679: 449: 410: 383: 363: 292: 265: 238: 187: 2255:
Liu, Chang; Dahlmeier, Daniel; Ng, Hwee Tou (2010).
169:
Paraphrase can also be generated through the use of
2273:"Paraphrase Identification on Quora Question Pairs" 1786:"Unsupervised Paraphrasing by Simulated Annealing" 1361: 1334: 1314: 1294: 1250: 1230: 1189: 1162: 1131: 1104: 1073: 1039: 1019: 995: 975: 955: 848: 820: 793: 773: 753: 709: 663: 432: 396: 369: 349: 278: 251: 224: 2033:"Proceedings of the 2019 Conference of the North" 1634:Barzilay, Regina; Lee, Lillian (May–June 2003). 724: 680: 628: 595: 556: 507: 478: 320: 293: 188: 2037:Proceedings of the 2019 Conference of the North 181:The probability distribution can be modeled as 1839:. Champaign, IL, USA: IEEE. pp. 226–229. 1651:Bannard, Colin; Callison-Burch, Chris (2005). 1554:Callison-Burch, Chris (October 25–27, 2008). 1231:{\displaystyle S\in \mathbb {R} ^{7\times 5}} 983:words, the autoencoder is designed to take 2 178:"in check," a paraphrase of "under control." 8: 2068:"Identifying Machine-Paraphrased Plagiarism" 1068: 1054: 1888:"Quality Controlled Paraphrase Generation" 1629: 1627: 1616:: CS1 maint: location missing publisher ( 2209: 2176: 2139: 2126:Nighojkar, Animesh; Licato, John (2021). 2079: 2008: 1975: 1938: 1909: 1899: 1844: 1807: 1797: 1760: 1750: 1713: 1677: 1593: 1353: 1347: 1327: 1307: 1286: 1273: 1267: 1243: 1216: 1212: 1211: 1202: 1181: 1175: 1154: 1148: 1123: 1117: 1096: 1090: 1060: 1052: 1032: 1012: 988: 968: 948: 840: 834: 812: 806: 786: 766: 742: 733: 722: 696: 690: 678: 646: 637: 611: 605: 589: 577: 564: 559: 550: 532: 523: 517: 499: 486: 481: 472: 457: 451: 450: 448: 418: 412: 411: 409: 388: 382: 362: 338: 329: 309: 303: 291: 270: 264: 243: 237: 213: 204: 198: 186: 1520: 1518: 350:{\displaystyle \Pr(e_{2}|f)\Pr(f|e_{1})} 2233: 2231: 2229: 1654:Paraphrasing Bilingual Parallel Corpora 1514: 1082:including the initial word embeddings. 873:LSTM. First, the encoding LSTM takes a 1609: 68:. Paraphrasing is also useful in the 7: 2293:Microsoft Research Paraphrase Corpus 2238:Chen, David; Dolan, William (2008). 1570:Berant, Jonathan, and Percy Liang. " 1074:{\displaystyle \lfloor m/2\rfloor } 1700:Zhou, Jianing; Bhat, Suma (2021). 80:of new samples to expand existing 25: 1572:Semantic parsing via paraphrasing 1295:{\displaystyle n_{p}\times n_{p}} 1258:is then subject to a dynamic min- 70:evaluation of machine translation 52:task of detecting and generating 1640:. Proceedings of HLT-NAACL 2003. 865:There has been success in using 225:{\displaystyle \Pr(e_{2}|e_{1})} 165:Phrase-based Machine Translation 1977:10.18653/v1/2021.emnlp-main.417 1940:10.18653/v1/2022.naacl-main.237 1715:10.18653/v1/2021.emnlp-main.414 1197:to produce a similarity matrix 1604:10.18653/v1/2022.emnlp-main.62 748: 734: 727: 704: 697: 683: 658: 638: 631: 625: 612: 598: 544: 524: 510: 463: 433:{\displaystyle {\hat {e_{2}}}} 424: 344: 330: 323: 317: 310: 296: 219: 205: 191: 1: 2178:10.18653/v1/2021.acl-long.191 2141:10.18653/v1/2021.acl-long.552 1762:10.18653/v1/2022.acl-long.501 2090:10.1007/978-3-030-96957-8_34 1911:10.18653/v1/2022.acl-long.45 1855:10.1109/JCDL52503.2021.00065 1809:10.18653/v1/2020.acl-main.28 754:{\displaystyle \Pr(f|e_{1})} 710:{\displaystyle \Pr(e_{2}|f)} 884:stochastic gradient descent 93:Multiple sequence alignment 50:natural language processing 32:Paraphrase (disambiguation) 2336: 2299:Paraphrase Database (PPDB) 152:were in serious condition" 29: 2315:Computational linguistics 894:With the introduction of 286:, which is equivalent to 232:, the probability phrase 46:computational linguistics 1387:recursive neural network 1262:to produce a fixed size 1007:as input and produce an 171:phrase-based translation 103:multi-sequence alignment 2211:10.3115/1599081.1599094 144:were (wounded/hurt) by 1494:Round-trip translation 1363: 1336: 1316: 1296: 1252: 1232: 1191: 1164: 1133: 1106: 1075: 1041: 1021: 997: 977: 957: 928:Recursive Autoencoders 923:Paraphrase recognition 867:long short-term memory 861:Long short-term memory 850: 822: 795: 775: 755: 711: 665: 434: 398: 371: 351: 280: 253: 226: 18:Automated paraphrasing 1414:binary classification 1364: 1362:{\displaystyle n_{p}} 1337: 1317: 1297: 1253: 1233: 1192: 1190:{\displaystyle W_{2}} 1165: 1163:{\displaystyle W_{1}} 1134: 1132:{\displaystyle W_{2}} 1107: 1105:{\displaystyle W_{1}} 1076: 1042: 1022: 998: 978: 958: 851: 849:{\displaystyle e_{2}} 823: 821:{\displaystyle e_{1}} 796: 776: 756: 712: 666: 435: 399: 397:{\displaystyle e_{1}} 372: 352: 281: 279:{\displaystyle e_{1}} 254: 252:{\displaystyle e_{2}} 227: 88:Paraphrase generation 2045:10.18653/v1/N19-1423 2001:Skip-Thought Vectors 1452:precision and recall 1423:adversarial learning 1377:Skip-thought vectors 1346: 1326: 1306: 1266: 1242: 1201: 1174: 1147: 1116: 1089: 1085:Given two sentences 1051: 1031: 1011: 987: 967: 947: 940:for classification. 916:sequence-to-sequence 833: 828:is substituted with 805: 785: 765: 721: 677: 447: 408: 381: 361: 290: 263: 236: 185: 66:plagiarism detection 2019:2015arXiv150606726K 1688:2016arXiv161003098P 1499:Text simplification 1460:machine translation 1419:logistic regression 1398:logistic regression 904:feed-forward layers 440:can be modeled as: 259:is a paraphrase of 1504:Text normalization 1410:Transformer models 1359: 1332: 1312: 1292: 1248: 1228: 1187: 1160: 1141:euclidean distance 1129: 1102: 1071: 1037: 1017: 993: 973: 953: 896:Transformer models 846: 818: 791: 771: 751: 707: 661: 594: 584: 506: 430: 394: 367: 347: 276: 249: 222: 132:(injured/wounded) 113:(injured/wounded) 62:text summarization 58:question answering 2099:978-3-030-96956-1 1864:978-1-6654-1770-9 1335:{\displaystyle S} 1315:{\displaystyle S} 1251:{\displaystyle S} 1040:{\displaystyle S} 1020:{\displaystyle n} 996:{\displaystyle n} 976:{\displaystyle m} 956:{\displaystyle W} 943:Given a sentence 794:{\displaystyle S} 774:{\displaystyle S} 585: 555: 553: 477: 475: 466: 427: 370:{\displaystyle f} 121:seriously" where 16:(Redirected from 2327: 2320:Machine learning 2281: 2280: 2277:Papers with Code 2269: 2263: 2262: 2252: 2246: 2245: 2235: 2224: 2223: 2213: 2197: 2191: 2190: 2180: 2160: 2154: 2153: 2143: 2123: 2117: 2116: 2115: 2114: 2083: 2063: 2057: 2056: 2028: 2022: 2021: 2012: 1996: 1990: 1989: 1979: 1959: 1953: 1952: 1942: 1922: 1916: 1915: 1913: 1903: 1883: 1877: 1876: 1848: 1828: 1822: 1821: 1811: 1801: 1781: 1775: 1774: 1764: 1754: 1734: 1728: 1727: 1717: 1697: 1691: 1690: 1681: 1665: 1659: 1658: 1648: 1642: 1641: 1631: 1622: 1621: 1615: 1607: 1597: 1581: 1575: 1568: 1562: 1561: 1551: 1545: 1544: 1543: 1542: 1533:, archived from 1522: 1472:Jaccard distance 1368: 1366: 1365: 1360: 1358: 1357: 1341: 1339: 1338: 1333: 1321: 1319: 1318: 1313: 1301: 1299: 1298: 1293: 1291: 1290: 1278: 1277: 1257: 1255: 1254: 1249: 1237: 1235: 1234: 1229: 1227: 1226: 1215: 1196: 1194: 1193: 1188: 1186: 1185: 1169: 1167: 1166: 1161: 1159: 1158: 1138: 1136: 1135: 1130: 1128: 1127: 1111: 1109: 1108: 1103: 1101: 1100: 1080: 1078: 1077: 1072: 1064: 1046: 1044: 1043: 1038: 1026: 1024: 1023: 1018: 1002: 1000: 999: 994: 982: 980: 979: 974: 962: 960: 959: 954: 857: 855: 853: 852: 847: 845: 844: 827: 825: 824: 819: 817: 816: 800: 798: 797: 792: 780: 778: 777: 772: 760: 758: 757: 752: 747: 746: 737: 716: 714: 713: 708: 700: 695: 694: 670: 668: 667: 662: 651: 650: 641: 615: 610: 609: 593: 583: 582: 581: 569: 568: 554: 551: 537: 536: 527: 522: 521: 505: 504: 503: 491: 490: 476: 473: 468: 467: 462: 461: 452: 439: 437: 436: 431: 429: 428: 423: 422: 413: 403: 401: 400: 395: 393: 392: 376: 374: 373: 368: 357:summed over all 356: 354: 353: 348: 343: 342: 333: 313: 308: 307: 285: 283: 282: 277: 275: 274: 258: 256: 255: 250: 248: 247: 231: 229: 228: 223: 218: 217: 208: 203: 202: 151: 147: 143: 140:seriously" and " 139: 135: 131: 124: 120: 116: 112: 99:parallel corpora 74:semantic parsing 21: 2335: 2334: 2330: 2329: 2328: 2326: 2325: 2324: 2305: 2304: 2289: 2284: 2271: 2270: 2266: 2254: 2253: 2249: 2237: 2236: 2227: 2199: 2198: 2194: 2162: 2161: 2157: 2125: 2124: 2120: 2112: 2110: 2100: 2065: 2064: 2060: 2030: 2029: 2025: 1998: 1997: 1993: 1961: 1960: 1956: 1924: 1923: 1919: 1885: 1884: 1880: 1865: 1830: 1829: 1825: 1783: 1782: 1778: 1736: 1735: 1731: 1699: 1698: 1694: 1667: 1666: 1662: 1650: 1649: 1645: 1633: 1632: 1625: 1608: 1583: 1582: 1578: 1569: 1565: 1553: 1552: 1548: 1540: 1538: 1524: 1523: 1516: 1512: 1490: 1435: 1408:Similar to how 1406: 1383:skip gram model 1379: 1349: 1344: 1343: 1324: 1323: 1304: 1303: 1282: 1269: 1264: 1263: 1240: 1239: 1210: 1199: 1198: 1177: 1172: 1171: 1150: 1145: 1144: 1119: 1114: 1113: 1092: 1087: 1086: 1049: 1048: 1029: 1028: 1009: 1008: 1005:word embeddings 985: 984: 965: 964: 945: 944: 930: 925: 892: 863: 836: 831: 830: 829: 808: 803: 802: 783: 782: 763: 762: 738: 719: 718: 686: 675: 674: 642: 601: 573: 560: 528: 513: 495: 482: 453: 445: 444: 414: 406: 405: 384: 379: 378: 359: 358: 334: 299: 288: 287: 266: 261: 260: 239: 234: 233: 209: 194: 183: 182: 167: 149: 145: 141: 137: 133: 129: 122: 118: 114: 110: 95: 90: 35: 28: 23: 22: 15: 12: 11: 5: 2333: 2331: 2323: 2322: 2317: 2307: 2306: 2303: 2302: 2296: 2288: 2287:External links 2285: 2283: 2282: 2264: 2247: 2225: 2192: 2155: 2118: 2098: 2058: 2023: 1991: 1954: 1917: 1878: 1863: 1823: 1776: 1729: 1692: 1660: 1643: 1623: 1576: 1563: 1546: 1513: 1511: 1508: 1507: 1506: 1501: 1496: 1489: 1486: 1434: 1431: 1405: 1402: 1378: 1375: 1356: 1352: 1342:is split into 1331: 1311: 1302:matrix. Since 1289: 1285: 1281: 1276: 1272: 1247: 1225: 1222: 1219: 1214: 1209: 1206: 1184: 1180: 1157: 1153: 1126: 1122: 1099: 1095: 1070: 1067: 1063: 1059: 1056: 1036: 1016: 992: 972: 952: 938:neural network 929: 926: 924: 921: 912:autoregressive 900:neural network 891: 888: 862: 859: 843: 839: 815: 811: 790: 770: 750: 745: 741: 736: 732: 729: 726: 706: 703: 699: 693: 689: 685: 682: 672: 671: 660: 657: 654: 649: 645: 640: 636: 633: 630: 627: 624: 621: 618: 614: 608: 604: 600: 597: 592: 588: 580: 576: 572: 567: 563: 558: 549: 546: 543: 540: 535: 531: 526: 520: 516: 512: 509: 502: 498: 494: 489: 485: 480: 471: 465: 460: 456: 426: 421: 417: 391: 387: 366: 346: 341: 337: 332: 328: 325: 322: 319: 316: 312: 306: 302: 298: 295: 273: 269: 246: 242: 221: 216: 212: 207: 201: 197: 193: 190: 175:pivot language 166: 163: 154: 153: 126: 94: 91: 89: 86: 26: 24: 14: 13: 10: 9: 6: 4: 3: 2: 2332: 2321: 2318: 2316: 2313: 2312: 2310: 2300: 2297: 2294: 2291: 2290: 2286: 2278: 2274: 2268: 2265: 2260: 2259: 2251: 2248: 2243: 2242: 2234: 2232: 2230: 2226: 2221: 2217: 2212: 2207: 2203: 2196: 2193: 2188: 2184: 2179: 2174: 2170: 2166: 2159: 2156: 2151: 2147: 2142: 2137: 2133: 2129: 2122: 2119: 2109: 2105: 2101: 2095: 2091: 2087: 2082: 2077: 2073: 2069: 2062: 2059: 2054: 2050: 2046: 2042: 2038: 2034: 2027: 2024: 2020: 2016: 2011: 2006: 2002: 1995: 1992: 1987: 1983: 1978: 1973: 1969: 1965: 1958: 1955: 1950: 1946: 1941: 1936: 1932: 1928: 1921: 1918: 1912: 1907: 1902: 1897: 1893: 1889: 1882: 1879: 1874: 1870: 1866: 1860: 1856: 1852: 1847: 1842: 1838: 1834: 1827: 1824: 1819: 1815: 1810: 1805: 1800: 1795: 1791: 1787: 1780: 1777: 1772: 1768: 1763: 1758: 1753: 1748: 1744: 1740: 1733: 1730: 1725: 1721: 1716: 1711: 1707: 1703: 1696: 1693: 1689: 1685: 1680: 1675: 1671: 1664: 1661: 1656: 1655: 1647: 1644: 1639: 1638: 1630: 1628: 1624: 1619: 1613: 1605: 1601: 1596: 1591: 1587: 1580: 1577: 1573: 1567: 1564: 1559: 1558: 1550: 1547: 1537:on 2018-01-06 1536: 1532: 1528: 1521: 1519: 1515: 1509: 1505: 1502: 1500: 1497: 1495: 1492: 1491: 1487: 1485: 1481: 1478: 1473: 1467: 1465: 1461: 1456: 1453: 1448: 1444: 1440: 1432: 1430: 1428: 1427:meta-learning 1424: 1420: 1415: 1411: 1403: 1401: 1399: 1394: 1392: 1388: 1384: 1376: 1374: 1372: 1354: 1350: 1329: 1309: 1287: 1283: 1279: 1274: 1270: 1261: 1260:pooling layer 1245: 1223: 1220: 1217: 1207: 1204: 1182: 1178: 1155: 1151: 1142: 1124: 1120: 1097: 1093: 1083: 1065: 1061: 1057: 1034: 1014: 1006: 1003:-dimensional 990: 970: 950: 941: 939: 935: 927: 922: 920: 917: 913: 909: 905: 901: 897: 889: 887: 885: 882:using simple 881: 876: 872: 868: 860: 858: 841: 837: 813: 809: 788: 768: 743: 739: 730: 701: 691: 687: 655: 652: 647: 643: 634: 622: 619: 616: 606: 602: 590: 586: 578: 574: 570: 565: 561: 547: 541: 538: 533: 529: 518: 514: 500: 496: 492: 487: 483: 469: 458: 454: 443: 442: 441: 419: 415: 389: 385: 364: 339: 335: 326: 314: 304: 300: 271: 267: 244: 240: 214: 210: 199: 195: 179: 176: 172: 164: 162: 159: 148:, among them 127: 125:are variables 108: 107: 106: 104: 100: 92: 87: 85: 83: 79: 75: 72:, as well as 71: 67: 63: 59: 55: 51: 47: 43: 39: 33: 19: 2276: 2267: 2257: 2250: 2240: 2201: 2195: 2168: 2158: 2131: 2121: 2111:, retrieved 2071: 2061: 2036: 2026: 2000: 1994: 1967: 1957: 1930: 1920: 1891: 1881: 1836: 1826: 1789: 1779: 1742: 1732: 1705: 1695: 1669: 1663: 1653: 1646: 1636: 1585: 1579: 1566: 1556: 1549: 1539:, retrieved 1535:the original 1530: 1482: 1468: 1457: 1436: 1407: 1404:Transformers 1395: 1389:(RNN) or an 1380: 1084: 942: 934:autoencoders 931: 908:autoencoding 893: 890:Transformers 864: 673: 180: 168: 155: 96: 42:paraphrasing 41: 37: 36: 1047:to produce 54:paraphrases 2309:Categories 2113:2022-10-06 2081:2103.11909 2010:1506.06726 1901:2203.10940 1846:2103.12450 1799:1909.03588 1752:2107.01294 1679:1610.03098 1595:2210.03568 1541:2017-12-29 1510:References 1433:Evaluation 880:perplexity 78:generation 38:Paraphrase 2187:236460333 2150:235436269 2108:232307572 1986:237497412 1949:250390695 1873:232320374 1818:202537332 1771:247315430 1724:243865349 1612:cite book 1447:ROC curve 1280:× 1221:× 1208:∈ 1069:⌋ 1055:⌊ 587:∑ 571:≠ 493:≠ 464:^ 425:^ 2053:52967399 1488:See also 1445:, or an 1443:f1 score 1439:accuracy 871:residual 136:people, 117:people, 2015:Bibcode 1684:Bibcode 1477:N-grams 1371:softmax 875:one-hot 123:X, Y, Z 82:corpora 48:is the 2220:837398 2218:  2185:  2148:  2106:  2096:  2051:  1984:  1947:  1871:  1861:  1816:  1769:  1722:  158:n-gram 64:, and 2216:S2CID 2183:S2CID 2146:S2CID 2104:S2CID 2076:arXiv 2049:S2CID 2005:arXiv 1982:S2CID 1945:S2CID 1896:arXiv 1869:S2CID 1841:arXiv 1814:S2CID 1794:arXiv 1767:S2CID 1747:arXiv 1720:S2CID 1674:arXiv 1590:arXiv 963:with 914:, or 801:when 2094:ISBN 1859:ISBN 1618:link 1464:BLEU 1425:and 1391:LSTM 1170:and 1112:and 717:and 76:and 2206:doi 2173:doi 2136:doi 2086:doi 2041:doi 1972:doi 1935:doi 1906:doi 1851:doi 1804:doi 1757:doi 1710:doi 1600:doi 1429:. 557:max 552:arg 479:max 474:arg 44:in 40:or 2311:: 2275:. 2228:^ 2214:. 2181:. 2167:. 2144:. 2130:. 2102:, 2092:, 2084:, 2070:, 2047:. 2035:. 2013:, 2003:, 1980:. 1966:. 1943:. 1929:. 1904:. 1890:. 1867:. 1857:. 1849:. 1835:. 1812:. 1802:. 1788:. 1765:. 1755:. 1741:. 1718:. 1704:. 1682:, 1672:, 1626:^ 1614:}} 1610:{{ 1598:. 1529:, 1517:^ 1441:, 1393:. 1238:. 910:, 725:Pr 681:Pr 629:Pr 596:Pr 508:Pr 321:Pr 294:Pr 189:Pr 84:. 60:, 2279:. 2222:. 2208:: 2189:. 2175:: 2152:. 2138:: 2088:: 2078:: 2055:. 2043:: 2017:: 2007:: 1988:. 1974:: 1951:. 1937:: 1914:. 1908:: 1898:: 1875:. 1853:: 1843:: 1820:. 1806:: 1796:: 1773:. 1759:: 1749:: 1726:. 1712:: 1686:: 1676:: 1620:) 1606:. 1602:: 1592:: 1355:p 1351:n 1330:S 1310:S 1288:p 1284:n 1275:p 1271:n 1246:S 1224:5 1218:7 1213:R 1205:S 1183:2 1179:W 1156:1 1152:W 1125:2 1121:W 1098:1 1094:W 1066:2 1062:/ 1058:m 1035:S 1015:n 991:n 971:m 951:W 856:. 842:2 838:e 814:1 810:e 789:S 769:S 749:) 744:1 740:e 735:| 731:f 728:( 705:) 702:f 698:| 692:2 688:e 684:( 659:) 656:S 653:, 648:1 644:e 639:| 635:f 632:( 626:) 623:S 620:, 617:f 613:| 607:2 603:e 599:( 591:f 579:1 575:e 566:2 562:e 548:= 545:) 542:S 539:, 534:1 530:e 525:| 519:2 515:e 511:( 501:1 497:e 488:2 484:e 470:= 459:2 455:e 420:2 416:e 390:1 386:e 365:f 345:) 340:1 336:e 331:| 327:f 324:( 318:) 315:f 311:| 305:2 301:e 297:( 272:1 268:e 245:2 241:e 220:) 215:1 211:e 206:| 200:2 196:e 192:( 150:Z 146:X 142:Y 138:Z 134:Y 130:X 119:Z 115:Y 111:X 34:. 20:)

Index

Automated paraphrasing
Paraphrase (disambiguation)
computational linguistics
natural language processing
paraphrases
question answering
text summarization
plagiarism detection
evaluation of machine translation
semantic parsing
generation
corpora
parallel corpora
multi-sequence alignment
n-gram
phrase-based translation
pivot language
long short-term memory
residual
one-hot
perplexity
stochastic gradient descent
Transformer models
neural network
feed-forward layers
autoencoding
autoregressive
sequence-to-sequence
autoencoders
neural network

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

↑