List of software to detect low complexity regions in proteins

431:

In addition, a web meta-server named PLAtform of TOols for LOw COmplexity (PlaToLoCo) has been developed, for visualization and annotation of low complexity regions in proteins. PlaToLoCo integrates and collects the output of five different state-of-the-art tools for discovering LCRs and provides

374:

A web platform to search, visualize and share data for low complexity regions in protein sequences. LCR-eXXXplorer offers tools for displaying LCRs from the UniProt/SwissProt knowledgebase, in combination with other relevant protein features, predicted or experimentally verified. Also, users may

350:

This algorithm defines compositional bias through a thorough search for lowest-probability subsequences (LPSs; Low Probability Sequences) and serves as workbench of tools now available to molecular biologists to generate hypotheses and inferences about the proteins that they are investigating.

432:

functional annotations such as domain detection, transmembrane segment prediction, and calculation of amino acid frequencies. Furthermore, the union or intersection of the results of the search on a query sequence can be obtained.

113:

It can readily handle very large protein data sets, such as might come from metagenomics projects. It is useful in searching for proteins with similar CBRs and for making functional inferences about CBRs for a protein of interest

266:

It uses discrete scan statistics that provide a highly accurate multiple test correction to compute analytical estimates of the significance of each compositionally biased segment.

747:

Nandi T, Dash D, Ghai R, B-Rao C, Kannan K, Brahmachari SK, Ramakrishnan C, Ramachandran S (2003). "A new algorithm for detecting low-complexity regions in protein sequences".

435:

A Neural Network webserver, named LCR-hound has been developed to predict the function of prokaryotic and eukaryotic LCRs, based on their amino acid or di-amino acid content.

163:

It facilitates the quantification of the amount of simple sequence in proteins and determines the type of short motifs that show clustering above a certain threshold.

65:

It describes several protein sequence statistics for the evaluation of distinctive characteristics of residue content and arrangement in primary structures.

1242:

Jarnot P, Ziemska-Legiecka J, Dobson L, Merski M, Mier P, Andrade-Navarro MA, Hancock JM, Dosztányi Z, Paladin L, Necci M, Piovesan D (2020-07-02).

1185:

Mier P, Paladin L, Tamana S, Petrosian S, Hajdu-Soltész B, Urbanek A, Gruca A, Plewczynski D, Grynberg M, Bernadó P, Gáspári Z (2020-03-23).

19: 954:

Ooi HS, Kwo CY, Wildpaner M, Sirota FL, Eisenhaber B, Maurer-Stroh S, Wong WC, Schleiffer A, Eisenhaber F, Schneider G (Jul 2009).

712:

Wan H, Li L, Federhen S, Wootton JC (2003). "Discovering simple regions in biological sequences associated with scoring schemes".

1299:

Ntountoumi C, Vlastaridis P, Mossialos D, Stathopoulos C, Iliopoulos I, Promponas V, Oliver SG, Amoutzias GD (2019-11-04).

915:"SubSeqer: a graph-based approach for the detection and identification of repetitive elements in low-complexity sequences" 86:

It is a two pass algorithm: first, identifies the LCR, and then performs local optimization by masking with Xs the LCRs

591:"CAST: an iterative algorithm for the complexity analysis of sequence tracts. Complexity analysis of sequence tracts" 511:

Wootton JC, Federhen S (June 2003). "Statistics of local complexity in amino acid sequences and sequence databases".

1054:"LCR-eXXXplorer: a web platform to search, visualize and share data for low complexity regions in protein sequences" 1367: 1301:"Low complexity regions in the proteins of prokaryotes perform important functional roles and are highly conserved" 308:

A graph-based approach for the detection and identification of repetitive elements in low–complexity sequences.

1244:"PlaToLoCo: the first web meta-server for visualization and annotation of low complexity regions in proteins" 1130:"AlcoR: alignment-free simulation, mapping, and visualization of low-complexity regions in biological data" 239:

Based on the complexity analysis of subsequences delimited by pairs of identical, repeating subsequences.

1101:

Claverie JM, States D (June 1993). "Information enhancement methods for large scale sequence analysis".

833:"A novel sensitive method for the detection of user-defined compositional bias in biological sequences" 589:

Promponas VJ, Enright AJ, Tsoka S, Kreil DP, Leroy C, Hamodrakas S, Sander C, Ouzounis CA (Oct 2000).

465: 417:

A compression-based and alignment-free tool for detecting low-complexity regions in biological data

1362: 1005:"LPS-annotate: complete annotation of compositionally biased regions in the protein knowledgebase" 772: 792:"A novel complexity measure for comparative analysis of protein sequences from complete genomes" 1338: 1320: 1281: 1263: 1224: 1206: 1167: 1149: 1083: 1034: 985: 936: 895: 854: 813: 764: 729: 694: 653: 612: 571: 493: 689: 672: 1328: 1312: 1271: 1255: 1214: 1198: 1157: 1141: 1110: 1073: 1065: 1024: 1016: 975: 967: 926: 885: 844: 803: 756: 721: 684: 643: 602: 561: 551: 520: 483: 473: 60: 56: 48: 1162: 469: 1333: 1300: 1276: 1243: 1219: 1186: 1078: 1053: 1029: 1004: 980: 955: 607: 590: 566: 539: 369: 361: 648: 631: 132: 124: 108: 96: 1356: 1114: 524: 488: 453: 1069: 931: 914: 890: 873: 849: 832: 808: 791: 776: 760: 1145: 220:

Calculates the compositional complexity using the linguistic complexity measure.

874:"A Novel algorithm for identifying low-complexity regions in a protein sequence" 725: 22:, which can have particular properties regarding their function and structure. 1129: 1020: 556: 1324: 1267: 1210: 1153: 155: 1342: 1285: 1228: 1171: 1087: 1038: 989: 940: 899: 858: 817: 768: 733: 698: 657: 632:"Detecting cryptically simple protein sequences using the SIMPLE algorithm" 616: 575: 478: 391: 18:

Computational methods can study protein sequences to identify regions with

1259: 497: 452:

Brendel V, Bucher P, Nourbakhsh IR, Blaisdell BE, Karlin S (15 Mar 1992).

158: 147: 1316: 1202: 971: 375:

perform queries against a custom designed sequence/LCR-centric database.

673:"0j.py: a software tool for low complexity proteins and protein domains" 540:"fLPS: Fast discovery of compositional biases for the protein universe" 81: 454:"Methods and algorithms for statistical analysis of protein sequences" 396:

It uses the PAM120 scoring matrix for the calculation of complexity.

261: 249: 104: 331:

This method creates an automation of the sequence analytic process.

412: 428:

For a comprehensive review on the various methods and tools, see.

303: 295: 257: 326: 318: 285:

A graph-based algorithm that constructs a graph of the sequence.

1187:"Disentangling the complexity of low complexity proteins" 956:"ANNIE: integrated de novo protein sequence annotation" 182:

A tool for demarcating low complexity protein domains.

201:

It calculates complexity using reciprocal complexity.

1128:Silva JM, Qi W, Pinho AJ, Pratas D (2022-12-28). 630:Albà MM, Laskowski RA, Hancock JM (May 2002). 137:It identifies LCRs using dynamic programming. 8: 1003:Harbi D, Kumar M, Harrison PM (6 Jan 2011). 1052:Kirmitzoglou I, Promponas VJ (1 Jul 2015). 24: 1332: 1275: 1218: 1161: 1077: 1028: 979: 930: 889: 848: 807: 688: 647: 606: 565: 555: 487: 477: 444: 690:10.1093/bioinformatics/17.suppl_1.s288 7: 831:Kuznetsov IB, Hwang S (1 May 2006). 14: 913:He D, Parkinson J (1 Apr 2008). 608:10.1093/bioinformatics/16.10.915 966:(Web server issue): W435–W440. 872:Li X, Kahveci T (15 Dec 2006). 790:Shin SW, Kim SM (15 Jan 2005). 649:10.1093/bioinformatics/18.5.672 761:10.1080/07391102.2003.10506882 1: 1070:10.1093/bioinformatics/btv115 932:10.1093/bioinformatics/btn073 891:10.1093/bioinformatics/btl495 850:10.1093/bioinformatics/btl049 809:10.1093/bioinformatics/bth497 1115:10.1016/0097-8485(93)85010-a 525:10.1016/0097-8485(93)85006-X 1191:Briefings in Bioinformatics 1146:10.1093/gigascience/giad101 538:Harrison PM (13 Nov 2017). 1384: 726:10.1089/106652703321825955 557:10.1186/s12859-017-1906-3 458:Proc Natl Acad Sci U S A 1021:10.1093/database/baq031 513:Computers and Chemistry 1305:Nucleic Acids Research 1248:Nucleic Acids Research 683:(Suppl 1): S288–S295. 479:10.1073/pnas.89.6.2002 1260:10.1093/nar/gkaa339 749:J Biomol Struct Dyn 470:1992PNAS...89.2002B 1317:10.1093/nar/gkz730 1311:(19): 9998–10009. 1203:10.1093/bib/bbz007 972:10.1093/nar/gkp254 544:BMC Bioinformatics 1368:Lists of software 1064:(13): 2208–2210. 1009:Database (Oxford) 960:Nucleic Acids Res 884:(24): 2980–2987. 426: 425: 1375: 1347: 1346: 1336: 1296: 1290: 1289: 1279: 1239: 1233: 1232: 1222: 1182: 1176: 1175: 1165: 1125: 1119: 1118: 1098: 1092: 1091: 1081: 1049: 1043: 1042: 1032: 1000: 994: 993: 983: 951: 945: 944: 934: 925:(7): 1016–1017. 910: 904: 903: 893: 869: 863: 862: 852: 843:(9): 1055–1063. 828: 822: 821: 811: 787: 781: 780: 744: 738: 737: 709: 703: 702: 692: 671:Wise MJ (2001). 668: 662: 661: 651: 627: 621: 620: 610: 586: 580: 579: 569: 559: 535: 529: 528: 508: 502: 501: 491: 481: 464:(6): 2002–2006. 449: 25: 1383: 1382: 1378: 1377: 1376: 1374: 1373: 1372: 1353: 1352: 1351: 1350: 1298: 1297: 1293: 1254:(W1): W77–W84. 1241: 1240: 1236: 1184: 1183: 1179: 1127: 1126: 1122: 1100: 1099: 1095: 1051: 1050: 1046: 1002: 1001: 997: 953: 952: 948: 912: 911: 907: 871: 870: 866: 830: 829: 825: 789: 788: 784: 746: 745: 741: 711: 710: 706: 670: 669: 665: 629: 628: 624: 601:(10): 915–922. 588: 587: 583: 537: 536: 532: 510: 509: 505: 451: 450: 446: 441: 12: 11: 5: 1381: 1379: 1371: 1370: 1365: 1355: 1354: 1349: 1348: 1291: 1234: 1197:(2): 458–472. 1177: 1120: 1109:(2): 191–201. 1103:Computers Chem 1093: 1058:Bioinformatics 1044: 995: 946: 919:Bioinformatics 905: 878:Bioinformatics 864: 837:Bioinformatics 823: 802:(2): 160–170. 796:Bioinformatics 782: 755:(5): 657–668. 739: 720:(2): 171–185. 704: 677:Bioinformatics 663: 642:(5): 672–678. 636:Bioinformatics 622: 595:Bioinformatics 581: 530: 519:(2): 149–163. 503: 443: 442: 440: 437: 424: 423: 421: 418: 415: 410: 407: 403: 402: 400: 397: 394: 389: 386: 382: 381: 379: 376: 372: 367: 364: 358: 357: 355: 352: 348: 345: 342: 338: 337: 335: 332: 329: 324: 321: 315: 314: 312: 309: 306: 301: 298: 292: 291: 289: 286: 283: 280: 277: 273: 272: 270: 267: 264: 255: 252: 246: 245: 243: 240: 237: 234: 231: 227: 226: 224: 221: 218: 215: 212: 208: 207: 205: 202: 199: 196: 193: 189: 188: 186: 183: 180: 177: 174: 170: 169: 167: 164: 161: 153: 150: 144: 143: 141: 138: 135: 130: 127: 121: 120: 118: 115: 111: 102: 99: 93: 92: 90: 87: 84: 79: 76: 72: 71: 69: 66: 63: 54: 51: 45: 44: 41: 38: 35: 32: 29: 20:low complexity 13: 10: 9: 6: 4: 3: 2: 1380: 1369: 1366: 1364: 1361: 1360: 1358: 1344: 1340: 1335: 1330: 1326: 1322: 1318: 1314: 1310: 1306: 1302: 1295: 1292: 1287: 1283: 1278: 1273: 1269: 1265: 1261: 1257: 1253: 1249: 1245: 1238: 1235: 1230: 1226: 1221: 1216: 1212: 1208: 1204: 1200: 1196: 1192: 1188: 1181: 1178: 1173: 1169: 1164: 1159: 1155: 1151: 1147: 1143: 1139: 1135: 1131: 1124: 1121: 1116: 1112: 1108: 1104: 1097: 1094: 1089: 1085: 1080: 1075: 1071: 1067: 1063: 1059: 1055: 1048: 1045: 1040: 1036: 1031: 1026: 1022: 1018: 1014: 1010: 1006: 999: 996: 991: 987: 982: 977: 973: 969: 965: 961: 957: 950: 947: 942: 938: 933: 928: 924: 920: 916: 909: 906: 901: 897: 892: 887: 883: 879: 875: 868: 865: 860: 856: 851: 846: 842: 838: 834: 827: 824: 819: 815: 810: 805: 801: 797: 793: 786: 783: 778: 774: 770: 766: 762: 758: 754: 750: 743: 740: 735: 731: 727: 723: 719: 715: 714:J Comput Biol 708: 705: 700: 696: 691: 686: 682: 678: 674: 667: 664: 659: 655: 650: 645: 641: 637: 633: 626: 623: 618: 614: 609: 604: 600: 596: 592: 585: 582: 577: 573: 568: 563: 558: 553: 549: 545: 541: 534: 531: 526: 522: 518: 514: 507: 504: 499: 495: 490: 485: 480: 475: 471: 467: 463: 459: 455: 448: 445: 438: 436: 433: 429: 422: 419: 416: 414: 411: 408: 405: 404: 401: 398: 395: 393: 390: 387: 384: 383: 380: 377: 373: 371: 368: 365: 363: 362:LCReXXXplorer 360: 359: 356: 353: 349: 346: 343: 341:LPS-annotate 340: 339: 336: 333: 330: 328: 325: 322: 320: 317: 316: 313: 310: 307: 305: 302: 299: 297: 294: 293: 290: 287: 284: 281: 278: 275: 274: 271: 268: 265: 263: 259: 256: 253: 251: 248: 247: 244: 241: 238: 235: 232: 229: 228: 225: 222: 219: 216: 213: 210: 209: 206: 203: 200: 197: 194: 191: 190: 187: 184: 181: 178: 175: 172: 171: 168: 165: 162: 160: 157: 154: 151: 149: 146: 145: 142: 139: 136: 134: 131: 128: 126: 123: 122: 119: 116: 112: 110: 106: 103: 100: 98: 95: 94: 91: 88: 85: 83: 80: 77: 74: 73: 70: 67: 64: 62: 58: 55: 52: 50: 47: 46: 42: 40:Open source? 39: 36: 33: 30: 27: 26: 23: 21: 16: 1308: 1304: 1294: 1251: 1247: 1237: 1194: 1190: 1180: 1137: 1133: 1123: 1106: 1102: 1096: 1061: 1057: 1047: 1012: 1008: 998: 963: 959: 949: 922: 918: 908: 881: 877: 867: 840: 836: 826: 799: 795: 785: 752: 748: 742: 717: 713: 707: 680: 676: 666: 639: 635: 625: 598: 594: 584: 547: 543: 533: 516: 512: 506: 461: 457: 447: 434: 430: 427: 413:downloadable 392:downloadable 258:downloadable 156:downloadable 105:downloadable 82:downloadable 57:downloadable 37:Description 31:Last update 17: 15: 1134:GigaScience 347:on request 282:on request 236:on request 217:on request 198:on request 179:on request 1363:Proteomics 1357:Categories 1015:: baq031. 550:(1): 476. 439:References 43:Reference 1325:0305-1048 1268:0305-1048 1211:1467-5463 1154:2047-217X 1343:31504783 1286:32421769 1229:30698641 1172:38091509 1163:10716826 1088:25712690 1039:21216786 990:19389726 941:18304932 900:17018537 859:16500936 818:15333459 777:45635217 769:12643768 734:12804090 699:11473020 658:12050063 617:11120681 576:29132292 296:SubSeqer 211:ScanCom 1334:6821194 1277:7319588 1220:7299295 1079:4481844 1030:3017391 981:2703921 567:5684748 498:1549558 466:Bibcode 1341: 1331: 1323: 1284: 1274: 1266: 1227: 1217: 1209: 1170: 1160: 1152: 1086: 1076: 1037: 1027: 988: 978: 939: 898: 857: 816: 775: 767: 732: 697: 656: 615: 574: 564: 496: 486: 406:AlcoR 173:Oj.py 148:SIMPLE 34:Usage 773:S2CID 489:48584 409:2022 388:1993 366:2015 344:2011 323:2009 319:ANNIE 300:2008 279:2006 254:2006 233:2005 230:CARD 214:2003 195:2003 176:2001 152:2002 129:2000 101:2017 78:1993 53:1992 28:Name 1339:PMID 1321:ISSN 1282:PMID 1264:ISSN 1225:PMID 1207:ISSN 1168:PMID 1150:ISSN 1084:PMID 1035:PMID 1013:2011 986:PMID 937:PMID 896:PMID 855:PMID 814:PMID 765:PMID 730:PMID 695:PMID 654:PMID 613:PMID 572:PMID 494:PMID 420:yes 399:yes 385:XNU 276:GBA 269:yes 250:BIAS 192:DSR 166:yes 125:CAST 117:yes 97:fLPS 89:yes 75:SEG 68:yes 49:SAPS 1329:PMC 1313:doi 1272:PMC 1256:doi 1215:PMC 1199:doi 1158:PMC 1142:doi 1111:doi 1074:PMC 1066:doi 1025:PMC 1017:doi 976:PMC 968:doi 927:doi 886:doi 845:doi 804:doi 757:doi 722:doi 685:doi 644:doi 603:doi 562:PMC 552:doi 521:doi 484:PMC 474:doi 378:no 370:web 354:no 334:no 327:web 311:no 304:web 288:no 262:web 242:no 223:no 204:no 185:no 159:web 140:no 133:web 109:web 61:web 1359:: 1337:. 1327:. 1319:. 1309:47 1307:. 1303:. 1280:. 1270:. 1262:. 1252:48 1250:. 1246:. 1223:. 1213:. 1205:. 1195:21 1193:. 1189:. 1166:. 1156:. 1148:. 1140:. 1138:12 1136:. 1132:. 1107:17 1105:. 1082:. 1072:. 1062:31 1060:. 1056:. 1033:. 1023:. 1011:. 1007:. 984:. 974:. 964:37 962:. 958:. 935:. 923:24 921:. 917:. 894:. 882:22 880:. 876:. 853:. 841:22 839:. 835:. 812:. 800:21 798:. 794:. 771:. 763:. 753:20 751:. 728:. 718:10 716:. 693:. 681:17 679:. 675:. 652:. 640:18 638:. 634:. 611:. 599:16 597:. 593:. 570:. 560:. 548:18 546:. 542:. 517:17 515:. 492:. 482:. 472:. 462:89 460:. 456:. 260:/ 107:/ 59:/ 1345:. 1315:: 1288:. 1258:: 1231:. 1201:: 1174:. 1144:: 1117:. 1113:: 1090:. 1068:: 1041:. 1019:: 992:. 970:: 943:. 929:: 902:. 888:: 861:. 847:: 820:. 806:: 779:. 759:: 736:. 724:: 701:. 687:: 660:. 646:: 619:. 605:: 578:. 554:: 527:. 523:: 500:. 476:: 468::

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

Knowledge (XXG)

List of software to detect low complexity regions in proteins

Index