Knowledge (XXG)

CatBoost

Source ๐Ÿ“

385: 344:
In 2016 Machine Learning Infrastructure team led by Anna Dorogush started working on Gradient Boosting in Yandex, including Matrixnet and Tensornet. They implemented and open-sourced the next version of Gradient Boosting library called CatBoost, which has support of categorical and text data, GPU
285:
listed CatBoost as one of the most frequently used machine learning (ML) frameworks in the world. It was listed as the top-8 most frequently used ML framework in the 2020 survey and as the top-7 most frequently used ML framework in the 2021 survey.
27: 334:, a proprietary gradient boosting library that was used in Yandex to rank search results. Since 2009 MatrixNet has been used in different projects in Yandex, including recommendation systems and weather prediction. 886: 896: 568:
Prokhorenkova, Liudmila; Gusev, Gleb; Vorobev, Aleksandr; Dorogush, Anna Veronika; Gulin, Andrey (2019-01-20). "CatBoost: unbiased boosting with categorical features".
207:
framework which among other features attempts to solve for categorical features using a permutation driven alternative compared to the classical algorithm. It works on
876: 337:
In 2014โ€“2015 Andrey Gulin with a team of researchers has started a new project called Tensornet that was aimed at solving the problem of "how to work with
906: 881: 871: 474: 901: 390: 500: 743:
Dorogush, Anna Veronika; Ershov, Vasily; Gulin, Andrey (2018-10-24). "CatBoost: gradient boosting with categorical features support".
891: 654: 236: 341:". It resulted in several proprietary Gradient Boosting libraries with different approaches to handling categorical data. 220: 111: 789: 301:
CatBoost has gained popularity compared to other gradient boosting algorithms primarily due to the following features
240: 232: 123: 82: 764: 807: 348:
CatBoost was open-sourced in July 2017 and is under active development in Yandex and the open-source community.
916: 315: 224: 115: 719: 911: 825: 629: 338: 193: 196: 150: 744: 569: 167: 403: 204: 398: 162: 155: 130: 256: 865: 418: 267:
magazine awarded the library "The best machine learning tools" in 2017. along with
524: 609: 380: 363: 268: 47: 34: 686: 672: 475:"Yandex open sources CatBoost, a gradient boosting machine learning library" 438: 357: 331: 263: 700: 26: 589: 413: 289:
As of April 2022, CatBoost is installed about 100000 times per day from
851: 551: 456: 408: 276: 272: 244: 212: 143: 765:"CatBoost Enables Fast Gradient Boosting on Decision Trees Using GPUs" 369: 282: 200: 52: 40: 856: 749: 574: 228: 119: 216: 208: 139: 135: 826:"How Careem's Destination Prediction Service speeds up your ride" 227:, and models built using CatBoost can be used for predictions in 290: 252: 248: 563: 561: 630:"Maven Repository: ai.catboost ยป catboost-prediction" 372:
uses CatBoost to predict future destinations of the rides
846: 501:"Yandex open sources CatBoost machine learning library" 311:
Visualizations and tools for model and feature analysis
178: 808:"Stop the Bots: Practical Lessons in Machine Learning" 655:"Bossie Awards 2017: The best machine learning tools" 173: 161: 149: 129: 107: 81: 59: 46: 33: 887:Python (programming language) scientific libraries 687:"State of Data Science and Machine Learning 2021" 673:"State of Data Science and Machine Learning 2020" 16:Open-source software library developed by Yandex 345:training, model analysis, visualisation tools. 546: 544: 318:trees or symmetric trees for faster execution 8: 790:"Code Completion, Episode 4: Model Training" 439:"Andrey Gulin - People - Research at Yandex" 19: 25: 18: 897:Data mining and machine learning software 748: 573: 321:Ordered boosting to overcome overfitting 305:Native handling for categorical features 430: 653:staff, InfoWorld (27 September 2017). 590:"Python Package Index PYPI: catboost" 554:. August 30, 2020 – via GitHub. 7: 391:Free and open-source software portal 255:. The source code is licensed under 877:Open-source artificial intelligence 720:"The Gradient Boosters V: CatBoost" 14: 882:Software using the Apache license 360:uses CatBoost for code completion 88:1.2.3 / February 23, 2024 610:"Conda force package catboost-r" 383: 907:Free software programmed in C++ 499:Yegulalp, Serdar (2017-07-18). 366:uses CatBoost for bot detection 330:In 2009 Andrey Gulin developed 525:"Releases ยท catboost/catboost" 1: 872:Free and open-source software 857:CatBoost - Yandex Technology 902:Free data analysis software 718:Joseph, Manu (2020-02-29). 933: 852:GitHub - catboost/catboost 259:and available on GitHub. 103: 77: 55:and CatBoost Contributors 24: 892:Applied machine learning 794:JetBrains Developer Blog 279:and 8 other libraries. 219:, and is available in 90:; 6 months ago 769:NVIDIA Developer Blog 701:"PyPI Stats catboost" 65:; 7 years ago 812:The Cloudflare Blog 552:"catboost/catboost" 457:"catboost/catboost" 443:research.yandex.com 21: 724:Deep & Shallow 168:Apache License 2.0 63:July 18, 2017 35:Original author(s) 634:mvnrepository.com 404:Gradient boosting 308:Fast GPU training 205:gradient boosting 187: 186: 924: 834: 833: 822: 816: 815: 804: 798: 797: 786: 780: 779: 777: 776: 761: 755: 754: 752: 740: 734: 733: 731: 730: 715: 709: 708: 697: 691: 690: 683: 677: 676: 669: 663: 662: 650: 644: 643: 641: 640: 626: 620: 619: 617: 616: 606: 600: 599: 597: 596: 586: 580: 579: 577: 565: 556: 555: 548: 539: 538: 536: 535: 521: 515: 514: 512: 511: 496: 490: 489: 487: 486: 471: 465: 464: 453: 447: 446: 435: 399:Machine learning 393: 388: 387: 386: 339:categorical data 203:. It provides a 197:software library 183: 180: 156:Machine learning 131:Operating system 98: 96: 91: 73: 71: 66: 39:Andrey Gulin: / 29: 22: 932: 931: 927: 926: 925: 923: 922: 921: 917:Yandex software 862: 861: 843: 838: 837: 824: 823: 819: 806: 805: 801: 788: 787: 783: 774: 772: 763: 762: 758: 742: 741: 737: 728: 726: 717: 716: 712: 699: 698: 694: 685: 684: 680: 671: 670: 666: 652: 651: 647: 638: 636: 628: 627: 623: 614: 612: 608: 607: 603: 594: 592: 588: 587: 583: 567: 566: 559: 550: 549: 542: 533: 531: 523: 522: 518: 509: 507: 498: 497: 493: 484: 482: 473: 472: 468: 455: 454: 450: 437: 436: 432: 427: 389: 384: 382: 379: 354: 328: 299: 177: 99: 94: 92: 89: 69: 67: 64: 60:Initial release 17: 12: 11: 5: 930: 928: 920: 919: 914: 909: 904: 899: 894: 889: 884: 879: 874: 864: 863: 860: 859: 854: 849: 842: 841:External links 839: 836: 835: 817: 799: 781: 756: 735: 710: 692: 678: 664: 645: 621: 601: 581: 557: 540: 516: 491: 481:. 18 July 2017 466: 448: 429: 428: 426: 423: 422: 421: 416: 411: 406: 401: 395: 394: 378: 375: 374: 373: 367: 361: 353: 350: 327: 324: 323: 322: 319: 312: 309: 306: 298: 295: 257:Apache License 185: 184: 175: 171: 170: 165: 159: 158: 153: 147: 146: 133: 127: 126: 109: 105: 104: 101: 100: 87: 85: 83:Stable release 79: 78: 75: 74: 61: 57: 56: 50: 44: 43: 37: 31: 30: 15: 13: 10: 9: 6: 4: 3: 2: 929: 918: 915: 913: 912:2017 software 910: 908: 905: 903: 900: 898: 895: 893: 890: 888: 885: 883: 880: 878: 875: 873: 870: 869: 867: 858: 855: 853: 850: 848: 845: 844: 840: 832:. 2019-02-19. 831: 827: 821: 818: 814:. 2019-02-20. 813: 809: 803: 800: 796:. 2021-08-20. 795: 791: 785: 782: 770: 766: 760: 757: 751: 746: 739: 736: 725: 721: 714: 711: 706: 702: 696: 693: 688: 682: 679: 674: 668: 665: 660: 656: 649: 646: 635: 631: 625: 622: 611: 605: 602: 591: 585: 582: 576: 571: 564: 562: 558: 553: 547: 545: 541: 530: 526: 520: 517: 506: 502: 495: 492: 480: 476: 470: 467: 462: 458: 452: 449: 444: 440: 434: 431: 424: 420: 417: 415: 412: 410: 407: 405: 402: 400: 397: 396: 392: 381: 376: 371: 368: 365: 362: 359: 356: 355: 351: 349: 346: 342: 340: 335: 333: 325: 320: 317: 313: 310: 307: 304: 303: 302: 296: 294: 292: 287: 284: 280: 278: 274: 270: 266: 265: 260: 258: 254: 250: 246: 242: 238: 234: 230: 226: 222: 218: 214: 210: 206: 202: 199:developed by 198: 195: 191: 182: 176: 172: 169: 166: 164: 160: 157: 154: 152: 148: 145: 141: 137: 134: 132: 128: 125: 121: 117: 113: 110: 106: 102: 86: 84: 80: 76: 62: 58: 54: 51: 49: 45: 42: 38: 36: 32: 28: 23: 829: 820: 811: 802: 793: 784: 773:. Retrieved 771:. 2018-12-13 768: 759: 738: 727:. Retrieved 723: 713: 704: 695: 681: 667: 658: 648: 637:. Retrieved 633: 624: 613:. Retrieved 604: 593:. Retrieved 584: 532:. Retrieved 528: 519: 508:. Retrieved 504: 494: 483:. Retrieved 478: 469: 460: 451: 442: 433: 419:scikit-learn 347: 343: 336: 329: 300: 288: 281: 262: 261: 189: 188: 48:Developer(s) 352:Application 293:repository 194:open-source 866:Categories 775:2020-08-30 750:1810.11363 729:2020-08-30 705:PyPI Stats 639:2020-08-30 615:2020-08-30 595:2020-08-20 575:1706.09516 534:2024-03-14 510:2020-08-30 485:2020-08-30 479:TechCrunch 425:References 364:Cloudflare 269:TensorFlow 108:Written in 95:2024-02-23 70:2017-07-18 659:InfoWorld 505:InfoWorld 358:JetBrains 332:MatrixNet 316:oblivious 264:InfoWorld 847:CatBoost 414:LightGBM 377:See also 297:Features 190:CatBoost 179:catboost 20:CatBoost 409:XGBoost 326:History 277:XGBoost 273:Pytorch 245:Core ML 213:Windows 174:Website 163:License 144:Windows 93: ( 68: ( 830:Careem 529:GitHub 461:GitHub 370:Careem 314:Using 283:Kaggle 251:, and 221:Python 201:Yandex 192:is an 112:Python 53:Yandex 41:Yandex 745:arXiv 570:arXiv 217:macOS 209:Linux 140:macOS 136:Linux 291:PyPI 253:PMML 249:ONNX 241:Rust 233:Java 151:Type 124:Java 229:C++ 223:, 181:.ai 120:C++ 868:: 828:. 810:. 792:. 767:. 722:. 703:. 657:. 632:. 560:^ 543:^ 527:. 503:. 477:. 459:. 441:. 275:, 271:, 247:, 243:, 239:, 237:C# 235:, 231:, 215:, 211:, 142:, 138:, 122:, 118:, 114:, 778:. 753:. 747:: 732:. 707:. 689:. 675:. 661:. 642:. 618:. 598:. 578:. 572:: 537:. 513:. 488:. 463:. 445:. 225:R 116:R 97:) 72:)

Index


Original author(s)
Yandex
Developer(s)
Yandex
Stable release
Python
R
C++
Java
Operating system
Linux
macOS
Windows
Type
Machine learning
License
Apache License 2.0
catboost.ai
open-source
software library
Yandex
gradient boosting
Linux
Windows
macOS
Python
R
C++
Java

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

โ†‘