First-order inductive learner

436:, or as a list of tuples for which a predicate is true. FOIL allows only operational rules; FOCL extends its knowledge base to allow combinations of rules called non-operational rules as well as partially defined or incorrect rules for robustness. Allowing for partial definitions reduces the amount of work needed as the algorithm need not generate these partial definitions for itself, and the incorrect rules do not add significantly to the work needed since they are discarded if they are not judged to provide positive information gain. Non-operational rules are advantageous as the individual rules which they combine may not provide information gain on their own, but are useful when taken in conjunction. If a literal with the most information gain in an iteration of FOCL is non-operational, it is operationalized and its definition is added to the clause under construction. 539:), the algorithm takes as input a set of non-operational rules which it tests for correctness and operationalizes for its learned concept. A correct target concept will clearly improve computational time and accuracy, but even an incorrect concept will give the algorithm a basis from which to work and improve accuracy and time. 309:

has been added, the algorithm will consider all combinations of predicate names and variables such that at least one variable in the new literal is present in the existing clause. This results in a very large search space. Several extensions of the FOIL theory have shown that additions to the basic

326:) extends FOIL in a variety of ways, which affect how FOCL selects literals to test while extending a clause under construction. Constraints on the search space are allowed, as are predicates that are defined on a rule rather than on a set of examples (called 270:, or many others – to create this literal, the algorithm must choose both a predicate name and a set of variables for the predicate (at least one of which is required to be present already in an unnegated literal of the clause). If FOIL extends a clause 341:: first FOCL attempts to learn a clause by introducing no free variables. If this fails (no positive gain), one additional free variable per failure is allowed until the number of free variables exceeds the maximum used for any predicate. 397:

are defined to be person variables, reducing the search space. Additionally, typing can improve the accuracy of the resulting rule by eliminating from consideration impossible literals such as

330:

predicates); most importantly a potentially incorrect hypothesis is allowed as an initial approximation to the predicate to be learned. The main goal of FOCL is to incorporate the methods of

349:

Unlike FOIL, which does not put typing constraints on its variables, FOCL uses typing as an inexpensive way of incorporating a simple form of background knowledge. For example, a predicate

535:

The addition of non-operational rules to the knowledge base increases the size of the space which FOCL must search. Rather than simply providing the algorithm with a target concept (e.g.

416:, FOCL introduces implicit constraints on variables, further reducing search space. Some predicates must have all variables unique, others must have commutativity ( 338: 628: 377:

would need to exist for this functionality to be maintained. However, this typing mechanism eliminates the need for predicates such as

337:

Even when no additional knowledge is provided to FOCL over FOIL, however, it utilizes an iterative widening search strategy similar to

63:, FOIL inductively generates a logical concept definition or rule for the concept. The induced rule must not involve any constants ( 611:

Michael Pazzani and Dennis Kibler. The Utility of Knowledge in Inductive Learning. Machine Learning, Volume 9, Number 1, 1992.

424:), still others may require that a particular variable be present in the current clause, and many other potential constraints. 60: 56: 369:

live next door to each other, or whether two locations are next door to each other. With types, two different predicates

87: 90:, focusing on creating one rule at a time and collecting uncovered examples for the next iteration of the algorithm. 331: 83: 561:

J.R. Quinlan. Learning Logical Definitions from Relations. Machine Learning, Volume 5, Number 3, 1990.

548: 79: 71:) or function symbols, but may allow negated predicates; recursive concepts are also learnable. 17: 402: 28: 59:. Given positive and negative examples of some concept and a set of background-knowledge 622: 75: 487:

Compute information gain of the clause over Positive examples and Negative examples

444:

Literal to be operationalized, List of positive examples, List of negative examples

48: 52: 612: 357:. Additional predicates may need to be introduced, though – without types, 82:

to construct a rule that covers the data. Unlike ID3, however, FOIL uses a

562: 549:

http://www.csc.liv.ac.uk/~frans/KDD/Software/FOIL_PRM_CPAR/foil.html

576:

be the largest number of distinct variables for any clause in rule

258:. This can be extended by conjoining Body with any of the literals 591:. Then an approximation of the number of nodes generated to learn 585: 310:

algorithm may reduce this search space, sometimes drastically.

456:

Operationalize(Literal, Positive examples, Negative examples)

432:

Operational rules are those rules which are defined

408:

Rather than implementing trivial predicates such as

282:. Positive examples now consist of those values < 254:. Furthermore, suppose our current Body consists of 525:between(X,Y,Z) ← lessThan(X,Y), lessThan(Y,Z) 401:which may nevertheless appear to have a high 8: 242:Suppose FOIL's task is to learn the concept 106:List of examples and predicate to be learned 597:NodesSearched ≤ 2 * MaxP * (Var + MaxA – 1) 506:, Positive examples, Negative examples) to 294:is true; negative examples are those where 78:, FOIL hill climbs using a metric based on 334:(EBL) into the empirical methods of FOIL. 607: 605: 584:be the number of predicates with largest 519:An operational rule might be the literal 599:, as shown in Pazzani and Kibler (1992). 554: 39:) is a rule-based learning algorithm. 492:For the clause with the maximum gain 481:For each clause in the definition of 278:, it is introducing the new variable 7: 305:On the next iteration of FOIL after 580:, excluding the last conjunct. Let 523:; a non-operational rule might be 98:The FOIL algorithm is as follows: 25: 114:A set of first-order Horn clauses 361:could determine whether person 256:grandfather(X,Y) ← parent(X,Z) 144:be the predicate to be learned 57:first-order predicate calculus 1: 226:examples that do not satisfy 33:first-order inductive learner 18:First Order Inductive Learner 375:nextDoor(location, location) 324:First Order Combined Learner 314:First-order combined learner 51:, FOIL learns function-free 629:Inductive logic programming 452:Literal in operational form 186:all examples which satisfy 645: 332:explanation-based learning 274:by conjoining the literal 196:Procedure LearnClauseBody 355:livesAt(person, location) 385:, and need not consider 371:nextDoor(person, person) 158:be the negative examples 137:be the positive examples 272:grandfather(X,Y) ← true 47:Developed in 1990 by 246:given the relations 168:Call LearnClauseBody 84:separate-and-conquer 508:OperationalLiterals 502:Add Operationalize( 476:OperationalLiterals 86:method rather than 339:depth-first search 88:divide-and-conquer 80:information theory 69:color(X,Y), red(Y) 495:For each literal 428:Operational rules 420:is equivalent to 206:Choose a literal 16:(Redirected from 636: 614: 609: 600: 570: 564: 559: 537:grandfather(X,Y) 478:to the empty set 403:information gain 296:grandfather(X,Y) 288:grandfather(X,Y) 244:grandfather(X,Y) 29:machine learning 21: 644: 643: 639: 638: 637: 635: 634: 633: 619: 618: 617: 610: 603: 571: 567: 560: 556: 545: 533: 463:is operational 430: 353:may have types 347: 316: 286:> such that 240: 96: 45: 23: 22: 15: 12: 11: 5: 642: 640: 632: 631: 621: 620: 616: 615: 601: 565: 553: 552: 551: 544: 541: 532: 529: 517: 516: 515: 514: 513: 512: 511: 510: 499:in the clause 490: 489: 488: 479: 472: 471: 470: 454: 446: 429: 426: 414:between(X,X,Y) 346: 343: 315: 312: 239: 236: 235: 234: 233: 232: 231: 230: 220: 210: 194: 193: 192: 191: 190: 180: 169: 166: 159: 145: 138: 116: 108: 95: 92: 55:, a subset of 44: 41: 24: 14: 13: 10: 9: 6: 4: 3: 2: 641: 630: 627: 626: 624: 613: 608: 606: 602: 598: 594: 590: 587: 583: 579: 575: 569: 566: 563: 558: 555: 550: 547: 546: 542: 540: 538: 531:Initial rules 530: 528: 526: 522: 521:lessThan(X,Y) 509: 505: 501: 500: 498: 494: 493: 491: 486: 485: 484: 480: 477: 473: 469: 465: 464: 462: 458: 457: 455: 453: 450: 447: 445: 442: 439: 438: 437: 435: 434:extensionally 427: 425: 423: 422:adjacent(Y,X) 419: 418:adjacent(X,Y) 415: 411: 406: 404: 400: 396: 392: 388: 384: 383:isLocation(Y) 380: 376: 372: 368: 364: 360: 359:nextDoor(X,Y) 356: 352: 344: 342: 340: 335: 333: 329: 325: 321: 313: 311: 308: 303: 301: 297: 293: 289: 285: 281: 277: 273: 269: 265: 261: 257: 253: 249: 245: 237: 229: 225: 221: 219: 215: 211: 209: 205: 204: 203:is empty do: 202: 198: 197: 195: 189: 185: 181: 178: 174: 170: 167: 164: 160: 157: 153: 152: 151:is empty do: 150: 146: 143: 139: 136: 132: 131: 129: 125: 121: 117: 115: 112: 109: 107: 104: 101: 100: 99: 93: 91: 89: 85: 81: 77: 76:ID3 algorithm 72: 70: 66: 62: 58: 54: 50: 42: 40: 38: 34: 30: 19: 596: 592: 588: 581: 577: 573: 568: 557: 536: 534: 524: 520: 518: 507: 503: 496: 482: 475: 467: 460: 451: 448: 443: 440: 433: 431: 421: 417: 413: 409: 407: 399:livesAt(A,B) 398: 394: 390: 387:livesAt(A,B) 386: 382: 378: 374: 370: 366: 362: 358: 354: 351:livesAt(X,Y) 350: 348: 336: 327: 323: 319: 317: 306: 304: 299: 298:is true but 295: 291: 290:is true and 287: 283: 279: 275: 271: 267: 263: 259: 255: 251: 247: 243: 241: 227: 223: 222:Remove from 217: 213: 207: 200: 187: 183: 182:Remove from 176: 172: 162: 155: 148: 141: 134: 127: 123: 119: 113: 110: 105: 102: 97: 73: 68: 65:color(X,red) 64: 53:Horn clauses 49:Ross Quinlan 46: 36: 32: 26: 474:Initialize 410:equals(X,X) 379:isPerson(X) 365:and person 345:Constraints 328:intensional 322:algorithm ( 307:parent(X,Z) 300:parent(X,Z) 292:parent(X,Z) 276:parent(X,Z) 268:parent(U,Y) 264:father(Y,Z) 260:father(X,X) 252:parent(X,Y) 248:father(X,Y) 179:to the rule 543:References 302:is false. 61:predicates 43:Background 94:Algorithm 74:Like the 623:Category 212:Conjoin 165:to empty 67:becomes 483:Literal 468:Literal 466:Return 461:Literal 238:Example 449:Output 441:Inputs 199:Until 147:Until 111:Output 586:arity 389:when 284:X,Y,Z 118:FOIL( 103:Input 595:is: 589:MaxA 582:MaxP 572:Let 393:and 373:and 320:FOCL 318:The 250:and 218:Body 188:Body 177:Body 173:Pred 171:Add 163:Body 161:Set 154:Let 142:Pred 140:Let 133:Let 120:Pred 37:FOIL 574:Var 459:If 412:or 381:or 224:Neg 216:to 201:Neg 184:Pos 156:Neg 149:Pos 135:Pos 128:Neg 124:Pos 27:In 625:: 604:^ 527:. 405:. 266:, 262:, 175:← 130:) 126:, 122:, 31:, 593:R 578:R 504:L 497:L 395:B 391:A 367:Y 363:X 280:Z 228:L 214:L 208:L 35:( 20:)

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

Knowledge (XXG)

First-order inductive learner

Index