233:
test is perfect. The index gives equal weight to false positive and false negative values, so all tests with the same value of the index give the same proportion of total misclassified results. While it is possible to obtain a value of less than zero from this equation, e.g. Classification yields only False
Positives and False Negatives, a value of less than zero just indicates that the positive and negative labels have been switched. After correcting the labels the result will then be in the 0 through 1 range.
248:(ROC) analysis. The index is defined for all points of an ROC curve, and the maximum value of the index may be used as a criterion for selecting the optimum cut-off point when a diagnostic test gives a numeric rather than a dichotomous result. The index is represented graphically as the height above the chance line, and it is also equivalent to the area under the curve subtended by a single operating point.
237:
215:
232:
in 1884. Its value ranges from -1 through 1 (inclusive), and has a zero value when a diagnostic test gives the same proportion of positive results for groups with and without the disease, i.e the test is useless. A value of 1 indicates that there are no false positives or false negatives, i.e. the
380:
for the two positive variables are equal as assumed in Fleiss kappa and F-score, that is the number of positive predictions matches the number of positive classes in the dichotomous (two class) case, the different kappa and correlation measure collapse to identity with Youden's J, and recall,
286:
are totally different measures. F-score, like recall and precision, only considers the so-called positive predictions, with recall being the probability of predicting just the positive class, precision being the probability of a positive prediction being correct, and F-score equating these
144:
129:
254:
The use of a single index is "not generally to be recommended", but informedness or Youden's index is the probability of an informed decision (as opposed to a random guess) and takes into account all predictions.
361:
in other contexts (including the multiclass case). Fleiss' kappa, like F-score, assumes that both variables are drawn from the same distribution and thus have the same expected prevalence, while
210:{\displaystyle J={\frac {\text{true positives}}{{\text{true positives}}+{\text{false negatives}}}}+{\frac {\text{true negatives}}{{\text{true negatives}}+{\text{false positives}}}}-1}
56:
240:
Example of a receiver operating characteristic curve. Solid red: ROC curve; Dashed line: Chance level; Vertical line (J) maximum value of Youden's index for the ROC curve
337:, where the component regression coefficients of the Matthews correlation coefficient are deltaP and deltaP' (that is Youden's J or Pierce's I). The main article on
341:
discusses two different generalizations to the multiclass case, one being the analogous geometric mean of
Informedness and Markedness. Kappa statistics such as
299:
effectiveness of predictions in the direction proposed by a rule, theory or classifier. DeltaP is Youden's J used to assess the reverse or
287:
probabilities under the effective assumption that the positive labels and the positive predictions should have the same distribution and
637:
592:
245:
425:
408:
338:
322:
529:
Powers, David M W (2011). "Evaluation: From
Precision, Recall and F-Score to ROC, Informedness, Markedness & Correlation".
271:
275:
251:
Youden's index is also known as deltaP' and generalizes from the dichotomous to the multiclass case as informedness.
135:
224:
in 1950 as a way of summarising the performance of a diagnostic test; however, the formula was earlier published in
295:. Youden's J, Informedness, Recall, Precision and F-score are intrinsically undirectional, aiming to assess the
308:
279:
561:
Perruchet, P.; Peereman, R. (2004). "The exploitation of distributional information in syllable processing".
477:"Optimal cut-point and its corresponding Youden Index to discriminate individuals using pooled blood samples"
350:
642:
330:
229:
613:. Conference of the European Chapter of the Association for Computational Linguistics. pp. 345–355.
259:
353:
based on different assumptions about the marginal or prior distributions, and are increasingly used as
124:{\displaystyle J={\text{sensitivity}}+{\text{specificity}}-1={\text{recall}}_{1}+{\text{recall}}_{0}-1}
283:
267:
334:
588:
498:
430:
404:
365:
assumes that the variables are drawn from distinct distributions and referenced to a model of
221:
614:
570:
538:
488:
457:
420:
362:
346:
342:
292:
35:
493:
476:
366:
326:
574:
631:
312:
461:
236:
34:
is its generalization to the multiclass case and estimates the probability of an
377:
370:
304:
288:
448:
Pierce, C.S. (1884). "The numerical measure of the success of predictions".
316:
300:
296:
27:
502:
434:
382:
358:
426:
10.1002/1097-0142(1950)3:1<32::aid-cncr2820030106>3.0.co;2-3
263:
618:
542:
16:
Index that describes the performance of a dichotomous diagnostic test
258:
An unrelated but commonly used combination of basic statistics from
235:
587:
Everitt B.S. (2002) The
Cambridge Dictionary of Statistics. CUP
475:
Schisterman, E.F.; Perkins, N.J.; Liu, A.; Bondell, H. (2005).
26:) is a single statistic that captures the performance of a
319:;, while correlation and kappa evaluate bidirectionally.
303:
direction, (and generalizes to the multiclass case as
147:
59:
381:precision and F-score are similarly identical with
209:
123:
244:Youden's index is often used in conjunction with
556:
554:
552:
266:, being a (possibly weighted) harmonic mean of
8:
604:
602:
600:
524:
522:
520:
518:
516:
514:
512:
399:
397:
291:, similar to the assumption underlying of
492:
424:
193:
185:
179:
168:
160:
154:
146:
134:with the two right-hand quantities being
109:
104:
94:
89:
74:
66:
58:
531:Journal of Machine Learning Technologies
393:
7:
409:"Index for rating diagnostic tests"
333:of the dichotomous problem and its
307:), matching well human learning of
494:10.1097/01.ede.0000147512.81966.ba
14:
246:receiver operating characteristic
339:Matthews correlation coefficient
323:Matthews correlation coefficient
138:. Thus the expanded formula is:
1:
575:10.1016/s0911-6044(03)00059-9
462:10.1126/science.ns-4.93.453.b
349:are methods for calculating
220:The index was suggested by
136:sensitivity and specificity
659:
638:Statistical classification
609:Powers, David M W (2012).
278:= true positive rate. But
32:(Bookmaker) Informedness
351:inter-rater reliability
611:The Problem with Kappa
331:regression coefficient
241:
211:
125:
315:as we model possible
260:information retrieval
239:
212:
126:
268:recall and precision
145:
57:
20:Youden's J statistic
563:J. Neurolinguistics
242:
207:
121:
373:are independent.
199:
196:
188:
183:
174:
171:
163:
158:
107:
92:
77:
69:
36:informed decision
30:diagnostic test.
650:
623:
622:
606:
595:
585:
579:
578:
558:
547:
546:
526:
507:
506:
496:
472:
466:
465:
445:
439:
438:
428:
401:
357:alternatives to
355:chance corrected
216:
214:
213:
208:
200:
198:
197:
194:
189:
186:
181:
180:
175:
173:
172:
169:
164:
161:
156:
155:
130:
128:
127:
122:
114:
113:
108:
105:
99:
98:
93:
90:
78:
75:
70:
67:
658:
657:
653:
652:
651:
649:
648:
647:
628:
627:
626:
608:
607:
598:
586:
582:
569:(2–3): 97–119.
560:
559:
550:
528:
527:
510:
474:
473:
469:
456:(93): 453–454.
447:
446:
442:
403:
402:
395:
391:
195:false positives
184:
170:false negatives
159:
143:
142:
103:
88:
55:
54:
44:
17:
12:
11:
5:
656:
654:
646:
645:
640:
630:
629:
625:
624:
596:
580:
548:
508:
467:
440:
392:
390:
387:
376:When the true
327:geometric mean
218:
217:
206:
203:
192:
187:true negatives
182:true negatives
178:
167:
162:true positives
157:true positives
153:
150:
132:
131:
120:
117:
112:
102:
97:
87:
84:
81:
73:
65:
62:
43:
40:
24:Youden's index
15:
13:
10:
9:
6:
4:
3:
2:
655:
644:
643:Biostatistics
641:
639:
636:
635:
633:
620:
616:
612:
605:
603:
601:
597:
594:
593:0-521-81099-X
590:
584:
581:
576:
572:
568:
564:
557:
555:
553:
549:
544:
540:
536:
532:
525:
523:
521:
519:
517:
515:
513:
509:
504:
500:
495:
490:
486:
482:
478:
471:
468:
463:
459:
455:
451:
444:
441:
436:
432:
427:
422:
418:
414:
410:
406:
400:
398:
394:
388:
386:
384:
379:
374:
372:
369:that assumes
368:
364:
363:Cohen's kappa
360:
356:
352:
348:
347:Cohen's kappa
344:
343:Fleiss' kappa
340:
336:
332:
328:
324:
320:
318:
314:
313:superstitions
311:; rules and,
310:
306:
302:
298:
294:
293:Fleiss' kappa
290:
285:
281:
277:
273:
269:
265:
261:
256:
252:
249:
247:
238:
234:
231:
227:
223:
204:
201:
190:
176:
165:
151:
148:
141:
140:
139:
137:
118:
115:
110:
100:
95:
85:
82:
79:
71:
63:
60:
53:
52:
51:
50:statistic is
49:
41:
39:
37:
33:
29:
25:
22:(also called
21:
610:
583:
566:
562:
537:(1): 37–63.
534:
530:
487:(1): 73–81.
484:
481:Epidemiology
480:
470:
453:
449:
443:
416:
412:
405:Youden, W.J.
375:
354:
321:
309:associations
257:
253:
250:
243:
230:C. S. Pierce
225:
222:W. J. Youden
219:
133:
47:
45:
31:
23:
19:
18:
378:prevalences
371:prevalences
367:expectation
280:specificity
276:sensitivity
76:specificity
68:sensitivity
28:dichotomous
632:Categories
619:2328/27160
543:2328/27165
389:References
305:Markedness
289:prevalence
42:Definition
419:: 32–35.
317:causation
301:abductive
297:deductive
284:precision
202:−
116:−
80:−
46:Youden's
503:15613948
435:15405679
407:(1950).
383:accuracy
359:accuracy
450:Science
329:of the
325:is the
264:F-score
262:is the
226:Science
591:
501:
433:
413:Cancer
272:recall
270:where
106:recall
91:recall
589:ISBN
499:PMID
431:PMID
345:and
335:dual
282:and
615:hdl
571:doi
539:hdl
489:doi
458:doi
421:doi
228:by
634::
599:^
567:17
565:.
551:^
533:.
511:^
497:.
485:16
483:.
479:.
452:.
429:.
415:.
411:.
396:^
385:.
274:=
38:.
621:.
617::
577:.
573::
545:.
541::
535:2
505:.
491::
464:.
460::
454:4
437:.
423::
417:3
205:1
191:+
177:+
166:+
152:=
149:J
119:1
111:0
101:+
96:1
86:=
83:1
72:+
64:=
61:J
48:J
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.