Knowledge (XXG)

:Knowledge (XXG) Signpost/2015-11-25/Op-ed - Knowledge (XXG)

Source 📝

191: 195: 362:
of the content, but also of the vocabulary, of the properties of different items, and of the taxonomies used to classify the information. We are deciding how to organise existing information about the world, and we are doing it in an open, participatory manner, as an example of the potential of technology. We know that human knowledge evolves cumulatively, and that Western culture is essentially inherited. Our reality is determined, in a sense, through the technological, social, political, and philosophical advances of those who came before us. This means that today’s generations don’t have to discover electricity all over again, for example. We enjoy the fruits of the efforts of our ancestors. But the Internet, for the first time, allows us to be involved in a phenomenon that will mark human history: we are defining and generating a new information ecosystem that will become the foundation for a possible cognitive revolution. And we are lucky to be able to participate, question, and improve it as it evolves. Together, we can participate in a historic project on a par with humanity’s greatest advances. We can create a new
202:
our fingertips. To ensure that the sum of all this knowledge reaches all human beings in their own language, free of charge, the Wikimedia Foundation runs many projects, free of charge, with one of the most successful being Knowledge (XXG). The English version of Knowledge (XXG) reached five million entries in October 2015. But this version is culturally biased, with an over-representation of Western culture. In fact, it only includes 30% of the items entered in the other 287 languages that form part of the Knowledge (XXG) project, which now has a total of more than 34 million articles. Many of the articles that refer to a particular culture only exist in the language of that culture, as can be seen just by looking at the maps of geolocated items. There is a lot of work to be done: it is estimated that in order to cover all human knowledge, an encyclopaedia today should have over
666:. Looking at the four paragraph intro, it contains tons of information, but only 3 of the claims made in the intro have references. Her founding an anarchist journal? No reference. Her being sentenced to 22 years in prison? No reference. Her date of birth? No reference. There are much more than 15 claims in the intro, but only 3 references. So the 20% of facts in Wikidata having a reference could also be interpreted as a much higher number than what Knowledge (XXG) offers. Much more than half of all claims in Knowledge (XXG) are without reference, probably much more than 90%. Now, obviously, this is no reason to say all is rosy for Wikidata, because Knowledge (XXG) is even worse - but I am questioning whether the metric, as presented here, is very valuable. -- 358:. Now more than ever, we need tools that will help us to contextualise information, to develop our own point of view, and to generate knowledge based on this information, in order to promote a society with a strong critical spirit. And we shouldn’t forget that data in itself is not objective either, even though it supposedly purports to be neutral. Data selection is a bias in itself. The decision of whether or not to analyse the gender, origin, religion, height, eye colour, political position, or nationality of a human group can condition the subsequent analysis. Codifying or failing to codify a particular item of information within a data set can both inform and disguise a particular reality. Data is useless without interpretation. 853:, writing a bot to do it isn't really an option; using templates could work but would be much harder to update than Wikidata's slick user interface is. Out of date governance and demographic information is a big problem in geographical articles and Wikidata solves that problem for us; that alone is reason enough to embrace it and welcome it with open arms. Yes, it has flaws, but let's remember it's in its infancy. When someone views an article and sees a population figure that's 14 years out of date, it doesn't make us look good. So I say let's put the effort in to make WikiData work for us. 270:
number of users and be updated more quickly. This is one of the strengths of the Wikidata project, given that thousands of volunteers are constantly updating the information. As a result, any application or project based on big data can take advantage of all of this structured knowledge, and do so free of charge. All of this means that we have to reconsider the role that traditional agents of knowledge (universities, research centres, cultural institutions) want to play, and the role or the possible role of the repositories of authorities around the world, now that new tools are
789:
anything present in Wikidata may come to be copied not just across several Wikipedias, but also by Google and multiple third-party sources taking either Google's or Wikidata's or Knowledge (XXG)'s statement on faith. This could lead to widespread contamination of sources everywhere ("citogenesis on steroids"). Insisting on strict sourcing standards is, in my opinion, absolutely vital, given the role envisaged for Wikidata. Otherwise you are not just creating intractable problems for yourselves, some months or years down the line, but also for all reusers.
223: 206:. Now that we know that it is possible and that everything is just a click away, we want to have the biographies of all the Hungarian writers available in a language that we understand, and we want it now. Local wiki communities around the world try to compile their own culture in their own language as best they can, but they often have limited capacity to influence the main body of the overall project. There are thousands of articles about Catalans in the 847:
election results; I'm still finding many articles that list incorrect members of parliament or local councillors because they haven't been updated and there's no central reference of which articles contain such information. Another prime example is census data; many UK geography articles still list the population as at the 2001 census, not the (more recent) 2011 census or any of the subsequent population estimates from the Office for National Statistics.
781:; again this concerns a snippet of information that could easily have been accommodated in Wikidata's statement structure. (As I pointed out on Wikimedia-l, Wikidata said for five months last year that Franklin D. Roosevelt was also known as "Adolf Hitler" – too obvious to be copied by anyone, unlike the Brazilian aardvaark moniker that entered multiple "reliable" sources.) Just today, there is this story on dozens of major news sites: 402: 243:, given that small communities can have a greater global impact in a more efficient manner. In the medium term, all Wikidata queries will include data from all over the world, not just from the cultures or historical communities with greater power to influence. A search for “doctors who graduated before they turned 20”, for example, will not only display French and English doctors, but also doctors from Taiwan and Andorra. 121: 111: 239:
the world. This means that when a change of government occurs, for example, simply updating the corresponding element on Wikidata will automatically update all the applications that are linked to it, be it Knowledge (XXG) or any other third-party application. It means that we do not have to constantly reinvent the wheel. This collaborative model helps to reduce the effects of the existing cultural
36: 131: 335: 91: 141: 101: 719:
What needs to be referenced, and what not? Etc. Wikidata is still a young project, and it needs to find its rules. Knowledge (XXG)'s citation rules were not as developed in 2004 as they are today, and Wikidata needs the time and the opportunity to find the correct set of rules as well. And every Wikipedian is invited to help at Wikidata.
218:, and much, much less the English version. How can we disseminate our culture internationally if we’re still trying to compile it in our own language? How can we access information that is not written in any of the languages that we are fluent in? The defense of online multilinguism entails as many challenges as opportunities. 198:, to name just two of innumerable examples. Wikidata is a new step forward in the democratisation of access to information, which is why the most important thing right now is the questions we ask ourselves: what information do we want to compile? How can we contextualise it? How does this new tool affect knowledge management? 718:
What I want to say is - the percentages you mention are hard to interpret. What would be a good number? Is it really captured in a simple number? What is the comparison coming from Knowledge (XXG)? A lot of the referencing and citation rules on Wikidata still need to mature. What is a good reference?
269:
And why Wikidata and not some other project? Internet standards do not necessarily become accepted because of their ability to generate authority, but because of their capacity to generate traffic, or their capacity to be updated. The winner is not the best, but the one that can assemble the greatest
759:
As was recently pointed out by another contributor in the mailing list discussion, Wikidata's role makes it all the more vital that its statements be referenced, because their content is likely to be copied. Given wikis' open structure, it is not uncommon for people to add false information. See for
750:
Chalberg gives the birth date in the same passage (though it is on page 12, not page 13). Would I think that a birth date like that should be referenced in Wikidata? Absolutely. Similarly, most of the bibliography is verifiable, given that each of her works bar one has its own article, complete with
710:
You are right, I was unfamiliar with that citing convention (and I like the convention a lot). Of the three claims that I mentioned two have indeed references later (the founding of the magazine and the prison sentence) and one does not (the date of birth). But many claims in the body of the article
361:
The impact of the emergence of Knowledge (XXG) on traditional print encyclopaedias is common knowledge. What will be the impact of Wikidata? In line with the wiki philosophy, the work is done collaboratively in an asymmetric but ongoing process. We can all collaborate in the creation and maintenance
238:
For this reason among many others, in 2012 the Wikimedia Foundation created Wikidata: a collaborative, multilingual database that aims to provide a common source for certain types of data such as dates of birth, coordinates, names, and authority records, managed collaboratively by volunteers around
755:
today, I would argue for holding promotion back until at least the ISBN numbers for Goldman's works are included, making verification that these works actually exist a matter of a single click on the ISBN number. Again, if we were in Wikidata, I would consider the addition of a reference like that
201:
With the introduction of the Internet, we now assume that information is just a click away. Thousands of people around the world post their creations online without expecting anything in return: guide books, manuals, photos, videos, tutorials, encyclopaedias and databases. All of it information at
714:
I do not say that each of these have to have references. That would make it so much harder to read, and some claims are just obvious. In Wikidata there are claims like "the first name of Emma Goldberg is Emma", which, I mean, does it really need a reference? Or "Living my Life was written by Emma
517:
that he was born in 1821 and died in 1881? Maybe 1881-1821=60 years old. But born 1821-01-01, died 1881-12-31 gives 61 years old, while born 1821-12-31, died 1881-01-01 gives 59 years old. But there are countries where the birth of a child is her first anniversary. But there are lunar years. And
349:
Data in itself is not knowledge. It is information. With the emergence of a new, very dense ecology of data that is accessible to everybody, we run the risk of trying to over-simplify the world: a description, no matter how detailed, will not necessarily make us understand something. Knowing that
788:
Wikidata need not and should not fall into the same ditches that plagued Knowledge (XXG) during its early years, and still continue to plague it to some extent today. Instead, Wikidata would do well to take the lessons learned in Knowledge (XXG)'s early years on board, because the danger is that
846:
Wikidata has some way to go but has the potential to be a massive help to building and maintaining Knowledge (XXG). For me, the biggest advantage is the ability to store information in once place that's referenced in many Knowledge (XXG) articles, and updated suddenly. The example was given of
190:, which can be read and edited by both humans and machines. A lot more free information, accessible to many more people, in their own language. The structure of the Wikidata information system and the open format allows us to make complex, dynamic queries, such as: what are 522:
as 1821--1881, this is even worse. And therefore, the question is not about what is written in the database, but about the confidence we can give to the way the data were collected to build the database. E.g. what says Wikidata about the death of Kim Hong-do ?
762:
Knowledge (XXG), the 25–year–old student and the prank that fooled Leveson: An American man wrongly named in the Leveson Report as a founder of The Independent newspaper has expressed surprise that a judge would accept without question information on Knowledge
164: 711:
remain without reference - her list of publications, for example. Or if you take the first paragraph of the article body, it has two references but many more claims (although it is admittedly hard to discern what exactly a reference contains).
277:
Cultural institutions, for example, have to deal with the challenge of the lack of standard matching criteria used to document artworks in their catalogues, such as for example: dimensions with frame, without frame, with or without
740:
article became a featured article in 2007, nearly 8 years ago. Quite possibly, it needs some work to make it conform to present-day standards. The birth date certainly should be referenced. Arguably, it
456: 461: 446: 747:
Emma Goldman was born on June 27, 1869. Her father used violence to punish his children, beating them when they disobeyed him. He used a whip only on Emma, the most rebellious of them.<ref: -->
441: 366:
that can serve as an open, transparent key to unlock the secrets of today’s world, and perhaps as a documentary source for future generations or civilisations. Let us take responsibility for it.
630:
A key fact here is that at present, only about 20% of Wikidata content is referenced to a reliable source. About half is unreferenced, and about a third is only referenced to a Knowledge (XXG).
451: 431: 77: 436: 687:; it is longstanding practice to use citations sparingly in the lead paragraphs. The lead is intended to summarise the article content; it should not contain anything that isn't covered, 419: 492: 769:, which involved the invention of an author and of books that had never existed. Or see the invention of a film director who had never lived, except on the pages of Knowledge (XXG): 186:
is set to become the main open data repository worldwide. The eagerly awaited promise of linked open data seems to have finally arrived: a multilingual, totally open database in the
172: 413: 55: 44: 761: 466: 379: 946: 21: 640: 921: 893: 602:
Probably is not the best sentence. Post was originally written for a non-wiki audience and was an intend of storifying the message. I do agree with you.--
284:, descriptions in text format, number fields… institutions have to bring order to their own data at home before opening up to the world. Being open means 916: 911: 819:
faster than the former ever did. Otherwise +1 to your points, especially "Insisting on strict sourcing standards is, in my opinion, absolutely vital,
906: 782: 263: 566:
My guess is, it's an assertion that other cultures are new or have a new essence. This would be appropriately, pretentiously, silly.
901: 766: 401: 49: 35: 17: 259: 203: 806: 701: 652: 624:
There is an ongoing discussion about Wikidata's quality issues and their wider implications on the Wikimedia-l mailing list:
317: 266:
from all over the world . All of these projects run on the Wikidata engine, which is becoming a new international standard.
773:. (That is a really, really good article, worth reading for its writing as well as the story it's telling.) Or see the 250:
allows users to make thousands of small contributions while playing, even from a mobile phone while waiting for a bus.
849:
Working through articles that find such information to update them is time consuming and mindnumbingly dull. Because
777:, whose content could conceivably have been included as a statement in Wikidata. See the Brazilian aardvaark story, 770: 354:
was born in 1821 and died in 1881 and that he was an existentialist is not the same as understanding Dostoyevsky or
584:"For this reason among many others, in 2012 the Wikimedia Foundation created Wikidata": I don't really want to be 927: 179: 571: 280: 881: 827: 810: 778: 731: 705: 675: 656: 611: 597: 575: 561: 546: 542: 532: 246:
This project opens up a whole new world of possibilities, for collaboration and for using the data: the
684: 662:
To be fair, regarding the 20% number: let's take a random Featured Article in Knowledge (XXG). Such as
297: 255: 815:
Also, Andreas, another difference between Knowledge (XXG) and Wikidata is that the latter is growing
309: 301: 305: 222: 211: 207: 607: 30:
Wikidata: the new Rosetta Stone: Wikidata is set to become the main open data repository worldwide.
262:
from around the world are uploading their research databases, and the cultural sector is building
215: 801: 696: 647: 567: 375: 351: 850: 94: 874: 538: 488: 124: 796:, is insert the reference for Goldman's birth date at the end of that sentence naming it. ;) 774: 313: 593: 557: 285: 752: 104: 727: 671: 625: 528: 134: 588:, but this is false. We either write Wikimedia Deutschland or "the Wikimedia movement". 683:, that's based on a lack of familiarity with citing conventions for article leads. See 603: 355: 289: 154: 940: 824: 797: 692: 643: 363: 342: 187: 854: 737: 663: 163: 114: 316:, is behind one of groundbreaking projects in this field, which aims to create an 746: 691:, in the article body. That is where the sources for those statements are found. 537:“doctors who graduated before they turned 20” – How would this query look like?-- 144: 589: 553: 793: 723: 680: 667: 524: 321: 240: 745:
verifiable from the reference present at the end of these three sentences:
334: 552:
What do you mean by "and that Western culture is essentially inherited"?
183: 247: 518:
what remains is something between 58 and 63 years old. When someone is
783:
This 'legend' changed a Knowledge (XXG) page to sneak backstage at gig
258:
offers a new way of visualising history through timelines. Meanwhile,
631: 196:
the number of ministers who are themselves the children of ministers
639:
exploring the links between Wikidata and Google's Knowledge Graph:
178:
With more than fifteen million items compiled in the space of just
293: 54: 756:(i.e. the ISBN number of the book's first edition) essential. 400: 333: 221: 162: 34: 271: 641:"Why Does Google Say Jerusalem Is the Capital of Israel?" 251: 382:
and is reprinted here with the permission of the author.
626:
http://www.gossamer-threads.com/lists/foundation/654001
504: 497: 477: 173:
Alfred Wegener Institute for Polar and Marine Research
192:
largest cities in the world with a female lord mayor
715:
Goldman". Again, does this really need a reference?
502:If your comment has not appeared here, you can try 254:allows people to share their favourite books, and 230:Map of geolocated items on Wikidata, October 2015. 751:bibliographical data. If the biography were at 635:For wider context, see yesterday's article in 328:Data is not knowledge. Data is not objective. 8: 380:Centre de Cultura ContemporĂ nia de Barcelona 171:Archive of marine geological samples of the 288:. Many institutions are already adapting: 947:Knowledge (XXG) Signpost archives 2015-11 18:Knowledge (XXG):Knowledge (XXG) Signpost 851:we prefer to write information in prose 505: 481: 234:Data is beautiful. Data is information. 71: 821:given the role envisaged for Wikidata 620:Wikimedia-l discussion, Slate article 29: 7: 210:, but there are not so many in the 298:openly collaborating with Wikidata 208:Catalan version of Knowledge (XXG) 56: 28: 771:The greatest movie that never was 722:Does this make any more sense? -- 487:These comments are automatically 792:One thing I will now go and do, 139: 129: 119: 109: 99: 89: 274:and creating a new centrality. 72:Wikidata: the new Rosetta Stone 498:add the page to your watchlist 1: 748:Chalberg, p. 13.</ref: --> 598:17:47, 30 November 2015 (UTC) 576:22:40, 30 November 2015 (UTC) 562:16:45, 30 November 2015 (UTC) 547:15:41, 30 November 2015 (UTC) 533:08:25, 30 November 2015 (UTC) 882:11:26, 4 December 2015 (UTC) 828:02:31, 4 December 2015 (UTC) 811:19:25, 3 December 2015 (UTC) 732:17:46, 3 December 2015 (UTC) 706:08:53, 3 December 2015 (UTC) 676:22:25, 2 December 2015 (UTC) 657:15:54, 1 December 2015 (UTC) 612:09:56, 7 December 2015 (UTC) 378:on the CCCB Lab blog of the 894:delivered to your talk page 963: 304:has also started using it 312:, in collaboration with 264:a database of paintings 779:told in the New Yorker 495:. To follow comments, 405: 338: 226: 167: 39: 513:How old was someone, 404: 337: 225: 166: 38: 823:." (emphasis mine). 491:from this article's 310:Barcelona University 302:Museum of Modern Art 204:100 million articles 775:Amelia Bedelia hoax 765:Or see the case of 376:originally appeared 272:mixing and matching 482:Discuss this story 457:Arbitration report 406: 339: 227: 168: 45:← Back to Contents 40: 879: 809: 704: 655: 506:purging the cache 462:Technology report 322:Catalan Modernism 292:managers such as 50:View Latest Issue 954: 930: 889:Want the latest 875: 804: 800: 749: 699: 695: 650: 646: 509: 507: 501: 480: 447:Featured content 424: 416: 414:25 November 2015 409: 392: 320:of all works of 314:Amical Wikimedia 308:. In Catalonia, 306:in its catalogue 286:interoperability 214:, much less the 157: 143: 142: 133: 132: 123: 122: 113: 112: 103: 102: 93: 92: 62: 60: 58: 57:25 November 2015 962: 961: 957: 956: 955: 953: 952: 951: 937: 936: 935: 934: 933: 932: 931: 926: 924: 919: 914: 909: 904: 897: 886: 885: 844: 802: 724:denny vrandečić 697: 668:denny vrandečić 648: 622: 511: 503: 496: 485: 484: 478:+ Add a comment 476: 472: 471: 470: 442:Recent research 417: 412: 410: 407: 396: 395: 390: 368: 346: 345: 331: 330: 236: 231: 228: 212:Spanish version 176: 169: 159: 158: 152: 151: 150: 149: 140: 130: 120: 110: 100: 90: 84: 81: 70: 65: 63: 53: 52: 47: 41: 31: 26: 25: 24: 12: 11: 5: 960: 958: 950: 949: 939: 938: 925: 920: 915: 910: 905: 900: 899: 898: 888: 887: 884: 843: 840: 839: 838: 837: 836: 835: 834: 833: 832: 831: 830: 790: 786: 757: 720: 716: 712: 621: 618: 617: 616: 615: 614: 581: 580: 579: 578: 486: 483: 475: 474: 473: 469: 464: 459: 454: 452:Traffic report 449: 444: 439: 434: 432:News and notes 429: 423: 411: 399: 398: 397: 388: 387: 386: 384: 370: 356:existentialism 347: 340: 332: 329: 326: 290:authority file 235: 232: 229: 220: 170: 161: 160: 148: 147: 137: 127: 117: 107: 97: 86: 85: 82: 76: 75: 74: 73: 68: 67: 66: 64: 61: 48: 43: 42: 33: 32: 27: 15: 14: 13: 10: 9: 6: 4: 3: 2: 959: 948: 945: 944: 942: 929: 923: 918: 913: 908: 903: 895: 892: 883: 880: 878: 873: 872: 869: 866: 863: 860: 857: 852: 848: 841: 829: 826: 822: 818: 814: 813: 812: 808: 805: 799: 795: 791: 787: 784: 780: 776: 772: 768: 767:Hannibal Fogg 764: 758: 754: 744: 739: 735: 734: 733: 729: 725: 721: 717: 713: 709: 708: 707: 703: 700: 694: 690: 686: 682: 679: 678: 677: 673: 669: 665: 661: 660: 659: 658: 654: 651: 645: 642: 638: 633: 632: 628: 627: 619: 613: 609: 605: 601: 600: 599: 595: 591: 587: 583: 582: 577: 573: 569: 568:Jim.henderson 565: 564: 563: 559: 555: 551: 550: 549: 548: 544: 540: 535: 534: 530: 526: 521: 516: 508: 499: 494: 490: 479: 468: 465: 463: 460: 458: 455: 453: 450: 448: 445: 443: 440: 438: 435: 433: 430: 428: 425: 421: 415: 408:In this issue 403: 394: 385: 383: 381: 377: 374:This article 371: 367: 365: 364:Rosetta Stone 359: 357: 353: 344: 343:Rosetta Stone 336: 327: 325: 323: 319: 318:open database 315: 311: 307: 303: 299: 295: 291: 287: 283: 282: 281:passe-partout 275: 273: 267: 265: 261: 257: 253: 249: 248:Wikidata game 244: 242: 233: 224: 219: 217: 213: 209: 205: 199: 197: 193: 189: 188:public domain 185: 181: 174: 165: 156: 146: 138: 136: 128: 126: 118: 116: 108: 106: 98: 96: 88: 87: 79: 59: 51: 46: 37: 23: 19: 890: 876: 870: 867: 864: 861: 858: 855: 845: 842:Mass updates 820: 816: 742: 738:Emma Goldman 688: 664:Emma Goldman 636: 634: 629: 623: 585: 539:Kopiersperre 536: 519: 514: 512: 437:In the media 426: 420:all comments 389: 373: 372: 369: 360: 348: 279: 276: 268: 245: 237: 200: 177: 928:Suggestions 896:each month? 689:and sourced 685:WP:CITELEAD 489:transcluded 352:Dostoyevsky 256:Histropedia 180:three years 300:, and the 260:scientists 252:Inventaire 83:Share this 78:Contribute 22:2015-11-25 922:Subscribe 604:Kippelboy 493:talk page 393:"Op-ed" → 241:diglossia 155:Kippelboy 941:Category 917:Newsroom 912:Archives 891:Signpost 760:example 586:that guy 520:reported 184:Wikidata 125:LinkedIn 105:Facebook 20:‎ | 798:Andreas 693:Andreas 644:Andreas 515:knowing 115:Twitter 763:(XXG). 753:WP:FAC 590:Aubrey 554:4nn1l2 216:French 135:Reddit 95:E-mail 907:About 794:Denny 681:Denny 637:Slate 525:Pldx1 427:Op-ed 69:Op-ed 16:< 902:Home 877:TALK 817:much 736:The 728:talk 672:talk 608:talk 594:talk 572:talk 558:talk 543:talk 529:talk 467:Blog 391:Next 341:The 296:are 294:VIAF 145:Digg 807:466 702:466 653:466 194:or 153:By 80:— 943:: 865:ge 825:Ed 803:JN 743:is 730:) 698:JN 674:) 649:JN 610:) 596:) 574:) 560:) 545:) 531:) 324:. 182:, 871:s 868:r 862:g 859:a 856:W 785:. 726:( 670:( 606:( 592:( 570:( 556:( 541:( 527:( 510:. 500:. 422:) 418:( 175:.

Index

Knowledge (XXG):Knowledge (XXG) Signpost
2015-11-25
The Signpost
← Back to Contents
View Latest Issue
25 November 2015
Contribute
E-mail
Facebook
Twitter
LinkedIn
Reddit
Digg
Kippelboy

Alfred Wegener Institute for Polar and Marine Research
three years
Wikidata
public domain
largest cities in the world with a female lord mayor
the number of ministers who are themselves the children of ministers
100 million articles
Catalan version of Knowledge (XXG)
Spanish version
French

diglossia
Wikidata game
Inventaire
Histropedia

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

↑