191:
195:
362:
of the content, but also of the vocabulary, of the properties of different items, and of the taxonomies used to classify the information. We are deciding how to organise existing information about the world, and we are doing it in an open, participatory manner, as an example of the potential of technology. We know that human knowledge evolves cumulatively, and that
Western culture is essentially inherited. Our reality is determined, in a sense, through the technological, social, political, and philosophical advances of those who came before us. This means that todayâs generations donât have to discover electricity all over again, for example. We enjoy the fruits of the efforts of our ancestors. But the Internet, for the first time, allows us to be involved in a phenomenon that will mark human history: we are defining and generating a new information ecosystem that will become the foundation for a possible cognitive revolution. And we are lucky to be able to participate, question, and improve it as it evolves. Together, we can participate in a historic project on a par with humanityâs greatest advances. We can create a new
202:
our fingertips. To ensure that the sum of all this knowledge reaches all human beings in their own language, free of charge, the
Wikimedia Foundation runs many projects, free of charge, with one of the most successful being Knowledge (XXG). The English version of Knowledge (XXG) reached five million entries in October 2015. But this version is culturally biased, with an over-representation of Western culture. In fact, it only includes 30% of the items entered in the other 287 languages that form part of the Knowledge (XXG) project, which now has a total of more than 34 million articles. Many of the articles that refer to a particular culture only exist in the language of that culture, as can be seen just by looking at the maps of geolocated items. There is a lot of work to be done: it is estimated that in order to cover all human knowledge, an encyclopaedia today should have over
666:. Looking at the four paragraph intro, it contains tons of information, but only 3 of the claims made in the intro have references. Her founding an anarchist journal? No reference. Her being sentenced to 22 years in prison? No reference. Her date of birth? No reference. There are much more than 15 claims in the intro, but only 3 references. So the 20% of facts in Wikidata having a reference could also be interpreted as a much higher number than what Knowledge (XXG) offers. Much more than half of all claims in Knowledge (XXG) are without reference, probably much more than 90%. Now, obviously, this is no reason to say all is rosy for Wikidata, because Knowledge (XXG) is even worse - but I am questioning whether the metric, as presented here, is very valuable. --
358:. Now more than ever, we need tools that will help us to contextualise information, to develop our own point of view, and to generate knowledge based on this information, in order to promote a society with a strong critical spirit. And we shouldnât forget that data in itself is not objective either, even though it supposedly purports to be neutral. Data selection is a bias in itself. The decision of whether or not to analyse the gender, origin, religion, height, eye colour, political position, or nationality of a human group can condition the subsequent analysis. Codifying or failing to codify a particular item of information within a data set can both inform and disguise a particular reality. Data is useless without interpretation.
853:, writing a bot to do it isn't really an option; using templates could work but would be much harder to update than Wikidata's slick user interface is. Out of date governance and demographic information is a big problem in geographical articles and Wikidata solves that problem for us; that alone is reason enough to embrace it and welcome it with open arms. Yes, it has flaws, but let's remember it's in its infancy. When someone views an article and sees a population figure that's 14 years out of date, it doesn't make us look good. So I say let's put the effort in to make WikiData work for us.
270:
number of users and be updated more quickly. This is one of the strengths of the
Wikidata project, given that thousands of volunteers are constantly updating the information. As a result, any application or project based on big data can take advantage of all of this structured knowledge, and do so free of charge. All of this means that we have to reconsider the role that traditional agents of knowledge (universities, research centres, cultural institutions) want to play, and the role or the possible role of the repositories of authorities around the world, now that new tools are
789:
anything present in
Wikidata may come to be copied not just across several Wikipedias, but also by Google and multiple third-party sources taking either Google's or Wikidata's or Knowledge (XXG)'s statement on faith. This could lead to widespread contamination of sources everywhere ("citogenesis on steroids"). Insisting on strict sourcing standards is, in my opinion, absolutely vital, given the role envisaged for Wikidata. Otherwise you are not just creating intractable problems for yourselves, some months or years down the line, but also for all reusers.
223:
206:. Now that we know that it is possible and that everything is just a click away, we want to have the biographies of all the Hungarian writers available in a language that we understand, and we want it now. Local wiki communities around the world try to compile their own culture in their own language as best they can, but they often have limited capacity to influence the main body of the overall project. There are thousands of articles about Catalans in the
847:
election results; I'm still finding many articles that list incorrect members of parliament or local councillors because they haven't been updated and there's no central reference of which articles contain such information. Another prime example is census data; many UK geography articles still list the population as at the 2001 census, not the (more recent) 2011 census or any of the subsequent population estimates from the Office for
National Statistics.
781:; again this concerns a snippet of information that could easily have been accommodated in Wikidata's statement structure. (As I pointed out on Wikimedia-l, Wikidata said for five months last year that Franklin D. Roosevelt was also known as "Adolf Hitler" â too obvious to be copied by anyone, unlike the Brazilian aardvaark moniker that entered multiple "reliable" sources.) Just today, there is this story on dozens of major news sites:
402:
243:, given that small communities can have a greater global impact in a more efficient manner. In the medium term, all Wikidata queries will include data from all over the world, not just from the cultures or historical communities with greater power to influence. A search for âdoctors who graduated before they turned 20â, for example, will not only display French and English doctors, but also doctors from Taiwan and Andorra.
121:
111:
239:
the world. This means that when a change of government occurs, for example, simply updating the corresponding element on
Wikidata will automatically update all the applications that are linked to it, be it Knowledge (XXG) or any other third-party application. It means that we do not have to constantly reinvent the wheel. This collaborative model helps to reduce the effects of the existing cultural
36:
131:
335:
91:
141:
101:
719:
What needs to be referenced, and what not? Etc. Wikidata is still a young project, and it needs to find its rules. Knowledge (XXG)'s citation rules were not as developed in 2004 as they are today, and
Wikidata needs the time and the opportunity to find the correct set of rules as well. And every Wikipedian is invited to help at Wikidata.
218:, and much, much less the English version. How can we disseminate our culture internationally if weâre still trying to compile it in our own language? How can we access information that is not written in any of the languages that we are fluent in? The defense of online multilinguism entails as many challenges as opportunities.
198:, to name just two of innumerable examples. Wikidata is a new step forward in the democratisation of access to information, which is why the most important thing right now is the questions we ask ourselves: what information do we want to compile? How can we contextualise it? How does this new tool affect knowledge management?
718:
What I want to say is - the percentages you mention are hard to interpret. What would be a good number? Is it really captured in a simple number? What is the comparison coming from
Knowledge (XXG)? A lot of the referencing and citation rules on Wikidata still need to mature. What is a good reference?
269:
And why
Wikidata and not some other project? Internet standards do not necessarily become accepted because of their ability to generate authority, but because of their capacity to generate traffic, or their capacity to be updated. The winner is not the best, but the one that can assemble the greatest
759:
As was recently pointed out by another contributor in the mailing list discussion, Wikidata's role makes it all the more vital that its statements be referenced, because their content is likely to be copied. Given wikis' open structure, it is not uncommon for people to add false information. See for
750:
Chalberg gives the birth date in the same passage (though it is on page 12, not page 13). Would I think that a birth date like that should be referenced in
Wikidata? Absolutely. Similarly, most of the bibliography is verifiable, given that each of her works bar one has its own article, complete with
710:
You are right, I was unfamiliar with that citing convention (and I like the convention a lot). Of the three claims that I mentioned two have indeed references later (the founding of the magazine and the prison sentence) and one does not (the date of birth). But many claims in the body of the article
361:
The impact of the emergence of Knowledge (XXG) on traditional print encyclopaedias is common knowledge. What will be the impact of Wikidata? In line with the wiki philosophy, the work is done collaboratively in an asymmetric but ongoing process. We can all collaborate in the creation and maintenance
238:
For this reason among many others, in 2012 the Wikimedia Foundation created Wikidata: a collaborative, multilingual database that aims to provide a common source for certain types of data such as dates of birth, coordinates, names, and authority records, managed collaboratively by volunteers around
755:
today, I would argue for holding promotion back until at least the ISBN numbers for Goldman's works are included, making verification that these works actually exist a matter of a single click on the ISBN number. Again, if we were in Wikidata, I would consider the addition of a reference like that
201:
With the introduction of the Internet, we now assume that information is just a click away. Thousands of people around the world post their creations online without expecting anything in return: guide books, manuals, photos, videos, tutorials, encyclopaedias and databases. All of it information at
714:
I do not say that each of these have to have references. That would make it so much harder to read, and some claims are just obvious. In Wikidata there are claims like "the first name of Emma Goldberg is Emma", which, I mean, does it really need a reference? Or "Living my Life was written by Emma
517:
that he was born in 1821 and died in 1881? Maybe 1881-1821=60 years old. But born 1821-01-01, died 1881-12-31 gives 61 years old, while born 1821-12-31, died 1881-01-01 gives 59 years old. But there are countries where the birth of a child is her first anniversary. But there are lunar years. And
349:
Data in itself is not knowledge. It is information. With the emergence of a new, very dense ecology of data that is accessible to everybody, we run the risk of trying to over-simplify the world: a description, no matter how detailed, will not necessarily make us understand something. Knowing that
788:
Wikidata need not and should not fall into the same ditches that plagued Knowledge (XXG) during its early years, and still continue to plague it to some extent today. Instead, Wikidata would do well to take the lessons learned in Knowledge (XXG)'s early years on board, because the danger is that
846:
Wikidata has some way to go but has the potential to be a massive help to building and maintaining Knowledge (XXG). For me, the biggest advantage is the ability to store information in once place that's referenced in many Knowledge (XXG) articles, and updated suddenly. The example was given of
190:, which can be read and edited by both humans and machines. A lot more free information, accessible to many more people, in their own language. The structure of the Wikidata information system and the open format allows us to make complex, dynamic queries, such as: what are
522:
as 1821--1881, this is even worse. And therefore, the question is not about what is written in the database, but about the confidence we can give to the way the data were collected to build the database. E.g. what says Wikidata about the death of Kim Hong-do ?
762:
Knowledge (XXG), the 25âyearâold student and the prank that fooled Leveson: An American man wrongly named in the Leveson Report as a founder of The Independent newspaper has expressed surprise that a judge would accept without question information on Knowledge
164:
711:
remain without reference - her list of publications, for example. Or if you take the first paragraph of the article body, it has two references but many more claims (although it is admittedly hard to discern what exactly a reference contains).
277:
Cultural institutions, for example, have to deal with the challenge of the lack of standard matching criteria used to document artworks in their catalogues, such as for example: dimensions with frame, without frame, with or without
740:
article became a featured article in 2007, nearly 8 years ago. Quite possibly, it needs some work to make it conform to present-day standards. The birth date certainly should be referenced. Arguably, it
456:
461:
446:
747:
Emma Goldman was born on June 27, 1869. Her father used violence to punish his children, beating them when they disobeyed him. He used a whip only on Emma, the most rebellious of them.<ref: -->
441:
366:
that can serve as an open, transparent key to unlock the secrets of todayâs world, and perhaps as a documentary source for future generations or civilisations. Let us take responsibility for it.
630:
A key fact here is that at present, only about 20% of Wikidata content is referenced to a reliable source. About half is unreferenced, and about a third is only referenced to a Knowledge (XXG).
451:
431:
77:
436:
687:; it is longstanding practice to use citations sparingly in the lead paragraphs. The lead is intended to summarise the article content; it should not contain anything that isn't covered,
419:
492:
769:, which involved the invention of an author and of books that had never existed. Or see the invention of a film director who had never lived, except on the pages of Knowledge (XXG):
186:
is set to become the main open data repository worldwide. The eagerly awaited promise of linked open data seems to have finally arrived: a multilingual, totally open database in the
172:
413:
55:
44:
761:
466:
379:
946:
21:
640:
921:
893:
602:
Probably is not the best sentence. Post was originally written for a non-wiki audience and was an intend of storifying the message. I do agree with you.--
284:, descriptions in text format, number fields⌠institutions have to bring order to their own data at home before opening up to the world. Being open means
916:
911:
819:
faster than the former ever did. Otherwise +1 to your points, especially "Insisting on strict sourcing standards is, in my opinion, absolutely vital,
906:
782:
263:
566:
My guess is, it's an assertion that other cultures are new or have a new essence. This would be appropriately, pretentiously, silly.
901:
766:
401:
49:
35:
17:
259:
203:
806:
701:
652:
624:
There is an ongoing discussion about Wikidata's quality issues and their wider implications on the Wikimedia-l mailing list:
317:
266:
from all over the world . All of these projects run on the Wikidata engine, which is becoming a new international standard.
773:. (That is a really, really good article, worth reading for its writing as well as the story it's telling.) Or see the
250:
allows users to make thousands of small contributions while playing, even from a mobile phone while waiting for a bus.
849:
Working through articles that find such information to update them is time consuming and mindnumbingly dull. Because
777:, whose content could conceivably have been included as a statement in Wikidata. See the Brazilian aardvaark story,
770:
354:
was born in 1821 and died in 1881 and that he was an existentialist is not the same as understanding Dostoyevsky or
584:"For this reason among many others, in 2012 the Wikimedia Foundation created Wikidata": I don't really want to be
927:
179:
571:
280:
881:
827:
810:
778:
731:
705:
675:
656:
611:
597:
575:
561:
546:
542:
532:
246:
This project opens up a whole new world of possibilities, for collaboration and for using the data: the
684:
662:
To be fair, regarding the 20% number: let's take a random Featured Article in Knowledge (XXG). Such as
297:
255:
815:
Also, Andreas, another difference between Knowledge (XXG) and Wikidata is that the latter is growing
309:
301:
305:
222:
211:
207:
607:
30:
Wikidata: the new Rosetta Stone: Wikidata is set to become the main open data repository worldwide.
262:
from around the world are uploading their research databases, and the cultural sector is building
215:
801:
696:
647:
567:
375:
351:
850:
94:
874:
538:
488:
124:
796:, is insert the reference for Goldman's birth date at the end of that sentence naming it. ;)
774:
313:
593:
557:
285:
752:
104:
727:
671:
625:
528:
134:
588:, but this is false. We either write Wikimedia Deutschland or "the Wikimedia movement".
683:, that's based on a lack of familiarity with citing conventions for article leads. See
603:
355:
289:
154:
940:
824:
797:
692:
643:
363:
342:
187:
854:
737:
663:
163:
114:
316:, is behind one of groundbreaking projects in this field, which aims to create an
746:
691:, in the article body. That is where the sources for those statements are found.
537:âdoctors who graduated before they turned 20â â How would this query look like?--
144:
589:
553:
793:
723:
680:
667:
524:
321:
240:
745:
verifiable from the reference present at the end of these three sentences:
334:
552:
What do you mean by "and that Western culture is essentially inherited"?
183:
247:
518:
what remains is something between 58 and 63 years old. When someone is
783:
This 'legend' changed a Knowledge (XXG) page to sneak backstage at gig
258:
offers a new way of visualising history through timelines. Meanwhile,
631:
196:
the number of ministers who are themselves the children of ministers
639:
exploring the links between Wikidata and Google's Knowledge Graph:
178:
With more than fifteen million items compiled in the space of just
293:
54:
756:(i.e. the ISBN number of the book's first edition) essential.
400:
333:
221:
162:
34:
271:
641:"Why Does Google Say Jerusalem Is the Capital of Israel?"
251:
382:
and is reprinted here with the permission of the author.
626:
http://www.gossamer-threads.com/lists/foundation/654001
504:
497:
477:
173:
Alfred Wegener Institute for Polar and Marine Research
192:
largest cities in the world with a female lord mayor
715:
Goldman". Again, does this really need a reference?
502:If your comment has not appeared here, you can try
254:allows people to share their favourite books, and
230:Map of geolocated items on Wikidata, October 2015.
751:bibliographical data. If the biography were at
635:For wider context, see yesterday's article in
328:Data is not knowledge. Data is not objective.
8:
380:Centre de Cultura ContemporĂ nia de Barcelona
171:Archive of marine geological samples of the
288:. Many institutions are already adapting:
947:Knowledge (XXG) Signpost archives 2015-11
18:Knowledge (XXG):Knowledge (XXG) Signpost
851:we prefer to write information in prose
505:
481:
234:Data is beautiful. Data is information.
71:
821:given the role envisaged for Wikidata
620:Wikimedia-l discussion, Slate article
29:
7:
210:, but there are not so many in the
298:openly collaborating with Wikidata
208:Catalan version of Knowledge (XXG)
56:
28:
771:The greatest movie that never was
722:Does this make any more sense? --
487:These comments are automatically
792:One thing I will now go and do,
139:
129:
119:
109:
99:
89:
274:and creating a new centrality.
72:Wikidata: the new Rosetta Stone
498:add the page to your watchlist
1:
748:Chalberg, p. 13.</ref: -->
598:17:47, 30 November 2015 (UTC)
576:22:40, 30 November 2015 (UTC)
562:16:45, 30 November 2015 (UTC)
547:15:41, 30 November 2015 (UTC)
533:08:25, 30 November 2015 (UTC)
882:11:26, 4 December 2015 (UTC)
828:02:31, 4 December 2015 (UTC)
811:19:25, 3 December 2015 (UTC)
732:17:46, 3 December 2015 (UTC)
706:08:53, 3 December 2015 (UTC)
676:22:25, 2 December 2015 (UTC)
657:15:54, 1 December 2015 (UTC)
612:09:56, 7 December 2015 (UTC)
378:on the CCCB Lab blog of the
894:delivered to your talk page
963:
304:has also started using it
312:, in collaboration with
264:a database of paintings
779:told in the New Yorker
495:. To follow comments,
405:
338:
226:
167:
39:
513:How old was someone,
404:
337:
225:
166:
38:
823:." (emphasis mine).
491:from this article's
310:Barcelona University
302:Museum of Modern Art
204:100 million articles
775:Amelia Bedelia hoax
765:Or see the case of
376:originally appeared
272:mixing and matching
482:Discuss this story
457:Arbitration report
406:
339:
227:
168:
45:â Back to Contents
40:
879:
809:
704:
655:
506:purging the cache
462:Technology report
322:Catalan Modernism
292:managers such as
50:View Latest Issue
954:
930:
889:Want the latest
875:
804:
800:
749:
699:
695:
650:
646:
509:
507:
501:
480:
447:Featured content
424:
416:
414:25 November 2015
409:
392:
320:of all works of
314:Amical Wikimedia
308:. In Catalonia,
306:in its catalogue
286:interoperability
214:, much less the
157:
143:
142:
133:
132:
123:
122:
113:
112:
103:
102:
93:
92:
62:
60:
58:
57:25 November 2015
962:
961:
957:
956:
955:
953:
952:
951:
937:
936:
935:
934:
933:
932:
931:
926:
924:
919:
914:
909:
904:
897:
886:
885:
844:
802:
724:denny vrandeÄiÄ
697:
668:denny vrandeÄiÄ
648:
622:
511:
503:
496:
485:
484:
478:+ Add a comment
476:
472:
471:
470:
442:Recent research
417:
412:
410:
407:
396:
395:
390:
368:
346:
345:
331:
330:
236:
231:
228:
212:Spanish version
176:
169:
159:
158:
152:
151:
150:
149:
140:
130:
120:
110:
100:
90:
84:
81:
70:
65:
63:
53:
52:
47:
41:
31:
26:
25:
24:
12:
11:
5:
960:
958:
950:
949:
939:
938:
925:
920:
915:
910:
905:
900:
899:
898:
888:
887:
884:
843:
840:
839:
838:
837:
836:
835:
834:
833:
832:
831:
830:
790:
786:
757:
720:
716:
712:
621:
618:
617:
616:
615:
614:
581:
580:
579:
578:
486:
483:
475:
474:
473:
469:
464:
459:
454:
452:Traffic report
449:
444:
439:
434:
432:News and notes
429:
423:
411:
399:
398:
397:
388:
387:
386:
384:
370:
356:existentialism
347:
340:
332:
329:
326:
290:authority file
235:
232:
229:
220:
170:
161:
160:
148:
147:
137:
127:
117:
107:
97:
86:
85:
82:
76:
75:
74:
73:
68:
67:
66:
64:
61:
48:
43:
42:
33:
32:
27:
15:
14:
13:
10:
9:
6:
4:
3:
2:
959:
948:
945:
944:
942:
929:
923:
918:
913:
908:
903:
895:
892:
883:
880:
878:
873:
872:
869:
866:
863:
860:
857:
852:
848:
841:
829:
826:
822:
818:
814:
813:
812:
808:
805:
799:
795:
791:
787:
784:
780:
776:
772:
768:
767:Hannibal Fogg
764:
758:
754:
744:
739:
735:
734:
733:
729:
725:
721:
717:
713:
709:
708:
707:
703:
700:
694:
690:
686:
682:
679:
678:
677:
673:
669:
665:
661:
660:
659:
658:
654:
651:
645:
642:
638:
633:
632:
628:
627:
619:
613:
609:
605:
601:
600:
599:
595:
591:
587:
583:
582:
577:
573:
569:
568:Jim.henderson
565:
564:
563:
559:
555:
551:
550:
549:
548:
544:
540:
535:
534:
530:
526:
521:
516:
508:
499:
494:
490:
479:
468:
465:
463:
460:
458:
455:
453:
450:
448:
445:
443:
440:
438:
435:
433:
430:
428:
425:
421:
415:
408:In this issue
403:
394:
385:
383:
381:
377:
374:This article
371:
367:
365:
364:Rosetta Stone
359:
357:
353:
344:
343:Rosetta Stone
336:
327:
325:
323:
319:
318:open database
315:
311:
307:
303:
299:
295:
291:
287:
283:
282:
281:passe-partout
275:
273:
267:
265:
261:
257:
253:
249:
248:Wikidata game
244:
242:
233:
224:
219:
217:
213:
209:
205:
199:
197:
193:
189:
188:public domain
185:
181:
174:
165:
156:
146:
138:
136:
128:
126:
118:
116:
108:
106:
98:
96:
88:
87:
79:
59:
51:
46:
37:
23:
19:
890:
876:
870:
867:
864:
861:
858:
855:
845:
842:Mass updates
820:
816:
742:
738:Emma Goldman
688:
664:Emma Goldman
636:
634:
629:
623:
585:
539:Kopiersperre
536:
519:
514:
512:
437:In the media
426:
420:all comments
389:
373:
372:
369:
360:
348:
279:
276:
268:
245:
237:
200:
177:
928:Suggestions
896:each month?
689:and sourced
685:WP:CITELEAD
489:transcluded
352:Dostoyevsky
256:Histropedia
180:three years
300:, and the
260:scientists
252:Inventaire
83:Share this
78:Contribute
22:2015-11-25
922:Subscribe
604:Kippelboy
493:talk page
393:"Op-ed" â
241:diglossia
155:Kippelboy
941:Category
917:Newsroom
912:Archives
891:Signpost
760:example
586:that guy
520:reported
184:Wikidata
125:LinkedIn
105:Facebook
20: |
798:Andreas
693:Andreas
644:Andreas
515:knowing
115:Twitter
763:(XXG).
753:WP:FAC
590:Aubrey
554:4nn1l2
216:French
135:Reddit
95:E-mail
907:About
794:Denny
681:Denny
637:Slate
525:Pldx1
427:Op-ed
69:Op-ed
16:<
902:Home
877:TALK
817:much
736:The
728:talk
672:talk
608:talk
594:talk
572:talk
558:talk
543:talk
529:talk
467:Blog
391:Next
341:The
296:are
294:VIAF
145:Digg
807:466
702:466
653:466
194:or
153:By
80:â
943::
865:ge
825:Ed
803:JN
743:is
730:)
698:JN
674:)
649:JN
610:)
596:)
574:)
560:)
545:)
531:)
324:.
182:,
871:s
868:r
862:g
859:a
856:W
785:.
726:(
670:(
606:(
592:(
570:(
556:(
541:(
527:(
510:.
500:.
422:)
418:(
175:.
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.