Knowledge

talk:Village pump (proposals)/FritzpollBot - Knowledge

Source 📝

The following discussion is archived. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
No consensus - proposal amended by Fritzpoll. This is an archive of the past discussion, please direct any new comments regarding the amended proposal here. xenocidic ( talk ¿ listen ) 17:32, 2 June 2008 (UTC)

Operator response to some issues

Hi there - I actually operate the bot which was set up under the request of User:Blofeld of SPECTRE and User:Editorofthewiki. I've only glanced through the text below, and fully support the community involvement in this process. Let me answer some of the points raised above:

  • The example being cited is now out-of-date. Further to discussions at the BAG proposal, the code has been modified to point to sort out some stylistic issues, and to adjust the references. Looking at the new Afghan articles is probably a better guide.
  • The bot is stupid! Computer programs are not inherently intelligent, and on a task like this, cannot be left to their own devices. Further to that, my proposal, as implemented and approved has the following steps:
  1. Bot extracts data and uploads lists to the subpages of Knowledge:WikiProject_Missing_encyclopedic_articles/Places which lists all the articles to be created to be manually checked or dabbed by editors before the bot is run.
  2. Data is then checked by human editors. This includes checking for disambiguation requirements, any spelling errors not consistent with Knowledge's existing content, and any other issues.
  3. Once the check is complete, I am notified. The bot then scans the list, creating articles for any article that is red-linked - it will not do anything is the page already exists, beyond write out a log to me of all places that it did not create.
  4. Please note that wikipedia will not be flooded with 2 million articles overnight, articles will be added several hundred or thousand at a time in a gradual and coordinated approach rather than 2 million which some people seem to be thinking of. The relative country wikirpojects will be notified and help ensure it is done as best as possible and also to try to expand many of the initial stubs where possible
  • As part of this process, I have encouraged the inclusion of Wikiprojects in these areas - as I upload, and common errors are checked, I hope to contact all interested parties. This will allow consistency within a project, more specialised templates, categories, etc.
  • Inclusion of others will also make it more likely that we can find extra data, such as census data, etc. that can be built-in to the article from day one. I am happy to include this where available, provided it is in a readable, accessible and publically available format.
  • Each area has its own idiosyncrasies in terms of administrative division, etc. so the bot is already being manually adjusted to cope with some of these. For the technically minded out there, I run the code in debug, edit & continue mode to allow intervention with any exceptions that arise, and to monitor the early stages of creation to ensure that things are being done properly. Takes longer, but worthwhile.

In short, the bot is only a tool for extraction and article creation. It still requires human input, but in a format where human interaction can be very efficient. The timescale is relatively short, but the articles are only created country-by-country, following human eyes (the community here, not just me) confirming the validity of the data. I hope this addresses some concerns about the bot, but I am of course happy (and in my opinion obligated) to respond to any other comments you may have. Best wishes Fritzpoll (talk) 10:59, 1 June 2008 (UTC)

Can I also say that it might have been better to wait for me before straw-polling, as there appear to be many misconceptions as to what this is doing? :) Fritzpoll (talk) 11:05, 1 June 2008 (UTC)
An up-to-date example of the bot output can be found at Langar, Badakhshan. Geometry guy 15:45, 1 June 2008 (UTC)

Village pump (proposals)/FritzpollBot

User:FritzpollBot was recently approved at Knowledge:Bots/Requests for approval/FritzpollBot to create stub articles for most or all of the documented villages and towns in the world in the style of User:Fritzpoll/GeoBot/Example. The BRFA means that it is approved technically, Tim Starling has confirmed that there will be no adverse technical effects from such a bot, but I don't believe that this is a non-controversial task, so I'm bringing this here for wider review by the community. The following are some pros and cons of the bot, though not an exclusive list:

Pros

  • Articles about verifiable towns are generally considered inherently notable
  • This will greatly increase Knowledge's coverage of geographical places
  • The articles will be very standardized, all will have coordinates and an infobox
  • A new user wishing to write about one of these places won't have to figure out how to start a new article (the infoboxes for places can be complicated)

Cons

  • Many people would rather not have stub articles, this would create close to 2 million new stubs, many of which may not be able to be expanded much more than their original size: Adding new articles like this could be seen as "inflating our article count"
  • There could be adverse effects with pages like Special:Random and the search function
  • The "inherent notability" for geographical places may not apply for very obscure villages.

Options

  1. Implement bot as written, create ~2 million new village articles
  2. Modify bot to only create article on large villages, X thousands new village articles (this is being done anyway 2 million is far from covering every place and google only recognizes the main towns and villages)
  3. Modify bot to create lists of all villages, X thousands new list articles
  4. Modify bot to create merged mini-articles for all villages on articles about townships, X thousands new and expanded township articles
  5. Do not implement bot —Preceding comment was unsigned

Another POV is that the options above are immediatist and that the long-term options are:

  1. Create 2 million new village articles in 4-12 months, beginning immediately
  2. Create tens of thousands of geographical articles in a few months after further programming
  3. Do not allow bots to create any more geographical articles for the next year

Options also exist as to the direction of the relevant notability policy (see Centralized notability discussion). JJB 20:36, 1 June 2008 (UTC)

FritzpollBot Discussion

So, should the bot go forward as planned? So far its only created 100 pages as a trial before approval, I've asked the operator to hold off running the bot until this discussion concludes. Mr.Z-man 18:53, 31 May 2008 (UTC)

We did something like that with User:Rambot for U.S. places years ago, didn't we? By those standards, it would seem only fair to do it for the rest of the world too. *Dan T.* (talk) 18:57, 31 May 2008 (UTC)
My only comment is about the "inherent notability" question posed above. If there is evidence that the populated place, and so far that's all the bot is supposed to run, can be found in one of the sources they're using, then I think that the place's notability is probably not open to much question, considering that source has a separate listing for it. And I think that there probably could be expansion in the US as well. John Carter (talk) 19:03, 31 May 2008 (UTC)
Iirc, Rambot made 130,000 edits to create 90,000+ articles on places in the US, with populations as small at 3 people. This was back when en.wiki had maybe 100K-200K articles and very few detailed articles. So I really don't see how we can say "no" to expanding such activities to ALL countries in the world, especially when we already have an admitted content bias towards developed nations. MBisanz 19:15, 31 May 2008 (UTC)
We have an admitted content bias towards things that have sources. I have no objection to, say, an Indian census that gives us demographic data being used to make articles in the same way the US census was. But we shouldn't be expanding anything wihtout non-trivial sources.--Prosfilaes (talk) 17:14, 1 June 2008 (UTC)
I would support this bot, something is better than nothing in these cases, and if the experience is anything like that with Rambot, many of these articles will be expanded and improved, which is icing on the cake. Christopher Parham (talk) 19:22, 31 May 2008 (UTC)
I can't see anything wrong with this. Even a very short stub is better than no information at all. It would be a good idea to have an article on every significant place in the world, no matter how short. Will these new articles have to be patrolled? If so it might slow down new page patrollers a bit. Hut 8.5 19:31, 31 May 2008 (UTC)

This is one of the biggest steps towards counteracting systematic bias on wikipedia in its history and should be welcomed gratefully rather than sniffed at. In answer to Hut of course it won't have to be patrolled, it is a bot sealed with approval and this discussion is several weeks too late. At the time few people seemed interested. As for stubs, the articles are likely to be more informative and far more consistent than people adding a feww stubs without proper infoboxes or references everynow and again. I believe that many of them can and will be expanded. It won't happen over night but real world content should be a strong focus on wikipedia. ♦Blofeld of SPECTRE♦ 20:23, 31 May 2008 (UTC) ♦Blofeld of SPECTRE♦ 20:19, 31 May 2008 (UTC)

It would be nice of the bot consistently used typical comment spacing. Currently, its output is inconsistent, sometimes using <!-- typical comment spacing -->, sometimes <!--unusual comment spacing-->. The lack of blank lines before and after the ==External links== section is also unusual. —Remember the dot 20:47, 31 May 2008 (UTC)
I didn't start this to "sniff at" it. BAG approves bots from a technical point of view - no bugs, not a waste of resources, won't break the site, etc. This discussion is to make sure that the community actually wants this. Creating perhaps as many as 2 million new articles needs a lot more than simple BAG approval. Mr.Z-man 21:30, 31 May 2008 (UTC)
Can I just say that I've had at least 15 people already say what a great idea it is including people I haven't met before from various fields of wikipedia and already there has been an offering of editors to help out with dabbing to prepare for the bot. ♦Blofeld of SPECTRE♦ 22:02, 31 May 2008 (UTC)
I also think this is a great idea, but I also think that this should not be solely under the authority of BAG; this is a huge project, and it should be approved by the general community as well. That said, I hope people do approve it. One note, though: Rambot also used census data to make its articles rather large, rather than just stubs. Would it be possible for Fritzpollbot to do the same, at least in the countries where it's available? I think that could boost the support for this, and would make an even bigger impact on the quality of the encyclopedia. --Rory096 03:59, 1 June 2008 (UTC)

All I have to say (and this was all discussed at the request for approval) is, well, finally. Once this bot creates the millions of articles, I can start working on the FA and GA that I've been dreaming about. A stub is better than nothing at all on a one sentence substub that say absolutely nothing. These articles will be complete with an infobox, reference, cats and stuff. Also, I thought that bots edits were automatically patrolled, like admins. I'm an Editorofthewiki 22:18, 31 May 2008 (UTC)

  • I also endorse this bot—our geographical coverage on developing countries must improve if this is to be seen as a thorough, unbiased encyclopædia. Expansion of the articles will be possible in most cases, the only hindrance is that unfortunately we have a lack of editors working at geographical articles. EJF (talk) 22:22, 31 May 2008 (UTC)
  • I agree this is fantastic and everything, and I do believe that eventually these will all develop into full articles, but I'm not so sure about doubling the number of articles on Knowledge overnight. Wouldn't it make sense to modify User:Fritzpoll/GeoBot/Example, so that all of these one-line articles on villages are merged up to the article on the slightly higher-level administrative unit, in this case Waingmaw Township.--Pharos (talk) 22:30, 31 May 2008 (UTC)

I strongly feel that we should Not have a bot creating 2 million stubs. In perspective that is doubling the size of the English Knowledge which some feel is too big already. We're already tree times the size of dewiki. I'd rather not have every other Random article be an obscure village. We want to find something more interesting through that. Also, they will be perennial stubs. 90% of these will not grow more than the population and location, even in years as I have seen with other articles. Many claim that all real places are inherently notable. I will definitely agree to this to an extent, but not two million articles. Although not every country has a system like the US, any town with an article, especially if it is very unpopulated, should be the equivalent of an incorporated place. The reference on the example above is only another Knowledge article and the links are only generic map websites! With that, there is not much notability other than existence. The inherent notability argument could be taken much further if necessary. I highly doubt even the best atlases would include these new towns villages. While I agree the we must be countering systematic bias, we are in no way required to have another two million obscure place articles. For example, if a town in Indonesia does not even have an article on the large Indonesian Knowledge, then I don't feel compelled to have one here. Also, remember that this is by English-speakers for English-speakers, so the Knowledge hit count will no matter what be very low. In response to Christopher, the US place articles were expanded and improved while the everything else was too. Remember, they were improved by English-speakers likely living in those places, but there will be few of those for these other places. An alternative to two million stubs would be a few thousand "List of places in XYZ" articles, which I feel easier to navigate than individual articles for each place. For the example above, notice how long the article on its township is. Perhaps a list of towns there. A list can still always include the town's elevation and coordinates, the only unique information in the example. Reywas92 22:35, 31 May 2008 (UTC)

The article on the township is short for a reason. It was also missing entirely until just weeks ago. It equates to district of countries which in the western world have full and detailed articles. It isn't any indication whatsoever of notability. It is a step to try to cover the world unevenly. It is absolute nonsense that articles "shouldn't be started" because of this. Why should America have an article on places with 3 people and towns in places like Burma and Bangladesh with a population of 60,000 be ignored???? ♦Blofeld of SPECTRE♦ 23:18, 31 May 2008 (UTC)
That's not quite what I said. I would fully support an article on a 60,000-person town in Burma, and personally, if it isn't incorporated there shouldn't be an article on a 3 person town in the US. I don't think the articles "shouldn't be started" because of that though. A vast majority of the two million towns may have less than 100 however. I'd really like to consider having complete, merged lists of towns in each district rather then one-liners on each town. In theory, something is better than nothing, but there are better ways than having almost nothing. Also, what about population? Or area? Funny how the example is 4.9kb yet 80% of that is forever empty infobox parameters. If this goes through, maybe save some space and only include the important ones that could even ever be filled. Reywas92 23:30, 31 May 2008 (UTC)
Uh, how would you describe "important"? Most of the supposed "important" ones with populations over, say, 10,000 are also missing article, and due to country's government not making full demographic info available, it is sad that the articles will not expand much. But, I cannot see how anyone would object to having information in the encyclopedia, and merging of most of the articles to the district etc. would be basically giving up the fight agaist systemic bias. If the U.S. has articles on every village with a population of 3 and yet it is incorporated, why can't we do the same to the developing world with unincoreporated villages (I don't think they have a policy like the U.S., just random settlements)? Also, I would disagree with your assertion that the majority would have pops of less than 100--consider overpopulation--and Burkino Faso, where information is available (see http://www.inforoute-communale.gov.bf/list_vill/regionname), most have roughly 900 or so. I could care less about the hit count, for we are only trying to display information, not have the entire world look at us (although that would be great). Most of the other wikis are just as incomplete in the matter as we are, so you suggesting the Indonesian Knowledge is irrelevant. This bot would save myself and the bald guy a bunch of time creating such article and more time expanding them or focusing on other aspects of the 'pedia. Oh, and there is more notability than existence--the notability of an African farmer who can barely keep his family fed and yet his existence is not even acknowledged for the Knowledge article has not will not be created because all of us couldn't give enough of a damn to do so. There is so much missing, a bot would be the best and easiest course of action. Remember, we are not only catering to anglophone countries, but the entire world, even if they couldn't give a damn of the aforementioned african farmer. If the best atlases don't include this info, then how are we getting the info in the first place? Anyway, the bot was already approved by User:Dihydrogen Monoxide (who by the way is currently only ~20 votes off of having the largest number of supporters in his ongoing RFA), so there is no need to continue discussion.I'm an Editorofthewiki 00:15, 1 June 2008 (UTC)
I am not suggesting that we have a mere "list" in the township articles. I am suggesting that we include everything in User:Fritzpoll/GeoBot/Example in the township article, with a separate section for every village. The information would be equally accessible to the Burmese or African farmer, and equally open to being improved. The minute that the section turns from a substub into a stub, we can spin it out as a new article. Heck, I bet we could even design a bot that would recognize a slightly expanded section on a village, to recommend spinning it off.--Pharos (talk) 00:29, 1 June 2008 (UTC)
As I said above, BAG approves the technical feasibility of a bot. The job of BAG is to make sure that bots aren't wasting resources and breaking pages with bad programming. BAG does not exist as a substitute for a community discussion like this. Or does the community not count anymore? And DHMO's RFA has absolutely nothing to do with this. Personally, I support lists as well. I'd rather have 18000 lists with a hundred sections than 1.8 million stubs. If we make the lists exactly as the stubs, infobox, coordinates, and all, the only things we lose are things that would be exactly replicated over all the articles like the stub tags. Redirects can be created from the village name to the list and we lose nothing in terms of people searching for the information, but we retain pages like Special:Random as useful features. Once the sections begin to expand, presumably at the same pace as the rest of the topics on the project, they can be split off into other articles and we still retain the list as an index. Mr.Z-man 04:34, 1 June 2008 (UTC)

Simply because Rambot made articles previously does not mean that it necessarily a good idea or that it should be repeated. A lot of these places have questionable notability, and creating a separate stub for each seems pretty illogical. What is the benefit to having these articles? What is the virtue of them? Sure, they show a map and the coordinates, but that is available elsewhere. Knowledge is not a directory. Sure, the information is accurate and verifiable, but that says nothing about the virtue of having a stub that states "X is town in Y," and nothing more. It would be far more logical to create lists of these pages and wait for content contributors to come along and build up the individual articles (if ever that happens). We could similarly create stubs for every single U.S. Supreme Court case or every single television show or every single whatever, but doing so diminishes the value of the project as a whole. It floods the database needlessly and provides little in return. But all of this is beside the point, what is needed, truly, is more discussion. A single BRFA for a project that will / would drastically change the project as a whole? No way. This discussion should be cross-posted to AN, CENT, and anywhere else before this bot starts. Perhaps even a watchlist notice is in order. --MZMcBride (talk) 00:42, 1 June 2008 (UTC)

The problem is that no user probably will ever create all these articles. Also, Mr. McBride, the articles contain coordinates and such, not just a one line substub. Basically what you are saying is that you will reject the information that this bot creates, and IMO villages in Africa are much more notable than "every single U.S. Supreme Court case or every single television show or every single whatever" because they contain People. They woukld create the articleles themselves if they had computers. If we had hald as many users contributing to this area than at the ANI or even the Doctor Who wikiproject there would be no need for this. And yet, basically myself, Blofeld, and a select handful of users give enough of a darn to do such. Remember when Blofeld, AlbertHerring, and myself created the 36,000 stubs on French communes? And now look at most of them, thanks to User:Markussep. If you build it, they will come. And if no one else except Blofeld or myself will do so, so be it. For I'm an Editorofthewiki 01:01, 1 June 2008 (UTC)
Note to self: Villages in Africa are not notable. Only villages in Europe and America. Haha! Yeah right. Let the bot go! People who are opposed to it are just short sighted and have no faith in what wikipedia users can accomplish. Wrad (talk) 01:14, 1 June 2008 (UTC)
Villages are notable if and only if we have non-trivial sources on them, just like everything else in Knowledge. All the villages in the US have non-trivial census data on them, and I assume that's true for most of the rest of the developed world. But a set of coordinates is not nontrivial. If Africa wants its villages to be notable, we need source data.--Prosfilaes (talk) 17:11, 1 June 2008 (UTC)

It has to be said that this is the impression people who oppose it give. WHy shouldn't we give other places in the world a chance. There is a whole world out there and I'm certain if you saw some of these places you'd see how a full article could actually be written if the barriers in accessing knowledge broke down, (which I believe will happen over time). It is incredibly narrow minded to think that a full article can be written on a hamlet in America yet an average article that would be created on a population with about 800 in the developing world couldn't equally have a full article. Any notion that "the article is useless in the rest of the world" displays systematic bias at its very best and is the reason why wikipedia has grown so America and the UK are ridiculously well covered (still with many articles missing though) and entire countries and regions of the world are missing in content on here. We would it be so impossible to have an article on a settlement in the developing world where we have basic detail, location and map and population and economic data when it becomes available? I for one think the encyclopedia would be far stronger to cover the world evenly rather than ignoring 95% of the planet because they are not "developed" like us. There is something very wrong if you can't see it is a major effort yo address systematic bias on wikipedia and yes encourage more editors to develop these articles. In ten years time, why is it impossible that many of the initial articles created will have been expanded?? ♦Blofeld of SPECTRE♦ 09:45, 1 June 2008 (UTC)


Funny, if "no user probably will ever create all these articles," then why are they notable or necessary? I have plenty of faith in what Knowledge users can accomplish, you rude Wrad; we've gotten here in only seven years, but there's no way we can double the number of articles and assume enough people care enough (per Editor's quote) or know enough about obscure villages to make these much longer. I'm all for creation of the larger towns, but those with a population of less than 100 900 are'nt particularly notable. I'm not being systematically biased, the same goes for the US. Those stubs on French communes: now they have a full infobox, but they're still only one sentence long. You know, to tell you the truth, my problem is not they they aren't as notable, because many of them are, it's that they aren't as long. Really, do you like reading one-sentence articles? Knowledge may work hard, but we cannot add two million articles and assume that they'll be of decent length in even a few years. Some say one-sentencers are better than nothing, but IMO they're useless. A list, really, would be much more concise, and the articles are reasonable length. Reywas92 01:40, 1 June 2008 (UTC)
But your're missing the point. I would rather full articles too. But the only thing stopping these "one line articles" becoming fuller articles is access to knowledge, deciated editors willing to expand them, and development and time. MOst of the articles we have on here began as one line stubs and most of those didn't have the referencing and infbox/maps etc that these will have. Trust me in the future access to knowledge will grow increasingly and setting up these articles is a a way of planting seeds for the editors on this encyclopedia to sow. In ten to twenty years when we have half decent articles on all of these places, I'm damn sure people will be gald we took the initiative to do this. ♦Blofeld of SPECTRE♦ 09:52, 1 June 2008 (UTC)
You've just proved my point. You don't have faith in wikipedians. You're convinced we can't handle it. Call me rude, but I ain't. I'm just pretty dang right. Wrad (talk) 03:06, 1 June 2008 (UTC)
This information is available elsewhere, but I certainly don't know off the top of my head where to find it. Running this bot will make Knowledge more useful as a resource and make this information more accessible to our users. These same articles produced by a human editor would be perfectly acceptable, and I don't see a big problem with speeding up the process: the faster the articles are created the soon their growth and improvement stage begins. Christopher Parham (talk) 03:11, 1 June 2008 (UTC)

Yeah go for it. Great idea - townships are inherantly notable, and wikipedia as a central information resource should cover them to at least some degree. Viridae 01:19, 1 June 2008 (UTC)

Sounds like a great idea. I certainly think it'd be good for newcomers who want to expand a villages article but can't overcome the burden of creating one. Mvjs (talk) 01:26, 1 June 2008 (UTC)


Yes, this bot is a very good idea.-gadfium 01:31, 1 June 2008 (UTC)

This is not a dichotomy between "bot" or "no bot". I think I have made a viable middle-way suggestion on how to modify the bot so that we can still cover the whole world, without a doubling (and this is a literal doubling) of the number of articles on Knowledge overnight. I would like to hear a response to this suggestion. Thanks.--Pharos (talk) 01:32, 1 June 2008 (UTC)

Well that one is easy, just split the list in half and run it over two nights, problem solved. --Samuel Pepys (talk) 01:37, 1 June 2008 (UTC)

The bot will create articles over a period of sevferal months, not one or two days. It's that massive, even if it makes 10 edits a minute 24/7. I'm an Editorofthewiki 01:39, 1 June 2008 (UTC)

If that is the case then the 'overnight' problem is solved. --Samuel Pepys (talk) 01:46, 1 June 2008 (UTC)
Um, that's not the point at all. Don't you see serious issues when we double the size of Knowledge over a very short period, with 2 million substubs? What's the disadvantage of just merging the same information (a full merge, not a list) into the township articles?--Pharos (talk) 01:48, 1 June 2008 (UTC)
Pharos, you asked for a solution to a non-issue of 'OverNight' doubling of wikipedia. I don't see the issues here. --Samuel Pepys (talk) 02:03, 1 June 2008 (UTC)
Geez, "overnight" is a figure of speech. Please, what is the disadvantage of just merging the same information (a full merge, not a list) into the township articles?--Pharos (talk) 02:30, 1 June 2008 (UTC)

What about the possible vandalism ramifications of this? I realize we have thousands of editors who religiously watch the recent changes, but this seems like an invitation for some of these pages to have inaccurate or spurious content on them for months or years. If a stub is vandalized and it isn't caught in recent changes, it's likely no one will be watching the article and, especially with the low pageviews many of these will likely have, will stand for years. As a possible solution to this, how feasible is a population limitation? I noticed no population is included in the example, but would the bot be capable of including that? We could start with all cities of, say, 50,000 or more. After we see how that goes, the population limit could be slowly reduced. Newsboy85 (talk) 01:46, 1 June 2008 (UTC)

Forgive me, but I can not find a good basis for this argument. Taken its logical conclusion it says we should limit the number of articles, even temporarily, based only on concerns of vandalism or concerns of the vandalism life span. --Samuel Pepys (talk) 01:52, 1 June 2008 (UTC)
The differnece is, when the majority of articles are created by humans, the number of articles increases at about the same rate as the number of editors. So unless we are expecting a massive influx of new users soon, it is a reasonable argument, especially since the articles are bot-created, they won't even be on the creator's watchlist. Mr.Z-man 04:42, 1 June 2008 (UTC)

I oppose the proposal and support a population limitation. Two million articles on obscure small towns is far too many. Keep only (1) towns with 50,000 or more and (2) towns where something notable has happened, as determined by a human, not a bot. Dirac66 (talk) 01:56, 1 June 2008 (UTC)

I've read some of the "arguments" above, and I definitely support these additions. Although it will drastically increase the number of articles, I can't think of any major issues against doing this. I think they should all be created, but not extremely fast, because a lot of people are going to wonder what's going on when suddenly we have over 4,000,000 articles. Anyway, I would spend lots of time on these as it would give me something to do. :)   jj137 (talk) 02:23, 1 June 2008 (UTC)

Questions
  1. Maybe this has been asked already, but what if the bot is used to create articles on towns in one country/region at a time? A controlled, monitored growth in articles is better to resolve the 2 million expansion issue properly, isn't it?
  2. Why can't we use the Fritzpoll Bot to create a list or directory, instead of separate articles, of towns and villages? The bot simply has to dig up and list the names and locations in a list rather than create an article? Of course we can have localized lists, by county, district, etc.Vishnava talk 01:56, 1 June 2008 (UTC)
I agree with the bot and like the quality of its creations, but disagree with the proposed implementation of 2 million at one go (I hold the same retroactively for Rambot, and proactively with the bot that will someday be able to create every USSC case). One datum which should be considered first is that at the bot's current speed of 87 articles in 7 minutes, it would take 112 days of constant operation to create them all. Wouldn't it be better to take a month or two to program it to write somewhat higher-quality articles on the 100,000 most populous cities, and to group the 2,000,000 towns into 10,000 higher-quality (often extant) region articles, completing in say 90 days? While it's both cool and encyclopedia-building to be able to create the 2M, if implemented as is, imagine how many AFDs would arise, turning quickly to CSDs after the community got sick of arguing them. The following needs to happen before this (heartily welcomed, immensely bias-counteracting) contributor is fully let loose:
  1. Locate a population database pronto and correlate the data. Start by checking the existence of all cities over 1 million residents and create stubs on the omissions (there will be some). Then go down one order of magnitude to 500,000 and continue stepwise to 200,000, etc. If we cannot find the municipality in some global database of population statistics, it must be regarded as lower priority. I see no reason to proceed alphabetically (the bot has opted to create 99 towns starting with A-M in the first alphabetical province of the first alphabetical nation-- typical computerthink).
  2. Tie the appellation of the municipality to its population. All my spot-checked articles say "village", but I presume some of them are larger than that. Have the community comment on what appellations correspond to what population and post the results on the bot's userpage.
  3. If the articles are have one-line leads, wouldn't that be nice to have in the edit summary?!
  4. Even if it only makes 100,000 articles, it had better be prepared to have the highest quality of formatting (not irregular formatting, spacing, paragraphing), and had better be prepared to fix (most all of) the 100,000 articles if enough people think they should have a mass formatting improvement.
  5. After those are done, the regional lists can be created as lists with coordinates and elevation. Data like time zone and preference for imperial units pertain to the subregion, not the village. JJB 02:00, 1 June 2008 (UTC)
How many times do I have to say this: It will take place over a period of several months. And for the 1 million query: We do have editors that can fix the problem before the bot. Do you have a link so that I can go through them? I'm an Editorofthewiki 02:06, 1 June 2008 (UTC)
OK, you did not read JJB's comment terribly carefully, like when he says it would take "112 days of constant operation to create them all".--Pharos (talk) 02:22, 1 June 2008 (UTC)
I think we understand the several months part. I feel that overnight or overyear doesn't make a big difference. That's two million articles, doubling our current extremely large size. WP:FA says there is about one FA per every 1,150 articles. With a doubling, that's only 1 per every 2,300. According to this, the depth of enwiki is 357. With two million articles and one edit each, the depth could change 50%. Newsboys's concern of vandalism which Samual dismissed is a real one. Who is going to be watchlisting two million articles? Reywas92 02:27, 1 June 2008 (UTC)
Oh Dear God. 2.3 million articles is nothing to how this project will look in the future, Knowledge will continue to grow in article count and in several years time we will pass 10 million articles that is for certain, Knowledge will grow and grow whether you like it or not.
I dismissed it because vandalism is and always has been a non issue on regard to content. Who is watching the stubs now? Who is watching the archive pages? The thing about vandals is that they vandalize, they replace the whole page or curse or do other things that get them caught by bots. They vandalize enough pages that they eventually get caught and all their edits reverted. All vandalism is caught and more vandalism only encourages more and better anti vandal tools. --Samuel Pepys (talk) 02:47, 1 June 2008 (UTC)
The vandalism can still be caught by RC patrollers, who must catch a fair fraction of it anyway. And I can't imagine who these would be high vandalism targets - vandals are going to go for pages such as Gorge W. Bush rather than an African town they've never heard of. Hut 8.5 08:03, 1 June 2008 (UTC)
  • OK, I don't have time to read through all this discussion, but I'd like to give my opinion: I think this is a good idea and that the pros far overweight the cons in this situation. People looking for articles about their home town might now be able to find at least a small entry and I think that any officially recognized town is notable ad eventually the articles are going to be created anyway and never will be deleted as towns are notable. Sometimes I've seen non-English speaking users create articles that are barely comprehensible on towns and a bot for this task would work great. I do however think that all technicalities should come under close scrutiny in the next few days. The bot has much potential to do a lot of damage, so I assume that all disambig bugs are fixed? In a nutshell, I think it is a good idea to have this bot, because it falls under "difficult tasks that would be tedious to do manually" and they are tasks that are going to eventually be done. The Dominator Edits 02:15, 1 June 2008 (UTC)

One thing that is less on the technical issues and more about appropriateness: coming from the standpoint of notability for other aspects of Knowledge, people always point to the fact that WP has stub articles for all these tiny towns and villages in the US, but we don't allow for stubs on fictional characters, every television episode ever, and much more. Personally, I know what the difference is, but I think there needs to be clear reasoning laid out that notability for the resulting stubs is assured to counteract the arguments of those that feel that other areas also deserve stubs. --MASEM 02:24, 1 June 2008 (UTC)

That sounds like an issue to be raised at the guideline for notability. The bot is only following the guideline, not setting it. --Samuel Pepys (talk) 02:29, 1 June 2008 (UTC)

Well, I think it'd be a good way to get other editors on to help fix them up, at least. Therequiembellishere (talk) 02:28, 1 June 2008 (UTC)

I like the idea of this bot. There are many holes of towns and cities which Knowledge does not have an article on. Captain panda 02:52, 1 June 2008 (UTC)
I agree and support the bot as written. I edit in African topics and it's incredibly embarrassing for the wiki that, seven years in, we are still creating stubs for towns that have populations in the tens of thousands, are connected to major political and military events, and/or are the birthplaces of heads of state. - BanyanTree 03:24, 1 June 2008 (UTC)


There sure are a lot of good ideas here on what to do, but I think we're losing focus. The bot task has already been approved by a group specifically designed to do just that. It has been tested. It has been been specifically reviewed by performance experts who aren't a normal part of the review process. All green lights.

Frankly what takes us only a few seconds to type out from the comfort of our edit screens takes the bot builder hours to code, test, and validate, all to do a task he has already been approved to do. This seems to discourage bot builders to do tasks that others aren't doing. IMHO if someone is going to require a bot builder to redo work already approve they better have a damn good reason, or at least help bake the bread. --Samuel Pepys (talk) 02:15, 1 June 2008 (UTC)

Doubling the size of Knowledge is a change quite above the scale of what is normally delegated by the community to the discretion of the bot approvals group.--Pharos (talk) 02:27, 1 June 2008 (UTC)
Perhaps that is an issue to take up with the Bot approvals group on limits to their authority. --Samuel Pepys (talk) 02:31, 1 June 2008 (UTC)
I wholly agree with Pharos, a task of this size requires community consensus before being started. The BAG is a technical and quality-control committee, it doesn't decide which tasks have consensus and which do not. Christopher Parham (talk) 03:06, 1 June 2008 (UTC)
According to WP:BOT, a bot needs to meet all of these criteria:
  1. is harmless
  2. is useful
  3. does not consume resources unnecessarily
  4. performs only tasks for which there is consensus
  5. carefully adheres to relevant policies and guidelines
  6. uses informative messages, appropriately worded, in any edit summaries or messages left for users
For most bots, the BAG is able to evaluate all six: most proposed bots are simply automated ways of doing tasks that humans do on a regular basis. For this bot, however, the BAG has decided that they can't evaluate point #4, which is why this discussion is taking place. --Carnildo (talk) 04:53, 1 June 2008 (UTC)

4 is obvious. If it obeys 1-3,5,6 it does better than most editors and has an inherit general consensus under wikipedia's anyone can edit rule (and that includes bots). The bot is already held to a much higher standard of behavior more strictly than other users. --Samuel Pepys (talk) 05:36, 1 June 2008 (UTC)

4 is obviously not obvious, as various editors have brought up concerns in the discussion above. The bot has been approved (on a technical point of view), but not the bot task (on a non-technical point of view). SyG (talk) 12:48, 1 June 2008 (UTC)

This section just illustrates how some people completely misunderstand the role of BAG. Thay have absolutely no authority to judge whether a BOT task has community consensus. They desperately want to have that power, but they ultimately do not. You will not find any editor willing to do this task manualy, and to suggest otherwise is ridiculous. MickMacNee (talk) 18:20, 1 June 2008 (UTC)

Discussion is hopelessly convoluted

It is impossible to have any sort of structured discussion here. I recommend that the watchlist notice be taken down for now, the conversation be refactored and moved to its own project page, and that we establish distinct sections for the various issues and options. Thanks.--Pharos (talk) 02:14, 1 June 2008 (UTC)

I've made an attempt at structure by listing several #Options at the top of this discussion.--Pharos (talk) 03:01, 1 June 2008 (UTC)
I agree. The first step in assessing how the community wants to take advantage of the potential offered by this bot is to create a well-structured place for discussion. Knowledge:WikiProject GeoBot? (For my part, I also agree with you that, at least with the Example case, it would be more useful to have a combined township article with sections for all the sub-stub town content, rather than separate articles. It's not an issue of how many articles get created, but of what the most useful way to present that information is. Collecting groups of settlements together makes the articles more useful for someone actually trying to find out information on these places, rather than just celebrate arbitrary article number milestones.)--ragesoss (talk) 03:03, 1 June 2008 (UTC)
I think Knowledge:WikiProject GeoBot is a great idea. It would instantly become the most important WikiProject there is, in many respects. This is a huge project we are starting on, and a lot of work awaits us.--Pharos (talk) 03:30, 1 June 2008 (UTC)
Of course the discussion is convoluted; it's about one's ability to focus the discussion and keep it as simple as possible that counts. :) Anthøny 06:32, 1 June 2008 (UTC)

More discussion

I am very much in support of this incredible bot. Note that (as was said above) the articles will not be created overnight but over a period of a year, all articles will have a map, geo coordinates, cat, stub template etc (I am also trying to organize automatic wikiproject tagging on these articles) and this will be a good starting point for many of these articles. It is a lot easier for a random individual to come and edit an already created article and add things to it, rather than someone starting it from scratch. Yes its easy for some of us to start from scratch, but for those that have never had wikipedia experience and just came by their town or village, this will definetelly make it easier for them. Oh and if you have a problem with these articles length, I would recomend expanding the hundreds of thousand (if not a million) or so other stubs that are only 1 or 2 sentences long in Knowledge currently. Then consider expanding all the 1 sentence articles on ALL of the language wikiprojects. I kid you not, many of them in different languages are 1 sentence. What do we do about them? Delete them all for being too short?!?! Rubish! This bot will be one of the best things to come out of wikipedia. Inclussion of more articles and the partial elimination of USA (or ) bias (it won't completely eliminate it but it's a step in the right direction). Cheers!Calaka (talk) 02:37, 1 June 2008 (UTC)

I have no problem with adding 2000000 articles over night. The only question I have is when do I celebrate articles 3000000 and 4000000?. Zginder 2008-06-01T02:32Z (UTC)

They will not be done overnight. But if that be the case, nothing wrong with celebrating twice as hard in the one night Zginder ;). I also want to remind people that a similar concept has occured with the BOT adding genes/proteins automatically a few months ago. Many people have argued that the articles are only one line each etc etc and that they are non-notable. Notice the patterns... Cheers!Calaka (talk) 02:40, 1 June 2008 (UTC)
Huh? Besides, once some user finds what is the 3 millionth article is, it will instantly become at least a B class article because people actually care now that it is a lucky statistic. I'm an Editorofthewiki 02:43, 1 June 2008 (UTC)
We could just lie to the users. Give every 100,000 users a different 3millionth article. Then we'd have a lot more B-etter Class articles. --Samuel Pepys (talk) 02:50, 1 June 2008 (UTC)
Interesting idea. Would never work, of course. Database queries could find the exact 3 millionth article, and Knowledge's too open for something like that. (Yes, I realize it was intended humorously.) Pyrospirit (talk · contribs) 03:15, 1 June 2008 (UTC)
Perhaps, but I'd say the vast majority of users wouldn't notice. The bcabal would observe their normal vow of secrecy --Samuel Pepys (talk) 03:18, 1 June 2008 (UTC)
I think that if they are especially obscure then they should be redirected to a section in a "villages in region" type article. Mike92591 (talk) 03:31, 1 June 2008 (UTC)

Let's do it! We should be focused on expanding our coverage of everything. Yes, quality is important to, but these articles will increase the overall quality of the project simply because they increase our coverage. What encyclopedia can boast to having an article about every village on the planet? This is a big leap forward in the effort to make Knowledge the definitive database of mankind! I'm excited! Okiefromokla 03:21, 1 June 2008 (UTC)

I like it. One of the first things that new users do when they start poking around with Knowledge is check out the article for their hometown. Expanding something that was little more than a stub, when I thought my hometown deserved better than that, is one of the things that got me started editing in the first place. More editors = Better encyclopedia. 'Nuff said. - Ken Thomas (talk) 03:48, 1 June 2008 (UTC)
Support. Its a good idea.ajoy (talk) 18:34, 1 June 2008 (UTC)
Alas I do not share your enthusiasm, and I am one of the more determined Knowledge promoters whenever the topic comes up in conversation. We are not talking about "articles for their hometown", are we? We are talking about a one line stub about Anywheresville (which may well already be a two line stub called "Anywheresville (South Somewhere)", how will it know?). doktorb words 06:21, 1 June 2008 (UTC)

Options two and three Create articles on towns of X thousand or more in population and create "List of towns in " articles as well. -Justin (koavf)TCM03:49, 1 June 2008 (UTC)

I don't like the idea of this bot. To be honest, I think nearly doubling the page number of our articles without adding anything other than directory-style content is bad. I'd rather say we have 2 million (mostly) substantial articles than 2 million substantial articles and 2 million directory entries. If there is more worthwhile content than "X is a location in Y" to be added to these articles, then they will be created in due time. Flooding the encyclopedia with 2 million articles devoid of any real content that will probably never grow to anything more isn't productive. Quality over quantity. MZM has a good suggestion above about creating a list. As for the systemic bias arguments, I would also support deletion of articles like this already in existence. seresin ( ¡? ) 03:59, 1 June 2008 (UTC)
That wouldn't actually solve systemic bias at all, as it's a product of our userbase's bias towards the developed world, particularly the United States. If anything, most of the articles similar to this are already articles on countries where not many users come from, so there isn't much information. Deleting those articles would just increase the systemic bias. --Rory096 04:03, 1 June 2008 (UTC)

I support this bot. Article count shouldn't really matter, this will go a long way towards helping us be as complete and unbiased as we can, and will encourage people to expand articles about places they know. Grandmasterka 04:13, 1 June 2008 (UTC)

Oppose. Articles should be created by people, not bots. The whole point of Knowledge is that at some point a human being thought some topic was notable and interesting enough to write an article. This editorial function is the most valuable thing we can get from Knowledge - an idiot search engine can turn up millions of random references just as easily as a bot on Knowledge with about as much (little) value to the reader. We're drowning in facts, and we don't need bots swamping us with more computer-generated trivia.

To the contrary, almost all of our US place articles were created by a bot, and many (most?) of those, once created, were expanded into very good and complete articles with far more information than just the bot provided. However, the bot started the process, and once the article is actually created, it becomes much easier to incrementally improve it. Additionally, Knowledge's systemic bias is caused by its lack of editors from various parts of the world, so articles that would be considered notable aren't created because we have so few people from the places where people would know about the subjects. A bot would be an effective means to mitigate that bias. --Rory096 04:29, 1 June 2008 (UTC)
Views on weather or not bots should create pages should be taken to the Bot Approvals Group and not held directly against individual bots. --Samuel Pepys (talk) 04:37, 1 June 2008 (UTC)

Oppose. Despite what some may imply, opposition doesn't have to be about xenophobia or superiority. It is simpler. If no human cared enough to create the article, why do we need it? As the editor above said, articles should be created by people, not bots. 2,000,000 more stubs is not the answer. Niteshift36 (talk) 04:24, 1 June 2008 (UTC)

These articles will be created. By bot or by human doesn't matter, because they will be created. So what's left is a question of how to most efficiently use our time. -- Ned Scott 04:27, 1 June 2008 (UTC)

Complete support of bot as written and approved. RyanGerbil10(Kick 'em in the Dishpan!) 04:37, 1 June 2008 (UTC)

Comment: Notability. Maybe I am splitting hairs here, but regardless of whether the act was done by bot or entirely the old fashioned way, isn't this stepping over the line by a foot or a mile on Notability? Not everyone would ever consider every town or village on the planet notable just because they exist any more than are Knowledge's current feelings about anything else. Not every person is notable even if they are very popular in a small area but not well-known in many places. To list every possible map spec even though 99.9% of the rest of the planet knows nothing about each of them and research is probably very limited on most of them because they are so insignificant (no offence meant) seems to me a step in the wrong direction. — CobraWiki 04:52, 1 June 2008 (UTC)

99% of the planet covered? We wish. The places that show up on google maps are only the main towns and villages in the world. For instance there are 28,000 google maps of places in India, Actually there are 638,000. I doubt even after adding two million articles we would have covered 70% of the world. ♦Blofeld of SPECTRE♦ 10:02, 1 June 2008 (UTC)
There's been a general consensus for years that articles on towns are notable; we have an article on every census-designated place in the United States, and that notability doesn't change because of the country. --Rory096 05:04, 1 June 2008 (UTC)

Several issues:

  • Will the bot copy from public domain sites, such as World Factbook, or other sources? How will the bot cite facts, per reliable sources? How do we know if these articles are not copyright violations? I am sorry to have to say this, but having Knowledge's articles written by a bot presents to the public an image of Knowledge editors being lazy and incompetent of adding content if we begin to have articles written by a bot. miranda 05:36, 1 June 2008 (UTC)
You can get an idea of everything from looking at the contribs --Samuel Pepys (talk) 05:42, 1 June 2008 (UTC)
  • Resounding oppose - an incredibly bad idea. Knowledge aims for quality, not quantity; flooding with loads of robot-written articles that amount to nothing is valueless but harmful. ╟─TreasuryTag (talk contribs)─╢ 07:09, 1 June 2008 (UTC)
  • Support - if this has already been implemented for US places, then it would create a strong systematic bias not to do so for the rest of the world. It this is not passed, then a review of the US bot and articles it has created will be needed. --GW_SimulationsUser Page | Talk 08:07, 1 June 2008 (UTC)
  • Strong support to help counter systemic bias, doing it just in the US is a thoroughly bad idea as we already have a bloated coverage of that country, the UK etc. Thanks, SqueakBox 16:47, 1 June 2008 (UTC)
  • Totally good plan! Would get influx from google maps users providing more information. (I actually recently made some small contributions to nl.wikipedia for the first time in years, for precisely that reason :-) ) Also, could the bot check geographic locations in old articles too? Sometimes GPS locations seem to be a tad off. --Kim Bruning (talk) 17:41, 1 June 2008 (UTC)
  • Strong support per SqueakBox and Christopher Parham. -- Quiddity (talk) 19:54, 1 June 2008 (UTC)
  • Strong Oppose Frankly this seems insane. We could choose to create millions of random stubs about things on which we have little or no information for many categories of things. Why not stub every school in the world? or every politician? The only reason to create an article is that someone has sufficient information about a subject to make it worthwhile to commit it to writing. Billsmith453 (talk) 19:51, 1 June 2008 (UTC)
  • Oppose. While I can't comment on every country on earth, let me say why this would be a bad idea with reference to Romania. Romania has some dozens of municipalities and cities, ~3000 communes and ~12000 villages. Only the first three have official status - legal incorporation, city/town government, etc.; villages do not. Villages are almost invariably very small and rather poorly covered in academic literature. Whatever can be said about them almost always can be said in the article on their commune - for instance if commune A contained villages A, B, C, D and E, we could have a lead section covering A as a whole, then five sub-sections covering the villages. But really, at the village level notability is quite hard to claim plausibly. I would definitely support a creation of articles on all incorporated places. But this proposal goes too far; it creates a couple million micro-stubs that will sit there with barely any content for a long, long time to come. No, thanks. Biruitorul 01:24, 2 June 2008 (UTC)

Change in rationale wording

I support this bot task, but I think we need to make it clear that the rationale isn't "inherent notability". We often tell editors that things are not inherently notable, because just being there or existing doesn't always make that thing/place/person notable. However, the places that would be covered in these articles were not simply just.. "born" (for a lack of better words). To be a town/village/whatever that is sourced then that means some basic criteria was met. -- Ned Scott 04:26, 1 June 2008 (UTC)

Nothing here says we can't delete them afterwards. --Samuel Pepys (talk) 04:44, 1 June 2008 (UTC)
Which then begs the question "Why allow a bot to create them all?" I can see it now: 100,000 stubbed locations created by this bot just because it can followed a month or two later by 100,000 AfD proposals because none of the articles contain any information other than one sentence or an infobox showing their location, size, and population, and no one can seem to provide any more useful information. — CobraWiki 05:03, 1 June 2008 (UTC)
100k of 2 million is 5%. That estimates only 5% of the pages would be 'bad'. Likely beats the daily average for new pages created by humans. --Samuel Pepys (talk) 05:07, 1 June 2008 (UTC)
That's irrelevant. The point is that there's 100,000 AfDs, which constitutes an organizational nightmare and utter hell for the 1,000 or so active administrators on the project. Heck, we could have 1,000 extra AfDs from this and that alone would create a very taxing situation for administrators on the project. Sephiroth BCR 09:51, 1 June 2008 (UTC)
Sounds more of a systemic issue within the AfD process. Change the policy, not the bot. --Samuel Pepys (talk) 09:52, 1 June 2008 (UTC)
How is it a systemic issue with AfDs? You're adding 2,000,000 articles to the project. There's going to be articles within that scope that are going to be AfD'd, prodded, or CSD'd inevitably. Even if 1% of these articles end up going this route, that's 20,000 articles administrators have to deal with. And it's the bot that is creating the problem, not the deletion process, which shouldn't have to bend over to accommodate this. Sephiroth BCR 10:01, 1 June 2008 (UTC)

Support creation of all 2 million articles. If we don't like what this does to the random article feature, we should improve the random article feature. I think notability on WP is generally a silly idea and this is a perfect example of how oldfangled ideas about what is "encyclopedic" might limit the utility of WP. Let's not allow that to happen. MaxVeers (talk) 07:00, 1 June 2008 (UTC)

Support the bot as written. Maybe it is possible to prevent the random article feature from presenting stubs. --R.Schuster (talk) 10:13, 1 June 2008 (UTC)
Isn't the very purpose of the Random Article feature to bring you to an article you might be able to improve that you'd never have thought of? That's certainly what i use it for. Cheers, Lindsay 14:23, 1 June 2008 (UTC)

Straw Poll

As the conversation is becoming unwieldy (as opposed to hopelessly convoluted) I move for a straw poll as is consistant on this page--Samuel Pepys (talk) 04:44, 1 June 2008 (UTC)

Users who support the bot as written

  1. Strong support - As above, the best thing we can do to address systematic bias and put wikipedia on the right path that attempts to cover the world evenly in which any decent encyclopedia should. Why is it impossible that any of the articles on these places can't be expanded once knowledge can be accessed?? Its already happening with place slike Madagascar and Burkina Faso... The allow a bot to do this for every place in America yet deny the rest of the world the right to be covered shows the views of certain people on this site at its most prominent that wikipedia is largely about America or the UK. This is so wrong in my view. The sort of articles that will be created will be places like Agnam-Goly etc, places in the world which exist and actually each have their own stories to tell, but they need to start off from somewhere. Articles such as this show that quite resonably it would be possible to have a fuller article on each of the places drawn up by the bot eventually. I honeslty believe that information will gradually become more available online, this is exactly what wikipedia and information services on the internet is all about -development and access and we have to be a part of making this happen and we are only in the infancy of what will surely develop into the 21st century on a mass scale. This shows that it is actually possible that articles can be written for the "useless stubs" so why should we deny people the right to try to cover the world evenly. Imagine eventually we have two million half decent articles. Is this a positive development of wikipedia or not? Even as stubs with the infobox and location and province details I think they are adequate additions that sjust need expansion. It is far better for wikipedia to take a giant leap and recognize these places rather than delibrately ignoring 95% of the world ♦Blofeld of SPECTRE♦ 10:04, 1 June 2008 (UTC)
    Blofeld, your enthusiastic support of this proposal is amazing, but your justifications are quite weak.
    Countering systemic bias in Knowledge to " the views of certain people on this site at its most prominent that wikipedia is largely about America or the UK"? In other words, you are proposing to make a point by disrupting the encyclopedia in perhaps the most extensive way it has ever been disrupted, or could be. According to you, the bias is due to articles on every "census place" in the US having been created by bot some years ago, by comparision depriving similar places elsewhere from having "their own stories" told. Yet my experience with these place articles (encountered randomly) is that they are not telling their stories. They are simply repeating their serial numbers, so to speak. The existing place articles do not add useful, notable, or even interesting information to Knowledge, and neither would the new ones. It is no shame to the rest of the world that their bureaucratic affairs are not documented here, as are the Americans'. Those articles that you hold up as examples of progress on this front are clearly exceptions to this trend, and these isolated cases of success do not justify (to me) the massive inclusion of articles for which you cannot provide any guarantee of a future.
    This is one reason why a lot of other countries hate America & the people in them. Your elitist view on stating how their countries are "unstable" and basically "third world" is evidence enough to not worry about them and they do not need articles here on Knowledge. Of course they will start of as stubs, thats how every article starts of. In time though I know that they will improve. Change occurs slowly. Furthermore just because these villages and towns happen to lack all the technicalities of a first world country like America happen to have (internet, computers, microwaves etc) doesn't mean they deserve any less of an article as they are a real place and they are notable. Just no one from that region has gotten to the internet cafe to write the article.Calaka (talk) 02:05, 2 June 2008 (UTC)
    How dare you put words in my mouth? I used no such phrases; please strike them from your response. In fact, strike your whole response; I never mentioned microwaves or internet cafes either. My point is that if these articles can in fact be expanded, then there really is more encyclopedic information on the places, and that information should be the basis for an initial stub. Such cannot be added by robot; just like the information itself, it requires the human touch to make it more than just rote enumeration. Certainly every place is notable to the people in it, just like every person is notable to himself and his friends (or her friends, of course); however, we have a strict, long-standing policy against articles on non-notable humans, and the prospective contents of these place articles doesn't convince me that there is anything more inherently notable about a random town anywhere on Earth (including in the US) than about myself. Ryan Reich (talk) 03:06, 2 June 2008 (UTC)
    I was interpreting the tone of your comment. You obviously did not say those things but I felt you implied them strongly. Now more seriously: Of course these articles can be expanded. I believe very much they can. Every article on wikipedia can in time be expanded to at least to a start class article. Easily and no question about that.
    Look at this random bunch of articles from South Australia:
    1. American River, South Australia
    2. Andamooka, South Australia
    3. Burra, South Australia
    4. Cockburn, South Australia <--Less than a 100 people
    5. Freeling, South Australia
    6. Hoyleton, South Australia
    The list goes on and on and on. Now tell me. Are these places notable? They are in a very rich country in comparison to the rest of the world... but are they notable? They are stub/start material and by some of them we can see that they are quite easily expandable. I hence believe that articles in other parts of the world can just as easily be expanded. As for your comment on people and their notability: Well as it now stands the WP: Bio wikiproject has tagged over half a million individuals (and I wouldnt be surprised if there is another 250 K that are not even tagged). Almost a third of wikipedia consists of people. Are they all notable? Hundreds of more people are being added everyday. I hence think a town or village or even a city in some place that no one on the internet has ever heard of is very much notable. Of course we think no one has heard of. But I very much bet a person has. They just might not have even heard of wikipedia to begin with.... Calaka (talk) 04:18, 2 June 2008 (UTC)
    Your entire argument is WP:OTHERSTUFF, is therefore invalid, and I'm not going to continue saying that. The question is not whether some articles can be expanded; the question is whether any stub on any place can be expanded. I say not; I say that without sources and notable information, such articles have no business existing in an encyclopedia. This results in a systemic bias that we have no business correcting: it is a bias in the world, of the lack of reliable data on many places in developing countries or of sources on non-Western countries that are translated into English (seemingly necessary for a source on en.wikipedia) and considered reliable (not knowing the originator, it may be hard to know whether their claims are for real). It is also certainly a bias in the interests and knowledge of our contributors, without whom there can never be any articles here. It's regrettable but it's not the fault of Knowledge, and creating vacuous articles that because of it will most likely remain vacuous is not a solution. This has nothing to do with the wealth of nations, size of the town, or quality of Internet connection. It has even less to do with how much you "feel" that I think it does; do not take your own personal bias into this discussion. Ryan Reich (talk) 05:01, 2 June 2008 (UTC)
    You say that "information will gradually become more available online": well, if it's not available now, why should we write the article now? By definition, if it is not available, no reliable sources exist on the subject of the article, and therefore no claim of notability can be supported. I do not believe that a place is inherently notable; you have said (elsewhere) that this is the case because a village has people in it, as though the personal touch were all that is necessary. Yet we routinely delete articles on non-notable people. There must be some connection to a larger community of interest, and for a place, which is by definition a community, such a connection would appear to be easily provided. Yet the bot you support would never supply more than the driest demographic and geographic data, and you are willing to delegate the responsibility of filling in the necessary details to someone who has the "story to tell". Your inability to provide these details yourself indicates that you could never defend the notability of the articles whose creation you advocate, which (again, to me) casts doubt on the claim that any such notability exists in general.
    What about all the "neighbourhoods" and "burrows" that have articles in Knowledge... They are part of a city so shouldn't a article on the CITY be enough? Why do we need an article about the neighbourhood "epping" or "cronulla" in Sydney when the article "sydney" is enough. *cough Bias cough*. Surely those neighbourhoods are non-notable? I mean if a town of a few thousand people is not notable in a third world country, then why should a neighbourhood with less people be? Just because they have internet access?Calaka (talk) 02:05, 2 June 2008 (UTC)
    That other stuff exists is no excuse. I haven't read any of these neighborhood articles, and perhaps some of them are indeed as vacuous as the proposed two million would be, but if any of them contain data of real human interest, it's exactly what I'm looking for. I have no problem with articles on small places; my problem is on indiscriminately populating Knowledge with such articles containing no secondary sources. That is not encyclopedic. Ryan Reich (talk) 03:06, 2 June 2008 (UTC)
    Now hold on. You do realize that the articles being created will ALL have sources from at least 2 places, prooving that they do in fact exist. I agree that more sources the better and if government sources (or census data for example) can be provided on the internet with even more data, then all the better. Note that as more time moves on, more information and sources will be available. Now while I am here, I might as well list those other stuff that you believe is not an excuse.
    I present you with 15 places of interest in Australia:
    1. Mount Nasura, Western Australia
    2. Eden Hill, Western Australia
    3. Morley, Western Australia
    4. Ascot, Western Australia
    5. West Leederville, Western Australia
    6. Rossmoyne, Western Australia
    7. Swanbourne, Western Australia
    8. Hammond Park, Western Australia
    9. Cottesloe, Western Australia <--Looks like a very good start class article to me.
    10. Hilton, Western Australia <-- oH Noez a one liner! AfD where are you!?!?!?
    11. Orange Grove, Western Australia
    12. Warwick, Western Australia
    13. The Lakes, Western Australia
    14. Peron, Western Australia
    Again the list goes on! Note that these are all from one city. Yup all from Pert, Western Australia. Should we speedy mass delete the lot of them? Or maybe merge/redirect onto the perth article?Calaka (talk) 04:18, 2 June 2008 (UTC)
    The articles will have sources, but they will be merely "primary sources": catalogues of facts, not opinion. The guideline WP:Notability states that "sources" are "secondary sources, which contain ideas as well. You claim that in time more data will arise; see WP:Notability#Notability is not temporary: it is not enough that a subject will one day become notable. When it does, come back and write the article properly. As for your examples: you are correct that Cottesloe is a good article, many of the others have at least something in them, and that Hilton probably should be deleted: it makes no claim of notability other than that it exists, and provides no information about it. Alternatively, since you feel so strongly, you could add to it; if you can't, consider that maybe there's nothing to say about it right now. Ryan Reich (talk) 05:01, 2 June 2008 (UTC)
    You say we are "ignoring 95% of the world". Elsewhere, you say that it is our responsibility to speak for people who, of necessity of circumstance, cannot speak for themselves here. Yet, again, breadth of coverage is no substitute for depth of coverage. We do not write articles on all people, yet we do write articles on all countries, of which many (most? all? I haven't read them all) are extensive and useful. In this sense, we have covered 100% of the world. How fine a mesh must we lay to claim adequate coverage, though? This is where the concept of "notability" becomes a crucial element of Knowledge's weave: the additional detail provided by a more extensive coverage of one town than another is a choice on the part of the author, indeed a judgement of these locations by that author, which is justified by the additional benefit to the encyclopedia provided by the narrative in the article. In indiscriminately (yes, indiscriminately, a direct violation of Knowledge practice) creating articles on every settlement in the world, you provide no narrative and fail to distinguish any of the towns from each other, ultimately denying notability to any of them. Politically correct, even morally compelling as your goal may be, your methods provide not the substance of panacea, but only the appearance. Ryan Reich (talk) 18:45, 1 June 2008 (UTC)
    God bless America. Jokes aside, if we go along that thread, what is stoping us delete all the non-notable crap that comes out of the American places. Yes there is a lot of stuff that we can probably do with out. I mean we have an article on the USA, an article on all the major cities, mountains,rivers,hills,mole holes, buffalo, indians, tennis courts, pizza stands, streets, highways, superhighways, super super duper highways, roads, bridges, collapsed bridges, shoot outs, the spoon that Obama used in his press conference, we basically have an article on EVERYTHING american related. All of that is seriously notable? And of course, that village in Randomstan is non notable. Peace!Calaka (talk) 02:05, 2 June 2008 (UTC)
    I can't take you seriously, but I also can't leave your remarks as though I agree with them. Let me make this clear: my objection has nothing to do with nationality. Your presumption that it does is far more offensive than whatever you are decrying, because it makes you guilty of your own crime. I'm sure there is a lot of non-notable crap on Knowledge, and I'm not part of the cabal that likes to delete it; however, I think it's upon me as a good citizen to oppose adding lots more. Ryan Reich (talk) 03:06, 2 June 2008 (UTC)
    Oh of course not. But while I am here why don't we look at some fasinating articles:
    1. Lake Tuscaloosa
    2. DeGray Lake
    3. Trinity Lake
    4. Aliceville Lake <--Oh noez a one liner :s
    5. Bear Creek Reservoir <--Oh noez another 1 liner. But it's ok. Your an American one liner!
    6. Wheeler Lake
    7. Gran Desierto de Altar
    8. Red Desert (Wyoming)
    9. Alvord Desert
    My point is: These are places (villages,towns,cities!). I can only imagine the uproar if people wanted to include a 'lake' from Pakistan or a 'desert' from thirdworldcountrySTAN onto wikipedia. Yet we are more than fine with all of the above lakes and random stuff from America?
    Note: I have nothing against America. I think it would be highly hypocritical of me judging the fact that I am on wikipedia (american), a computer (american brand), an internet connection (american) and drinking coca cola can (of course american). I am just saying that as it stands, a world view is not in place in wikipedia and it does not need to remain that way. This bot will SLOWLY (read: over the course of a year or more) implement the addition of places throughout the world, increasing the coverage and greatness of wikipedia. I have no dought that these articles can go beyond stub status. If those small towns in Australia can, or those suburbs in Perth can, then so can these. Peace!Calaka (talk) 04:18, 2 June 2008 (UTC)
    Once again: this is not about nationality, and you should get over it. You are the only one here ascribing this motive to your opponents. I don't care about the examples of success you can produce; I am interested in the far more numerous failures you are hiding, but which are likely to be the norm for this project. Nor do I care about the examples of failures that already exist that you think justify increasing their number. I don't care whether you think that these articles will have a future; if when they are created the best that can be done for each town is to give its census data, then that town will not meet even the minimal notability criterion that someone outside of it had anything to say about it. If that criterion can be met, then the best way to prove your point is to meet it in each case; if you can't do that, then you are ceding your point.
    I'm done with this argument; the debate has already been suspended for a few days and a request was specifically made to end the divisive poll which we are, somehow, continuing. Think about what I've said and we'll talk again later. Ryan Reich (talk) 05:01, 2 June 2008 (UTC)
Ryan Reich said: "... the debate has already been suspended for a few days and a request was specifically made to end the divisive poll which we are, somehow, continuing. Think about what I've said and we'll talk again later."
I will agree with that. Peace!Calaka (talk) 06:25, 2 June 2008 (UTC)
  1. Bots that satisfy points 1-3&5-6 obviously have a general consensus under 'anyone can edit' WP. Bot is already held far beyond the standard set for other users. Samuel Pepys (talk) 04:44, 1 June 2008 (UTC)
  2. per WP:NOTPAPER the expansion of the size of Knowledge is a red herring. All other arguements seem spurious. This seems like a fine task, and I see no inherent problem with it.Jayron32.talk.contribs 05:00, 1 June 2008 (UTC)
  3. I like it. MBisanz 05:23, 1 June 2008 (UTC)
  4. Towns and vilages are about the closest we get to inherent notability per long standing tradition and consensus on deletion debates. EconomicsGuy (talk) 05:52, 1 June 2008 (UTC)
  5. -- Ned Scott 06:02, 1 June 2008 (UTC)
  6. The purpose and value of this addition to Knowledge is right on. The "sky is falling", and "evil machine" arguments are ludicrous, just as they were in the folk tale, and among the Luddites. —EncMstr (talk) 06:26, 1 June 2008 (UTC)
    Comment I cannot allow that to go without reply. My problem with this insane idea is quite clear. We are, is seems, accepting the right to create 2,000,000 one-line stubs out of the control of ordinary editors. The whole point of this project, I assumed, was to allow every day ordinary people to edit articles. Now I find a Bot is to come along and create millions on my behalf, for my benefit, for my "greater good". Take a look at Aliabad, supposedly a Bot created article from the long list of towns in Azerbaijan without articles. It takes you to an article about a town in India. Great, wonderful, really useful. Are they all going to be like this? doktorb words 06:32, 1 June 2008 (UTC)
    ReComment Aliabad was not created by a bot, look at its history. --Samuel Pepys (talk) 06:36, 1 June 2008 (UTC)
  7. While I do have concerns regarding the ability of the project to upkeep all the newly created articles, I believe this project should move forward, though at a pace that others can easily monitor...we're not in a race. What I hope for is a rash of editing by folks wishing to further develop the new articles, whereas they may have never thought to create them before. Huntster (t@c) 07:32, 1 June 2008 (UTC)
  8. Go for it per the longstanding consensus on these type of articles but create them at a measured pace as per Hunster. This will be a huge step against systematic bias. Davewild (talk) 07:42, 1 June 2008 (UTC)
  9. While it's a major step, I can't see how this would harm Knowledge in anyway. Yes, it would add a whole bunch of articles on topics I don't care about, but then, the vast majority of Knowledge is already about topics I don't care about. Towns have long been considered inherently notable, and I for one, would be glad to have these stubs. Tuf-Kat (talk) 07:45, 1 June 2008 (UTC)
  10. D.M.N. (talk) 07:56, 1 June 2008 (UTC)
  11. If Special:Random is a worry, why not alter Special:Random to de-prioritize articles under a given word count? If it was good enough for the USA to have this with Mr. Rambot, it's good enough for the world with Mr. Thisbot. rootology (T) 07:58, 1 June 2008 (UTC)
  12. Support - if this has already been implemented for US places, then it would create a strong systematic bias not to do so for the rest of the world. It this is not passed, then a review of the US bot and articles it has created will be needed. I would suggest setting a low edit rate (one or two per minute), but this doesn't really matter too much. --GW_SimulationsUser Page | Talk 08:07, 1 June 2008 (UTC)
  13. Support Seems like a good idea to plug a large hole in WP that might never be completed by user contributions. Lugnuts (talk) 08:29, 1 June 2008 (UTC)
  14. Strong Support Haven't seen a single convincing opposing argument (no offense meant to anyone). Yes, at this stage in our development quality is preferable over quantity. But does that mean we should stop creating new articles completely? Obviously not. These articles will be created sooner or later - there are at least two very dedicated users that come straight to mind who create many such articles (I've even created a couple myself). How can one argue that comprehensiveness is a trait to be avoided when writing an encyclopedia where space is not a concern? faithless () 08:39, 1 June 2008 (UTC)
  15. Strong Support Calaka (talk) 08:44, 1 June 2008 (UTC). I already had a spiel above about the advantages of this project, but I shall restate a number of key points for each of you all to think about:
    1. This will not create 2 million articles in one day. The fastest it can go is at 600/hour or 20 weeks if it went from Afghanistan to Zimbabwe. It will not go with out pause. This will take time. It will take the entire year and probably go on to next year.
    2. Thousands of articles are written every day. Like it or not the 3millionth,4th millionth, Nth millionth article will be written give it enough time. This is not about increasing the amount of wikipedia articles just to look all fanciful and stuff. This is about eliminating (partially) the bias that is unfortunately a very real thing.
    3. A bot is consistent. Humans are not. Consistency is efficient. Humans are random. Given that this bot is fully controllable and modifiable, the bot can be arranged so it creates articles now to prevent inconsistencies from occuring later. All these villages and towns will be created sooner or later. The bot will have the advantage of putting them in the correct title, giving them a map, a geo coordinate, correct stub, category and (with the help of another bot -->) correct wikiproject template.
    4. Quality over quantity is nonesense. I propose shutting down 100 language wikipedias and deleting 1 million wikipedia articles for lack of quality if that be the case. We are not saying that this will increase the amount of quality in Knowledge. But if this project is NOT to run ahead, it does NOT mean we are all of a sudden going to get an increase of 50 FA a month to 500. It will stay the same so we might as well add them in.
    5. Vandalism argument does not hold much ground. Just because there are twice as many articles for vandals to "play around with" doesn't mean more vandalism. If that be the case you would have to assume that there will be an increase in vandals proportional to the number of articles. Then you will HAVE to assume increased number of editors due to increasing Knowledge coverage.
    6. EXISTANCE of a town/village DOES indicate notability. These are notable places it's just that no one could be bothered (or had the wikipedia knowledge...... or even dare I say it: Internet ACCESS! to be able to create them). Furthermore, what is saying my neighbourhood in my city of a few thousand people (currently a B class article) is more notable than a town in a 'third world country' that no one has heard of that has twice as many people. What has happened in my area thats notable? NOTHING! It just happens to be in a 'first world country'. Bias! My neighbourhood has had no famous people live there and nothing super awesome has occured there.... It is just a regular neighbourhood with regular people... Notable?
    Thats is all...
    ps. I am not trying to make this personal at all (so sorry if it seems that way). I just have not felt more strongly about such a wikipedia concept before and if in the off chance this is NOT allowed, it will be a low day for Knowledge.Calaka (talk) 08:44, 1 June 2008 (UTC)
    I'm in favour of countering systemic bias, but creating 2 million stubs on one particular aspect of this is overcompensation, even if it takes a year. Current projections suggest that in a year's time, without this measure, there will be just under 3 million articles. With this measure, even if it only results in 1.5 million additional articles, 1/3 of the encyclopedia will be on small towns and villages. The mistakes made during the Rambot episode should not be repeated. Please scale back this plan. I comment in more detail in the "withhold" section. Geometry guy 18:04, 1 June 2008 (UTC)
    But why don't we fix the Rambot episode if it's a problem? Why don't we put on our hunting hats and MASS delete all the American articles with a population of under 100 or a neighbourhood in a major city. Surely the city is satisfactory enough and there is no need for all the "neighbourhood articles" that plague wikipedia with pictures of barns and fountains and war memorials of a bunch of people in some war....Calaka (talk) 02:05, 2 June 2008 (UTC)
    Most neighbourhoods of US cities will survive an AfD because they have significant populations and reliable sources for the information they provide. As for population <100 articles, take each and every one to AfD with my blessing. Some will survive, some won't; it depends upon the sources and the mood at the AfD. Good luck. Now do you want to create the same issue more than twenty times over? Even more good luck. And by the way, I really am on your side on the broader issues. I'm just being realistic. Geometry guy 03:10, 2 June 2008 (UTC)
    I know that they will survive AfD. Hell I would support AfD as I am probably more of an 'inclusionist' wikipedian than the other type. I don't think there will be an issue. As long as there is a reference, why is there a need to delete them. Oh and see the list of places I have listed above for a number of places I would recomend for deletion had my views been of the "delete everything and keep only certain things" kind. Peace!Calaka (talk) 04:18, 2 June 2008 (UTC)
  16. Support Rambot on steroids. The "notability" arguments are irrelevant, as verifiable population centers have been considered for a while to be inherently notable. Additionally, a project like this would be incredibly helpful for downstream reusers of our content, such as Google Earth. However, I'd be happier if the following two feature requests are done:
    1. A log of entries added separated by geographical region or country, so relevant WikiProjects can verify the additions and update their worklists.
    2. Also, I can't find anywhere where it says how articles are going to be created. Will you use alphabetical, geographical or even random order to throttle the creation of these articles? More documentation would be nice. Titoxd 09:10, 1 June 2008 (UTC)
    I am willing to go to a number of relevant wikiprojects and advise them of the creation of articles related to that particular country (I just advised WP:Afghanistan as when this bot is ready it will be the first country to have all the articles written) and this will be good as they will be more in the know of the articles and possibly even add extra info on to them. As for your second point, I am not too sure myself of any documentations but you can just look around the page here at: Knowledge:WikiProject Missing encyclopedic articles/Places and see if you can find any info. I believe all the Asian countries will have their towns/villages/cities added first in alphabetical order and then followed by Africa. China, India and Russia are all on hold as there is a massive chunk for them (which might constitute a lot of the missing articles). Calaka (talk) 09:26, 1 June 2008 (UTC)
  17. Support in the strongest terms. This is the best means of addressing geographic systemic bias that we've done yet, and remedies an issue that was raised when Rambot was run years ago. Rebecca (talk) 09:41, 1 June 2008 (UTC)
  18. I support and especially like this reason: "A new user wishing to write about one of these places won't have to figure out how to start a new article (the infoboxes for places can be complicated)." This is extremely helpful. It allows all town articles to be standardized with correct infoboxes and coordinates and it is very helpful for a new user to not have to create a new article but simply expand on one that is already created. A random user who searches for their town and sees it doesn't have an article will probably move on to other things, but if they see their town does have an article they may expand it. I agree the two million number is daunting, and I also agree very small villages probably don't need their own article, but if it's true that every single little village and town in the United States has its own article then it would definitely help make the English Knowledge more global and more culturally neutral if we applied the same standard to every country. LonelyMarble (talk) 09:52, 1 June 2008 (UTC)
    Another thing I thought of along the same lines: this will allow many town articles to have a high position on a Google search which is yet another way for them to more likely be expanded. But this does have a down side, it could cause a lot more vandalism on these new pages and a lot of information added that is unreferenced. But I don't think vandalism or unreferencing should deter the project because any page on Knowledge will have those problems. LonelyMarble (talk) 09:59, 1 June 2008 (UTC)
  19. support Strongly. Every UK/US place like this gets an article so surely foreign countries deserve articles. Anonymous101 (talk) 10:08, 1 June 2008 (UTC)
  20. Support. For all the reasons given above - this is in line with well-established notability policy, and merely extends to the rest of the world what has already been done for the US and UK.--Kotniski (talk) 10:27, 1 June 2008 (UTC)
  21. Strong support. My supporting arguments have all been mentioned, but the strongest ones have to do with the fact that these are articles that should be created anyway, and the stubs will now be created in a much more efficient way, leaving the humans to do what they do best, enter actual content, rather than spend time doing things like looking up geographical coordinates (and quite likely get them wrong). Also, this means that the number of "bad" (badly formatted, written in unintelligible English etc) geographical stubs created will decrease. I think this bot is a fantastic idea. --Bonadea (talk) 10:37, 1 June 2008 (UTC)
  22. Support this is a very good idea, as it will fill holes in Knowledge's coverage and reduce systematic bias. To answer some objections: yes we should be aiming for quality and not quantity, but that doesn't mean we can't have both, especially as it will take very little effort to create these articles. Are the bot programmers going to spend their time writing FAs instead? No. Regarding the argument that there is not much information in these articles, there will be more information than Knowledge currently has - none. When a bot was used to create articles on US towns and villages the articles were expanded, and the same will apply here. It will be harder to detect vandalism on these articles, but we never consider vandalism to be a reason not to add content, a lot of vandalism (possibly even most) is detected through Special:Recentchanges and not watchlists, and African villages will not be high priorities for vandals. The argument that bots shouldn't create articles is clearly flawed, as this is certainly not the first bot to create articles. There is a longstanding consensus that real settlements with people living in them are notable, and one of these articles would certainly pass AfD. I see no reason not to proceed, and plenty of reasons to. Hut 8.5 10:43, 1 June 2008 (UTC)
  23. Support assuming enough can be said about each town created to make it at least a good article, which shouldn't be a problem. --Mark J (talk) 10:44, 1 June 2008 (UTC)
  24. Support As a new pages patroller, I often come across badly written articles that someone has just put up about their local town. If such an article already exists, the contributions of such people will be better. This proposal would greatly benefit the community, and the fact that such articles already exist for US/UK means they simply have to exist for the rest of the world as well.Knowledge is not a paper encyclopedia. ninety:one 10:50, 1 June 2008 (UTC)
  25. Articles on these minor towns and villages are usually stubs anyway, even when manually written. Do it, says I. -- Anonymous Dissident 10:52, 1 June 2008 (UTC)
    Note that actually many missing are actually towns with several thousand in particularly Africa and heavily populated places in countries like Bangladesh. Believe it or not I have started many missing articles on cities in places in Africa which have a population of like 115,000 people. IN Bangladesh for instance we only have about 40 articles out of 27,000 that will be created. There must be thousands of articles missing on solid towns even cities with tens of thousands of people. 2 mill is actually only a small proportion of what is likely to fully exist and most of the settlements are likely to have several hundred people living in them. Most of them are likely to be villages of several hundred people which as we know is notable by wikipedia standards already. Basically I use articles like Agnam-Goly which are exactly the sort of places missing in hundreds of thousands in places like Africa and Asia as proff that such places have their own stroies to tell and the first step towards making things happen is to set these articles up and acknowledge their existence. Whether people like the idea of mass stub creating or not, I think it sends out a good message to those who think wikipedia is infatuated by fictional content and is ignorant towards real world content. Well nothing can be more "real world" then covering places in the world we live in. ♦Blofeld of SPECTRE♦ 11:48, 1 June 2008 (UTC)
  26. Strong support - Ninetyone explained it well. Garion96 (talk) 10:58, 1 June 2008 (UTC)
  27. Support. --Heron (talk) 11:17, 1 June 2008 (UTC)
  28. Support SGGH 11:30, 1 June 2008 (UTC)
  29. Strong Support: This is a massive step forward for wikipedia in terms of tackling the bias towards the english speaking nations. Great job at getting this going! --Borgarde 11:44, 1 June 2008 (UTC)
  30. Support option 4 (preferred) or "as-is" - Option 4 (create merged mini-articles for all villages on articles about townships) sounds best to me, but I'm alright with the larger version, too. What about adding a namespace like Geo: for them? If anyone ever expands the article, they could be moved into mainspace. Or not. Doesn't really matter. The only reason why Geo: or something like it might be nice would be to help with the Random Article issue (suddenly every n-th article is one of these geo-stubs instead of something interesting). I especially like the fact that articles would be live and ready for anyone to edit, rather than having to be created from scratch (and risk being speedy deleted by overzealous new article watchers). --Willscrlt (Talk) 12:02, 1 June 2008 (UTC)
  31. Strongest Support: This will somewhat reduce the systemic bias in wikipedia. Moreover, wikipedia has this goal to culminate in the collection of all human knowledge available for free. Of course, this bot is a good step towards the goal! There is no policy that all articles need to be created by human beings, that rationale sounds pretty romantic though :) It's ultimately the total body of information that matters, not who writes it. Regards.--Dwaipayan (talk) 12:07, 1 June 2008 (UTC)
  32. Hemmingsen 12:11, 1 June 2008 (UTC)
  33. Support per bot operator. Zginder 2008-06-01T12:23Z (UTC)
  34. Strongest support possible per my comments above. I do find it funny that some think that fictioncruft like Doctor Who is more notable and important that improving geographical coverage for places in the developing world. Some of the articles that are not yet even written could be brought up to GA or FA standard. Think about that before opposing a move to make this the most geographically-complete encyclopædia in the world. Regards, EJF (talk) 12:30, 1 June 2008 (UTC)
  35. Strongly support the bot as it is. I really like Blofeld's rationale, that we can always expand, but a stub is a great place to start. Also, the argument raised above of WP:NOTPAPER is applicable. I hate this long convoluted discussion that never goes anywhere... Alex Muller 12:32, 1 June 2008 (UTC)
    me too. The time spent on council discussions while important in some cases the amount of time spent here could easily be put into improving the existing "stubs" that we have. But I guess this issue is particularly important hence the discussion ♦Blofeld of SPECTRE♦ 12:34, 1 June 2008 (UTC)
  36. Strong Support these places are inherently notable. it is going to very useful to have seperate articles for these (rather than lists), since they can be easily referenced in other articles/events/lists etc. It will also invite new editors to expand them. This kind of effort needs to be undertaken in each field (not just geography) so that Random Feature does not just yeild geography articles. I was the bot-owner for the bot that created thosands of towns in India. We had gone through a similar discussion at the time. Thanks, Ganeshk (talk) 12:45, 1 June 2008 (UTC)
    Fully agree with you Ganeslk about this being implemented in others areas. Well put. People seem to think that wikipedia in terms of article content is nearly complete, in fact it is just starting and there ar emany areas which need serious devleopment to evne up the encyclopedia. Once the encyclopedia begins to cover topics evenly and address systematic bias then we can try to make every article full quality and hopefully improve quality as the are being created. There seems to be some misconcpetion that wikipedia is a set encyclopedia and should be complete. Knowledge has always been about building an encyclopedia. As for curbing growth of articles because every article should be fully developed, wikipedia will double, triple and quadruple in size of the coming years guaranteed anyway whether there is a bot or not, it is up to us to try to develop each article and ensure an encyclopedia of the finest quality and coverage. ♦Blofeld of SPECTRE♦ 12:57, 1 June 2008 (UTC)
  37. Support Has to be a good idea. --Michael C. Price 12:58, 1 June 2008 (UTC)
  38. Support - would go a long way to countering systemic bias by covering areas of the world that are not adequately represented at the moment. As Blofeld has said, some of these missing places are major towns in their areas. Stubs make it a lot easier for people to add information who would otherwise be put off by the thought of creating a new article. --BelovedFreak 13:00, 1 June 2008 (UTC)
    Not to mention consistency and creating two million references articles with infoboxes that wikipedia is yet to achieve in seven years..... ♦Blofeld of SPECTRE♦ 13:07, 1 June 2008 (UTC)
  39. Great idea I appreciate the comparison to Rambot's US places: all the doomsday predictions about vast levels of vandalism to these pages didn't happen to the US places, even though when those articles were created, there were far fewer editors and admins (and I don't know, were there any antivandal bots then?) to keep track of this. Systematic bias is a big thing, too: we've got tons of US place articles, and I'm working on expanding them slowly (see my recent contributions), but I (and probably tons of other geography-focused editors, too) don't have the sources for non-US places like I do for US places. Nyttend (talk) 13:01, 1 June 2008 (UTC)
  40. Support - quality, referenced stub with infoboxes? Brilliant! I feel sure that this will encourage the expansion of many of these stubs into interesting and full articles. guiltyspark (talk) 13:03, 1 June 2008 (UTC)
  41. Strong Support; I have always thought that, though Knowledge will always be unfinished, one of the project's achievements will be in creating most possible articles in some defined areas. Articles on places are some of the easiest to create, and this proposal goes a long way to addressing the inherent asymmetry in that process, as well as speeding it up and making it more uniform. Of course, by minimising the western bias in geography articles, this creates a new bias favouring geography articles as a whole, but I think that bias of discipline is less unacceptable than bias of nationality. Ross 13:03, 1 June 2008 (UTC)
  42. Support A great idea. If it can be coupled with appropriate Wikiprojects following up and expanding the articles, even better. --Malcolmxl5 (talk) 13:07, 1 June 2008 (UTC)
  43. Strong support. If possible, I think it would be good idea to do a 10,000 edit run, and re-assess, and then do a 100,000 edit run, and re-assess. This will give the community scheduled times to retrospectively determine how well it worked, which should allow minor nits to be discussed without anyone jumping on the big stop button. John Vandenberg 13:11, 1 June 2008 (UTC)
    It essentially is throttled due to the need for human intervention in checking the data, per the spec above Fritzpoll (talk) 13:13, 1 June 2008 (UTC)
  44. Complete support Will go a long way to correcting the lack of coverage of non-English speaking countries. Yes, a lot of them will be stubs for a long time, and yes there is quite some scope for undetected vandalism, but in the long run it will be beneficial. Martocticvs (talk) 13:16, 1 June 2008 (UTC)
  45. Support. If something is notable enough to have an article we shouldn't care who wrote the first stub on it, a human or a bot. Shanes (talk) 13:22, 1 June 2008 (UTC)
  46. Support I suggested the idea in the first place. I'm an Editorofthewiki 13:24, 1 June 2008 (UTC)
  47. Strong support, although: would it not be possible to hide these places from "random article" until someone else than the bot has edited them at least once? --Aqwis (talkcontributions) 13:35, 1 June 2008 (UTC)
  48. Support This bot is awesome. I have faith in wikipedians and believe that we'll see some of these new articles become GAs and FAs soon enough. Wrad (talk) 13:43, 1 June 2008 (UTC)
  49. Support It might not be possible to manually create all these articles - so we will have these articles within a short span of time, they can be slowly upgraded manually as time goes by. Around The Globe 14:21, 1 June 2008 (UTC)
  50. Absolutely. RyanGerbil10(Kick 'em in the Dishpan!) 14:30, 1 June 2008 (UTC)
  51. Strong Support This bot will seriously help out, since being a town or village is, in my opinion, the closest thing you can get to automatic notability. Kivar2 (talk) 15:27, 1 June 2008 (UTC)
  52. Support - helps to combat systematic bias.--PhilKnight (talk) 15:41, 1 June 2008 (UTC)
  53. Support, Implement bot as written, we're not paper and this will help fight our developed-world/USA bias. Tim Vickers (talk) 15:45, 1 June 2008 (UTC)
  54. Support, a big step towards a comprehensive and unbiased encyclopedia. --Skizzik 16:12, 1 June 2008 (UTC)
  55. Strong Support I absolutely 100% approve of this and it's been a long time coming. This will greatly improve Knowledge's quality as a reference work.Orange Tuesday (talk) 16:30, 1 June 2008 (UTC)
  56. Support - Will go along way in combating large amounts of systematic bias on Knowledge, and a bot will set the basis for a good article with an info box, references section etc. - which will be very helpful in the long-run. A lot of the articles may remain stubs for a long time - but I do ultimately find encyclopaedic information better than no information. Camaron | Chris (talk) 16:36, 1 June 2008 (UTC)
  57. Support - All the articles the bot creates will be notable, and a basic stub is better than nothing at all, so there is no reason not to implement the bot in my opinion. - MTC (talk) 16:50, 1 June 2008 (UTC)
  58. Support very strongly indeed. I am staggered at the ignorant parochialism of comments such as that those on en.wikipedia are all English native speakers living in the English speaking world and who would be interested in places poutside ther English speaking world anyway. Truly staggered. This proposal can actually counter systemic bias and be a real step towards our actual goal, which is educational. Especially in this modern agee of jet travel it feels such an appropriate suggestion and am really rooting for it to be accepted. Thanks, SqueakBox 16:56, 1 June 2008 (UTC)
  59. Very Strong Support -I agree with most of the assessments above regarding the fact that that inherent notability is a given, and this is such a useful tool to get these articles going. I have no doubt that once run, many articles will see immediate expansion. I'm looking forward to seeing it run. Carter | Talk to me 17:00, 1 June 2008 (UTC)
    Articles can be quickly expanded in two minutes flat. There seems to be some idea that they wiLl never be expanded. IMagine how article like Kushgag will look once proper information becomes avilable on the web. I fail to see how this doesn't have any benefit for the encyclopedia at all ♦Blofeld of SPECTRE♦ 17:05, 1 June 2008 (UTC)
  60. Support This will counter systematic bias. Vandalism can be picked up by RC patrollers. --Patar knight - /contributions 17:12, 1 June 2008 (UTC)
  61. Support - I've read the arguments here, and the proposal itself (which it seems few objectors have done!), and think this can do nothing but good.--Ukslim (talk) 17:20, 1 June 2008 (UTC)
  62. Support with the caveat of running it only on countries where there is some interest from editors to do the followup work. This will slow things down a little and with User:Blofeld of SPECTRE being a so prolific writer he and his gang will do a lot of that followup work. In fact using the bot to do the grunt work allows for the more detailed work to be done by humans. Agathoclea (talk) 17:22, 1 June 2008 (UTC)
    By involving the wikiprojects before creation, we hope to achieve this very aim. It should also help with the watchlisting situation, since editors from those projects will be more inclined to watchlist these pages and work on them Fritzpoll (talk) 17:26, 1 June 2008 (UTC)
  63. Support Especially since the opposers don't seem to know that we already have many articles created by bots, for plants and animals, and other places. GlassCobra 17:31, 1 June 2008 (UTC)
  64. Support, though I would like it to also do what I brought up below as it would save all kinds of time on the part of the members of various country WikiProjects. ···日本穣 17:39, 1 June 2008 (UTC)
  65. Support. A wonderful tool to help make Knowledge more complete and more useful. Dovi (talk) 17:50, 1 June 2008 (UTC)
  66. Strong support. Helps add content on things Knowledge should cover. Articles that exist are more likely to be improved. Increasing the size of Knowledge is not a compellingly bad thing. --Alynna (talk) 18:01, 1 June 2008 (UTC)
  67. Yep I would rather see a topic covered in a stub fashion than not at all. Similar has been done with the U.S.; the rest of world should get the same respect. §hep¡Talk to me! 18:02, 1 June 2008 (UTC)
  68. Support - why should they NOT be created? They will be better sourced than many other articles... Plrk (talk) 18:12, 1 June 2008 (UTC)
    The most important thing is that they are all well referenced or linked and consistent and avoid any of the mess and uneveness that humans generate -evidence of this is look through most of the country cities on wikipedia and the vast majority are very inconsistent or unreferenced, and these articles will be created anyway over time. I wonder how many people could say that two million articles we currently had are either referenced or consistent. ♦Blofeld of SPECTRE♦ 18:23, 1 June 2008 (UTC)
  69. Support - bot created articles have been of great value to Knowledge. Provided the places are notable - and any generally recognised inhabited places is very likely to be - then let's create the articles. We have as long as is needed to improve them, as has happened with the articles on places in the U.S. Warofdreams talk 18:17, 1 June 2008 (UTC)
  70. Bloody good idea. naerii - talk 18:22, 1 June 2008 (UTC)
  71. Support - Expanded coverage = better quality. We'll have a good base for any editor who wants to make an article about any village anywhere. Looking forward to the expansion. Okiefromokla 18:25, 1 June 2008 (UTC)
  72. Strong support - Scarian 18:29, 1 June 2008 (UTC)
  73. Support in the interest of complete coverage. We have already done this for U.S. towns and villages, and should do the same for the rest of the world as well. JBsupreme (talk) 18:38, 1 June 2008 (UTC)
  74. Strong Support Towns are verifiable, and hence should be included in Knowledge. I don't see any reason to reject new articles, that's what WIkipedia is all about! --Falcorian  18:41, 1 June 2008 (UTC)
  75. Support English and French small towns get attention, other towns from other countries need the same attention. Randomblue (talk) 19:04, 1 June 2008 (UTC)
  76. Support. It's the same as Rambot, but international - how could we refuse? But it's sad that someone thought it would be a good idea to deal with this issue by summoning Pollzilla, Destroyer of Consensus. rspeer / ɹəədsɹ 19:12, 1 June 2008 (UTC)
    1. If it were the same as Rambot, I'd have no problem, but all we have for most of these is a set of coordinates. Rambot had a healthy set of census data to work from.--Prosfilaes (talk) 19:22, 1 June 2008 (UTC)
    You're right. The Ethiopia and "Gnaa, Nigeria" examples have convinced me that this needs to be run on a country-by-country basis. rspeer / ɹəədsɹ 22:53, 1 June 2008 (UTC)
  77. Support. The Rambot articles are useful. These would be too. Phil Sandifer (talk) 19:38, 1 June 2008 (UTC)
  78. Support. Let's just do this and stop getting bogged down in analysis paralysis. The time we waste discussing this could be better spent creating and improving articles. --Nricardo (talk) 19:57, 1 June 2008 (UTC)
  79. Support - this isn't really any different then what rambot did. However, limiting it by additional criteria such as population is a good idea. It would be hard to argue for the notablity of a town of under 10,000
    This is actually completely unlike what rambot did. RamBot created several-paragraph-long, referenced articles with considerable geographic and demographic information. It did not create one-line stubs from an error-ridden data source, as this proposal says it will. --Delirium (talk) 20:16, 1 June 2008 (UTC)
  80. Support — I'm fine with it; if this turns out to be a colossal failure, it's not too difficult to get a bot to remove the articles. Maxim(talk) 20:20, 1 June 2008 (UTC)
    That is one of the most ineresting comments I've seen all day. Kind of like if the article is still a sub stub in exactly a year use a bot to remove it. Hopefully this would encourage more articles to develop it sort of scenario. There is of course ko obligation that we are stuck with any article forever. Things can change dramtically. Thats interesting.... Mmm. But it would be nice to know that every article could be fully devleoped relatively quickly and not feel under pressure to develop everyone myself. ♦Blofeld of SPECTRE♦ 20:27, 1 June 2008 (UTC)
  81. Support millions of perfectly good new articles. I do not believe the reader exists who, upon searching for a topic, would prefer nothing to one of these pages. If they do exist, they are certainly the exception to the rule. Christopher Parham (talk) 20:38, 1 June 2008 (UTC)
  82. SupportThe pages that would be created under this project would otherwise most likely be created by new users and/or non-native speakers of English. New page creation is a potentially burdensome step for these editors that would be greatly alleviated through the project. While the potential problems noted earlier in the conversation regarding vandalism may be valid, the other side of that coin is that adding useful material to an existing page is less of a hurdle than starting a new page. This would also provide a standard starting point in terms of format, content, and appearance that most topics would have benefitted from had this kind of activity been possible at the beginning of WP. For the sake of both completeness and consistency, more such projects should be undertaken.Jim Miller (talk) 20:42, 1 June 2008 (UTC)
  83. Support. Technopat (talk) 20:53, 1 June 2008 (UTC)
  84. Support. As an editor on tropical cyclones, I often write on locations worldwide (at least near the coastline). Often coming across redlinks that cannot have an otherwise appropriate link, I would certainly find a use in its implementation. I can see the argument of quality over quantity, but why not both? Worst comes to worst, we could create articles like List of cities in X, but only after the stubs were made (so to make sure we get all of them). ♬♩ Hurricanehink (talk) 22:03, 1 June 2008 (UTC)
  85. Support with reservations. I support this project in theory, but believe there are many hurdles to overcome before it can be implemented. I am commenting here even though polls are evil because I'd like to have my thoughts on record. Because, as others have written, this bot is probably a once-in-Knowledge occurrence (it's much harder to write over bluelinks with confidence), it is important that WikiProjects are involved and even more important that we are confident we are incorporating all available data sources before article creation takes place. I realize that the involvement of WikiProjects has been planned from the beginning, but I want to emphasize how long it will probably take before this bot can go live with any table to start actually making articles. I have been investigating the availability of census data from Cambodia, and have found that while the data is encouragingly complete, placename spelling is hugely inconsistent across data sources. This will take hours and hours of human effort to sort out.
    I am also sympathetic to the objectors' belief that it is not desirable to create many stub articles that will likely not be expanded. I think the notion of some population limitation may be sensible, below which merged articles might be created. Even assuming that sources for the developing world come online, it is unlikely that there will be much to be said about many villages whose populations number in the 100s. As a pure administrability issue, it seems prudent to create higher-level merged articles. These could be more easily monitored and (with the appropriate redirects) provide identical coverage. Today, I created Kouk Romiet as a mock-up of what such an article might look like. Of the 20 villages in this commune, census information was available for 18 and coordinates for 4.
    All this being said, if the question were this bot or nothing, I would support this bot. Go to Kouk Romiet and click on the coordinates next to one of the villages, then zoom in on the google maps satellite view. Doing this summoned some sort of nameless feeling of wonder and awe inside me, seeing that now the rest of the world may be able to know of and be connected to all sorts of people living in similar tiny villages in rural Cambodia. I think this is what wikipedia and the internet are all about. Mangostar (talk) 22:38, 1 June 2008 (UTC)
  86. Strong Support, per many of the reasons discussed above.-Polotet 23:37, 1 June 2008 (UTC)
  87. Strong support not just to counter systematic bias but as a generally great idea. Let's get articles created on these places with appropriate infoboxes etc ASAP! It will be way easier to standardize Knowledge's coverage of places before the articles are created than after. Kalkin (talk) 00:16, 2 June 2008 (UTC)
  88. Strong support Just increase out article count? But someone's going to have to create those articles anyways. -- penubag  (talk) 00:54, 2 June 2008 (UTC)
  89. Support It's a good idea. It will help to spur an influx of information into those articles and wikipedia. I mean these new city/town/village articles will eventually come to existance anyway, so why not just get a head start and create them. El Greco
  90. Support. I believe this is a good idea, and will enhance Knowledge. Polymath618 (talk) 01:15, 2 June 2008 (UTC)
  91. Per my BRFA approval of the bot. (This message is my opinion only, not a BAG statement.) dihydrogen monoxide (H2O) 01:19, 2 June 2008 (UTC)
  92. Support Definitely the right move. Razorflame 01:20, 2 June 2008 (UTC)
  93. Support As I understand the proposal, I think the idea to be good, perhaps with a review to be scheduled in 30 days. rkmlai (talk) 02:54, 2 June 2008 (UTC)
  94. Support - standardized mini-stubs with infobox will be 200% better than mini-stubs created by uncoordinated newbies and other users. Do consider population limits. Renata (talk) 04:02, 2 June 2008 (UTC)
  95. Strongest possible support. A wonderful tool to help fight systematic bias and increase our coverage. If we can have articles on middle-of-nowhere towns in Maine, then there's no reason not to have articles on equivalent locations elsewhere in the world. Celarnor 05:09, 2 June 2008 (UTC)
  96. Support as described at the top of the page. There's a world of difference between editing in a field where you know all the articles are there (even if stub) and one where you may or may not find an article for the given subject. The reference generated is not specific enough, though: WHICH databases were each article pulled from? --Alvestrand (talk) 05:22, 2 June 2008 (UTC)
  97. This cannot in good conscience be opposed, otherwise we are allowing systemic bias in Knowledge. Andre (talk) 06:18, 2 June 2008 (UTC)
  98. Support, improved coverage is a good thing. Standardized stubs for places are also, in my opinion, a good thing - providing a framework on which to build. DuncanHill (talk) 09:07, 2 June 2008 (UTC)
  99. User:Krator (t c) 11:57, 2 June 2008 (UTC)

Users currently withholding support

Users that would support under modification or specific criteria

  1. Will support if lists/tables/sections with redirects are used for really small towns. Mike92591 (talk) 17:27, 1 June 2008 (UTC)
    I would support creating articles for towns above a certain population (to be determined later), with the rest in lists with redirects, or just lists and redirects. Mr.Z-man 06:13, 1 June 2008 (UTC)
  2. Villages and small towns in the English-speaking world, maybe, as they are likely to be created at some stage anyway. The majority of stubs about villages and small towns in the rest of the world—including Europe and Latin America—will never be accessed. Scolaire (talk) 06:29, 1 June 2008 (UTC)
    How would this mitigate systemic bias? If anything it would just further it. --Rory096 07:29, 1 June 2008 (UTC)
    Furthermore, either villages and small towns are notable, or they aren't. If you are happy for all English-speaking ones to be created without further comment, why are these more notable than those ones in countries where people don't speak English? Fritzpoll (talk) 11:41, 1 June 2008 (UTC)
    This is English Knowledge. People will read/edit articles about their own village or town, their family's, their friends' etc. But the majority of villages in the non-English-speaking world will never be accessed, as nobody on en.wikipedia will have any reason to ever have heard of them (unless, of course, the bot will create stubs on villages in French-speaking areas on fr.wikipedia only, etc.). And I said nothing about systematic bias. I am talking only from an efficiency/worthwhile-ness point of view. Scolaire (talk) 16:51, 1 June 2008 (UTC)
    Yes but this is not wikipedia of the English speaking world. Your opinion that most of the even Latin American and continental European is both daft and plain wrong, and actually it may comes a shock for you but wikipedia is an educational project and geography an educational subject and the rather silly comment that people will only access places they already know directly contradicts the educational goal. Thanks, SqueakBox 17:06, 1 June 2008 (UTC)
    Squeakbox, we were asked for comments; I've offered mine. I think that using a bot to create millions of articles on the principle that someone at some time might want to look up some of them is daft (and BTW I didn't say "places they don't already know", I said places they have no reason ever to have heard of, from a geography, education or any other point of view). That's not a reason to patronise me or lecture me. I've registered my vote. Let's not clutter up the discussion further. Thanks, Scolaire (talk) 18:52, 1 June 2008 (UTC)
  3. Create articles by order of population and stop at X, then halt and wait for a new consensus of whether to continue or not and on how that continuance should be conducted. Starylon (talk) 06:58, 1 June 2008 (UTC)
  4. I support the suggestion of JJB above. Start with towns with large populations, and gradually move on to smaller ones. -- Nudve (talk) 07:07, 1 June 2008 (UTC)
  5. Per John J. Bulten - controlled, measured growth. Vishnava talk 07:18, 1 June 2008 (UTC)
  6. Agree with starting with largest populations and working down, stopping at each order of magnitude. I would also like the bot to not create any stubs–like its example–that don't provide a population number, as that is pointless. Such stubs should be nominated for speedy deletion as they don't assert notability even by the generous rules that all populated places are notable. Phlegm Rooster (talk) 07:28, 1 June 2008 (UTC)
  7. While I'm still not thrilled with the idea, I do understand the argument do this to fight systemic bias. I'm withholding support in the hope that some sort of population limits will be implemented, at least to get the project started. Newsboy85 (talk) 07:56, 1 June 2008 (UTC)
  8. Per comments at #Reference Issue automated construction of article need to be perfect examples of a good article, even if it is a stub. Example User:Fritzpoll/GeoBot/Example does not even meet the most basic requirements of WP:V and WP:CITE#HOW with a reference link to National Geospatial-Intelligence Agency a Knowledge page! Jeepday (talk) 13:21, 1 June 2008 (UTC)
  9. Same opinion as Z-man. Complete mini-articles about all the villages, containing the infobox and everything, should be merged into the township articles. So we get (say) 10,000 relatively substantive new and expanded articles with the same information content, ease of access and ease of editing, instead of 2 million substubs. These can always be spun off later if the article develops from substub to stub (and we can deploy a bot to detect when this happens).--Pharos (talk) 13:27, 1 June 2008 (UTC)
  10. Options two and three seem to be the best. I'm not sure what this is going to do to our random articles or our server lag, though. I would support if the bot will make articles that qualify for sufficient notability, and if sufficient sources can be found, unsuitable articles can be put up for PROD, AfD, or CSD. ~AH1 15:15, 1 June 2008 (UTC)
  11. This is a great idea, but two million articles is too many. A handful of people living in a place doesn't make it notable on its own. Start with the towns with the largest populations, and relegate those below a certain threshold to lists. I wonder, though, whether the threshold should be relative rather than absolute, to ensure that all major towns in each country are covered. ThreeOfCups (talk) 15:24, 1 June 2008 (UTC)
  12. This idea is fine in principle, but devil is in the details. I think most of the problems could be addressed if the proposal was scaled back by a factor 10 say. In particular I am against the the automatic generation of 2 million new stubs on any topic. Much as I admire the dedicated work of those working to counter systemic bias, this should not be done at the expense of the readership, the wider interests of the encyclopedia and contrary to policy. The rambot exercise was badly done, and the 100000 articles on US villages are stuff that exists as a consequence of that.
    This discussion may be over-influenced by the political nature of the systemic bias issue. I suggest thinking of other examples where this idea applies: for instance, would we support a bot to create stubs on each of the 350000 described species of beetle? Systemic bias comes from the demographics of the editorship. However, to some extent, this reflects the demographics of the readership: this is the English language Knowledge after all. It is a laudible goal to educate a readership mostly coming from North America about the rest of the world and Knowledge can do a much better job at this than traditional encyclopedias, but this should not be taken to an extreme. 2 million articles on world villages would seriously imbalance the encyclopedia, affecting, for instance Special:Random.
    Contrary to popular belief the number of articles is not growing exponentially and there are rough limits on how big it will get: it is limited by what can be attributed to reliable secondary sources. I think there are serious notability and verifiability concerns about articles on tiny villages. We do not normally allow websites as reliable secondary sources, even (or perhaps especially!) governmental intelligence agencies. Furthermore, it is proposed that this data is checked and corrected by Knowledge editors, who are by definition not reliable sources. On the otherhand 200000 articles would focus on the more notable places, for which it is more likely that reliable secondary sources can be found, and is a much more reasonable scale for an endeavour of this kind. Geometry guy 16:22, 1 June 2008 (UTC)
    Have you even considered that we are missing thousands of articles on towns in Asia and Africa with over 10,000 people, many over 50,000??? ♦Blofeld of SPECTRE♦ 18:03, 1 June 2008 (UTC)
    Of course I have! That's why I would strongly support a proposal to create by bot c. 200,000 stubs on the more populous or notable places. I'm with you in the basic mission, just opposed to the scale. I know I wrote a lot above, but your reply suggests you didn't really read it. Geometry guy 18:14, 1 June 2008 (UTC)
  13. I would support with a population limit, for many of the reasons given above. I can't see above that anyone has mentioned the large number of duplicate articles this will almost certainly cause, given the varying naming conventions existing articles have. Let's face it, few of 2m new articles will be checked manually in the foreseeable future. I am also concerned at the effect 2m bot-articles will have on public perception of WP. Unlike some, I think the rate of growth of new articles will slow considerably, as many types of article are largely covered. If this goes ahead we will gradually cease being "the encyclopedia written by schoolkids" to our critics, and beome "the encyclopedia written by computers" which will be even worse.Johnbod (talk) 17:09, 1 June 2008 (UTC)
  14. Withholding support on current implementation plan, as above. JJB 17:28, 1 June 2008 (UTC)
  15. I think this is a great effort, but as mentioned above I think it should start with large villages only (if up to a million articles or more is fine with me). Then I think we should see if those articles have progressed any or have proved notable, and then decide if we want to go further. I also wonder why Maplandia.com was chosen out of all the sites possible. I am sure people will put a lot of time and thought into the template. Danski14 17:36, 1 June 2008 (UTC)
  16. Withholding support per comments above. Would support starting with larger towns first, as a trial run. Are there samples of the proposed bot pages available? Pete Tillman (talk) 17:42, 1 June 2008 (UTC)
    Answering my own question, it's difficult to see how an article such as Langar, Badakhshan is worth including in Knowledge. Pete Tillman (talk) 18:10, 1 June 2008 (UTC)
  17. After skimming the discussion above, I agree with the proposal to create lists for the places with small populations, and redirects to those lists from the placenames -- that will help keep maintenance of these articles manageable, and any village that someone actually has something to say about beyond population and location can eventually have a full article developed about it. I've been coming across hundreds of apparently bot-generated substubs about French locations that are apparently utterly abandoned, completely orphaned, unreferenced, and with little prospect of being expanded in the next decade. Creating event another 2 million articles like that (so that half of Knowledge's content would be undeveloped place stubs) seems unproductive. Maybe with lists we could limit the total number of articles produced to some tens of thousands instead of 2 million, and each new article would be more likely to have someone interested in taking care of it, too. -- Avocado (talk) 17:55, 1 June 2008 (UTC)
  18. I'm fine with the general idea of bot-created articles, and I'm not really that concerned about thousands of perma-stubs. Some information is better than nothing. However, Llywrch brings up some interesting points about the reliability of the National Geospatial Intelligence Agency that merit further discussion. (His comments are probably the most valuable in this whole debate, yet have been mostly ignored.) Zagalejo^^^ 01:12, 2 June 2008 (UTC)
    I think that site is reliable? We should try and find more places where we can reference just to satisfy more people :) Calaka (talk) 02:27, 2 June 2008 (UTC)
    Llywrch (who's started lots of articles on Ethiopian settlements) seems to have some concerns about it. Zagalejo^^^ 02:50, 2 June 2008 (UTC)
  19. While I like the idea, I'm withholding on the basis that this may flood Special:New Pages. If the bot can can avoid Special:New Pages somehow (Not sure if even possible), I would be in full support. --Michael Greiner 01:15, 2 June 2008 (UTC)
    I think they can flag the bot created articles so that the users looking at new pages don't get the tsunami effect of all these newly created articles. Calaka (talk) 02:23, 2 June 2008 (UTC)
  20. My concern is whether Wikimedia server can physically handle 2 million article creation in a short period of time. (FYI: Total number of articles on ALL wiki is approximately 10 million.) SYSS Mouse (talk) 02:01, 2 June 2008 (UTC)
    Of course it can SYSS Mouse. Knowledge is a beast and it can eat anything you throw at it. But that is not a concern AT ALL as the articles will NOT BE ADDED IN A PERIOD OF ONE DAY. Rather they will be added over a period of this year and most of next year. I bolded/capitalized for emphasis :). Calaka (talk) 02:23, 2 June 2008 (UTC)
  21. With certain criteria. I can't support flooding the site with millions of stubs, most of which won't have much in them at all. If they meet a certain population criteria, or have inherent notability, then yes. Otherwise, no. I am very concerned about the impact of adding articles on every small town in the world. Enigma 06:22, 2 June 2008 (UTC)
  22. I oppose to create the two million substubs, no matter it is done in literal overnight or in months. You click the link to the article, wait 3 seconds, and you see one sentence about the village. You want to view information about a nearby village, so you click again, and wait for 3 seconds again. What a waste of time! Creating thousands of lists (or merged mini-articles) of all villages is a much better idea. This way we don't lose any information, and browsing will be much easier. You can always split the section into own article if it gets bigger than, say, 4 kilobytes. --Acepectif (talk) 08:12, 2 June 2008 (UTC)

Users who oppose entirely

Oppose I still feel that existence does not equal notability. As a new pages patroller I struggle daily against a rising tide of non-notability where Knowledge instead of being an enyclopedia is just a weird replacement for the internet itself, where almost all things already have an entry. The country doesn't matter to me; the notability of the entry does. This will just be an automated and officially supported mass-dilution of an already melting encyclopedia. "Hey, I drove through a town with three people in it. Does it have a Knowledge entry? Yes! And what does it say? Hmmm, almost nothing. Yay!" Multiply that experience by 1,000,000 and you'll understand how I feel. Rob Banzai (talk) 05:52, 1 June 2008 (UTC)
3 people??? Most of these places have several hundred people at the very least. In the thrid world many of the places are likely to have several thousands of people. We are missing thousands of articles on towns in place slike Bangladesh and all across Africa some of which have a population of over 50,000. Not the 3 people settlmeent you are visualizing. ♦Blofeld of SPECTRE♦ 10:22, 1 June 2008 (UTC)
As far as I understand the proposal, the bot does not intend to create articles only for places that have several thousands of people, but for all places. Hence even places with only 3 people would have an article created by the bot. SyG (talk) 13:12, 1 June 2008 (UTC)
Only 28,000 google maps locations out of a possible 638,000 places in India and many other countries ar elikely to be the same. Is this what you call adding an article on every place?? 2 million article may seme huge but I guarantee that is a far cry from covering every place hamlet or dwelling as this prooves. ♦Blofeld of SPECTRE♦ 13:22, 1 June 2008 (UTC)
Agreed, 28,000 is not everything. I stil think, however, that these 28,000 should be filtered to list only the notable ones. There are various criteria (population, length, ...) that could be used to confer automatic notability. The other ones (i.e. non-notable through the automatic criteria devised) could go into lists. SyG (talk) 13:30, 1 June 2008 (UTC)
I can see that Blofeld is going to hound anyone against this idea to the ends of the earth so I rescind my oppose vote. Why not just make a stub for everything that ever existed anywhere, just in case? Oh right, we already have something for that: THE INTERNET. Rob Banzai (talk) 14:33, 1 June 2008 (UTC)
But I thought that Knowledge was supposed to be the most comprehensive site on the web or in the entire world. Besides, if this bot is rejected Blofeld and myself would simply create the 2 million article manually. It might take ~10 years, but I hate the ignorance that is shown here in regard to treatment of articles. I'm an Editorofthewiki 14:49, 1 June 2008 (UTC)
Probably that is a critical point of disagreement. I suspect a lot of persons who oppose this Bot believe that Knowledge is supposed to be an encyclopedy, not "the most comprehensive site on the web". For my part, I think there is a huge distinction between those two notions, and I would tend to think the most comprehensive site on the web already exists and is called Google. SyG (talk) 14:59, 1 June 2008 (UTC)
I agree with SyG on this. Wiki is not Google. Wiki is not the Internet. Wiki is not an atlas. I cannot see how a robot creating 2 million articles "on our behalf" will help human editors keep Wiki close to its original purpose. If you want to create the 2 million articles manually, go ahead, but as with all one-line stub articles, they may be subject to AfD if they cannot be shown to be notable, just as the Bot articles would doktorb words 15:12, 1 June 2008 (UTC)
  1. Oppose This seems quite the most bizarre idea I have ever read on Wiki. It seems impossible to follow the discussion in full, but a lot of the points seem to have been made very clearly above. To include millions of stub articles "just because we can" is like filling a writing the words "Shopping List" at the top of every sheet of a writing pad. It may well serve a purpose at the time, but come three days later when you want to write a job application, you've just a writing pad full of potential shopping lists. I guess this proposal is already too far "down the line" for the likes of me to have much sway on the matter, but I have grave concerns about the direction Wiki is taking here. doktorb words 06:18, 1 June 2008 (UTC)
    Addressing systematic bias by giving other same size places in the world as in the United States an equal chance to develop compared to a shopping list??? Now that's bizarre ♦Blofeld of SPECTRE♦ 10:33, 1 June 2008 (UTC)
    You have missed my point entirely. If you write "shopping list" on every page of a writing pad, you have a perfectly good pre-prepared item for jotting down all the little bits of shopping you may need in the future. But when you need to write a job application, all you have is a writing pad you've written "shopping list" on. In other words, it's fine to create 2 million articles as a "just in case", but when you want to use Wiki as an encyclopedia, all you will find is nothing more than you could get from an atlas or a Google search. doktorb words 15:16, 1 June 2008 (UTC)
    The current systematic bias could be solved in other ways, e.g. by deleting articles on non-notable US towns. The bot could also create other forms of systematic bias, e.g. by stating implicitely that any hamlet in the world is notable, while fictional characters are not. SyG (talk) 13:15, 1 June 2008 (UTC)
  2. Oppose Wiki is not paper, but it isn't a dustbin either. As argued above, it will enhance Knowledge's quantity but not its quality. I am strongly in favour of taking the German Knowledge as an example and creating no more robot articles at all. Steinbach (fka Caesarion) 07:23, 1 June 2008 (UTC)
    A dustbin? Thats how you refer to articles that cover the world that can be expanded?? What about all those lists of fictional anime characters etc that wikipedia has in the thousands. ♦Blofeld of SPECTRE♦ 10:19, 1 June 2008 (UTC)
    A list with thousands on instances is easier to manage than thousands of articles. For example, if a list with fictional anime characters is non-notable it can go to an AfD, in a much easier way than thousands of dispersed articles. SyG (talk) 13:18, 1 June 2008 (UTC)
  3. Resounding oppose - an incredibly bad idea. Knowledge aims for quality, not quantity; flooding with loads of robot-written articles that amount to nothing is valueless but harmful. ╟─TreasuryTag (talk contribs)─╢ 07:09, 1 June 2008 (UTC)
    And yet all these articles will be referenced--How many articles can say that? A majority, but over 100,000 don't. The will be expanded by myself and Blofeld. I'm an Editorofthewiki 14:51, 1 June 2008 (UTC)
    Two million articles expanded by two people? --Michael White 21:02, 1 June 2008 (UTC)
    Don't make it sound like no one is going to want to work on these articles. It is a whole lot easier to fix a slightly messed up article than create a new one from scratch. --NickPenguin(contribs) 23:58, 1 June 2008 (UTC)
  4. What do you want Knowledge to be? This has a few neat side-effects—such as dishing out coordinates for many more places to third parties that use coordinates in articles—but I don't think we're an atlas. How do you envision these articles developing? If someone with familiarity with a village adds some information to its article, it will be "unreferenced" and therefore de-valued by the community. If little to nothing "encyclopedic" can be said about these locations, do we really want another 2 million permanently empty stubs? Regarding systemic bias, you could just as well argue that this approach is an affront to the cultures we're supposed to be engaging. People counter bias, not bots. –Outriggr § 08:20, 1 June 2008 (UTC)
    "If little to nothing "encyclopedic"". Who are you to judge the world that you don't know about?? Doesn't it seem strnage that a full article can be written about an unincorporated village in America which is very encyclopedic but that the same sort of the size place in other countries isn't "encyclopedic". "People counter bias not bots"?? A bot is merely a tool to emulate a human editor in creating new pages but far more rapidly and consisently. It is then the responsibility of editors to expand and develop these articles, but the unevensss of which suggests gross systematic bias. We write the articles for sure, but actually a bot is the first step intially to counteracting the bias in coverage on here that humans generate. Quite the opposite to what you;ve said here ♦Blofeld of SPECTRE♦ 10:30, 1 June 2008 (UTC)
    If an unincorporated village in America has an article while being non-notable, it should go to AfD. If a place in another country is notable but does not have an article, then an article should be created. The common point of these actions is they are both human, because notability is judged by humans. If we want to be sure that no "important" (in terms of size) town is left without its article, then some size-checks could be incorporated in the bot. SyG (talk) 13:27, 1 June 2008 (UTC)
    But most off the articles in America became well devekloped over time; why wouldn't the same happen to African villages? IF YOU BUILD IT THEY WILL COME. I'm an Editorofthewiki 15:03, 1 June 2008 (UTC)
    So you are saying that is probably best if we allow 2 million stub articles to be created on the promise of some kind of "magnetic attraction"? As has been said before, Wiki is not an atlas. Some of these villages have not had articles created becuase no one feels they are notable enough for the English Wiki. The idea that building one-line stubs will act as a beacon for future editors seems rather naive, with all due respect. doktorb words 15:10, 1 June 2008 (UTC)
  5. Resounding oppose - Thousands of taggers stand ready to improve these articles by adding false and unverifiable information about villages in Myanmar, Central African Republic, India, Paraguay, and Taiwan. The Guiness Book of Records stands ready to honor the editor with the greatest number of undetected Knowledge vandalisms. Lou Sander (talk) 08:39, 1 June 2008 (UTC)
    See, you admit it yourself-- there are more assessment taggers than article builders. Just look at the 100 articles the bot has already created--they are all referenced.I'm an Editorofthewiki 15:03, 1 June 2008 (UTC)
    I mean "taggers" in the sense of graffiti artists. You know, the guys who like to leave a mark just to piss off the authorities. The bot may be fine, but what about the vandals who post that "Dik Limber is the village headman," with a bogus reference to the leading newspaper in Papua New Guinea? Ain't nobody gonna catch 'em, IMHO. It'll turn into a really exciting hobby. Lou Sander (talk) 18:49, 1 June 2008 (UTC)
  6. Oppose. I don't like the idea of having 2 million articles created by a bot. It is a natural process to have real persons to create articles. These persons are also the most likely to watch the articles for accuracy and relevancy of content being added. It is also quite natural that there are more articles on small towns and villages located in the English speaking world at the English Knowledge. Much the same as there are more articles on Swedish villages at the Swedish Knowledge. It is a matter of supply and demand. I do admit that this bot creates useful and well-formatted stubs/articles, but I don't think it is within the scope of this project to create a complete database of every settlement in the World. --Kildor (talk) 09:28, 1 June 2008 (UTC)
    Why not? The Rambot was not a person, The KotBot was not a person. You admit people couln't give a darn about most of these--but we are an all-purpose encyclopedia. They are all accurate and sourced.
    I don't know about the Rambot and KotBot, but I still don't think articles should be created by bots. Yes, they are certainly accurate and sourced, for the moment. But who is going to watch them? --Kildor (talk) 15:32, 1 June 2008 (UTC)
  7. Oppose - (Don't support generating the articles or any of the options originally listed. Lists would not be as bad as individual articles.) Would like to widen the debate by considering other options. How about running the bot at user request to create individual entry when a user wants it (make it available/publicize it/use a template to invoke it/whatever). See my more detailed comment below #Generate content dynamically - run bot on user requested page only Zodon (talk) 09:27, 1 June 2008 (UTC)
  8. Oppose. The articles created by the bot are effectively conglomerations of elements of some database, it seems. Creating millions of articles will completely unbalance the striving for a reasonable overall article quality. It has to be expected that the vast majority of the articles created will never get any further edits or attention. Indeed, as pointed out by somebody above, quality, not quantity should be the aim of WP. One thing I would support is a bot which creates such a stub article at the request of an editor, but not automatically. Jakob.scholbach (talk) 10:33, 1 June 2008 (UTC)
    A complete database?? Who said 2 million articles is anywhere near covering the world?? For instance we have 28,000 google maps for India, According to the 2001 census there are 638,000 settlments in India alone. What will be added is far from a complete coverage, they are the main towns and villages from each country which is not a full coverage. If we were to try to cover the world 100% I'd imagine we'd be looking nearer 10 million new articles. 2 million is actually only a proportion of places which google has shown up to be a "notable" place. ♦Blofeld of SPECTRE♦ 10:38, 1 June 2008 (UTC)
    Then maybe we could focus the discussion on deciding whether all places shown in google maps are inherently notable ? Because if there is consensus on that, it seems most of the arguments against the bot would fall off. SyG (talk) 13:33, 1 June 2008 (UTC)
    Populated places are inherently notable anyway so no debate needed ♦Blofeld of SPECTRE♦ 13:38, 1 June 2008 (UTC)
    I would tend to agree for cities and villages (even if I am not aware of a clear Knowledge Guidelines stating that), but not for hamlets. Is there some evidence that the 28,000 places in google maps are at least villages, and not hamlets ? SyG (talk) 13:50, 1 June 2008 (UTC)
  9. Oppose First off, Knowledge is not an indiscriminate collection of information, so there is no reason to have an article about every last one-man hamlet (don't think they call them hamlets across the world, but you get the idea) in the world. I took a look at the example article, and it's not referenced, it contains minimal information, and I'm sure they are all going to be similar to that. As people have said, Random article would be a mess with en.wp's size doubled ir better. Who wants to click on it and 3/4 of the time get some eight-foot wide CDP? I don't know how close we are to filling up Knowledge's servers, and, as Blofeld of SPECTRE said, we might end up with 10 million or more articles. This would easily slow down everything, and possibly limit how many potential FAs and GAs we write. Sure, I would like to see stubs on larger towns and cities across the world. I agree with Jakob.scholbach that it would be better for the bot to create a stub on request, not by the millions. I also don't like the idea of an automated bot writing our articles. What good are us as editors then? Heck, come a few years from now, I bet we're going to see bots developing featured content. Also, I would imagine CSD and AFD would be backlogged with these articles on the non-notable locations. And, unlike the majority of the articles that currently exist, most of the stubs can't be expanded past stubs, as there won't be any more information. I am amazed at how advanced our programing and bots are getting, but I don't like this idea one bit. Juliancolton 11:52, 1 June 2008 (UTC)
    What I don't understand is why people automatically assume that because there is a great dela missing that they are naturally hamelts of 3 people. Not at all. Note that actually many missing are actually towns with several thousand in particularly Africa and heavily populated places in countries like Bangladesh. How can you just pass them all of as "non notable"??. Believe it or not I have started many missing articles on cities in places in Africa which have a population of like 115,000 people. IN Bangladesh for instance we only have about 40 articles out of 27,000 that will be created. There must be thousands of articles missing on solid towns even cities with tens of thousands of people. I guarantee the average place started is likely to have hundreds of people living in them or at least qualify as a notable inhabited settlement above a hamlet level. WHy are you so locked out of the idea that actually something could be written for them over time? . For instance India has 28,000 settlements on google maps. Actually there are 638,000 that exist. This for me proves that the articles started are the main towns and villages in these countries. ♦Blofeld of SPECTRE♦ 12:03, 1 June 2008 (UTC)
    Blofeld of SPECTRE, you have badgered almost every oppose so far, saying basically the same thing. I understand you have your opinion, but other people have their own, and you don't need to try to prove everybody wrong. Also, I didn't say all of them are going to be non-notable. But if we have an article about every last documented settlement in the world, some of them are bound to be non-notable. As I have heard other people state, if the Indonesian Knowledge doesn't have an article about a certain village, there is no reason at all for use to haveone. Juliancolton 13:08, 1 June 2008 (UTC)
    The example page being cited was the first go! The actual trial created articles with a better sourcing. As for your comments regarding technical matters, I assume these were already considered by BAG who approved the bot's operation. Plus, we don't worry about performance when adding content. I have commented on the usefulness of more information in my response at the top of the page Fritzpoll (talk) 12:00, 1 June 2008 (UTC)
    Using the poor development of another wikipedia, particularly like one in a country like Indonesia is one of the lamest criterias for notability than I have ever seen. A city with a population of 400,000 is a one liner on Indonesian wikipedia so that means, ahhh, not much to know about that then. I'm not doubting that a lot more can be written about some places than others but I'm sure you would find that actually a lot could be written into the encyclopedia on just about any populated place. This has been proved by the many full articles which are eeven FA'S on villages or small places... ♦Blofeld of SPECTRE♦ 13:14, 1 June 2008 (UTC)
    Ah, but the only reason we have FAs on small vilages is because we have living people to work on them. The bot is just going to create stubs...short, uesless stubs with no more information than can be found a map. My oppose still stands, especially per Knowledge is not an indiscriminate collection of information in WP:NOT. And while there are indeed going to be notable towns that don't have articles, there are going to be far more with three people. You tell me what good it is going to do by increasing Wikiepdia's size by more than 2-fold with articles about those places. The only thing it can do for us is fill our servers, backlog our processes and stress the system as a whole. And consider this; with millions and millions of more articles, we're going to have twice as much vandalism. Who's going to spend all day reverting the doubled vandalism rate? Juliancolton 14:56, 1 June 2008 (UTC)
    I've got a fairly strong opinion about this. There are a lot of places in the world where they are possibly unable to create their own pages due to lack of availability of computers. How are they to get their voice on WP? This is i=not making WP an indiscriminate collection. It is giving people a voice who would not otherwise have one. Carter | Talk to me 17:05, 1 June 2008 (UTC)
    But in the vast majority of these small locations throughout the world, the only people who have ever heard of them are the persons that live there. And if they don't have computer access to write the articles, why do we need them? There is an entire internet for people to write blogs, articles, and news stories about every last bit of information in the entire world. Knowledge is not within that scope. I can't imagine seeing two million one-sentence stubs about non-notable locations in Encyclopedia Britannica. So, since we are just as much an encyclopedia as Brtiannica, why should we include all that information? Juliancolton 17:18, 1 June 2008 (UTC)
    I don't think you should say, "And if they don't have computer access to write the articles, why do we need them?" Implying that someone who doesn't have computer access' opinion doesn't matter is silly to me. Everyone's opinion belongs as long as it's not vandalism. Everyone should have a say. Carter | Talk to me 17:29, 1 June 2008 (UTC)
    Of course, everyone's matter is important. Likewise, everyone should indeed have a say in life. However, Knowledge is not the place for this. We're an encyclopedia, not a blog, not a fourum, and not a map. People aren't specifically going to be voiced by having an article about their village, so I don't see what the big deal is. Juliancolton 18:35, 1 June 2008 (UTC)
  10. Oppose Citing WP:NOTE in the case of tiny villages; and in the current vandalism climate, articles about notable places should have their defenders. What better way to ensure this than to wait for a human to create the page? In a bid to expand Knowledge beyond the world view of Wikipedians, the quality and reputation of the encyclopedia will be damaged as a result. One lot of censuscruft shouldn't excuse another. Where are all the ADW people? -- Regregex (talk) 12:55, 1 June 2008 (UTC)
    Where did you get the idea they are all tiny villages??? We are missing tens of thousands of articles on towns with thousands of people in. Read the comments above . Not only are humans spending hours checking out these places and planning things first but humand have proved to be extremely useless at creating consistent articles with references for towns and villages, I've spent weeks cleanning up peoples mess and trying to make articles by country consistent. The bot is the best way to start them initially and improve considerably the chance of expanding them ♦Blofeld of SPECTRE♦ 13:09, 1 June 2008 (UTC)
    Never said they were all tiny villages. Got your point about consistency but Knowledge is not a data harvester, it needs the human touch. Search engines are the place for scraped automated statistics. To find them in an encyclopedia supposedly hand-crafted by humans is pretty insulting -- way to say "Nobody cares about your town." A non-existent page would be better. -- Regregex (talk) 13:43, 1 June 2008 (UTC)
    Well such an outlook that "nobody gives a damn about your town or village in the world" is precisely the ignorance that has seen wikipedia develop in the way that it has, seeing the world from an Anglo-Centric viewpoint rather than actually how the world should be represented evenly. There are already thousands of articles on here which show or in a search engine as "automated" when on wikipedia we ahev half decent articles with images. Why would it be impossible eventually for people to expand articles and show we are better and more valuable than the other sites? ♦Blofeld of SPECTRE♦ 13:50, 1 June 2008 (UTC)
    Blofeld, I fully respect the view you are making, but it does not stop those of use who oppose this Bot and its work from being very worried about the consequences of such a massive article creation programme. You have made it very clear that there are thousands, maybe millions, of potential articles in line for creation. If, say, a tenth of these are one-line stubs with an infobox, is that really an improvement? Surely we can get the articles you desire - for every place and settlement on earth - without having to create millions of one-line summaries. Why is it so apparantly and seemingly important for this Bot to be let loose on the project, creating articles "on our behalf"? Are human editors no longer required, because that is how it is starting to look to me. This Bot is a very worrying phenomenon and I think we all have the right to express our concern doktorb words 13:59, 1 June 2008 (UTC)
    Note that these articles will be created over a period of a year or more and not all by tomorrow morning. A lot of work still needs to be done. Furthermore I have to say that having "something" is better than "nothing".Calaka (talk) 14:25, 1 June 2008 (UTC)
  11. Oppose How useful is it that people will type in the name of a village and get ... nothing. Then Knowledge will be regarded as a big empty pointless space. This is a community not a computer-generated space to be coloured in. There's a very good reason why the english-language wikipedia is dominated by articles relating to the english-speaking world. If you think about it long enough you'll understand. Why not have a bot make a list, and have a automated creation of one of these perfectly-created pages as soon as someone thinks its notable enough to be created Almost-instinct 14:08, 1 June 2008 (UTC)
    Note that there is information provided in these articles on location, administrative area and coordinates. Census information along with other sources can be added in due course. The information is available; unfortunately few editors are willing to volunteer to do this work.
    What is the good reason that our coverage on non-English-speaking places is so woeful? Just because an encyclopædia is written in English does not mean its contents must only relate to people and places in the English-speaking world surely? To answer your final point, you have proposed the system that is already in place. The bot makes a list, we have decided on the notability of villages and towns (that is, they are notable) and then the articles are created. Regards, EJF (talk) 15:32, 1 June 2008 (UTC)
    If, for instance, a typical, unimportant Spanish village is notable enough to have an entry in an encyclopedia, then - in the majority of cases - the people who think it is notable will be Spanish-speaking. The natural place for the article written by these people is NOT the english speaking wikipedia. Or do we think that something has not been noted, unless its been noted in English? Almost-instinct 15:50, 1 June 2008 (UTC)
    Why not? Forgive me, but it appears you are promoting systematic bias, rather than opposing it? How can notability only be confined to one language? It sounds as though you do not think that a subject is notable, unless it has been noted in English. Do we want to break down barriers in the world or build them? Notability transcends language, culture and nationality. EJF (talk) 19:36, 1 June 2008 (UTC)
    No I will not forgive you: I am not promoting systematic bias. I'm pointing out that the English-language wiki will naturally show a bias towards those things that concern all those that speak English; the Spanish-language wiki will naturally show a bias towards those things of interest to Spanish-speakers. It seems to me that people think that Information available in English transcends language. Shame we don't have Edward Said to hand Almost-instinct 19:46, 1 June 2008 (UTC)
    What you have just described is systematic bias; have you no wish to allow us that want to, to counter it? And what would Mr Said have said about the matter? I fail to understand how a Spanish village can be notable in Spain but not in England. English is no more a superior or inferior language to any other, so why should information that is in other languages not be allowed to be in English? EJF (talk) 19:55, 1 June 2008 (UTC)
  12. Oppose Maybe the incrementalist in me is speaking here, but I simply see no point in creating hundreds of thousands of de facto permastubs that no one but a bot wanted to have. If people want to write about an obscure location, they are free to create an article, and no-one will mind. Let humans be the selectors, not stupic bots. (I am however neutral on the bot creating lists of locations with incoming redirects.) – sgeureka 14:15, 1 June 2008 (UTC)
    Have you not read any of the proposals???? The content generated is sorted by humans. ♦Blofeld of SPECTRE♦ 14:19, 1 June 2008 (UTC)
  13. Oppose I think the discussion about this Bot is a wonderful opportunity to review entirely our Notability Guidelines about places, and to decide once for all (or at least once for a while) proper criteria infering automatic notability for places. Running this bot would be an implicit assumption that all places are notable by mere existence. This should be discussed and transformed as a Knowledge Guidelines before we act (or not). SyG (talk) 14:22, 1 June 2008 (UTC)
    Again, it is widely accepted that populated places regardless of size are inherently notable. I;ve seen hundreds of sub stubs far worse than the bot types articles thrown out of afds with a resounding keep because of WP:SNOW. ♦Blofeld of SPECTRE♦ 14:42, 1 June 2008 (UTC)
    If that is the case, it should be made a Knowledge Guideline. Only then can we allow a Bot to create automatic articles for places without appropriate scrutiny. SyG (talk) 14:54, 1 June 2008 (UTC)
    See Knowledge:Articles_for_deletion/Common_outcomes#Places, "Cities and villages are acceptable, regardless of size". Regards, Ganeshk (talk) 15:43, 1 June 2008 (UTC)
    I'm inclined to agree with this sentiment. If we can gain a formalised expression of this apparently "widely accepted" principle (please don't take this as a dispute that it exists!) then the bot will have more of a leg to stand on, so to speak. Oli Filth 15:18, 1 June 2008 (UTC)
  14. Oppose I decline to give my reasons (though they exist) because virtually every editor who has so far has been "answered" in a badgering, hectoring manner that i don't care to open myself to. Allowing discussion is great, but the same person making the same arguments over and over, whatever is said, is not discussion. Cheers, Lindsay 14:44, 1 June 2008 (UTC)
    Hear hear! Since my points didn't get refuted I can't decide if that means that my contribution was (a) fantastic or (b) beneath contempt ;-) Almost-instinct 14:49, 1 June 2008 (UTC)
  15. Oppose In the nicest possible way, I fail to see what useful purpose this can serve. As far as I understand, what we would be condoning here is data-mining and re-presentation of information that's already readily accessible from other directories, on an unprecedented scale (as far as Knowledge is concerned). Well, we already know that Knowledge is not a directory. The overwhelming majority of this information will never be expanded upon (i.e. the majority of these articles will remain as stubs for an exceedingly long time), so statistically, it's not as if this would be sowing the seeds of greater things to come (quite the opposite from an administrative point of view, as far as I can see). And as for arguments of systematic bias, I feel that would be much better served by getting rid of some of the already-existing US-oriented stubs. So other than the feel-good factor of vastly increasing the article count, and the show-off factor of "what other encyclopaedia lists two million villages?", I just don't see any benefit here. Oli Filth 15:01, 1 June 2008 (UTC)
  16. Oppose Per SyG. Let's decide what is an is not notable. Hamlet with a population of 3 = non-notable (and pace Rambot, I wouldn't mind at all if tiny U.S. towns were deleted, either). IronDuke 15:15, 1 June 2008 (UTC)
  17. Oppose I thought I'd better go on record since my name was mentioned previously in a positive context. I think Rambot was a mistake, and I think this would be a much bigger mistake. Since only very few people (perhaps nobody) will be able to make regular edits to millions of articles, collaboration and incremental improvement to the template will be severely restricted. So the quality will generally be very poor. This is exactly what we saw with Rambot, we had 30,000 articles built from a template which was written and updated by a single person. The template was poor, and this poor quality impinged on public perception of the whole encyclopedia. -- Tim Starling (talk) 15:36, 1 June 2008 (UTC)
  18. Strongest Oppose We need to work on our re-existing problem of thousands of stubs before throwing another two million on. Reywas92 15:39, 1 June 2008 (UTC)
  19. Oppose let's focus on quality rather than quantity. In addition, as noted by somebody else above, I doubt many of these places (especially smaller towns and villages) are actually notable on their own. Better to cover them, when necessary, on the corresponding municipality's article (if any, and only when actually useful). --Angelo (talk) 16:23, 1 June 2008 (UTC)
  20. Strongly oppose -- WP needs to increase its coverage of non-western issues, Botgenerated stubs are not the way to do it. I would be willing to consider creating pages for collections of locations (villages, etc.) that are approximately article length with each notable location redlinked. I think the automatic addition of articles on American geographical locations was also a bad idea--most of the locations would have had articles created eventually anyhow, but now the prose is written around a bunch of census data that doesn't need to be there. -- Myke Cuthbert (talk) 16:48, 1 June 2008 (UTC)
  21. Strong oppose: Harebrained at worst, misguided at best. Who's watching all these useless, one-sentence pages? Cannot see any justification for adding 2 million substub articles, I'm sorry but an infobox is not content. Rambot created tens of thousands of articles, with minimal, albeit more, content and years later there are tens of thousands of articles that have nary another edit, languishing in obscurity to say "Bob Smith likes men" for all time. I think the amount of time and resources that Wikipedians have has been badly overestimated. Sure they're notable but doubling the size of the project is a bad idea unless you plan on doubling the number of active editors, it took what, 7 years to amass 2 million articles? And now we are going to double that overnight? Doesn't make much sense to me. (Also I don't mean overnight literally, so spare me the lecture). IvoShandor (talk) 17:19, 1 June 2008 (UTC)
  22. Strong oppose. It may be possible to create articles in an automated fashion, but it should be recognized that Knowledge is maintained by humans. Already now, we have rather too many articles to maintain; about 16% are flagged for cleanup, and the backlogs seem endless. We rather need to focus on improving existing articles than creating new ones. At the moment, the scope of Knowledge is at least limited by the time of people who create articles - with mass creations by bots like these, the idea of having a maintained encyclopedia becomes entirely ridiculous. --B. Wolterding (talk) 17:22, 1 June 2008 (UTC)
  23. Oppose Awful, awful, AWFUL idea. I don't see the utility in adding millions of pages that are nothing more than a one-sentence description of a red dot on an orange map. The overwhelming majority of these articles will NEVER be expanded beyond that, and no one's going to be looking up a 43-person town in western Zimbabwe, anyway. I'm all for countering systemic bias, but this doesn't do that. Systemic bias is not determined by what has articles in WP, but by how things are covered in WP. The article for New York City will ALWAYS have more information than 99% of the articles threatening to be created by this bot. This is a horrible idea -- plain and simple. — MusicMaker5376 17:35, 1 June 2008 (UTC)
    Just like Kushgag. Yeah there will be no editors at all willing to actually write any decent article whatsoever. Automatically created by the bot of course means that they are locked from ever developing for life. How do you think wikipedia ever started? What we have today is precisely the result of limited starter articles ♦Blofeld of SPECTRE♦ 17:51, 1 June 2008 (UTC)
    We currently have 2.4 million articles, and barely enough people to destub the articles we currently have. Doubling that number -- no matter how slowly -- without the concurrent bump in personnel that would normally come with it is a bad idea. A bad, bad idea. — MusicMaker5376 19:47, 1 June 2008 (UTC)
  24. Strongly oppose — This proposal is a massive accession to the other stuff exists argument. Many of the more reasonable supporters claim that adding a stub article on every village in the world would counter systemic bias in Knowledge, apparently because we have already made such a robotic accounting of US places. Some of the less reasonable ones have argued that this accounting is rightful compensation for our tolerating "fictioncruft" or whatever proliferation of articles on characters from various books and games; that this self-serving position has a place alongside the other argument suggests to me that the whole rationale is weak. Why should we create an article on a subject in which it is not known that a single person has an interest? I don't believe that the concept of "notability" could ever apply to such an article: the word itself is an inherent claim of interest to some people. A priori, the only people interested in an article on a random village are the inhabitants of the village, and for many places, in fact this is the permanent extent of interest. More people live in the apartment building across the street from me than live in many of the places that would get an article under this proposal; furthermore, that building is better documented than these places, its residents have more personal and professional connections than their residents, and it quite possibly exemplifies numerous aspects of artistic, architectural, and urban planning paradigms, not to mention historical trends, that no sleepy hamlet anywhere in the world could lay claim to. Yet it would be absurd to write an article on every structure in New York City. We must decide when writing the article that the subject is interesting; otherwise, we rightfully delete articles which concern, for example, not-yet-notable academic figures (who might nonetheless produce good stuff in five years). This proposal is also a knowing violation of the spirit of "Knowledge is not a dictionary". An article on a random town, containing only population information and location, with negligible chance of ever being more than this, is exactly the same thing as an article on an obscure term containing only its definition. To my mind, the prohibition on dictionary definitions is a statement of what an encyclopedia should be: not an enumeration of facts (which are merely true, something that is the case for more statements than we would ever consider including here), but a compilation of knowledge: facts which reflect research, conversation, the interaction of ideas, human intellectual accomplishment. If we often miss this mark, that is no excuse for tossing the idea entirely. Ryan Reich (talk) 17:49, 1 June 2008 (UTC)
  25. Oppose – millions of unreferenced stubs with no further oversight, watchlisted by nobody and prone to vandalism. And I don't quite understand where the documentation comes from. Is it really that reliable, without mistakes? Most unlikely. Thus it is an extremely dangerous proposal. Colchicum (talk) 17:51, 1 June 2008 (UTC)
    + problems with romanization. Colchicum (talk) 17:55, 1 June 2008 (UTC)
    Unreferenced stubs???? Dear God have you even read the proposal? Is three links not adequate still?  :::: ♦Blofeld of SPECTRE♦ 17:54, 1 June 2008 (UTC)
    Are the links really independent from each other? And what about the rest (growing number of stubs nobody really watch, where even obvious vandalism can persist for months, let alone sophisticated mis/disinformation)? Colchicum (talk) 18:02, 1 June 2008 (UTC)
  26. Strong Oppose - decline to give reasons, due to badgering and heckling, per Lindsay above. Fin© 18:19, 1 June 2008 (UTC)
  27. Oppose; if we have real information, like the census information we used for American cities, then we should create the articles. Until then, we have nothing more than trivial mentions going into articles that an atlas would handle much more concisely.--Prosfilaes (talk) 18:22, 1 June 2008 (UTC)
  28. Oppose The proposal is a classic 'because we can' and not 'because we must'. 95% of the created articles will remain stubs until the heat death of the universe, or until Jimobo sells wikipedia to Microsoft, whichever comes first. A momnumental waste of effort, and a positive damnation of all the people who can't see it as such. Seriously, of all the things that need doing to articles that already exist, and they think this is a priority? Get some freekin perspective. MickMacNee (talk) 18:27, 1 June 2008 (UTC)
  29. Oppose. Databases are wonderful things, and so are encyclopaedias, but the two are different. Knowledge is not an indiscriminate collection of information, but it would be if it contained exactly the same information as a database (minus the ability to perform advanced queries, of course). Jakew (talk) 18:32, 1 June 2008 (UTC)
  30. Strong oppose, the Rambot stubs are bad enough, let's not have more of them. Such listings may perhaps be workable as a "List of places in X", as this meets the almanac argument (we incorporate some features of an almanac, but an almanac would not devote a full page to a non-notable village of three people, they would list it in a table). If the bot intends to create millions of permastubs, I cannot and do not support it. Seraphimblade 18:37, 1 June 2008 (UTC)
  31. Strong oppose per above. Knowledge already has far too many neglected stub articles that need to be fleshed out. A Bot creating millions more on non-notable hamlets and such, is a step backwards. And the badgering and hectoring displayed above only increases my conviction that this is a being pushed by editors who want their way, rather than what's good for Knowledge. Shawn in Montreal (talk) 20:09, 1 June 2008 (UTC)
    Like developing Canadian films? The first person who can provide an argument that I haven't tried to do my uttermost hardest to improve wikipedia for everybody is clearly mistaken. If I didn't have a plan to vastly improve wikipedia in the long term I would seriously not bother doing what I do here and would leave. ♦Blofeld of SPECTRE♦ 20:17, 1 June 2008 (UTC)
  32. Oppose. I would support country-by-country addition of articles if a data source could be found for the country that: 1) had some reasonably reliability; and 2) had enough content to produce at least a minimal stub article with some demographic information. I wouldn't support mass addition of nearly empty articles from data sources with lots of errors. --Delirium (talk) 20:12, 1 June 2008 (UTC)
  33. Strong oppose - creating two million extra stubs means that 99% of them will never be improved past their micro-stub and frankly useless status. The "List of places in X" is acceptable, but the giant amount of stubs is not. I've also stressed that the administrative backlog that this will create from mass deletions is practically untenable, as is the strain we're going to put on our recent changes patrollers in dealing with the flood of vandalism. That and misinformation suddenly becomes far easier to do, especially considering that most recent changes patrollers will take it as a good faith edit and leave it alone, only for it to sit there forever, which will be the case with practically all these articles. I still think this proposal goes against WP:NOT#INFO and WP:NOT#DIRECTORY also. Sephiroth BCR 20:16, 1 June 2008 (UTC)
  34. Oppose - a terrible idea - we should be deleting/redirecting unmaintained stubs, not creating new ones. Now, if you wanted to just create them all as redirects to List of places in X articles or to the name of the city/county/state/whatever that they are in, that's a more useful endeavor. But having unmaintained content out there just makes it more likely that vandalism will go unnoticed. --B (talk) 20:39, 1 June 2008 (UTC)
  35. Oppose, I've looked through some of the articles created in the bot's trial run, and they only give the country, province, latitude, longitude and elevation. The latitude and longitude are approximations (Darayem, Afghanistan is typical, but most of the time Google maps can't get in that close), and sometimes have the village perched on a mountainside, or in the middle of a river (Chasnud-e `Olya, Dudgah). The bot would be flooding Knowledge with incorrect information. Fee Fi Foe Fum (talk) 22:42, 1 June 2008 (UTC)
  36. The more I think about this, the more I don't like it. Even creating only lists isn't all that great. What's wrong with just letting Knowledge expand at a natural rate? The Volapük Knowledge was heavily criticised (and IIRC almost deleted and restarted) for using bots just like this to create thousands of stubs (I think they were even geographcal places too). Why would we want to bring that here? Knowledge has content based on what its users want, because its created by its users. This will likely lead to a systemic bias toward English speaking areas. I think that as a problem, its overhyped. I would venture a guess that the Arabic Knowledge is biased toward the Middle East and the Russian Knowledge is biased toward Russia. Its perfectly natural and, while it does need to be addressed somewhat, using a bot like this as some sort of full frontal assuault against bias isn't a very good idea. We also have to consider how this will affect Knowledge's image. We generally make a press release for every X millionth article, how is it going to look when we release 2 of those a month apart and the articles are near-identical bot created stubs? Yes, we'll have 4 million articles, but only half will be created in the real wiki tradition. Storden, Minnesota (a random rambot article) is a good example of what's likely to happen to most of these articles - Since it was created almost 6 years ago, 17 of its 21 edits have been by bots, only 1 human edit has actually added any new text, 1 sentence about a highway, and even that wasn't done until earlier this year. Mr.Z-man 00:06, 2 June 2008 (UTC)
  37. Strong oppose. The idea of inherent notability is inherently repulsive. Not every little boondocks in every little backwater deserves an article. Take my island, for example. 14,000 people scattered in about two dozen settlements, and we don't all agree on where one starts and the other stops. One day I noticed that Boven Bolivia got its own article, and was quite surprised to find that there was a notable settlement 5 km from my house. Of course, it's not notable ... it's an old farmhouse. I'd never heard anyone call it by name, although I eventually located the old plantation name on a map. Apparently important enough for Knowledge.
    The article on Bonaire shows the right way to handle trivial locations: a short list in a parent article. If the parent article isn't notable enough to exist, then the subordinates shouldn't exist either. Giant lists of trivial spots aren't much of an answer, either.
    Just as Knowledge isn't a cookbook or a television series trivia guide, it isn't an atlas.Kww (talk) 00:58, 2 June 2008 (UTC)
  38. Oppose. If a place is not notable enough for a human to actually spend the time to sit down and write an article about it, then it's not notable enough. Knowledge is NOT an indiscriminate collection of information, an atlas, etc. - Merzbow (talk) 01:27, 2 June 2008 (UTC)
  39. Vehement oppose for reasons outlined above, and because notability is not inherent on tiny, unincorporated settlements (be they in the USA or Malawi), and finally because I'd rather have people decide on these matters than a bot. Biruitorul 01:28, 2 June 2008 (UTC)
  40. Oppose as outlined above. Also, if we include every city and village in the world, then why not every person? We need to draw the line somewhere. We can't flood the encyclopedia with new stubs on non-noteworthy towns.Z1720 (talk) 02:00, 2 June 2008 (UTC)
  41. Oppose This would mean that the majority of all Knowledge articles would be very short and computer written which is not good for our image and would also result in a lot of administrative work. Articles like User:Fritzpoll/GeoBot/Example are uninteresting for most of the readers. Vints (talk) 08:16, 2 June 2008 (UTC)
  42. Oppose What a horrible Imagination. 1,8 Million more bad quality articles at the en-WP. And here are already a lot of rubbish. At the most of all other language versions such substubs would be deleted. EN-WP is the biggest of all Wikipedias. But not the best and with such actions the quality levil will sink to the bottomless. Marcus Cyron (talk) 10:16, 2 June 2008 (UTC)
  43. The most extreme and most extensive oppose in Knowledge's history. It is impossible, improbable and it will be a masive headache. Having an article about every town and city in the Philippines is enugh. If you add unnotable barangays, then it's a huge pain in the rear. Add more unnotable places on planet earth and Knowledge can easily crash. There's too much stuff to fix in the English Knowledge, starting with unnotable persons and places. Cebuano Knowledge recently had a huge problem: articles/stubs about French Communes. Please spare the English Knowledge from that gruesome fate. -iaNLOPEZ1115 · TaLKBaCK · Vandalize it 10:46, 2 June 2008 (UTC)
    for a young lad from the philippines who had admitted he never edits on this site, that is quite a statement to make about english wikipedia. It is actually possible and the articles will not be on sub standard baranguays of the Philippines thankyou. ♦Sir Blofeld ♦ 10:53, 2 June 2008 (UTC)
    I have to say, the editor's nationality and frequency of editing has no relevance in this discussion whatsoever. IvoShandor (talk) 13:29, 2 June 2008 (UTC)
    Right. @ Blofeld: Could you stop to comment the votes of other persons. It's arrogant hehavior. Marcus Cyron (talk) 19:36, 2 June 2008 (UTC)
  44. Strong Oppose. This is a terrible, terrible idea. We're talking about 2 million 1 sentence articles that will likely never be expanded. If a place is notable, it will get an article, we don't need to manufacture them. Kaldari (talk) 16:35, 3 June 2008 (UTC)

Voting is evil

  1. EVIL!. I support this bot so long as the articles it creates are more than "such-and-such is a hamlet in Wherezistan" micro-stubs. --Carnildo (talk) 04:56, 1 June 2008 (UTC)
  2. Indeed, Poles are quite evil. I support the proposal as is, but I would extra bonus support it if it were a more complex proposal in which it would gather was much data as possible for each country and work from there to create more complete articles. It would be more work, and extra code would need to be done for each source, and a new generic article would have to be created for each country, but it would contribute that much more to the encyclopedia (and it would probably get more people to support this). --Rory096 05:04, 1 June 2008 (UTC)
  3. Don't trust polls (but don't trust bots either). Doczilla STOMP! 05:14, 1 June 2008 (UTC)
  4. Consensus was moving towards yes, until someone started a poll. ;-) --Kim Bruning (talk) 17:54, 1 June 2008 (UTC)
  5. I won't vote; but I will express my general support for this proposal. I don't think that one-sentence articles (such as the examples) have much value, and I expect that most of the articles created will take years to be expanded at all. Still, if our goal is to sum all the world's knowledge, and to do so in a fair and unbiased fashion, the benefits of running this bot quite assuredly outweigh the costs. I also support, at least initially, running the bot first on larger villages, and then, if all goes well, on the smaller ones. Perhaps the largest issue is referencing: can we find (ever) reliable, independent and published references on each of the towns to be created? -- Rmrfstar (talk) 19:03, 1 June 2008 (UTC)
    "To sum all the world's knowledge" is a goal frequently cited in this debate, and apparently just as frequently misinterpreted. Knowledge is not simply fact, or documentation of fact: it is also opinion, comparison, and research, the sort of material that is contained in a secondary source; fact alone is the province of primary sources, and as Knowledge is by intent a tertiary source, simply documenting the facts is presumptuous and, actually, a kind of original research. It is not our mandate to provide exposure ("notability") to topics which otherwise have none; in fact, we generally frown on uncitable promotion.
    You also claim that we must sum this knowledge in a "fair and unbiased fashion". Complete fairness is impossible: even if we created all 2e6 articles as proposed, they would remain for the vast majority in a pitiful state of neglect, while the articles which reflect our "systemic bias" towards Western topics would flourish all the same. The reason is that we do have a systemic bias, and it's a bias in our population of editors, their nationalities, and their interests. The cost of running this bot is to burden us with a mountain of material of minimal value, all of which must be defended from vandalism, argued on AfD, categorized, and endlessly encountered to general annoyance on Special:Random; in general, it would dilute our efforts by distracting us, seriously impair our appeal to casual users, and not contribute much at all other than by way of making us feel good about ourselves. The perfect is the enemy of the good, and in an attempt at approaching perfection, this proposal would write a lot of bad articles. Ryan Reich (talk) 20:03, 1 June 2008 (UTC)
  • As you admit, knowledge comprises fact, and this bot should add facts, if nothing else: that is a plus. Its additions should also, though less certainly, encourage users to add the rest of knowledge (this is a wiki). I quote Joe Schroeder in asserting "You don't have to get it right. You just have to get it going."
  • This bot performs no kind of original research. It should do a better job of referencing.
  • A lot of articles in "pitiful neglect" is better than what we have now. Certainly, our current biases will continue to exist; but if they are ameliorated just a bit, we have done Good.
  • The burdens you speak of are trivial when compared to the benefits earned by improving our global coverage. I do not see how this bot would "seriously impair our appeal to casual users".
  • The perfect is the enemy of the good, but only given limited resources. Knowledge has (practically) infinite resources. -- Rmrfstar (talk) 21:15, 1 June 2008 (UTC)
  • Knowledge includes fact, but the entire contribution this bot would make would be to provide a "dictionary definition" of some millions of geographic places: location and possibly population, but nothing else. This is not knowledge, because at no point did any intelligence intervene in it. We are not a dictionary, and by that standard I believe we should stay out of the atlas business as well. This kind of information has its place on the Internet, but just like definitions, quotes, source code, and textbooks have their own Wikimedia projects because their contents are too rote to be included here, this content likewise belongs elsewhere.
  • The bot performs no original research, true. However, the designation of millions of undistinguished (and also undifferentiated) towns as "notable" is a claim of the sort that would otherwise be established only by recourse to secondary sources. In other contexts, making claims that belong in secondary sources is original research, and this cannot be considered differently. This is again a facet of "what is knowledge": we do not merely compile facts, and we do not evaluate facts that have been elsewhere compiled; that evaluation, of which the determination of notability is one example, is original research. If a town is notable, someone else by definition will have had to agree in public.
  • An acknowledgement that these places exist is not an amelioration of any kind; the bias is still explicit in our neglect of their detailed coverage. If you put lipstick on a pig, it's still a pig, and maybe our bias is ugly, but going through the motions only seems like an improvement. What you say seems to me like a statement "We must do something; this is something; therefore we must do this".
  • The burdens I speak of would be trivial if the benefits were as substantial as you think. These articles will be no coverage whatsoever, yet since they were created by a bot, no human would ever be responsible for the vast, vast majority of them (the numbers alone prevent this). Since I don't believe that anyone supporting this proposal actually knows anything about the towns in question, even if one of them were watching an article that got vandalized, they wouldn't know. Especially if it were subtle misinformation: I'm thinking of a Chinese–Tibetan rivalry, or Israeli–Palestinian, the sort of thing where half the population to whom the subject matters cares only about putting their POV stamp on it. And in the mean time, those articles which are being neither expanded nor vandalized (that would be about 1.9 million) would comprise half of all of en.wikipedia, making Special:Random a joke; now, what's the point of having the sum of all human knowledge in one place if anyone wanting to just "flip through the book" sees only page after page of geography? That wouldn't be so bad (after all, knowledge of geography is very important) but the only thing those pages would teach them would be the locations of towns with no information about any of them! Considering the existing poor coverage of these towns already, I doubt there is a population here of any substance capable of writing about them, and the chance of any one of those people hitting the page for a town they are familiar with is about one in a million. In my opinion, absolutely no encyclopedic benefit would arise from this project.
  • The perfect is always the enemy of the good, and the more resources are available, the more abuse can be perpetrated in the name of the perfect. You think that our infinite resources absolve us of the necessity of sacrificing quality to save space; however, this project will sacrifice quality to gain space. We will have more of less. The nearly blank articles will be so numerous that neither any one of them, nor anything else, will be possible to find unless you're looking for it, but if you're looking for a particular town, you already have the will to do what proponents think will happen to these stubs: write about it. You think that our enormous population of editors will improve our ability to cover these towns? Only a small number of people is ever interested in writing on a particular subject, and they might as well create the articles by hand right now, because when they're done (and it's probably going to be sooner than when this bot would finish) they will have improved Knowledge by as much as we have the capacity to do at this time, and probably by a lot more than they would have if two million bad, vandal-prone stubs had been created first. Ryan Reich (talk) 22:10, 1 June 2008 (UTC)

Suggest tables with redirects

This is highly similar to a current discussion at Knowledge talk:WikiProject Astronomical objects#Asteroid articles, and I'll suggest the same solution here as I've suggested there. Create lists of places, with tables containing the (frankly, quite sparse) amount of information that will be in these stubs and a redirect from the place name to the specific entry in a table in one of these list articles. If anyone ever cares to, the redirect can be replaced with an actual (non-stub) article, the place name exists in Knowledge (as it should), and Knowledge contains whatever data is known about the place (in tabular format, rather than as text in a stub). Short of spitting the delegates based on ... (sorry - wrong forum) - doesn't this solution address all concerns? -- Rick Block (talk) 04:42, 1 June 2008 (UTC)

The main difference is that 1. those asteroids are not notable for their own article 2.They can never be expanded since no further information exists about each asteroid. Every city can be expanded with more information, and I believe every city inherits notablity.(if it is a city and not just 3 people) Why not just have the articles there so people can add to it. But I do agree most of theses will stay stubs for a very long time. -- Coasttocoast (talk) 05:02, 1 June 2008 (UTC)
What is the essential difference between a redirect to a table entry that contains the information that would be in a stub, and having that information in a stub? -- Rick Block (talk) 05:13, 1 June 2008 (UTC)
To copy (and tweak) part of my comment in the mess above: Personally, I support lists as well. I'd rather have 30000 lists with a few dozen sections than 2 million stubs. If we make the lists exactly as the stubs: infobox, coordinates, and all, the only things we lose are things that would be exactly replicated over all the articles like the stub tags. Redirects can be created from the village name to the list and we lose nothing in terms of people searching for the information, but we retain pages like Special:Random as useful features. Once the list sections begin to expand, presumably at the same pace as the rest of the topics on Knowledge, they can be split off into other articles and we still retain the list as an index. Mr.Z-man 06:06, 1 June 2008 (UTC)
This is not about abstract notability. We would all agree that Early life of Alexander the Great would be notable, but if the Alexander the Great article had only one line about his childhood, noone would think of spinning it off. I personally agree that all villages are notable, and should get full articles eventually. But, if there is a logical place to merge these village substubs (into the township articles), this should be done for now, until the content develops further.--Pharos (talk) 15:24, 1 June 2008 (UTC)
  • This is the best idea. Create a list of these places, so that if there is indeed more information to be added on an location, it can be expanded. If this solution does not pass, I oppose the bot entirely, for reasons I have stated above. seresin ( ¡? ) 08:39, 1 June 2008 (UTC)
I concur with the list idea. The individual redirects can then be created into articles if an editor actually takes the time to create a half-decent stub/article out of it. I oppose the bot as it stands if this isn't the case, as I foresee an absolute nightmare as these articles are AfD'd, prodded, and CSD'd by the hundreds. That and we're doubling the size the encyclopedia with a bunch of micro-stubs, which certainly doesn't improve our image, or correlate with WP:NOT#INFO or WP:NOT#DIRECTORY. Sephiroth BCR 08:47, 1 June 2008 (UTC)
Support lists/tables with redirects I think making a separate stub for every tiny village would be overkill. I think the best solution would be to create lists at the 2nd-lowest level of the political hierarchy. That is, if a city is big enough to have multiple distinct wards/neighborhoods/boroughs with their own distinct names (and in this I do NOT mean numbered wards, e.g. Ward 6 or the like), then give the city its own article with the wards listed in it. If the city/town is not big enough to be divided like that, then it would not have its own article, but would be listed on the township/county/district page which contains that city/town.
So, to restate, here's my standard for notability of inhabited places: If an inhabited location is subdivided into distinct parts which are each named (not numbered) separately, then it is inherently notable and deserves its own article. If it is not, then it needs to meet one of the other notability criteria in order to be considered notable. Thoughts? --Aervanath's signature is boring 09:52, 1 June 2008 (UTC)
  • Or, the bot could create the article as it desired, and then over-write it with the redirect to the list of townships, etc. This way, if someone ever comes along and says, Gee, we need an article, not a redirect, they just rollback the bot's second edit and work on the stubbed article. xenocidic (talk) 15:47, 1 June 2008 (UTC)
  • Support. Furthermore, I'd go back and do the same to American towns as well - if the article is still the original stub, then replace it by a re-direct into a parent list. In this way there is no inherent geographical bias - all small towns, anywhere in the world, are treated exactly the same way. LouScheffer (talk) 05:23, 2 June 2008 (UTC)
  • Support. No information is lost then, maintenance is much easier. I would even support this as a general policy of Knowledge -- create a section within the parent article (with redirects) rather than a sub-stub. The large and growing number of abandoned sub-stubs is a real problem, we simply don't have enough active editors to watch them, and probably never will. Though it is not always obvious how to determine the parent article. Colchicum (talk) 10:03, 2 June 2008 (UTC)

Generate content dynamically - run bot on user requested page only

Oppose bot in current form By running this to create a bunch of static articles we would be using up our one shot at a unique opportunity. The creation of a new page is a unique opportunity - especially for a bot. Since bots can't understand what is there (except in a very limited sense), once we create a page it is hard for an automated system to improve it. So when somebody comes up with a new data source or a better bot design it will be much harder (or impossible) for them to use their elegant design to come up with better pages for these cities/towns. Instead of generating and storing these static pages, we should consider making the tool available (and accessible) so that somebody who wants to make a page on wherever can get a nice starter page to work on easily, but don't create them until somebody wants one. Much in the way that mediawiki handles links to outside datasources (DOIs, etc), automatically generating data like this should be handled through templates/linking in other data, rather than as static article text. Only make it static if you have to when somebody wants to edit it. (If must make it a static page, how about a single template that generates the page on the fly.)

  • As bot and data sources evolve the user will get the best page that we can generate at that time. (Can integrate bug fixes, ideas for improvement, etc.)
  • In the meantime it won't clutter up the random article selector, won't be as much of a target for vandals, etc.

Insert your favorite joke or quote about not having to deal with an installed base here.

Other issues:

  • How does it handle disambiguation pages. 2 million new articles will mean a lot of extra disambiguation entries for some terms that people will have to sort through, and they won't know which bits are stubs.
  • It doesn't seem to do a very good job of categorization. (Should do better.)
  • At the very least, it should make it painfully easy for other bots to parse the whole article so somebody could come by later and improve things for the articles that people haven't edited. Zodon (talk) 09:11, 1 June 2008 (UTC)
I don't know how possible the dynamic-generation suggestion is, but I would support that if it's technically feasible. However, I think the bot-parsing and "what if we come up with a better format" objections are red herrings. The bot is going to make these pages in a standard format, and therefore any other bot can be programmed to come along, automatically parse that standard format, and re-format or update the article as needed.--Aervanath's signature is boring 10:06, 1 June 2008 (UTC)
    • Or, as I suggested above, the bot could create the articles, then redirect them to the list of townships. The user who came along wanting to create the article could just undo the last edit and be on their way. Of course, this would require that the user know this is possible. xenocidic (talk) 15:50, 1 June 2008 (UTC)
Something could probably be set up on the toolserver to do this, similar to the tool Commons has for uploading Flickr images. Mr.Z-man 19:43, 1 June 2008 (UTC)

Reading above, that "a bot sealed with approval and this discussion is several weeks too late", I don't wish to waste my breath. I don't even suppose that there will be a "notability" aspect incorporated in the bot: will it stop at the level of commune and not include what Italians call frazioni or hamlets? Doubtful, even if I thought it should. So Knowledge is to be a gazetteer after all. I hope we can stop crowing over "three-millionth-article" etc. --17:32, 1 June 2008 (UTC)

List form

Why not have the bot generate lists of articles with the potential to be created, and allow those articles to be tagged by template for creation with the available data? That way, a human touch will be required, but the pre-made data and infobox available by bot would increase the convenience of creating the article. Nihiltres 17:57, 1 June 2008 (UTC)

I wouldn't have any objections to lists of places in a table with available data, but my feeling is that editors would create these articles anyway whether a bot is used or not and without proper references or infoboxes. The concern for me is that ther eis a great deal to write about many of these places potentially and it isn;t encouraging people to write fuller articles for them in the way that a list would. I think its important the we have equal article coverage of places around the world, in the belief or maybe optimism that evnetually they will all develop ♦Blofeld of SPECTRE♦ 19:12, 1 June 2008 (UTC)
Perhaps you misunderstand. This is my idea:
  1. Bot generates lists of possible articles for it to generate
  2. Human tags article on list for creation
  3. Bot creates bot article
  4. Human improves bot article.
This way, there would be a guarantee that humans would do something with the bot-generated articles. Nihiltres 23:26, 1 June 2008 (UTC)

Reference Issue

I notice in the example that the bot claims to be using the National Geospacial Intelligence Agency as a reference. However, the reference is just a wikilink to that Knowledge article. The bot needs to reference the actual page which the information was retrieved from, and NOT the NGIA wiki-article. The NGIA page (in this case) is here. If the NGIA wiki-link is necessary, put it in a See also section. Also, I notice that Maplandia.com is listed in the External Links section, and has exactly the same info on the linked page as the example article does. Is the bot retrieving info from Maplandia (which would by a copyright violation, or the NGIA? (Not that anyone would be able to tell, if it's exactly the same.)

If you will check some of the trial article additions from the bot's contribs, notice that the NGA GNS Search is being cited. Not ideal, but it seems acceptable. Huntster (t@c) 10:58, 1 June 2008 (UTC)
The bot extracts coordinates from National Geospacial Intelligence Agency which is public domain. Maplandia happens to be one of the most comprehensive sites on the web for places and satellite maps. This is why it is linked externally to help people. I happen to be a member of that site anyway. ♦Blofeld of SPECTRE♦ 12:25, 1 June 2008 (UTC)
As is mentioned at the top of this section, to comply with WP:V the reference link needs to go to the specific and detailed source not to a Knowledge page. Additionally the citation should be correctly formated using a fully populated template (see User:Jeepday/Cite). If you are going to build 2 million articles they should be very well formated and referenced. Jeepday (talk) 13:12, 1 June 2008 (UTC)
Which it will... ♦Blofeld of SPECTRE♦ 13:15, 1 June 2008 (UTC)
This is the point, BoS, the lack of evidence at the moment is fueling suspicion about this Bot and its aims doktorb words 14:22, 1 June 2008 (UTC)
Show me, I don't see an example. Jeepday (talk) 13:22, 1 June 2008 (UTC)
Example: Kushgag. We dumped the example in favor of this. The FritzpollBot has already created 100 articles. I'm an Editorofthewiki 15:06, 1 June 2008 (UTC)

How many databases?

I could get behind this if the bot used more than one source, say two or three different databases, and then only created stubs for the locations present in both/all databases searched. Using only one database, we run the risk of repeating typos and other errors in the original database. Only creating articles for places found in multiple databases would also increase the both "notability threshold" and the "verifiability threshold" for the articles created. In other words, the claim that "Aju is a village in Waingmaw Township in Myitkyina District in the Kachin State of north-eastern Burma" is more credible if Aju is listed in two or three databases rather than just one. —Angr 11:32, 1 June 2008 (UTC)

Currently, I use maplandia to reference the NGIA database. If there is no match, it doesn't get written in and logs are then output for me to look at. Not many hits so far, and more to do with maplandia than anything else. More sources would be good (always good) as I say in my response at the top of the page Fritzpoll (talk) 11:39, 1 June 2008 (UTC)
What does that mean, though? Does Maplandia have its own database that's separate from NGIA? I'm talking about only creating articles for places that are in the logical conjunction of two or three different databases in the first place, not creating articles for all places in a single database and then going in after the fact with additional sources. —Angr 11:50, 1 June 2008 (UTC)
Hello what is this?? Amurn etc etc etc ... created by the bot trial of 100 has ahem how many database links or references?... I count three. Perfectly adequate from notable sites. A google search shows tons of hits ♦Blofeld of SPECTRE♦ 12:20, 1 June 2008 (UTC)
Well, good. Three different sources indicate it exists, so that's fine. But if the bot creates articles on places whose existence is confirmed by only one source, we could be getting into trouble. We don't Stephen Colbert's prediction of Wikiality to come true. —Angr 13:30, 1 June 2008 (UTC)
What's in a name?

One of the challenges of this project not mentioned so far is adherence to various language and/or country specific naming conventions. For example, WP:MON regulates naming for places in Mongolia. With places in multilingual countries, things will get even trickier. Are there procedures in place that will make sure no articles are created with non-conforming names? Or even worse, articles duplicated under slightly varying names? Only dealing with places in the US, Rambot didn't have to face this problem, so we don't have any related experience. With a goal of several million new stubs, this is a substantial risk, and if anything goes wrong, cleaning up the resulting mess later might amount to a gargantuan task. --Latebird (talk) 11:47, 1 June 2008 (UTC)

Note that we are setting up an organization area as part of The missing encyclopedia articles group where eaxch country can be sorted and discussed with the relative wikiprojects before the bot is run for each country. I don't know why people keep assuming we'd be happy to just jump in there and not organize it first so we don't create a "huge mess". There is also the misconception that suddenly overnight we will be "plagued or flooded with two million articles". Not at all. Each country will be added gradually and discussed betwene human editors rather than unleashing an uncontrollable bot as is being implied ♦Blofeld of SPECTRE♦ 12:13, 1 June 2008 (UTC)
Sorry for sighing here, but did you read the way the bot operates at the top of the page where I respond to these comments? The bot soes not just rip from a source and automatically create an article - it is checked by humans first. Please read the notice, and feel free to ask any questions Fritzpoll (talk) 12:12, 1 June 2008 (UTC)
I didn't assume anything. It's just that the issue wasn't mentioned anywhere before, so I considered it worth bringing it up. I wrote this question before your explanations were added, and posted it here right after the discussion was moved to a seperate page. After that, I noticed the new entry at the top, which does indeed give an (at least implicit) answer. Therefore, please direct your sigh at those who started this discussion without first preparing a document outlining all the relevant information. --Latebird (talk) 12:44, 1 June 2008 (UTC)
Well, actually, you added the question 40 mins after that section had been added. My sigh was directed at those who haven't read the approvals page that was linked at the start of this discussion, which mentioned most of these points. Apologies if the tone seemed off, but I'm finding the need to repeat myself throughout the page quite tiring :) Fritzpoll (talk) 12:54, 1 June 2008 (UTC)

What Would Jimbo Do?

It was Jimbo, among others, who said we should move from a focus on quantity to a focus on quality... Perhaps his comments can focus this discussion and remind us what is most important. -- Rmrfstar (talk) 11:59, 1 June 2008 (UTC)

Well ideally we'd like to have both. ♦Blofeld of SPECTRE♦ 12:10, 1 June 2008 (UTC)
And if we have quantity all of us users like Blofeld and myself can start focusing on quality of these article. Please, everybody, start reading about the bot proposal! I'm an Editorofthewiki 13:37, 1 June 2008 (UTC)
The creation of many articles which may not be improvable beyond stub status is a move towards quantity. That having been said, automating article creation should certainly allow for a greater focus by editors on quality. I don't think that creating many, many one-line articles is useful if they cannot be improved beyond that point. Blofeld suggests this is not the case. I only appeal to Jimbo to gauge how we should best direct our efforts.-- Rmrfstar (talk) 14:37, 1 June 2008 (UTC)
Jimbo said to focus on quality, not to cease making new articles. Obviously we're never going to be done working on either quality or quantity. What would Jimbo do? He'd let the bot go. Obviously if Knowledge is going to be the sum total of all human knowledge and whatnot, we need to create these articles. Wrad (talk) 17:25, 1 June 2008 (UTC)
Well said, Wrad. Okiefromokla 18:29, 1 June 2008 (UTC)
I agree: well said. -- Rmrfstar (talk) 19:04, 1 June 2008 (UTC)

Suggestion: Create preview of new page names

I believe this stub generation is generally a good idea. It might be useful to create a list of all page names that are proposed to be created by this bot. Users with local geographical knowledge, or experience with transliteration schemes, or other interest could review the list and offer suggestions for improvement if any are identified. --Ghewgill (talk) 12:37, 1 June 2008 (UTC)

See Knowledge:WikiProject Missing encyclopedic articles/Places and our manual system of chekcing - please please read about what the bot is doing and proposal first before leaving such questions. Everything is actually a lot more organized already than people think Thanks ♦Blofeld of SPECTRE♦ 12:50, 1 June 2008 (UTC)

(e/c)If you'd read the actual way the bot operates at the top of this page, you'd see that this has always been the plan, as has already been done, with special emphasis on human inclusion! The bot is too dumb to be able to do this automatically - it needs people! :) Fritzpoll (talk) 12:52, 1 June 2008 (UTC)
Sample some of the bot's previous work

Maybe worth having a look at some of the recent pages the bot has created via the following link . I had a quick look through a few articles at random, and they all seem to conform to the same standards. Lugnuts (talk) 13:56, 1 June 2008 (UTC)

Thanks for the link. I just did the same, and the first dozen were all variants of Khandud -- so I'll discuss that one.
The Khandud article body, in its entirety, reads: "Khandud is a village in Badakhshan Province in north-eastern Afghanistan." This is referenced to the "NGA GeoName Database", but this is a generic reference -- it pulls up no information on Khandud without you typing in a query. Querying "Khandud" gives you a lat-long location, and variant spellings.
The article also gives two links: one leads directly to a satellite Google map, with Khandud located. The second is a generic link to the Encarta atlas. Typing in "Khandud" yields a planimetric area map.
I suppose this does no harm, but what's the point? There's no information here except the place name & location. How are we adding value for the user? Pete Tillman (talk) 20:40, 1 June 2008 (UTC)

Limit by minimum population?

While I think this is a great tool that should be used, I think the concerns of those opposed are at least worth considering. Maybe I've missed it, but I haven't seen a response as to whether it would be feasible to limit new articles to those with a minimum population? I'm not totally aware of how big of a hole this is in wikipedia? Are there significant numbers of cities more than 50,000 people without an article? Larger cities and towns could be created first, and we wait and see what develops. After those are done we could decide whether to do everything smaller. That said, the bot could be set to create articles based on other criteria - do we have articles for all regions/counties? All regional/provincial capitals? If the process is going to take months anyway, and is going to require direct participation by editors anyway, it might as well be done in a more systematic way than simply starting with Afghanistan and finishing with Zimbabwe. - TheMightyQuill (talk) 15:18, 1 June 2008 (UTC)

This has been suggested a couple times above, but I agree that it's quite probably the best way of going about it. If the problem is that there's these places with large populations and no articles, do them first. Then trickle down to ones with less and less. Makes sense to ME anyway... ♫ Melodia Chaconne ♫ (talk) 15:29, 1 June 2008 (UTC)
I don't think it's feasible. Getting reliable, up-to-date information about the population of towns and villages in all countries of the world is really, really difficult. —Angr 15:34, 1 June 2008 (UTC)

The population information doesn't have to be particularly accurate, as it doesn't have to be included in the article. ie. We don't need to know the exact population of a town, just whether it's bigger or smaller than, say, 50 thousand people.

I agree. Even if it is not a recent number, and even if it is not included (although I think it should be, even if it's a 1990 population estiumate), it is a good way to limit the spread of the articles. After all, an article about a city of 50,000 is much more likely to receive additional edits than a town of 1,000. Even if the threshold is later reduced, a population control seems to me to be the best way to get the project started while mitigating the concerns many editors seem to have. Newsboy85 (talk) 18:00, 1 June 2008 (UTC)

Again, I'm not saying that smaller articles shouldn't be created eventually just that doing it in order makes more sense. To compare, I've never heard of a neighbourhood article (and there are a lot of them) being created when the related city article is still a stub. That doesn't mean all neighbouhood articles are non-notable and should be erased, but obviously cities are generally MORE notable than towns. If more people live there, more people must know about it.

As someone who lives in a Midwest town of less than 50,000 (which has a pretty good article) I am very opposed to the idea that 50,000 is somehow a magic number that accurately determines notability. I find it kind of condescending and POV, personally. Wrad (talk) 18:03, 1 June 2008 (UTC)
I live in a city of 35,000. We're using 50,000 as a suggestion and illustration; it's not intended as a magic number, and as I mentioned, whatever number we pick could then be reduced. Call it 20,000 or 10,000 if you like. The argument remains the same. The fewer the people, the less likely there will be additional edits or additional notable information to add. That doesn't mean some towns with 100 people won't be the birthplaces of great people or great ideas. It just means that it is less likely, and those towns could be created manually to fulfill that need. Newsboy85 (talk) 18:08, 1 June 2008 (UTC)
Alright then, this might be a good way to slow the growth a bit so the humans can keep up. Wrad (talk) 18:24, 1 June 2008 (UTC)

Example

As it appears a lot of discussions are firing off up in the main body of this page, I thought I'd take a look at one of the examples citied by the "pro" lobby. Kushgag tells me that Kushgag is a small town in Afghanistan. There are very few pages on the net, it seems, that tells me anything more. This place is just a point on a map, it's not Leyland or Dublin or Cockermouth. There is nothing else to say about this otherwise (and hitherto) non-notable settlement. This is a permastub, and there's 1,999,999 permastubs to come. I'd say that fails WP:N, wouldn't you? And Angr says up at 15:34, getting up-to-date info about the population of towns and villages across the world is really difficult...Imagine trying to argue at AfD the notability of Kushgag... doktorb words 15:37, 1 June 2008 (UTC)

Villages are inherantly notable. The only reason there is nothing more to be written about this particular stub is that most Afghanis don't have computers and cannot create Websites or add info about their town from, say a local newspaper. I'm an Editorofthewiki 15:56, 1 June 2008 (UTC)
No, you are looking at this from the wrong perspective. There is nothing to be written about this stub because there is nothing to write about. Kushgag is one of a total of 2 million potential permastubs. The ideal, I know, would be Kushgag to have a lot going on to write about. But it does not. It is not notable. It may not even be notable within Afghanistan! doktorb words 16:00, 1 June 2008 (UTC)
I just took a note of this discussion because someone had the idea of making a link to it on every (?) editor's watchlist. doktorb, you did not get the initial point here. Any village in Afghanistan is about as notable as any village in the USA or somewhere else in the Western world. The only problem is that the people in Afghanistan usually don't have access to the Internet. This is some sort of structural bias in WP, with the result that some places receive less coverage than others. And I would be of the opinion that a stub is better than no article in these cases, although this might very well result in 2 million stubs. Zara1709 (talk) 16:33, 1 June 2008 (UTC)
The fact that consensus currently holds that villages are inherently notable keeps coming up. However, I would argue that the current consensus holds that villages which someone choses to write about are inherently notable. Supposedly, if a person creates an article on a village or city, even as a stub, it is because they have more information than just the geographic coordinates and a map. The idea that all villages are inherently notable, especially in light of this bot, needs to be challenged. I understand the need to counter systemic bias. What I fail to see is how a stub accomplishes this on the individual level. Yes, we will have two million of them. But does this actually make Knowledge less biased? Or does it just provide an illusion of a decrease in bias? After all, most of the U.S. and U.K related articles are much longer, and they will remain that way. Newsboy85 (talk) 17:56, 1 June 2008 (UTC)
Everyone needs to listen to this guy. If a town only has an article because some bot created a stub with coordinates, then that town is not notable, and the fact that it is a place where people live doesn't change this. When sources appear, the article will be written. We have very general and reasonable notability standards, namely that the subject be discussed in a secondary source, and we should not pretend that by granting waivers to certain topics we are in any way improving the encyclopedia. Rather than rushing to flood the encyclopedia with material of dubious value, we should establish on what basis we are proposing to include it. In the long run, this principle will be of more value than the stubs FritzpollBot will create. Ryan Reich (talk) 04:08, 2 June 2008 (UTC)
A short-term viewpoint that defines an article about a city with little presence in current electronic media as a "permastub" ignores that fact that paper sources will exist and the number of computers in the developing world can and will change. Arguing from Google hists that a town in Afghanistan cannot have a good article written about it is therefore both incorrect and short-sighted. Tim Vickers (talk) 18:51, 1 June 2008 (UTC)
There is nothing wrong with a permastub anyway. If you read Britannica recently you would note that half of it by pages is 65,000 stubs the other-half 699 multipage articles from 2 to 310 pages in length. Zginder 2008-06-01T20:18Z (UTC)

Projects tags

If the bot is approved to create these articles, how difficult would it be to program it to add the appropriate country WikiProject tag to the talk page of the article? This would allow the country WikiProjects to more easily find the articles, I think. Most countries have these projects. ···日本穣 17:37, 1 June 2008 (UTC)


Compromise: create another Wikimedia project for this?

Simply put, Knowledge doesn't seem like the proper place for millions of stub articles on villages of dubious significance. Nonetheless, I think it's impossible to argue that the information isn't useful or relevant or interesting to someone. It just seems that Knowledge isn't the right place for it. What about creating a WikiAtlas project that is intended to document every geographical location on Earth? It would be a good compromise - it would keep Knowledge free of encyclopedia articles that are unlikely to be maintained, but it would still be a place where geographic information could be shared, accumulated, and improved. Of course, if a location attains encyclopedic importance/notability, an article on it could exist, but a new project seems like the best place for this information. - Chardish (talk) 17:42, 1 June 2008 (UTC)

That is just not the answer. Villages and cities are already inherently notable (see WP:notability). Knowledge already has the responsibility to cover this and passing the buck is not a good idea. Wrad (talk) 17:49, 1 June 2008 (UTC)
I agree with Wrad, these articles pass current policy. Whether a human or a bot makes them, they should exist on Knowledge. --Falcorian  18:42, 1 June 2008 (UTC)
Notability is not the only criteria for inclusion in this project. Knowledge was never intended to be a repository of all information about every thing. The reason for sister Wikimedia projects is precisely to address that. - Chardish (talk) 18:46, 1 June 2008 (UTC)
Knowledge is meant to cover everything that meets notability requirements. Cities and villages meet notability requirements. A=B=C. Wrad (talk) 20:07, 1 June 2008 (UTC)
Further it is verifiable from reliable sources. --Falcorian  20:22, 1 June 2008 (UTC)
Notability has never been the sole criterion for inclusion in Knowledge; moreover there are many notable, verifiable things that just don't belong here. Why do other Wikimedia projects exist, if Knowledge is intended to store all verifiable information about everything notable? - Chardish (talk) 22:55, 1 June 2008 (UTC)

You know, were Knowledge the perfect encyclopedia I wished it to be, I would completely agree with this suggestion and advocate it as much as possible, since it makes more sense at least for me. What we presently have, however, is a gigantic database stuffed with articles for every single Pokémon ever created, for every single Star Trek episode ever conceived, for stupid webcomics that nobody reads, and many many other atrocious examples. Taking this in consideration, then I have to vote that yes, do create a stub for every city and village in existance.--Ivo (join Project Portugal) 03:37, 2 June 2008 (UTC)

Well, actually, most of the Pokemon articles have been merged at this point. (Yeah, not really relevant here, but yours is a common misconception, and I thought I would correct it.) Zagalejo^^^ 03:45, 2 June 2008 (UTC)

Units of Measurement

The example I've seen so far uses Imperial and metric units without conversions. Since in most if not all of these countries metric units are common if not universal, and since WP:UNITS says that metric units should be primary outside the US & UK, will the bot use metric units where appropriate (with conversion to imperial)? If this is going to happen, we'd better be sure we get it right - we don't want large sections of our audience not understanding us on 2000000 articles because they weren't brought up with the set of units used. Pfainuk talk 19:43, 1 June 2008 (UTC)

Not the bot's work - this stub has been expanded since creation with the units you describe. Take a look at the history. Fritzpoll (talk) 20:05, 1 June 2008 (UTC)
OK, well if it's not an issue it's not an issue! Thanks, Pfainuk talk 20:13, 1 June 2008 (UTC)

Another possibility

There is already wikimapia and such, and places can be encyclopedic, but a real possibility is a separate geographical wiki for small places. Reywas92 15:42, 1 June 2008 (UTC)

Not a good idea. Villages are already inherently notable and we don't need a whole new project to cover what it is already wikipedia's mission to cover. Wrad (talk) 17:50, 1 June 2008 (UTC)

Why the debate?

Why is this even being debated? BE BOLD and unleash that damn bot! Indeed, may many bots of its type follow! Observe what Knowledge has achieved in seven years. Now take yourself forward another seven. The actions of this bot will be miniscule compared to what WP will be in 2015. Those who disagree fail to see the momentum this project has. Suicup (talk) 15:59, 1 June 2008 (UTC)

See Knowledge:Articles for deletion/Common outcomes#Places. I'm an Editorofthewiki 16:01, 1 June 2008 (UTC)
Again, why the debate? According to that link, there is no problem. Suicup (talk) 16:04, 1 June 2008 (UTC)
Whilst technically feasible, given BAG approval and the fact that the bot has been flagged by a bureaucrat, the community is entitled to discussion to establish consensus for what is potentially a large change. My job is to keep clarifying issues surrounding the bot. Fritzpoll (talk) 16:06, 1 June 2008 (UTC)
The way I see it, WP Notability policy explicitly states that all villages and towns are notable. Thus, there is no problem with creating articles for all of them, whether bot or human created. Hence there is no problem with unleashing this bot. Arguments about how this will distort the size of WP, how having a million stubs is bad, we shouldn't conduct such large interventions etc etc are irrelevant to the issue at hand. As long as the current WP policy regarding notability remains, there is no reason why this bot can't proceed. Theoretically, these articles are going to be created eventually in the long run (because villages and towns are notable, so ipso facto they will eventually be written about on WP), so why does it matter if a bot creates them en masse right now? Unfortunately as we have seen here, you will never get agreement when people talk past each other and miss the entire point of the discussion. Hence this entire debate is pointless - proceed with the bot. Regards Suicup (talk) 16:37, 1 June 2008 (UTC)
I have two problems with that:
  1. Knowledge:Articles for deletion/Common outcomes#Places is not a WP Policy. Hence my suggestion to have an official policy stating bluntly something like "all places inhabited are notable".
  2. The Bot would include not only town and villages, but possibly also hamlets. (please correct me if I am mistaken on this point). And I see nowhere that hamlets are inherently notable.
SyG (talk) 16:53, 1 June 2008 (UTC)
With regards to point one: this page, while still a proposal, meshes with the opinion of the AFD guidelines, and frankly it seems inevitable that the key point (that all villages/towns are notable) will be upheld. I'm not sure about the status of 'hamlets', you'd be best to take that up with the bot creator. Personally, I don't have a problem, and I'm sure the majority of others don't too. Regards Suicup (talk) 17:01, 1 June 2008 (UTC)
With regards to point one, thanks for the new page you pointed, it is good to see that an actual proposal is developping. I also noticed on its Talk page that a user (User:Carlossuarez46) proposed to put a clear statement like "all inhabited places are inherently notable" but that was not accepted by another user (User:Exit2DOS2000), although the reasons are a bit obscure to me. Why not pushing this proposal through before the bot is launched ? SyG (talk) 17:53, 1 June 2008 (UTC)

On a tangential point, "be bold" generally doesn't apply to mass bot edits, because they're much more difficult to undo, and we don't want bot edit wars across thousands of pages. --Delirium (talk)

Presumed notability

We cannot assume all places are notable because they exist and there's a passing mention of documentation. According to the general WP:Notability statement, "If a topic has received significant coverage in reliable sources that are independent of the subject, it is presumed to be notable." With these examples, there is not significant coverage whatsoever. With the exact same argument there are significantly more references to individual people than any of these places. Surely elementary schools have move coverage than any of these places, but it was decided that they are not as notable. Retrospectively it was a bad idea to create the US articles a few years ago, and we cannot assume we are required to have articles on everything because of we did that in the past. Reywas92 15:54, 1 June 2008 (UTC) Yes and how many of those are American or British oriented. WOuld you please stop cluttering this page ♦Blofeld of SPECTRE♦ 16:57, 1 June 2008 (UTC)

An article on Alma High School (Alma, Arkansas) but not an article on a town with 60,000 in Bangladesh? Ahh but of course American wikipedia is for the shallow minded with little scope for widening coverage of the world. Virtually everything listed in that box proves my point completely. People think wikipedia is about America and the UK and popular culture. How many in that box are not related to this?? Pirates in popular culture? Dear God help us. This is exactly what needs to doing to rid of this outlook and start focusing on what is important, real world content. Basically you have just illustrated to me that your ideal goal is to have a afull article on a season of some baseball or NFL team rather than attempting to address the problem with ignoring 95% of the world land mass. That box looks to me like it has been drawn up by American teenagers. ♦Blofeld of SPECTRE♦ 17:01, 1 June 2008 (UTC)

WOuld you please stop cluttering this page? You have many more comments here, and I have the right to give my opinion. First, I said no to Alma High School, as it is now merged, as many other articles should be if it will be only one sentence. Please, I want thousands of articles on Bangladeshi towns with 60,000 people, just not two million villages with less than 1000 people, and that goes for the US and UK, too. Are you addressing me personally with the the American teenage part? I'm just pointing out that there are other things we can do than add two million stubs. Yes, they will be expanded eventually, but that's a lot of articles,a nd that would take many years. I don't want to ignore most of the world land mass, I just don't appreciate one-sentence articles. Reywas92 17:38, 1 June 2008 (UTC)
Well you didn't imply this. It looked to me as if you thought adding any articles on any town whether it has thousands or not is a bad thing, . I was referring to the fact that that to do list revolves primarily around the UK and the US and popular culture rather than evenly ♦Blofeld of SPECTRE♦ 19:02, 1 June 2008 (UTC)
I get real tired of people telling me what I should read and write, and what I should care about and what I shouldn't. Yes, if 95% of the world's land mass is basically undocumented, then we shouldn't be creating articles that just include a pair of coordinates that amount to all we know about that part of the land mass. We're leaving 99.9% of the galaxy undocumented, instead of putting articles on every star in the galaxy in Knowledge.
But hey, it's easy to attack the labor of people who are actually putting information in Knowledge that they care about and that people want to read, based solely on their country and age. That'll teach them about bigotry in the real world, an important lesson.--Prosfilaes (talk) 17:31, 1 June 2008 (UTC)


You can help improve the articles listed below! This list updates frequently, so check back here for more tasks to try. (See Knowledge:Maintenance or the Task Center for further information.)

Help counter systemic bias by creating new articles on important women.

Help improve popular pages, especially those of low quality.

How 'bout some quality, not quantity.

This whole quality vs. quantity does not persuade me that this proposal should not go ahead. Lets assume (and I believe it is a relatively safe assumption) that many, if not most, of these articles will eventually get created by humans. Whether this is by residents of the towns/villages, following the global spread of the internet, or current wikipedians. Human created stubs do not always follow guidelines, have maps, suitable categories etc; they often require "fixing" i.e. they are not always of good quality. These bot created articles, in comparison will, follow the manual of style and overall be consistent. In my opinion they will be of higher quality than the equivalent human created stubs. If you disagree with the assumption that many of these stubs will eventually be created by humans I guess you are welcome to oppose this proposal! :) Suicidalhamster (talk) 17:04, 1 June 2008 (UTC)
The articles created by the bot will be created anyway guaratneed it would just takes years longer and done haphzardly without the consistency that a bot has (e.g sub stubs without an infobox and coordinates or details on provinces etc which take weeks to sort out human generated mess in). Articles can be quickly expanded in two minutes flat. There seems to be some idea that they will ALL never be expanded. IMagine how articles like Kushgag will look once proper information becomes available on the web. I fail to see how this doesn't have any benefit for the encyclopedia at all. For may place sinformation on population etc should be avialable ♦Blofeld of SPECTRE♦ 17:07, 1 June 2008 (UTC)
To Reywas92:-
Are you suggesting that the articles on Gladiators (British TV show) and Pirates in popular culture are more important than historic cities in the developing world? There is significant coverage, much of it however is not on the internet due to the infrastructural problems in the developing world. Much of the information that could expand these articles are still written on paper. Once these countries start computerising information to a larger extent (could be within a few years) then the expansion work can really start. Even with these problems there are missing articles on towns and cities that could be taken to GA or FA with the sources available, we are simply lacking manpower, and need more people to dedicate themselves to these sort of topics, instead of fancruft like Star Wars or some random cartoon. EJF (talk) 17:13, 1 June 2008 (UTC)
Why should people write what you want them write? Isn't that slavery? Isn't this supposed to be a volunteer effort? Why is that articles that people read are less important than articles that no one has bothered to create yet?--Prosfilaes (talk) 17:27, 1 June 2008 (UTC)
Please don't put a spin on my words. I am not forcing anyone to write anything, and to compare my point of view to a horrific crime against humanity is a nonsense. I am suggesting that perhaps articles about real places and real people are of more interest in an educational point of view (the Wikimedia Foundation being an educational charity) than an article on a random TV character. Imagine being Knowledge's supposed target audience, "the child in Africa", who receives one of the $100 laptops - what would the child want to learn about our world? About its people and places, or a B-rate porn movie character? EJF (talk) 17:38, 1 June 2008 (UTC)
You want to demand that volunteers (people who aren't getting paid for their work) work on what you want them to work on. How would you describe that? I would say that your opinion that articles about real places and real people are of more interest in an education point of view to be a minority one, given the amount of time most students spend studying literature. Those that aren't studying literature, tend to be studying technical subjects of practical use on the job, not geography.
At best, "the child in Africa" is one of Knowledge's supposed audiences, though I've never read that claim before. "A B-rate porn movie character" wouldn't last one second as a Knowledge article. Furthermore, I suspect that child would rather learn about heroics in a galaxy far, far away, than learn that "Azaman is twelve miles north of Debra, which is twelve miles north of Kelemore", like in fact most children worldwide have preferred. Stories of heroics and action have always captured the mind above mundane information presented without interesting details. If and when we can say something interesting about their locale, they may be interested, though many still won't care.--Prosfilaes (talk) 18:18, 1 June 2008 (UTC)
The number of straw men in this discussion dumbfounds me. No one is trying to force anyone to create any articles they don't want to. On the contrary, this proposal is about allowing those who want to create articles to create articles in which they are interested. Mangostar (talk) 18:50, 1 June 2008 (UTC)
Actually, this discussion, like many others, is being used to complain about the badwrong people who are adding information to Knowledge that's not formal enough.--Prosfilaes (talk) 19:15, 1 June 2008 (UTC)

One thing's for sure: Kushgag is notable now. Personally I would be wary of making accusations of behaving like a schoolchild if I had signature like some scattered through this page Almost-instinct 18:22, 1 June 2008 (UTC)

Centralized notability discussion

Was: AfD nomination of Amurn

An article that you have been involved in editing, Amurn, has been listed for deletion. If you are interested in the deletion discussion, please participate by adding your comments at Knowledge:Articles for deletion/Amurn. Thank you. Do you want to opt out of receiving this notice?

Devil's advocacy. JJB 17:25, 1 June 2008 (UTC)

Lulz CWii(Talk|Contribs) 17:59, 1 June 2008 (UTC)
That AFD's already been closed.--Aervanath's signature is boring 18:31, 1 June 2008 (UTC)
Thanks, but I believe the notability issue is a separate and distinct issue that needs separate discussion from support/withhold/oppose polling re the bot. Regarding the AFD, independent notability discussion is certainly not happening at Knowledge talk:Notability (Places and transportation), nor is there any impetus for it to happen in the same way the welcome notice generated this pile of discussion in one day. Notability has not been clearly established and an individual article chosen at random would face a challenging AFD if nominated on its own merits. For the sake of discussion, assume 10,000 articles have been created alphabetically, and an established user picks one for deletion asserting nonnotability. How would it go? The following is an expansion of User:Phlegm Rooster's excellent point above:
  • Article does not assert that village has a verifiable notation in multiple atlases.
  • Even if bot were reprogrammed to add Falling Rain, Blofeld's method of inserting Falling Rain into Amurn did not assert notability because it was not provided with a verifiable link.
  • Even if the bot were programmed to do better, two atlases should not be taken as automatically fulfilling the "multiple atlases" criterion for two million cases; it may well be literal, but is certainly treading on the spirit of WP:NPT.
  • By linking to one site which (if user enters search criteria correctly) states that village has at one time been a "populated" place, article fails to assert sufficiently that village is a currently populated place, because site provides no "as of" dates.
  • Even if site were shown to be current, no policy or guideline has been brought by anyone (I looked!) to show that all currently populated places are notable; WP:NPT does not assert "inherent notability", and there is no consensus for "inherent notability" in this discussion either.
  • By linking the name with only a dot, article fails to assert that the village has specific boundaries, instead of being a loosely defined neighborhood (which would fail WP:NPT).
  • Finally, by being unable to assert the population number itself, article fails to demonstrate that the population itself is sufficient enough to have been censused at any time.

Per User:Phlegm Rooster, a fresh bot article does not assert notability. I see that my attempt to get this discussion going properly will need me to consider deletion review and/or renomination of another article later, i.e., to be discussed on the village's own merits rather than as a representative of the two million villages. This would not be a case of pressing the WP:POINT; I believe my good-faith call for focused discussion is legitimate and supported by many other editors above and especially below. It would be gratifying to me if users would regard this question as just as worthy of polling as Pepys's question, and would treat it as a question of "delete", "merge and redirect", or "keep". Please consider that in your response. JJB 19:44, 1 June 2008 (UTC)

AFD-style comments

  • Merge and redirect as nonnotable, to appropriate national district or other subdivision, for the reasons above. JJB 19:44, 1 June 2008 (UTC)
  • Comment — Here is a comment on "inherent notability" which I left at Knowledge talk:Notability (Places and transportation)#Inhabited places: Regardless of whether it is the community's norm (and regardless of whether it is already done in other policies), I do not think it is a good idea to declare, merely by administrative fiat, that a broad class of topics is inherently notable. This amounts to an abdication of responsibility for the quality of the encyclopedia. Consider how notability can be proved absent such a declaration: given a topic, one consults first primary sources which document it, and then secondary sources which, in commenting on the primary sources, relate it to other ideas and provide the crucial aspect of notability: human commentary. This is why Knowledge is a tertiary source, merely weaving together the contents of secondary sources as a unified testament to human interest in a subject. The insistence on secondary sources is at the core of two of our core policies: no original research, and notability. By requiring that we rely on secondary sources, we force our articles to incorporate only documented facts and opinions, preventing us from being what we can never successfully be: a forum for the original publication of new ideas. But it also ensures that we only write about topics which have been demonstrated to matter outside themselves: that's what the existence of secondary sources (commentary) proves. That's the basis for our notability criteria. By declaring something to have inherent notability, we give license to circumvent secondary sources and, therefore, sacrifice true notability. Essentially, a permissive notability policy is original research. Ryan Reich (talk) 20:12, 1 June 2008 (UTC)
Everyone needs to read Ryan's comment above, for it most eloquently sums up what the core issue is here - human interest. A robot creating articles about places nobody cares about (and speculation about people who may or may not care about them but do not have internet access is a thought experiment, not a keep argument) based on a primary source is not asserting notability. - Merzbow (talk) 01:39, 2 June 2008 (UTC)

Sign up for the FritzpollBot bet

I for one bet that after one year of this bots running, through examination of a statistically significant sample, I am willing to bet less than 0.05% of its created articles will have been touched in any meaningfull way. And by the time of the heat death of the universe, or when jimbo sells wikipedia to microsoft (well its not like google needs a database of villages is it?), whichever comes sooner, I bet that less than 5% will have been touched. MickMacNee (talk) 18:44, 1 June 2008 (UTC)

Wasn't there an old policy that said "avoid creating stubs"?

Whatever happened to that? - Chardish (talk) 18:49, 1 June 2008 (UTC)

On the contrary. WP:IA says: "If you do not have the time to write a full article, consider writing a "stub". Stubs are very short articles - generally just a few sentences. These are the "ugly ducklings" of Knowledge. With effort, they can grow into "swans"." Mangostar (talk) 18:52, 1 June 2008 (UTC)
No, I'm asking what happened to the old policy. I remember it being around when I got started, sometime in 2004 or so. - Chardish (talk) 18:56, 1 June 2008 (UTC)
I think you are misremembering. Christopher Parham (talk) 20:26, 1 June 2008 (UTC)

Suspend this discussion until WP:NPT is finalized

It seems to me that this discussion is not going to get resolved until consensus is formed on what Notability (Places and Transportation) will define as the standard of notability for localities. I'm not sure what the proper forum for debating that proposal is (probably WP:VPP or the proposal's talk page), but I think the discussion here should go on temporary hold until that issue gets resolved. I think it'll make the debate over here a lot less convoluted.--Aervanath's signature is boring 19:01, 1 June 2008 (UTC)

I'd have to agree here. Singularity 19:11, 1 June 2008 (UTC)
I disagree. There are far more issues here than the notability of the pages. Telling people to stop discussing and come back later is not at all conducive to any sort of productive discussion. Mr.Z-man 19:34, 1 June 2008 (UTC)
The editors are correct that notability is a separate and important issue and must be discussed separately. Discussion is not happening at WP:NPT, and this is WP:VPP. Independent WP:N discussion needs to be jump-started with the same strength that this initial discussion was. Please comment at Centralized notability discussion above. I think I can rely on even the esteemed Mr.Z-man to do so. Thank you. JJB 19:51, 1 June 2008 (UTC)
I guess that this discussion seems to be the vehicle for both the policy and the bot. Singularity 21:00, 1 June 2008 (UTC)

Page move

Yes, they did say millions: 2 to 3 million was the figure I saw. Why was this page moved to thousands? Scolaire (talk) 19:23, 1 June 2008 (UTC)

User:Blofeld of SPECTRE did it, obviously because it helps his case.--Prosfilaes (talk) 19:25, 1 June 2008 (UTC)
Clearly a POV move. He seems to be profligate on this page. MickMacNee (talk) 19:27, 1 June 2008 (UTC)
Is there a WP:Banging on and on page? Almost-instinct 19:28, 1 June 2008 (UTC)

Uh, can we slow down on this issue for a moment?

(ec) I just learned about this bot a few minutes ago, & have some feedback based on my own work in creating articles on every settlement in Ethiopia.

  • First, I have no problem with the idea of a bot creating place articles. I made a serious attempt to setting something like this up for my contributions to Ethiopia back in 2006.
  • However, I ended up discarding the idea, even for my small corner of Knowledge. I found that this only automated about 10-20% of the total work: creating articles with any useful content requires a lot of manual labor.
    • For example, with the 500-odd articles on individual woredas of Ethiopia (the equivalent of a county in the US or a civil parish in the UK), first I had to put the data in a structured form -- which took up the majority of my Knowledge-related time over several weeks. (This included not only removing duplicates, alternate names, & alternate spellings from the National Geospatial Intelligence Agency, but also finding & extracting population figures for these Ethiopian locations.) Then I had to compile a list of adjacent woredas in a useful order (here I followed Ram-man's recommendation for US county articles). Oh, & I also needed to proof the output of the script I was using to pull all of the structured data together; I had many cases of GIGO which I could not detect at the data-cleaning stage. It was after creating the first 20-40 articles that I found that having a bot add the content to Knowledge would not save me any appreciable amount of time, so I started adding the articles as I finished them. (And towards the end I discovered another source of information for these local units -- which was not in a structured or even standardized format -- which meant that the last 90-100 were the slowest part of this effort.)
  • Further, the data from the NGIS is not that reliable: it is a compendium of place names from a wide assortment of unattributed sources of uneven quality. A. B. quite convincingly proved this in the case of one village in Nigeria. In other words, although the bot would create hundreds of thousands of articles that Knowledge might not otherwise have, it will also create hundreds of articles that need to be deleted because they can be shown not to exist.
    • However, if the various WikiProjects affected were to organize reliable information into a structured format (e.g., into a spreadsheet) & pass this to the bot, this would produce the result I believe all of us want. They should be able to do this, since they should know what sources are reliable, & which are not. For example, I could deliver to this bot-master in the next day or two a spreadsheet containing names & population data on the 500-odd towns & villages which don't yet have articles. (I need the time to locate the actual file & verify that all of the settlements with existing articles are removed.)* Lastly, for some of these settlements, even if reliable information exists to show that they exist, to provide more information than that they exist -- even basic information such as geographical data, population figures -- will be difficult if not impossible. I speak from personal experience -- & frustration -- here. One invaluable source I have used is a compendium of local Ethiopian history at the Nordic African Institute; although it has provided me with information on dozens of towns and villages I would not otherwise have, this treasure trove is beginning to fail me on an increasing basis. In short, of the 750-odd settlements I can provide some verifiable about, I can only write more than a brief, skeletal stub for a third of them; most of those I have not yet started an article about fall into this stub category. Even fewer articles have any reasonable hope of becoming a B-class or better article in the present conditions of the information available.
    • As a further note, for most of these articles I have been the only contributor. And in many cases, I have found obvious mistakes I made in writing them that have gone unfixed for over a year. Folks, the sound you hear me make what the bottom of the barell sounds like when one scrapes it.

As laudable as the goals of running this bot are, I have serious fears that running it without doing more research first will result in our needing to push through the deletion process many thousands of articles -- & I doubt anyone here will seriously argue that AfD can be trusted to accurately sort out which ones we should delete or keep. -- llywrch (talk) 19:34, 1 June 2008 (UTC)

I agree strongly with that. The reason RamBot produced articles of at least minimally useful quality are that: 1) the data source was reasonably reliable (and referenced!); and 2) it actually contained enough data to produce an article with a minimal amount of (referenced) factual information. I think the best place to go from there would be to find similar data sources for other countries on a country-by-country basis, as you did for some of the subdivisions of Ethiopia. Just creating a bunch of articles with virtually no content, and questionable accuracy on what content is there, is not useful. --Delirium (talk) 20:09, 1 June 2008 (UTC)

Thanks for reporting your Ethiopia experience. Not encouraging for going worldwide bot-happy! Cheers, Pete Tillman (talk) 22:45, 1 June 2008 (UTC)

God I wish I could rate sections. — Dispenser 03:32, 2 June 2008 (UTC)

Motion to recess by the operator

I have found a series of extra sources that might alleviate many of the concerns here, or at least allow some ideas to be implemented. I have found a link to a vast amount of country census data at http://www.census.gov/main/www/stat_int.html and I need time to look through it all and see what information is available and in what format. This would give population data, which can either expand the articles proposed, or allow them to be restricted if there is consensus for such a restriction. I'm not going to do this right now, but I will look at it over the next few days - no deadline, right? That said, don't get your hopes up - some countries will probably not have the requried data. Fritzpoll (talk) 20:36, 1 June 2008 (UTC)

This is the right direction to go, in my view. I found this 2007 article, which can be used to estimate the relation between minimum population size and number of articles. I've commented further on Blofeld's talk. The article doesn't contain the precise data we need, but based on it, I would guess that if we used a minimum population of about 1000 as a first approximation to the criteria for creating a stub using this bot, then the number of articles created would be between 100 and 200 thousand. With WikiProject input to ensure the stubs are reliable, this seems to be a reasonable target for a year's work. Dare I second the motion to recess? I think I do :-) Geometry guy 21:17, 1 June 2008 (UTC)
THANK YOU!! JJB 21:24, 1 June 2008 (UTC)
If even part of that is directed at me, you're welcome. I know it may have seemed like I was opposing your point of view at times, but I do passionately believe in the need for consensus. Give me a few days, and we'll resume discussion when the more salient points can hopefully be addressed Fritzpoll (talk) 21:27, 1 June 2008 (UTC)
I support a recess for this purpose. Adding further references would alleviate many concerns voiced here (by me and others). "Thank you" indeed! -- Rmrfstar (talk) 21:29, 1 June 2008 (UTC)
All I can say is finally we are making progress rather than this undesirable poll and conflict which doesn't do anybody any favors. From now on can we please discuss the best course of action rationally and how it could be done in the most efficient and productive way possible without what happened earlier. Thanks ♦Blofeld of SPECTRE♦ 21:46, 1 June 2008 (UTC)
Agreed. Give me a few days, and I'll resume discussion under a different heading here or on a new page Fritzpoll (talk) 21:48, 1 June 2008 (UTC)
Myself and Maxim have moved this discussion to a more neutral title. I hope this will stand, but I also hope that all editors will accept this motion to recess, and not contribute further to the divisive poll. Geometry guy 21:53, 1 June 2008 (UTC)
I like the idea of putting the discussion on hold. When there is significant objection to an idea, the original idea should be re-evaluated. Debate rarely accomplishes anything; the genesis of new ideas drives this project. - Chardish (talk) 22:57, 1 June 2008 (UTC)
If this had been the proposal all along, I don't think it would have gotten much opposition. But I suppose, if you didn't have the data you couldn't do it. But clearly there is no controversy that we need articles for villages over for example 10,000 in population. The only opposition is on the small end of the scale. So the obvious solution is to start at the top and work down. That will give more than enough time to work out any issues and won't create any controversial articles. Anyway, a recess works, but I would certainly recommend the top down solution as a way forward. - Taxman 02:33, 2 June 2008 (UTC)
Since the bot operator asked for a suspension of discussion (and discussion is really pointless without his input), can someone please remove the watchlist notice??-RunningOnBrains 04:55, 2 June 2008 (UTC)
Support recess Not much point in debating further until Fritzpoll decides how he wants to reconfigure the bot. Also, will give the community some time to come to a basic consensus on the proposed guideline Knowledge:Notability (Places and transportation), which is its own issue and should be treated as such.--Aervanath's signature is boring 05:36, 2 June 2008 (UTC)
Support recess. — Athaenara 06:30, 2 June 2008 (UTC)

Create wikiproject, use bot, create articles

  1. Determine order in which to create the articles (by country/region, maybe)
  2. Create a wikiproject prepared to deal with the thousands of new articles
  3. Create the articles over the period of several months/years
  4. Recruit dozens of users to use Google and fill up/verify the articles, thereby vastly improving the {{worldwide}} view on Knowledge
  5. Lather, rinse, repeat

Sounds like the right approach to me, when do we start? --NickPenguin(contribs) 22:39, 1 June 2008 (UTC)

Sounds like a plan! TreveX 01:05, 2 June 2008 (UTC)
Looks like someone beat me to it, Knowledge:WikiProject Cities already exists. Now's the time to join, since I'm sure this bot will be working closely with members of that project. --NickPenguin(contribs) 02:02, 2 June 2008 (UTC)

In progress: amended proposal

In the next 72 hours, I will post up a new, amended proposal. This will deal with the reliability of the sources, the use of multiple sources, and the inclusion of a wider dataset including population. This should address many of the concerns of those opposed or neutral to the bot (though not all) and allow us to set limits on the bot's operation. For now, please do not comment further on the current implementation - I will formulate a proposal and create a single new example article with the bot for discussion. I hope this is acceptable to everyone Fritzpoll (talk) 23:01, 1 June 2008 (UTC)

I would suggest running a dozen (or more) examples, as a fairer sample of the new proposal. No more Khanduds, please! Thanks and look forward to it, Pete Tillman (talk) 23:29, 1 June 2008 (UTC)
May I suggest to limit the uploads to the localities already linked from (a certain number of) other articles, or at least mentioned elsewhere in Knowledge? This would certainly have a lesser impact on systemic bias (if any at all), but such sub-stubs are much more likely to be taken care of in the future. Colchicum (talk) 23:48, 1 June 2008 (UTC)
Can we make sure the Encarta link is removed, please? Neıl 10:33, 2 June 2008 (UTC)

Suggestion

I'd like to bring up the example of a potential alternate approach to this.

Although it hasn't been entirely free of controversy, the emerging Canadian practice in recent months has been to move away from necessarily giving every geographic entity an article just on the basis that it exists, and toward prioritizing incorporated municipalities, with smaller settlements being redirected to their parent municipalities until such time as properly referenced articles can be written about the individual communities as separate topics. In the Canadian situation, for example, a documented census population figure can always be found from Statistics Canada for actual municipalities, whereas it's difficult at best to find a properly sourced population figure for an unincorporated place.

I'd also call attention to the fact that back in March, a user went through List of communities in Ontario and created well over a hundred unreferenced boilerplate stubs about communities — see Conway, Ontario for just one example of what they look like — that s/he didn't even categorize except for the {{Ontario-geo-stub}} template. So finding and fixing them has been an exhausting process of going through each individual article in the stub category to review whether it's one of those or not, figuring out which census division it's actually in (all places in Ontario are subcategorized by their county), doing a web search to determine whether there are valid sources out there or not, redirecting it to a parent municipality if the sources aren't there, and then ducking the arrows of people who think that redirecting an unreferenced stub is some kind of mortal crime against Holy Wikidom, because it's all about the number of articles we have on here, so who gives a crap about their actual quality?

There are still a lot of incorporated municipalities in Canada which don't have their own articles yet (the number of redlinks in List of municipalities in Quebec alone is dauntingly outrageous!), and the Canadian contingent would absolutely welcome bot assistance in getting that rectified. (In fact, we've discussed how to go about doing that very thing in the past, but it never came to fruition.) However, precisely because of the need to balance that with issues like WP:RS, I'd prefer that any bot helping out with Canadian places concentrated on incorporated cities, towns, villages and townships, and left the unincorporated hamlets alone for a human editor to determine whether the topic can stand alone as an article or not. A bot can easily go to the Canada 2006 Census data website, extract objective data and fill it into an infobox. A bot can easily do a quick Google crawl to insert a town's or city's website as an external link. But I'd prefer not to see the bot do anything where potential sources have to be evaluated for quality beyond a straight data dump.

So could that be a potential avenue for compromise here? Have a bot assist on places that actually have legal municipal status as cities, towns, villages, etc., and leave unincorporated places out of it for now? Bearcat (talk) 23:35, 1 June 2008 (UTC)

This sounds like a great idea. Unincorporated places can exist as content within the larger regional/city article, and when they prove to have enough content to warrant splitting, they can be split. I don't think anyone will deny that country should have articles about major cities and regions, it is just seems to be the lower end some disagree about. --NickPenguin(contribs) 23:48, 1 June 2008 (UTC)
I stongly agree with you and I know what you're talking about, but the problem is that other countries don't have systems like the US and Canada. That's where the population question comes in. Reywas92 00:05, 2 June 2008 (UTC)
I understand that, but I also don't know if a universal population cutoff is the right answer, either. It also sounds to me like some people here think that we're talking about the bot just suddenly doing a mass run through two million new articles in the space of a few days, when it's obviously not going to be that quick. So I'd like to suggest another corollary to my original idea: instead of doing a mass bot run, allow this bot to do one country at a time, based on whatever standards and sources are appropriate and necessary for that particular country. For countries like Canada and the US, for example, we should certainly let the bot loose on any redlinked place, regardless of population, that is legally incorporated as a city, town, village or other kind of municipal government. For countries with different systems of local governance, different standards will likely have to be used based on local needs and the accessibility of local sources. But it might have to be that we give the bot a cutoff of 50,000 people in one country, 10,000 in another, no lower limit in a third, 25,000 in a fourth, 500 in a fifth, don't touch country #6 at all, etc., because I don't know if using one universal bottom number is going to suit every country's individual needs. Bearcat (talk) 00:20, 2 June 2008 (UTC)
This is why I think creating a WikiProject to guide the rate of article creation would be advisable, because standards relevant to particular regions can be created when they are needed, which certainly won't be all at once. --NickPenguin(contribs) 00:32, 2 June 2008 (UTC)
I'd support that entirely. And while I don't have the programming skills to create a bot myself, I'll definitely chip in to provide whatever sources and links would be needed to do a Canadian run — I'm so eager to find a less exhausting way to get all those redlinked Quebec paroisses done. And even if this proposal doesn't result in anything coming to fruition, I would still be interested in working with a bot programmer to set a Canada-specific geobot loose on Canadian municipal redlinks, if at all possible. Bearcat (talk) 00:40, 2 June 2008 (UTC)
The plan was to do it by a country by country basis, not as a "mass run" as you described Bearcat. So after all the disambiguation was completed and all the articles were understood to be sepearate places, the bot was then to run. Humans will then check each article to make sure there are no erros and then move on to the next country. Cheers.Calaka (talk) 02:15, 2 June 2008 (UTC)

A wikiproject set up for a project a this huge scale would be advisable. When I initiated the proposal I wasn't aware how many would respond to it like this ♦Blofeld of SPECTRE♦ 09:18, 2 June 2008 (UTC)

I mentioned it above, but I'll say it here too: Looks like someone beat me to it, Knowledge:WikiProject Cities already exists. Perhaps this would be best as a task force under that project, and if this idea is really flying, perhaps we could put up a site wide message requesting help to verify these newly created articles. --NickPenguin(contribs) 11:40, 2 June 2008 (UTC)

Redirect stubs

Redirect by their nature hide all content after the redirect. Instead of leaving this information blank we could include the stub information that bot would've provided. That way if somebody want to change the redirect into an article they would already have a good starting point. — Dispenser 04:38, 2 June 2008 (UTC)

More thought required

Strongly oppose as it stands: There is no point in just replicating a list from one data set into another, even if it changes format. This adds no value. Will increasing the article count in en:Knowledge by 2,000,000 very basic micro stub articles fill it with 2,000,000 more pieces of encyclopedic information - IMHO - NO it will not. It would be much better, for example, to automatically add content to geo catogories along the lines of If the location is missing in the list of page/articless below, see the (external) list . Adding quantity does NOT add quality.

Creating redirects to lists / tables of the scant data might be a better idea too, (suggested elsewhere in this discussion).

Someone also needs to check out possible copyright violations. Some/many? publically available freely useable data sets cannot be reproduced in their entirety without permission? And, this is just a straight full repoduction. (Note that full reproduction is different from use / processing in full.)

I would support automatically generated (non micro) stub articles (but prefereably start class articles) if they could add some value over their source data, for example combining two or more sets of source data into the one article. This does add value for the reader. Also, it is then NOT just a full reproduction (of something readily available elsewhere).

Another problem might be monitoring vandalism? "Normally created" articles usually have people watching them, those that created / edited them. While the bots can pick some valdalism, who will monitor millions of micro stub articles for vandalism, spamming, etc.

Will the bot trawl for changes to the dataset/s it used to generate the micro stubs, and update the stubs?

Should, automatically generated articles be in a new category hierarchy of some form showing their automatic status, then manual update status?

Peet Ern (talk) 03:20, 2 June 2008 (UTC)

See Knowledge:Bot_requests/Archive_20#Some type of GeoBot for discussion on copyright. — Dispenser 04:52, 2 June 2008 (UTC)

What about regional considerations?

I am wondering if this bot will take into account the different standards for articles in different areas? For example, in Australia, the infobox is always Infobox Australian Place, and of course IAP has its own rules and regulations as well, including the use of metric values and the non-use of maps for example. It would be very annoying if a bot was to go through and add many Australian articles that used a different infobox, and while it is a developed nation, Australia has many towns that are not yet on wikipedia. (see: List_of_towns_in_the_Adelaide_Hills for a few) --TheJosh (talk) 03:38, 2 June 2008 (UTC)

Took you long enough

I patrol Special:Newpages and I've seen Blofeld of SPECTRE create thousands of geography location stubs. It's a pain in the ass trying to see through them to find other articles to patrol. I'm glad you guys are finally using a bot to do this, it makes it a hell of a lot easier for the rest of us. --Hemlock Martinis (talk) 03:57, 2 June 2008 (UTC)

Recommend doing this in phases

I recommend doing this in phases based on data source and cutoff sizes. For example, if the data source is India's census bureau, start with articles for places with more than 10,000 people. Then go back and check the results before doing the smallest communities. Then do places with more than 1,000 people and spot-check those results. When you get done with the India census bureau results, go on to the next data source. Pause a few hours between runs to allow time for editor review and tweaking. Pause for a day or two at the 25%, 50%, and 75% mark and as needed to allow incorporating "lessons learned". Also, be sure to give the New Article Patrol people a heads-up so they can ignore bot-created articles if they want to.

Assuming we do this in at least 2 "chunks" per data source and have at least 100 data sources, that's 200 chunks averaging 10,000 articles each. If we average 1 chunk per day including several hours of time for review, that's 200 days, which isn't unreasonable for a task of this size. Is it possible or desirable to have this work done at the back-end, to lessen the load on the servers and speed things up a bit?

davidwr/(talk)/(contribs)/(e-mail) 04:16, 2 June 2008 (UTC)

This is exactly what we intend to do. Before we had a chance to explain, suddenly a poll was set up and people who know nothing about the proposal start jumping to conclusions that millions of useless stubs would be made overnight. This is a far cry from what will actually happen. We will use as much data as is avilabale and plan each country so we get it right first time and several thousand at a time in chunks in a coordinated fashion rather than unleashing a perma-sub-stub creating monster non stop until suddenly we have 4 million articles ♦Blofeld of SPECTRE♦ 09:13, 2 June 2008 (UTC)
Actually, people jumped to the conclusion that the example article was a fair example of what was going to be created and that the claimed number of articles was going to be created in a reasonably short period of time (up to and including months).--Prosfilaes (talk) 10:37, 2 June 2008 (UTC)
Yeap, and it has few rewards. We can't elaborate that many villages we don't know that fast. Just take this bit by bit.Ruennsheng (Talk) 04:53, 2 June 2008 (UTC)
It could also give admins the chance to see how many of these articles are going to show up on Special:Unwatchedpages and stay there. John Nevard (talk) 07:30, 2 June 2008 (UTC)

Images

Hm... this bot would not only add millions of articles, it would also add a similar number of millions of images of maps to illustrate those articles. While I'm not sure that's a problem, I haven't seen it mentioned. — PyTom (talk) 06:58, 2 June 2008 (UTC)

Not really, just one per country would be sufficient, and these mostly exist. The markers are added as overlays. — Andrwsc (talk · contribs) 07:03, 2 June 2008 (UTC)
If you look at the example, I think it's a standard national image with an HTML overlay. Pseudomonas(talk) 07:04, 2 June 2008 (UTC)

Regional extra data sources

Different countries/regions will have different census data (of varying availabilities), political data (if the locality is a political constituency), linguistic data such as place names in different languages and alphebetizations, perhaps mapping data whose licensing varies from place to place, and various other numerical and textual information that will not be uniformly available across the globe, and may not even be applicable to other countries. Using this information will increase the value of the articles (and of Knowledge) significantly, since it's quite possible that it may not be synthesized in any other places.

I would support collaboration with the local wikiprojects, down to as fine-grained a level as is practical, to collate the sources. I would strongly prefer this to be done before the creation of the articles rather than after, as the process will inevitably lead to some decisions to rename/uninclude.

Given this, I'm in favour of creation of pages for large places or places with large amounts of supplemental data, and merged articles (and redirects) by area for small villages with little supplemental data.

With regard to the vandalism concerns (an area of interest of mine), I think in general the present procedures will apply as to other small and little-examined articles. I would like to canvass ideas on how to reference the articles to make number-change and similar subtle vandalism easier to check (following a single link to check a number is easier than searching government site and digging through PDFs, for instance). I'd prefer "Population in the 2002 Foobar regional census" to "Population" in an infobox heading, for similar reasons. Pseudomonas(talk) 06:52, 2 June 2008 (UTC)

Why not begin with Commonwealth or English-speaking countries?

If there's a Knowledge article on a US town with less than a hundred people, there is a pretty good chance that at least a few of those people will search for their town and come up with the Knowledge article and make an attempt to improve it. They are also certain to have a high level of fluency in English. If they become involved in Knowledge, they're going to have the capability to greatly improve articles.

On the other hand, while access to computers is a limiting factor on whether people in less developed countries are able to improve Knowledge, English language proficiency is going to be a higher barrier in the way of writing coherent articles. With that in mind, restricting the explosion of small town articles to those countries where residents are equipt to contribute seems sensible. John Nevard (talk) 07:44, 2 June 2008 (UTC)

Beginning with countries that have large and active Wikiprojects could also be a good idea. I feel it ought to be a case of "these first" rather than "these only", to do otherwise would really compound the systemic bias. Pseudomonas(talk) 08:34, 2 June 2008 (UTC)

The answer is because the focus of the bot really is away from english speaking countries and to cover parts of the world which are gorssly uneven e.g the third world. But english speaking countries will all be covered eventually ♦Blofeld of SPECTRE♦ 09:12, 2 June 2008 (UTC)

Perhaps the focus of the bot needs to be rethought -- especially since the bot's sample product so far doesan't look very useful. Cheers, Pete Tillman (talk) 18:46, 2 June 2008 (UTC)

Keep in mind

Sometimes the line between cities/towns/villages/hamlets and counties is difficult to distinguish, particularly in China. Also, we need stubs made for all of China's counties (as was done recently in bot-assisted fashion for all of Vietnam's counties, which worked quite well!). Badagnani (talk) 09:55, 2 June 2008 (UTC)

Also keep in mind that for China, many dabs will need to be made, as there are many settlements with the same names (and many settlements have had names that have changed, as well as informal place names that don't appear in sources, etc.). It can become a nightmare. Badagnani (talk) 10:02, 2 June 2008 (UTC)

I think it may be less of a nightmare to do it all at once, with discussion about transliteration schemes, history, informal naming, and so on, rather than have each article named (as approximately at present) as what seems good to the creator of that article at that time. Later editors adding content won't have to worry about it so much. Pseudomonas(talk) 11:29, 2 June 2008 (UTC)

Encarta spam

When this was raised elsewhere (here), the feeling was the MSN Encarta link is inappropriate spam. Particularly as it does not link to a specific location on Encarta, rather it links to the front page. Two million pages on Knowledge all with a direct link to MSN Encarta's front page? No thanks. At least the Maplandia one goes to a more defined destination.

The infobox already has a direct link to geohack; this is both sufficient, and better. I am strongly opposed to this bot running unless the Encarta link is removed. Neıl 10:38, 2 June 2008 (UTC)

I only just logged in, only just responded there. Give me a chance to respond in one place before racing off all over the place, please Fritzpoll (talk) 10:40, 2 June 2008 (UTC)

If wikipedia had a decent atlas the link wouldn't need to be there. It can be taken out ♦Sir Blofeld ♦ 10:45, 2 June 2008 (UTC)

I was just coming back here to scratch this given Fritzpoll has indicated it will be removed. Thanks guys. Neıl 10:51, 2 June 2008 (UTC)
Might be worth considering getting wikipedia a similar atlas though, The mini atlas is useful but a far cry from a decent atlas encyclopedia you see in books ♦Sir Blofeld ♦ 10:55, 2 June 2008 (UTC)

Reliability?

How are we to judge the reliabilty of our sources for places outside the US? Google Earth has had a lot of problems with toponyms that had been out of use for decades, like Adolf-Hitler-Berg. Speaking about Mongolia (again), it is not so uncommon that settlements are relocated, and that nga database can apparently have a lag of some decades, too. A search for "bürentogtoh" at http://geonames.nga.mil/ggmagaz/geonames4.asp gets you coordinates that roughly match this place, which has been abandoned at some point in the 70s or so. The settlement still exists, just that it is now located about 15km further north, at a place formerly (?) (until the late 1930s?) known as Zagzuu Dugang. Yaan (talk) 10:51, 2 June 2008 (UTC)

For a similar case, see Tariat (in Arkhangai). It must have moved in the late 1950s, Owen Lattimore mentions this in his Nomads and Comissars. whcih appeared in 1962. Yaan (talk) 11:02, 2 June 2008 (UTC)
You can never be 100% certain of names, but we certainly try to work with the wikiprojects such as Mongolia as much as possible and ensure we have everything sorted in advance with names, sources, and articles ready for starting. When we get around to Mongolia we will fully discuss the best course of action with you, Late bird and Bogotemolv and the project as with other wikiprojects before articles are implemented. Thankyou ♦Sir Blofeld ♦ 11:05, 2 June 2008 (UTC)
My question was not specifically about Mongolia. I just chose Mongolia as example because that is a place where I am able to point out the problems. Yaan (talk) 11:11, 2 June 2008 (UTC)

Will the bot also faithfully create articles for all those places that now reside at the bottom of lakes after the Sezchuan Earthquake? There seems to be no plan for the dynamic nature of settlements, which again shows this bot up as a useless exercise in creating articles that will never be touced, just because we can. MickMacNee (talk) 11:52, 2 June 2008 (UTC)

And naturally of course you can predict every edit that will be made in the future of wikipedia? If the articles are never touched has it not occurred that a bot could also remove them if needs be later if they don't develop? There seems to be some idea going that articles are rigidly stuck in place for ever. This is a wiki, which means fast, also referring to how "fast" articles can develop. Ernst Stavro Blofeld 11:57, 2 June 2008 (UTC)

Archive, please!

Can we archive this page (I lack the skills to do so) so that we can start afresh with my refined proposal? Since it is substantially different, I'd rather people didn't get confused? Fritzpoll (talk) 12:05, 2 June 2008 (UTC)

The above discussion is preserved as an archive. Please do not modify it. Subsequent comments should be made in a new section.

Why was this archived?

I fail to understand the proffered justification for the archiving of the above, and the proposal of a "refined" proposal.

As can be seen from both the "archived" proposal above and the discussion on it, the proposal had huge support.

The present proposal is terribly cumbersome; I gave up reading through it and don't plan to support it though I supported the initial one. Though continuing discussion is always fine, and revised proposals should be proffered if the initial one does not have wide support, I think it is ridiculous that the whole proposal was archived in this particular case. And that yet further proposals, discussion, and straw polls are needed all over again, as if starting from zero.

A strong majority of people agreed that the initial proposal would have been a fantastic addition to Knowledge. I propose ditching the current discussion and going with it. Dovi (talk) 13:35, 4 June 2008 (UTC)

9-4 is not great support; whatever you feel about the new proposal, it's better supported by consensus then this one. Keeping this discussion open when a new proposal was floated would merely have created more heat than light.--Prosfilaes (talk) 15:36, 4 June 2008 (UTC)

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.