User talk:Matilda/Archive 2

Page contents not supported in other languages.
From Simple English Wikipedia, the free encyclopedia

Note this is an archive - please don't add comments here but start a new thread on my current talk page


Adminship?[change source]

Hello Matilda,

I've seen you actively editing here at Simple English Wikipedia for quite a while now, I've looked through most of your contributions, and I must say I am very impressed. You do a lot of work in mainspace, you have very good judgement, you are very civilized in discussion, you revert vandalism, you tag articles for deletion (not much as I really expected, but they're all accurate!), you work in various places, and you always use edit summaries (and you use them well). Obviously your experience from en.wikipedia and commons have helped you here. You've done a lot to help Simple English Wikipedia, and the community appreciates it very much. Fantastic!

Now that I've got that off my chest... let's get to the point to why I said all that. I came by to ask you something: Are you interested in adminship here at Simple English Wikipedia? You probably already knew I was going to ask you that from the title of this header, but still. Anyway, I think you'll make a fantastic administrator for Simple English Wikipedia. I don't see any instance where you will abuse the tools. You as an administrator would only benefit Wikipedia.

I realize you haven't yet hit the three month mark of activity that is suggested for adminship at WP:CFA, so I'll be asking again in about two weeks or so, that is, if you accept my offer. For other information on my past nominations, refer to User:RyanCross/RfA_noms. If you have any questions, just ask. If you decline my offer to have an RfA in a few weeks, that's no problem at all. I'll understand completely.

Have a nice day. — RyanCross (talk) 09:28, 11 November 2008 (UTC)[reply]

I'd like to add that I too think you would make a good candidate for adminship. I know we did have our run-ins but I think we've worked them over and we together have built a very workable DYK process :D. If Ryan agrees i'd like to actually co-nominate you as I feel that you would make a good candidate. And I will of course support regardless of whether I co-nom or not ;). — This unsigned comment was added by Bluegoblin7 (talk • changes).
Of course you can co-nominate, Bluegoblin! Matilda, if you accept, I suggest only having 2–3 nominators only. — RyanCross (talk) 09:59, 11 November 2008 (UTC)[reply]
Cheers Ryan ;). Matilda: Ping me if you accept Ryan's nom and would like me to co-nom you. I'll probably see it anyway (I patrol RecentChanges when i'm online and when I log on I look at RC back to my last edit). Cheers, BG7even 10:04, 11 November 2008 (UTC)[reply]
  • Thank you both for your kind words. My initial response was that I am very happy being an active editor and if I want to "play" with the tools I should help out more at enwp. On the other hand, yes , at the appropriate time (ie after 3 months of active editing), I am happy to accept additional tools if the community agrees. They may come in useful, as some times, I am editing at a time of day when no other admin is around. Although I have had disagreements with several editors, I don't wheel war and don't intend to start! I hope my disagreements are seen to remain with in the bounds of civility / our guideline on being kind.--Matilda (talk) 23:02, 12 November 2008 (UTC)[reply]
I'd like to be the third co-nominator.--   ChristianMan16  23:12, 12 November 2008 (UTC)[reply]
  • Very well then. I'm happy you have accepted. I'll alert you when the time is right (ie after three months of active editing). That would be in about two weeks, or I might wait a tad longer. Your choice. — RyanCross (talk) 01:16, 13 November 2008 (UTC)[reply]
I will leave the timing to your judgment and convenience Regards and thank you --Matilda (talk) 02:49, 13 November 2008 (UTC)[reply]


Reading ease[change source]

Hello there, I just wanted to drop a little comment regarding GAs and VGAs. All of those articles underwent a stict community review, which is not based on syllable and word counts. If you want a Flesch-Kincaid (published paper dates from the mid-1950s) score that is strictly lower than 8.0, this means that we would need to look at almost all VGAs again (except for the Chili Peppers, Spurgeon, and Hendrix) and that half the GAs (Warner, Caffeine, Muhammad, Pipe organ, Lenzburg, Graham, Baseball uniform, the Mouthpiece, and the Premier League) would need revising. I don't know, but in the science I come form, we are told not to use material that is more than 5 years old; you try to tell us that something that is over 50 years old is still up to date?

We are currently about 30 regular editors; less than half participate in the VGA/GA process. Any ideas how we can handle this? (And especially, before we embark the project on such a mission, we should have very good reasons to do so; we currently don't)

Just wanted to bring those things up. And yes, I can supply you with the statistics; send me a mail so I have your address. --Eptalon (talk) 10:53, 15 November 2008 (UTC)[reply]

A chainsaw is better than an axe for cutting down trees (though even chainsaws are relatively old technology - see en:Chainsaw#History - we don't have the money for the chainsaw approach to readability. Axes are stil used even though they ancient and so are wheels. I don't think the tools are too out of date as they haven't been replaced with something better and cost-effective. We haven't found a tool better than the wheel either. In your thinking, please remember I am not proposing it is the only assessment of an article's usefulness. Nor am I proposing articles be deleted. I am suggesting though that until they are in line with the project aims of simple English, they are not promoted on the main page. I will email you. --Matilda (talk) 21:45, 15 November 2008 (UTC)[reply]
In reply to We are currently about 30 regular editors; less than half participate in the VGA/GA process. Any ideas how we can handle this? (And especially, before we embark the project on such a mission, we should have very good reasons to do so; we currently don't) I think all new VGA and GA reviews should take readability into account. The older VGAs you already have stats on and we should review for improvements in readability stating with those which have the poorer scores first. In the first instance propose collaboration at ST. Ultimately I would suggest demotion if the articles do not fit within the aim of this wikipedia. As said numerous times elsewhere this wikipedia's defining characteristic is simplicity of language and readability scores are a measure (not the only measure) of that. Poor readability scores need to be justified - for example Oklahoma scores badly because of the number of syllables in the state's name - you can actually test to exclude that. --Matilda (talk) 21:50, 15 November 2008 (UTC)[reply]
Automatically determining how "readable" an article is, or how much "trouble" a certain target group would have understanding the article is extremely difficult to do. The methodologies you propose have two "critical" failures:
  • They neither look at the grammar of the sentence (is the the word I am currenly checking a verb, a noun, or an adverb) nor do they do a lexical analysis (ever tried searching google for "the"?) nor a semantic one (what does the word actually mean?) all they have to rely on is a method of splitting words into syllables, and counting the number of syllables per word, and the number of words per sentence.
  • I am sure there are other methodologies (dating from the 1980s and 1990s, at least) that could be used to automatically determine readabliilty, these would solve at least some of the issues.
So rather than using methods that are over 50 years old, I would not use such methods at all. We do have capable eaditos who are quite easily able to tell you if something is hard to understand. --Eptalon (talk) 22:34, 15 November 2008 (UTC)[reply]
Regarding the (V)GA reassessment: we cannot judge the articles that currently meet the criteria with other rules than those that don't. --Eptalon (talk) 22:34, 15 November 2008 (UTC)[reply]
I think we have to disagree on whether old methods are better than no methods. I would be happy to replace old methods with newer methods. I am not happy to not use any method because it is too old.
I don't think a readability score is the only thing but a high readability score needs to be explained as to how the article then complies with the guideline.
For example, Oklahoma was recently promoted to VGA - see Wikipedia:Proposed_very_good_articles/Archive_4#Oklahoma. In the discussion I focussed on readability. I think the work we did to improve readability including the discussion at Talk:Oklahoma improved the article for this wikipedia. Obviously the scores were not the only thing in revising the article. During these 37 edits , I and CPacker added material as well as improving readability. I think the article is significantly more suitable for our intended audience at SimpleWP than before. If you score Oklahoma on a single syllable substitute name - eg Oak then readability improves - no surprise. A similar method could be used for other reviews to separate out improving readability from things that cannot be changed such as using the state's name in an article about the state.
I think VGAs should meet policy and guidelines. One of our guidelines is that the articles be written in Simple English. That articles have not been reviewed against guidelines in the past, is no reason to not to do so now. I think Wikipedia:Requirements for very good articles probably needs reviewing. At the moment a featured article on enwp transwikid here with redlinks eliminated would pass - surely this would not be acceptable. It would not be acceptable in my view because this is not just another wikipedia in English it is a Wikipedia in Simple English.
Any review needs to be carefully handled so as not to offend editors.
Note enwp has extensively reviewed featured articles that no longer meet standards although they did in the past. It is a wiki!
I don't really feel the VGA review comments and !votes are often against criteria - they seem to be often based on sentiment. We need to start providing some guidance on looking at criteria for expressing an opinion.--Matilda (talk) 00:41, 17 November 2008 (UTC)[reply]

(outdent) There must be no templates pointing to the fact that the article needs improvement. These templates include {{complex}}, {{cleanup}}, {{stub}}, {{unreferenced}} and {{wikify}}. The article also should not need them. (Item 9 of VGA criteria, Item 8 of the GA criteria--Eptalon (talk) 16:39, 17 November 2008 (UTC)[reply]

Part of the issue is when should articles be tagged as {{complex}} - there is not necessarily agreement on this and I think it is this agreement I am trying to get to. But at least than you for pointing out that the VGA criteria does include readability even if a little obliquely.--Matilda (talk) 20:45, 17 November 2008 (UTC)[reply]

Reading ease part 2 - suggested tools?[change source]

So: if we rely on an automated tool for assessment, we should use current (<10 years) methodology ... which then would you suggest?

  • Dale-Chall [1] revised in 1995 (13 years old)

    The New Dale-Chall readability formula calculates the U.S. grade level of a text sample based on sentence length and the number of unfamiliar words. Unfamiliar words are ones that do not appear on a specially designed list of common words that are familiar to most 4th-grade students.
    The original familiar word list was only 763 words; however, Professors Chall and Dale extended this list to 3,000 for the revised version of this formula in 1995. Note that the revised version of this formula is what Readability Studio uses.
    Because this formula is based on the usage of familiar words (rather than syllable or letter counts), it is often regarded as a more accurate test for younger readers.

    This to me seems quite good, although originally developed in 1948 [2]it seems to me that it has been kept current. The word list feature addresses concerns beyond mere syllable, word length, sentence length statistics.
    The tool can be used here http://www.interventioncentral.org/htmdocs/tools/okapi/okapi.php The wordlist is at http://www.interventioncentral.org/htmdocs/tools/okapi/okapimanual/dalechalllist.php and the formula at http://www.interventioncentral.org/htmdocs/tools/okapi/okapimanual/dalechall1.php
    • I tested it on the lead of Oklahoma removing the bit about the pronounciation and highlighting Oklahoma and Oklahomans as words to be accepted as easy (since that was the article topic). The results were:

      Total Words in Sample: 202 Total Sentences in Sample: 18
      Average Number of Words Per Sentence: 11.22
      Words Not Matched to Dale Familiar 3000-Word List: 40
      Percentage of Words Not Matched to Dale Familiar 3000-Word List: 19.80
      Dale-Chall Readability Index: 7.31 Raw Score; 9-10th Grade Level

      The text was as follows with the words found difficult were underlined (I have put in italics) :

      Oklahoma is a state that is in the southern part of the Central United States. It had a population of about 3,617,000 people in 2007. The state has a land area of about 68,667 sq mi (177,847 km?). Oklahoma is the 28th largest state by population. It is the 20th largest state by area. The name of the state comes from the Choctaw words okla and humma. It means "Red People". It is also known by its nickname, The Sooner State. The state was formed from Indian Territory on November 16, 1907. It was the 46th state to become part of the United States. The people who live in the state are known as Oklahomans. The state's capital and largest city is Oklahoma City. Oklahoma is a large producer of natural gas, oil and food. It has large industries in aviation, energy, telecommunications, and biotechnology. The state has one of the fastest growing economies in the nation. Between 2005 and 2006, it had the third highest percentage of income growth and the highest percentage in gross domestic product growth. Oklahoma City and Tulsa are the main economic areas of Oklahoma. Almost 60 percent of Oklahomans live in these two metropolitan statistical areas.

My conclusions based on this still in progress - will get back to this later this morning! Want to assess the words they found difficult against our combined basic word list or appropriate wikilinks and understand the impact of removing those words. --Matilda (talk) 20:45, 17 November 2008 (UTC)[reply]

I have rerun the test using the following

*Oklahoma is a state that is in the southern part of the Central United States. It had a population of about 3,617,000 people in 2007. The state has a land area of about 68,667 sq mi (177,847 km²). Oklahoma is the 28th largest state by population. It is the 20th largest state by area. The name of the state comes from the *Choctaw words *okla and *humma. It means "Red People". It is also known by its *nickname, The Sooner State. The state was formed from *Indian *Territory on November 16, 1907. It was the *46th state to become part of the United States. The people who live in the state are known as *Oklahomans. The state's capital and largest city is Oklahoma City. Oklahoma is a large producer of *natural gas, oil and food. It has large industries in *aviation, *energy, *telecommunications, and *biotechnology. The state has one of the fastest growing *economies in the *nation. Between 2005 and 2006, it had the third highest percentage of income growth and the highest *percentage in *gross *domestic *product growth. Oklahoma City and Tulsa are the main economic areas of Oklahoma. Almost 60 percent of Oklahomans live in these two *metropolitan *statistical *areas.

That is I have marked with an asterisk (*) in front of additional words that were explianed via wikilinks and I wanted the tool OKAPI! to accept as an 'easy' word. The results were:

Total Words in Sample: 202 Total Sentences in Sample: 18
Average Number of Words Per Sentence: 11.22
Words Not Matched to Dale Familiar 3000-Word List: 20
Percentage of Words Not Matched to Dale Familiar 3000-Word List: 9.90
Dale-Chall Readability Index: 5.75 Raw Score; 5-6th Grade Level
Dale-Chall Readability Formula for This Passage =
(0.0496 * 11.22 Avg. Number of Words Per Sentence) +
(0.1579 * 9.90 Percent of Words in Sample Not Found on Dale Familiar Word List) +
3.6365 = 5.75 Raw Score = 5-6th Grade Level

My conclusions are that using such a tool can sort out readability issues including allowing for explanations of terms via wikilinks. I am not sure that all the wikilinks provided are excellent, but that is another project. Using http://www.online-utility.org/english/readability_test_and_improve.jsp the same text scored :

Number of characters (without spaces) : 921.00
Number of words : 206.00
Number of sentences : 19.00
Average number of characters per word : 4.47
Average number of syllables per word : 1.58
Average number of words per sentence: 10.84
Indication of the number of years of formal education that a person requires in order to easily understand the text on the first reading Gunning Fog index : 9.77
Approximate representation of the U.S. grade level needed to comprehend the text :
Coleman Liau index : 7.77
Flesh Kincaid Grade level : 7.31
ARI (Automated Readability Index) : 5.05
SMOG : 10.00
Flesch Reading Ease : 61.95
List of sentences which we suggest you should consider to rewrite to improve readability of the text :
* Between 2005 and 2006, it had the third highest percentage of income growth and the highest percentage in gross domestic product growth.
* It has large industries in aviation, energy, telecommunications, and biotechnology.
* Almost 60 percent of Oklahomans live in these two metropolitan statistical areas.
* Oklahoma City and Tulsa are the main economic areas of Oklahoma.
* Oklahoma is a state that is in the southern part of the Central United States.

Still useful information but I think the 5-6th Grade Level is probably more realistic in this case because we have spent some time in providing the links to explain the terms. We could possibly do more as per these sentence suggestions.

Dale-Chall Readability Formula is only a tool - of course the article has to be reviewed that it makes sense and it complies with lack of bias and verifiability policies! However, use of this tool could help to resolve issues about whether an article is too complex or sufficiently readable for our target audience. This is particularly so because it allows for adjustments to be made about unfamiliar words and because the tool is vocabulary based. I think this tool (Dale-Chall Readability Formula and http://www.interventioncentral.org/htmdocs/tools/okapi/okapi.php site ) might meet your requirements to be modern and thoughtful - appreciate your response --Matilda (talk) 00:57, 18 November 2008 (UTC)[reply]

If you're up to it, please add in the end date upon transclusion. If you don't think you're ready, I will delete it until you are. :) Synergy 19:37, 18 November 2008 (UTC)[reply]

I think I am ready but I am embarrassed (as well of course as extremely flattered) by offers of noms and conoms (see above ) --Matilda (talk) 20:57, 18 November 2008 (UTC) (blushing)[reply]
Looks like Ryan has been one-upped again (Majorly nominated me for adminship, but Ryan had offered before then). alexandra (talk) 20:58, 18 November 2008 (UTC)[reply]
Retreats in embarrassment into the corner and says - let's just wait til 25 November as per community convention. --Matilda (talk) 21:02, 18 November 2008 (UTC)[reply]
I'm a stickler for not following every rule we have, when I know whats right. If the RfS started today, you'd have been here exactly 3 months the day you received the bit (approx. and given its a successful request). I don't think it will be a problem if you ran a few days shy (an awful oppose to boot), but if you really want to wait the 7 days I won't stop you. ^_^ Would you like me to delete the page? If so, I will not be upset. If Ryan was here first, I can let him do the nomination (its entirely up to you; I won't hold it against you). I just thought that Ryan was taking too long and I misread the date (ugly formatting) and thought you'd been here since the 8th, not the 25th <rolls eyes>. Synergy 22:00, 18 November 2008 (UTC) Or maybe we can wait just 2,3 or 4 more days? Just so its not as many as 7 days off. :)~[reply]
  • I will wait til the 25th and give Ryan a chance to conom plus Bluegoblin and ChristianMan too if they still wish. While I appreciate the sentiment about not following "every" rule we have, when I "know whats right", I think its important in an RfA to follow rules, guidelines and conventions to demonstrate to the community that I do really respect the rules, guidelines and conventions of this community. Many thanks --Matilda (talk) 00:44, 19 November 2008 (UTC)[reply]
Good, thats the answer I was looking for. :) Synergy 00:47, 19 November 2008 (UTC)[reply]
  • Cassandra's right, I have been one-upped again... but anyway, I don't mind co-nominating, unless Synergy would be fine if I would nom instead of co-nom since I asked first... but it doesn't matter much. :) I'll start writing my nomination shortly... but I won't add it until it is close for the RfA to become live. — RyanCross (talk) 01:02, 19 November 2008 (UTC)[reply]
I don't mind deleting and letting you recreate... Synergy 01:05, 19 November 2008 (UTC)[reply]
If you insist. Thanks, — RyanCross (talk) 01:10, 19 November 2008 (UTC)[reply]
 Done Perfectly reasonable to let you do the honors. ^_^ Synergy 01:16, 19 November 2008 (UTC)[reply]
Thank you, Synergy. I'll be sure to recreate it when my nomination is done, and I'll let Bluegoblin, ChristianMan, and you co-nom if you would still like. Though, I think three nominations is enough, but I don't mind having four if Matilda doesn't mind. It's her RfA anyway. ;) — RyanCross (talk) 01:24, 19 November 2008 (UTC)[reply]
I'm still up for co-nom. You're the first persion I've felt truly deserves it since American Eagle.--  CM16  02:16, 19 November 2008 (UTC)[reply]
I am very flattered by the support and touched by the sentiments expressed and thus will have as many co noms as people choose to make even though it isn't a convention, I would find it impolite to decline such kind gestures that have been offered. Many thanks for the good wishes. --Matilda (talk) 02:39, 19 November 2008 (UTC)[reply]
Good response. :) — RyanCross (talk) 03:18, 19 November 2008 (UTC)[reply]
As long as no-one else takes the first co-nom, i'm not fussed ;). I would actually have nommed but RyanCross beat me to it... and I am too nice to steal it from him ;) XD . Can some one ping me on my talk page when the main nom is written. Thank, BG7even 11:21, 19 November 2008 (UTC)[reply]

Quick heads up[change source]

After quickly checking your edit count, just wanted to say that you should bone up on your #article edits before a RfA. You have high numbers of Wikipedia namespace editing, but to be better safe than sorry, try maybe getting article edits to 50%+? Don't give anyone vague foundations of an {{oppose}}! --Gwib -(talk)- 06:49, 19 November 2008 (UTC)[reply]

Well it is one way to keep me away from ST and DYK, let alone nagging people about their edit summaries and signatures ;-)
I guess I am not doing anything to become an admin (nor to stop becoming one) - but I do appreciate the friendly advice and the spirit in which it was offered. --Matilda (talk) 06:54, 19 November 2008 (UTC) + another non-article edit :-([reply]

Addendum Reading Ease[change source]

Hello, just a quick note: Could you imaging getting an article to VGA, so you see yourself how it works, and how much work it actually is. I am proposing this because if we are supposedly changing the system, you can then judge yourself, how much work it would be to "re-approve" a certain number of articles that fail the new ceriteria? --Eptalon (talk) 13:50, 19 November 2008 (UTC)[reply]

I did in fact assist with getting Oklahoma to VGA specifically on the issue of reading ease. With some focus, it wasn't that hard. I query whether this is a new criterion or rather one that has not been applied rigorously enough in the past. --Matilda (talk) 20:10, 19 November 2008 (UTC)[reply]
I think you are not quite right, there. It took 78 changes to get Gothic architecture to Good Article, from the nomination; see here for the change summary (note this does not reflect the current revision; has been changed since. (and is nominated for Very good article). I suggest you get an article to Very good article, for exactly that reason.--Eptalon (talk) 20:33, 19 November 2008 (UTC)[reply]
Ummm - I think you have ignored my point several times concerning Oklahoma (including in section above) - In 37 edits over about 3 days (of which 23 were mine, the others mainly CPacker and there was one from Tholly) I improved readability from Flesch Reading ease 35.9 -> 44.7 and Flesch-Kinkaid Grade level 13.1 -> 10.5 (measured using Microsoft Word on printable versions of earlier iterations of articles excluding header and references - 1st version of diff was 1956 words, the second 2177 words) The current version is 2194 words, Flesch reading ease of 44.6 and Flesch Kinkaid grade level of 10.7. I don't think that is good enough particularly, especially 25% passive sentences needs to be reviewed. The point I am making is (a) I have contributed to a VGA and (2)I improved readability by 3 grade levels in that process.--Matilda (talk) 20:58, 19 November 2008 (UTC)[reply]
Note: There is also the question of filling red links (I remember creating around 50 articles on various cathedrals with Eptalon for Gothic architecture), filling broken templates and a number of other tasks to be tackled outside of the article itself before it meets VGA criteria. --Gwib -(talk)- 21:05, 19 November 2008 (UTC)[reply]
Ummm - your point is - one moment I am discussing readability and the next I am being told about how much work another editor has doen on redlinks. I don't mean to belittle the redlink work but the discussion is about readability, how we assess that (what measure or test to use), and what standard to apply. Furthermore can an article that is below certain parameters be a GA or a VGA on this wikipedia (which is for Simple English)? It would seem that the criteria for GA and VGA include that an article shouldn't be tagged or be deserving of a tag and the {{complex}} tag is in scope. When do we tag for complex? This is the question I am seeking some consensus on.--Matilda (talk) 21:13, 19 November 2008 (UTC)[reply]
'Off-topic'

In response to above, it was just the "With some focus, it wasn't that hard [to get Oklahoma to VGA". There is far more to do than simply add/simply the article itself. A tad off-topic, but relevant to that sentence none-the-less (I've always wanted to say that!). --Gwib -(talk)- 21:17, 19 November 2008 (UTC)[reply]

As you note it is off-topic, and I reiterate, I am not talking about the effort to add content (which I appreciate is considerable), I am talking about the effort to ensure that articles, including articles already promoted to VGA and GA, are suitably non-complex - ie are in Simple English not just English. Eptalon raised the issue about reviewing GA and VGA articles and I am trying to say, it takes some effort to improve readability but it should not be too much. But I still want to know, what is our standard here at Simple English wikipedia for readability and how do we measure it? Content expressed in complex language might as well be at enwp. --Matilda (talk) 22:22, 19 November 2008 (UTC)[reply]
Getting an article to VGA

You yourself say that it wasn't that hard to get an article to VGA - so please do it (to an article of your choice, which is neither GA nor VGA, nor candidate for such flag) yet. My rationale behind making you do this is that you find out yourself how much work has been spent on getting a number of articles to that standard. As Gwib pointed out, this is not just about language simplification; it is also about extending an article content-wise, finding the right amount of links to place, referencing, fixing the red-links that turn up, etc. There is a reason we allow so much time for the process. --Eptalon (talk) 22:15, 19 November 2008 (UTC)[reply]

See my reply above to Gwib - your argument is an ad hominem argument and does not deal with the question. If you can't answer the question, that is OK - I will make up my own answer and test the community response. I have tried repeatedly to engage in a conversation and feel as though I am getting nowhere. My next step will be to draft a guideline for community consideration.--Matilda (talk) 22:25, 19 November 2008 (UTC)[reply]
My argument is not an ad hominem (en:ignoratio elenchi to be exact). I have reason to believe you are not aware of the work involved in getting an article to GA or VGA level; I therefore propose that you do so, before you comment on how the criteria should be changed; besides, your proposition above sounds like an en:Argumentum ad populum
To my knowledge, we do not have an explicit requirement for the article to be in simple language, whatever that translates to; however Within one week of being listed under the voting section, 80% of named editors must agree that the article is indeed very good. There is a required minimum of 6 named voters. (GAs: 9 criteria, 5 votes min, 70% support).
Rather generally, an assessment of reading difficulty must meet the following criteria, in my opinion:
  • It must be predictable, texts with similar predicted readablilty levels should get similar scores; a small change in the text should also only lead to a small change in the score.
  • It should be tuneable to different texts and audiences (as stated, scientific articles should be able to use scientific terms, without detriment to the readability score)
  • Ideally the scoring should rely on certain statistical data; filler words (the, a(n),..) should not be considered for the score.
  • No assumptions should be made about socio-economic or political background of the readers.
If we rely on the antiquated Flesch Reading Ease, and Flesch-Kincaid (with the limitations as cited), we get Medians of 58.7 (FRE), and 9.1 (FK) for Very Good Articles, and 66.7/8.1 for Good Articles; please do not argue we should take the mean/average; it is prone to be influenced by extreme values. If we take those values as given, one aim could be to raise all articles below the median to the level of the current median; the problem is just that both FRE and FK are bean-counting apporaches, they do visibly not incoprorate statistical text analysis, they also do not filter out filler words (so the exercise is doubtful at best).
Anyway, I am looking forward to your proposal, and wonder what the rest of the community thinks about it.--Eptalon (talk) 10:19, 20 November 2008 (UTC)[reply]
Your argument is an ad hominem argument because instead of dealing with the topic raised (readability), you (Eptalon and Gwib) are apparently attacking my credibility as an editor (ie per the en article which is perhaps clearer as it is more detailed you are indeed replying to an argument or factual claim by attacking or appealing to a characteristic or belief of the person making the argument or claim, rather than by addressing the substance of the argument or producing evidence against the claim. What you describe as mere red herring (the English equivalent of ignoratio elenchi), I see as ad hominem - you are suggesting by raising the arguments that since I have not contributed significantly to creating a VGA or GA here, thus not aware of the work involved in getting an article to GA or VGA level, in your view I should not comment on changing the criteria. They are two separate things - the criteria and my own activities as an editor. As per the article here on ad hominem arguments, whether or not I am poor and illiterate should have no bearing on the arguments pertaining to illegal abortion - the arguments on illegal abortion should be considered on their own merits without regard to the bank account of the person making those arguments.
I do indeed no what it means to get something to a featured article on enwp - for example Riverina (direct result of Wikipedia:WikiProject Riverina), but that is not in fact where I put my energies here or even usually there. I have several good articles to my name at en - several of which were almost entire my own - see for example Snowball marches or Arthur Upfield. I also, as pointed out above, know something about what it takes to revise a candidate for VGA - and it wasn't just readability that was up the spout, try spelling, facts missing, ...
You have said we do not have an explicit requirement for the article to be in simple language. Ummm - would the community allow a VGA equivalent on say German wikipedia to be excellent in every way except that it is not written in German? Surely the requirement for Simple English is so implicit is doesn't have to made explicit?
One of the challenges is the lack of definition around what is Simple English. If we do not define that as a community, we should not exist. There are various alternatives to adopt and we can fail to specify because they all have their good and bad points. What they all have is limited vocabulary, limited verb forms, and anything written in those forms of simple English would score highly on almost any readability test you like to name, regardless of its age or bias.
I am not about to argue about means, medians, modes or standard deviations - I do not in fact think that this community's past efforts are useful in setting standards. The community has a disproportionate number of active editors who are native English speakers, of limited educational attainment (still in school) some of whom are refugees from enwp because they have been banned or blocked long term. Their interest in being on this project is nothing to do with writing Simple English but is to do with the community and wikimedia projects in general. For that reason !votes in favour of VGA or GA candidates are skewed not least because there is a lack of expectation that criteria should be used . WP:ILIKEIT or ILIKETHEEDITOR is reason enough to !vote in favour. The fact that as you state there is no explicit requirement for the article to be in simple language means that there is of course no reason why such !votes should take the readability / language complexity into account. Above you have said that an article should not be tagged or able to be tagged of any issues including {{complex}}.
You have not commented on the model I discussed above. In my view it meets the four criteria you propose. ... Got to go and be busy in real-life --Matilda (talk) 01:11, 21 November 2008 (UTC)[reply]

Explanations[change source]

  • When I said that we did not have an explicit criterion for the text to be in simple language, I meant: It is not stated explicitely, that the article shoud be in simple language - Implicitly, the Several revisions, possibly by different editors-criterion (item 2, I think) should fulfill this task.
  • As to your I know that editor statement: With a community of roughly 30 editors, you tend to know the regulars (about 15) after about a week.
  • As to Dale-Chall:A quick search for revised Dale-Chall, does indeed reveal that it might be usable; this article looks at texts from medical literature, and trying to predict what words of the medical vocabulary are unlikely to be understood by a lay reader (that is one who is not familiar with medical terminology). Their method is based on looking up the frequency of the given word in a number of texts; more frequent terms are supposed to be more familiar.

We do not have a corpus of texts (that is: we do, but we cannot currently exploit it). A possible automated method (largely based on revised Dale-Chall) could therefore be:

  1. Filter out punctuation, and non-words (hyphen, quote-marks,...); also get rid of wiki-markup (discussion item: what about references?)
  2. Take one of the wordlists from En Wiktionary frequency list - filter out the top 200 (to be discussed): If it is in this wordlist, it is certainly understood. (filler)
  3. Filter out all words in another list (there comes our BE850, BE1500,...) (base vocabulary of an ...-grade student)
  4. For the remaining words:
    • Count their frequency in the remaining text; and from that their chance of appearing (a word that appears 10 times in a 100 word text has a 10% chance, or 0.1)
    • The words with a frequency of less than (probably, to be discussed) a certain chance (eg. 10%) (or a certain number: 10 times) are probably hard to understand. A text that contains 20 such words is probably harder to understand that one that contains 10 of them.

Drawbacks of the method:

  • No sentence length, syllables per word counts
  • Where do we get the other list (base vocabulary..) from?
  • Longer texts are more likely to contain different words (and more unknown words); is there a way to normalize (so the indication become independent of text length?)

Of course, the comparisons should be case-inisensitive. - Ideas? --Eptalon (talk) 10:30, 21 November 2008 (UTC)[reply]

Don't mean to be argumentative (who am I kidding, I love arguing), but my hook isn't boring.

"Did you know that Adriana Lima is trying to reunite with her father, who left her when she was 6 months old?"

Her father left her and her mother when she was 6 months old, and she is making efforts to find and meet with him almost 25 years after having never seen him. That's rather interesting, and I'd be willing to bet a fair amount that you didn't know that. --Gwib -(talk)- 22:20, 22 November 2008 (UTC)[reply]

Did you know that I don't know who Adriana Lima is? I could read the article but this hook doesn't tempt me. Why is she famous - why should I want to read about her? I don't want to read about people who were left by their fathers - very sad but ...? --Matilda (talk) 02:14, 23 November 2008 (UTC)[reply]
Nor did I, but special:Random brought her up. I'm sure you've heard of Victoria's Secret, and probably have heard of Brazil. She is a Victoria's Secret model from Brazil. People would associate her with scantily-clad situations, but the fact that she is making attempts to contact her father after 25 years of silence would not, in my opinion, be described as 'boring'. --Gwib -(talk)- 08:57, 23 November 2008 (UTC)[reply]
Speaking of DYKs I fixed mine.--  CM16  02:19, 23 November 2008 (UTC)[reply]

RfA reminder[change source]

Hello. I just wanted to remind you that tomorrow is November 25, the day you said you would run for RfA. If you haven't noticed already, I created Wikipedia:Requests for adminship/Matilda with the nominations in it (though, I am waiting for BG7's nom). Don't forget to accept the nomination when the time comes tomorrow. I wish you luck, and I'm sure it will go fine. – RyanCross (talk) 00:06, 24 November 2008 (UTC)[reply]

Will do tomorrow --Matilda (talk) 01:12, 24 November 2008 (UTC)[reply]

Re: IP Userpage[change source]

I have replied at my talk page, Regards, Kennedy (talk) 09:30, 24 November 2008 (UTC)[reply]

Well theres only so much I can do at 6AM :) --  Da Punk '95  talk  19:50, 24 November 2008 (UTC)[reply]

T:TDYK, just on top of CM16's. --  Da Punk '95  talk  19:56, 24 November 2008 (UTC)[reply]

I agree with your summary here. There's nothing on google and no article at EN WP. I would support a nomination for deletion. Malinaccier (talk) (Rev) 01:20, 25 November 2008 (UTC)[reply]

Thnk you for letting me know that you agreed with my suspicions. Given your opinion agreed with mine and the creator did not respond to my tagging or the query on her/his talk page, I nominated for quick deletion and it has been deleted. Regards --Matilda (talk) 02:43, 25 November 2008 (UTC)[reply]
Get used to checking the logs for the pages. :) I deleted it once before as a possible hoax. Its been salted indef until an autoconfirmed user can establish its notability. Synergy 02:49, 25 November 2008 (UTC)[reply]

I've nominated Tom for DYK. Please check and comment. --  Da Punk '95  talk  23:57, 25 November 2008 (UTC)[reply]


New discussion[change source]

I'd like to inform you of a new discussion here at Simple talk. God bless.--  CM16  06:56, 26 November 2008 (UTC)[reply]

Hi, I saw your comment and oppose on PGA for France. Note, I have done the refs you mentionned. If you will please tell me want you think is also not refferenced (or other things), then I'll be happy to add them. Thanks. (Oh and by the way, good luck on your RFA, I'm still undecided wether to support or not, but I think I will). Yotcmdr (talk) 13:56, 26 November 2008 (UTC)[reply]

Thanks for your prompt attention to the refs. I think you need to look for anything with a number and make sure there is a reference to support the number. You need to review the article and say - how do I know. If the wikilink is to a referenced article or it is something very obvious like the date of the French revolution, even if the article on the French revolution is not referenced, then no reference required. I will have another look tomorrow.--Matilda (talk) 14:01, 26 November 2008 (UTC)[reply]
Ok, thanks. Yotcmdr (talk) 14:02, 26 November 2008 (UTC)[reply]

thanks matilda, you rule.Spencerk (talk) 19:34, 26 November 2008 (UTC)[reply]

Re: Welcome![change source]

Hi. Thanks! I just finished omitting all the red links from the India page! I am also at the regular Wikipedia, so talk to me there, okay? I only visit here when I need a quick summary of a subject, mostly because a lot of these articles are stubs. Thank you again! -Blue Caper (talk) 22:04, 26 November 2008 (UTC)[reply]