Tatoeba.org Native Speakers

Short, easy-to-remember URL to this page
http://bit.ly/nativespeakers

Table: The number of sentences on tatoeba.org that were owned by usernames on this list in their own natives languages.
DateNative Speaker Sentences% NativeTotal SentencesLanguages with Native Speaker ContributorsIdentified Native SpeakersIdentified Native Speakers Who Have ContributedNative Speakers Who Have Added Sentences Since the Previous Check
2016-01-303,448,11274.95%4,600,53182 (out of 283)41132,620353
2015-12-263,384,43874.77%4,526,27281 (out of 274)3,5982,493405
MoreSee the complete table at the bottom of this page.
  • This list limits each member to one language -- the language they are most fluent in (native language, strongest language, dominant language or primary language)
  • The best way to help us is to translate sentences by native speakers into your own native language.
  • Current Trustworthiness of the Tatoeba Corpus = 74.95% (See note at the bottom of this page.)
  • The sentence counts are based on data from the January 30, 2016 sentences-detailed.csv file.
  • Added since last check shows an increase in the number of sentences owned by the username. Users without this number possibly added new sentences, but had some duplicates removed.
  • You can read more info at the bottom of this page, including how to get your username on this page.

  • I have also harvested all the usernames who claimed only one native language on the users_languages page on tatoeba.org, if they weren't already listed.

Information

Why Is a List of Native Speakers Important?

  • Knowing who the native speakers are will help you find trustworthy sentences that you can translate into your own native language.
  • Of course, native speakers sometimes make mistakes, too, but their mistakes are usually typing mistakes.
  • Non-native speakers aren't just more likely to make grammar and vocabulary mistakes, but also often create sentences that aren't the most natural way things would be said.

Disclaimer

  • This is not an official list. This may not be accurate.

More Data

Contact

  • Contact me via tatoeba.org/eng/private_messages/write/CK, ...
    • ... if you find any errors.
    • ... if you want to let me know your native language so I can include you.
    • ... if you know the native langauge of a member who is not listed here.

More

  • Who Is Included?
    • These are all the native speakers I know about, either from what was written in their profiles or what I've been told. I have used this information under the (perhaps false) assumption that Tatoeba Project members are honest.
    • I have listed a member's strongest language, which in some cases may not actually be their "native language." (It happens.)
    • I have limited this to one native language per member. I know that there are some members that perhaps grew up speaking more than one language, but even for those people, one language is usually stronger. For most people, it's the language in which they did their formal education, have read the most, written the most, watched the most TV, listened to the most radio, and had the most conversations.
  • There is also a page showing sentence counts of Languages Without Identified Native Speakers.

Stats

Table: The number of sentences on tatoeba.org that were owned by usernames on this list in their own natives languages.
DateNative Speaker Sentences% NativeTotal SentencesLanguages with Native Speaker ContributorsIdentified Native SpeakersIdentified Native Speakers Who Have ContributedNative Speakers Who Have Added Sentences Since the Previous Check
2016-01-303,448,11274.95%4,600,53182 (out of 283)41132,620353
2015-12-263,384,43874.77%4,526,27281 (out of 274)3,5982,493405
2015-11-083,309,48974.77%4,426,18980 (out of 259)3,1682,354386
2015-10-103,228,42074.32%4,343,50077 (out of 183)2,9132,235335
2015-09-123,156,89273.86%4,273,94977 (out of 183)2,7262,138503
2015-06-202,959,34372.90%4,059,26672 (out of 181)2,088?407
2015-05-022,828,63472.17%3,919,44770 (out of 181)1,615?307
2015-03-212,696,05467.62%3,767,68167 (out of 181)1,4711,471283
2015-01-242,547,95270.85%3,596,20163 (out of 179)1,4131,413192
2014-12-272,466,80970.09%3,519,131 unique59 (out of 178)1,2011,201298
2014-09-132,265,37667.42%3,359,88254 (out of 178)1,0021,002282
2014-04-262,055,00766.77%3,077,83053 (out of 171)929929?
2014-02-221,953,63466.58%2,934,11853 (out of 131)891891?
2013-11-301,814,14365.67%2,762,21952 (out of 131)861861?
2013-11-021,765,81265.43%2,698,44152 (out of 131)840840?
2013-09-071,699,91265.40%2,599,21151 (out of 131)807807?
2013-06-151,533,25164.56%2,374,82049765765?
2013-04-131,461,40664.58%2,263,055????
2012-03-10809,337?39478478?
2011-11-13632,531?36398398?
earlier?I didn't archive this info.?

Notes

  • Current Trustworthiness of the Tatoeba Corpus
    This is based on the assumption that all native speaker sentences should be trusted without a doubt and non-native speaker sentences should not be blindly trusted. This also assumes that all links between languages are accurate. We've all seen bad native speaker sentences, good non-native speaker sentences, and also seen bad links between sentences, so this is not a percent aimed at trying to determine accuracy, but on perceived trustworthiness, based on the stated assumption.