Tatoeba.org Native Speakers

Short, easy-to-remember URL to this page


  • To find identified natives speakers, click the most recent "Native Language Sentence Counts" button.
  • The other button shows all contributors. This also includes those with no identified native language. I don't update this page as often.
  • I have stopped updating the "Tatoeba.org Native Speakers" page at this URL, since the "sentence count" page(s) also have most of this same information.
  • I now keep a list of all identified native speakers on my own computer. Sometimes, I create a public list of members who contribute sentences in their own native languages. See the button on the right.

The last version of this page from 2016-07-09 is below.

Table: The number of sentences on tatoeba.org that were owned by usernames on this list in their own natives languages.
Date Native Speaker Sentences % Native Total Sentences Languages with Native Speaker Contributors Identified Native Speakers Identified Native Speakers Who Have Contributed Native Speakers Who Have Added Sentences Since the Previous Check
2016-07-09 3,743,023 75.51% 4,956,610 91 (out of 310) 6,023 3,214 240
2016-06-04 3,691,023 75.45% 4,891,870 86 (out of 301) 5,714 3,118 349
MoreSee the complete table at the bottom of this page.
  • This list limits each member to one language -- the language they are most fluent in (native language, strongest language, dominant language or primary language)
  • The best way to help us is to translate sentences by native speakers into your own native language.
  • Current Trustworthiness of the Tatoeba Corpus = 75.51% (See note at the bottom of this page.)
  • The sentence counts are based on data from the July 9, 2016 sentences-detailed.csv file.
  • Added since last check shows an increase in the number of sentences owned by the username. Users without this number possibly added new sentences, but had some duplicates removed.
  • You can read more info at the bottom of this page, including how to get your username on this page.

  • I have also harvested all the usernames who claimed only one native language on the users_languages page on tatoeba.org, if they weren't already listed.

Toggle Buttons

Approx. only 53% of these identified native speakers actually have native speaker contributions.
Non-contributors are hidden by default. Languages shown without usernames have non-contributing native speakers.


Why Is a List of Native Speakers Important?

  • Knowing who the native speakers are will help you find trustworthy sentences that you can translate into your own native language.
  • Of course, native speakers sometimes make mistakes, too, but their mistakes are usually typing mistakes.
  • Non-native speakers aren't just more likely to make grammar and vocabulary mistakes, but also often create sentences that aren't the most natural way things would be said.


  • This is not an official list. This may not be accurate.

More Data


  • Contact me via tatoeba.org/eng/private_messages/write/CK, ...
    • ... if you find any errors.
    • ... if you want to let me know your native language so I can include you.
    • ... if you know the native langauge of a member who is not listed here.


  • Who Is Included?
    • These are all the native speakers I know about, either from what was written in their profiles or what I've been told. I have used this information under the (perhaps false) assumption that Tatoeba Project members are honest.
    • I have listed a member's strongest language, which in some cases may not actually be their "native language." (It happens.)
    • I have limited this to one native language per member. I know that there are some members that perhaps grew up speaking more than one language, but even for those people, one language is usually stronger. For most people, it's the language in which they did their formal education, have read the most, written the most, watched the most TV, listened to the most radio, and had the most conversations.
  • There is also a page showing sentence counts of Languages Without Identified Native Speakers.


Table: The number of sentences on tatoeba.org that were owned by usernames on this list in their own natives languages.
DateNative Speaker Sentences% NativeTotal SentencesLanguages with Native Speaker ContributorsIdentified Native SpeakersIdentified Native Speakers Who Have ContributedNative Speakers Who Have Added Sentences Since the Previous Check
2016-07-09 3,743,023 75.51% 4,956,610 91 (out of 310) 6,023 3,214 240
2016-06-04 3,691,023 75.45% 4,891,870 86 (out of 301) 5,714 3,118 349
2016-05-07 3,636,909 75.33% 4,827,523 85 (out of 298) 5,356 2,991 333
2016-04-09 3,587,732 75.30% 4,764,619 85 (out of 257) 5,100 2,906 335
2016-03-123,526,04975.18%4,689,82982 (out of 284)4,7712,786424
2016-01-303,448,11274.95%4,600,53182 (out of 283)4,1132,620353
2015-12-263,384,43874.77%4,526,27281 (out of 274)3,5982,493405
2015-11-083,309,48974.77%4,426,18980 (out of 259)3,1682,354386
2015-10-103,228,42074.32%4,343,50077 (out of 183)2,9132,235335
2015-09-123,156,89273.86%4,273,94977 (out of 183)2,7262,138503
2015-06-202,959,34372.90%4,059,26672 (out of 181)2,088?407
2015-05-022,828,63472.17%3,919,44770 (out of 181)1,615?307
2015-03-212,696,05467.62%3,767,68167 (out of 181)1,4711,471283
2015-01-242,547,95270.85%3,596,20163 (out of 179)1,4131,413192
2014-12-272,466,80970.09%3,519,131 unique59 (out of 178)1,2011,201298
2014-09-132,265,37667.42%3,359,88254 (out of 178)1,0021,002282
2014-04-262,055,00766.77%3,077,83053 (out of 171)929929?
2014-02-221,953,63466.58%2,934,11853 (out of 131)891891?
2013-11-301,814,14365.67%2,762,21952 (out of 131)861861?
2013-11-021,765,81265.43%2,698,44152 (out of 131)840840?
2013-09-071,699,91265.40%2,599,21151 (out of 131)807807?
earlier?I didn't archive this info.?


  • Current Trustworthiness of the Tatoeba Corpus
    This is based on the assumption that all native speaker sentences should be trusted without a doubt and non-native speaker sentences should not be blindly trusted. This also assumes that all links between languages are accurate. We've all seen bad native speaker sentences, good non-native speaker sentences, and also seen bad links between sentences, so this is not a percent aimed at trying to determine accuracy, but on perceived trustworthiness, based on the stated assumption.