Tatoeba Project

Sunday, April 6, 2014

Tatoeba update (April 6, 2014): Chinese converter fixes and 21 new languages

- The logo no longer says "beta", nor does our discussion of the list feature.

- The pinyin converter now works again.

- The Chinese traditional/simplified converter now works again.

- Updated icons for Kurdish and Telugu.

- The Spanish user interface is now fully translated.

- When editing a comment using the German interface, the text for the "Abbruch" link no longer overlaps the "absenden" button.

New languages:

- Bashkir
- Chuvash
- Hausa
- Hawaiian
- Hill Mari
- Kinyarwanda
- Kyrgyz
- Lakota
- Luxembourgish
- Macedonian
- Mambae
- Mon
- Nogai
- Ottoman Turkish
- Pipil
- Shona
- Shuswap
- Somali
- Yakut
- Yoruba
- Zulu

These 21 new languages, added to the 146 we had previously, give us a total of 167.

Sunday, March 30, 2014

Tatoeba update (March 30, 2014): 15 new languages

We have just updated the website again. Tatoeba now has 15 new languages, for a total of 146. The new languages are:

- Amharic
- Awadhi
- Bhojpuri
- Chavacano
- Middle English
- Middle French
- Haitian Creole
- Juhuri (Judeo-Tat)
- Greenlandic
- Meadow Mari
- Nahuatl
- Pennsylvania German
- Sinhala
- Turkmen
- Wallon

Thank you to those who gave us the sentences and information to fulfill these requests. Note that the procedure for requesting a new language (which involves supplying at least five sentences in that language) can be found via the Tatoeba menu under "More"/"Tatoeba Wiki"/"How to Request a New Language", or at this link.

Sunday, March 23, 2014

Tatoeba update (March 23, 2014)

We are pleased to announce a set of updates to the site. In addition to the differences that you'll see when you visit the site, we have some major changes behind the scenes that make it easier for us to attract and work with developers around the world.

- Contributors can now edit their comments on sentences or the Wall.

User Interface
- Added link to friendlier search instructions.
- Improved UI text (fixed misspellings, etc.) in English.
- Incorporated updates to UI translations from the past year or longer, most notably in Japanese and German (which is now 100% translated!).
- Internationalized several strings so that they can now be translated.
- Changed remaining references to "tatoeba.fr" into references to "tatoeba.org".
- Renamed "Modern Greek" to "Greek".

- Empty passwords are no longer accepted.

- Now accepts profile photos with uppercase file extensions as well as lowercase.

- Moved repository from Subversion on Assembla to Git on GitHub.
- Added scripts for adding languages and incorporating updated translations.
- Fixed various issues that appeared on developers' machines.

- All sentences have been indexed, so they will appear in the search results.

Even more important than the changes to the code is the fact that the team behind it is stronger and more responsive than it has been in a long time. We are especially looking forward to working with our Google Summer of Code participants, once we know who they will be.

Whether you are interested in contributing sentences, translating the user interface, developing code, testing the site, or all of the above, we hope you will join the team!

Saturday, March 1, 2014

Why We Need You to Help Beyond Adding Sentences

al_ex_an_der wrote: "I'd find it helpful if you could explain if possible in plain English why a newly added sentence can be found by Google already one minute later but by Tatoeba only one month later." I thought this was worth some discussion in a thread of its own.

First of all, I did a little experiment to determine whether a Google search for a word contained in a sentence that I had added a minute earlier really would succeed. Answer: no, though in one case, it remarkably took only about fifteen minutes before a search ("incontrovertible site:tatoeba.org") found it. But searches for words that I added in sentences seventeen hours and one hour ago came up empty.

To address Alexander's larger point: Why is it that Google indexes words so quickly, and Tatoeba takes so long? It comes down to differences in the hardware, software, human resources, and project management available to Google (a corporation with US$59 billion of revenue in 2013) and Tatoeba (a nonprofit whose budget is somewhat smaller). Google has vast "farms" of machines. Tatoeba has one. Even two machines would be a big improvement because one could index while the other was still actively handling requests and adding sentences. Getting from one to two, however, requires more funding, which demands organization, not just in terms of assembling a proposal for a grant or plans for fundraising, but for putting the money to use if and when it actually comes through. It also requires someone to write the code that can handle interaction between two computers operating in parallel. Software can accomplish what seems like magic, but it's not written by magic.

Tatoeba.org can never hope to replicate the money or machines that Google.com has at its disposal, but we can do a far better job (even beyond the impressive things we already do) if we get a lot more participation in everything that makes the site run, beyond the operations of adding, commenting on, modifying, and deleting sentences. Many of the people essential to Google are not software developers, and much of what we're missing at Tatoeba can be provided by people who are not developers, either.

In my last long post, I called for volunteers for testing, either at a high level (putting together a test plan and coordinating other volunteers), or simply working through some screens and determining whether they work. I also asked for someone to coordinate the translators who work on the code at Launchpad. Of course, I would have been glad if someone proposed to help in some way that I didn't even mention. But I was disappointed that no one responded at all. I want you to understand why people stepping up to help are not just nice to have, but essential.

We've undergone some changes in the way we store code, and we need to undergo some changes in the way we put it on the server. If we don't test before and after we make these changes, we could easily break something without knowing that we've broken it. But testing takes time. If I am responsible for doing every level of test planning and testing, as well as planning how to move the code without losing anything, it will take weeks longer to get to the point where we can move it. It will also become likely that something else will change in the interim, so we'll have to begin the cycle again without making any progress.

People who work at Google are motivated by some combination of enjoyment of the tasks involved in their jobs, satisfaction from accomplishing the assignments that they're not initially able to do, and financial incentives for doing their work. Their jobs require them to learn new skills and to do what has to be done, not just what they know they already enjoy doing.

Tatoeba can't provide financial incentives, but we can give you everything else, including the chance to move beyond what you already know you can do to tackle what has to be done (write up a test plan, collect bug reports and enhancement requests from the Wall, fix code written in PHP even if your favorite programming language is Python), and feel proud of what you've accomplished. You can also feel pleased that you're keeping Tatoeba going so that you can continue to add sentences to the corpus.

There is one more reason why we need a coordinated team to connect the gaps: You don't want anyone to burn out because they're asked to do too much. We all have commitments, and are limited to how much time we can contribute. If someone senses that he or she doesn't have enough time to do a job right, they'll drop out entirely. Let's make sure that we take full advantage of the incredibly talented people who've gotten us this far, and those who have yet to join us, by making sure that all the pieces fit together.

Please send me a note telling me how you'd like to help. Many thanks!

Sunday, February 23, 2014

Update on development

I just wanted to give an update on development at Tatoeba. There has been a lot going on behind the scenes.

To begin with, we're rebuilding the team. Developers who were involved in the past but had to take a break are now part of the crew again. Others are learning new skills so that they can help perform new tasks and make life easier for others. Thanks to lool0, we have a new mailing list so that people can communicate via e-mail and continue to refer to our collective wisdom throughout eternity.

In terms of changes that make life better for developers: First, pep has created a virtual machine, which means that regardless of whether developers are running on Windows, Mac, or Linux, they can recreate Tatoeba on their own machines, and can test both how it works now and how their changes affect it. This is a big deal. Secondly, lool0 has moved our repository (the place where we store our code and our problem reports) to GitHub, which is easier for our developers in all countries to reach, and opens us up to collaboration with people who find out about us there. Also thanks to lool0, we have a new mailing list so that people can communicate via e-mail and continue to refer to our collective wisdom throughout eternity. Finally, I've been working with the various translations of the user interface so that the good work already done by the UI translation teams at Launchpad, as well as their translations for all our new features, can be seen live. We are also getting closer to getting new languages and audio onto the site once again. And perhaps it will even become possible to use Tatoeba on a smartphone.

But developers shouldn't have all the fun! I'm hoping that some of you will help with various tasks that don't require you to know how to write code. For instance, I would love for someone to take on the role of translation czar (monarch?), who communicates with the various Launchpad translation teams and sees how we're coming along with making the Tatoeba user interface available in dozens of languages. It would also be great if we could get people involved in testing, whether at the top level (creating a test plan, recruiting and coordinating with other testers), or simply stepping through a list of functions and seeing how well they work. Finally, it would be nice to have a king/queen of collecting audio so that Tatoeba can be seen as well as heard.

Please send me a private message at Tatoeba (or e-mail me at alanf . tatoeba AT gmail) if you're interested in getting involved. And if you want to be part of the development team, visit (and join) the mailing list at https://groups.google.com/forum/?hl=en#!forum/tatoebaproject . I look forward to hearing from you!

Sunday, January 12, 2014

Hello, team!

Hello, everyone! Trang asked me to write this post to let you know that I'm going to be coordinating between the administrators, the developers and the contributors who form the Tatoeba community. As she explained in her previous post, she and sysko are very busy now with other responsibilities. Thus, they can't be involved in all the same ways, and to the same extent, as they were in the past. However, she will maintain an active advisory role in which she promptly answers the questions that I pass to her from the developers. In turn, I will make sure to relay those answers quickly back to the developers. In addition, as I've been doing for months now, I will make sure that the problems and requests posted by contributors on the Wall make their way into help tickets so that they can be tracked and solved in an organized way.

I will also be working with the developers on ways to make the site robust. We will be documenting and spreading the collected knowledge that prevents problems and that helps us recover from them quickly when they do happen. At the same time, we will be planning how to restart development.

I will soon be contacting people who have contributed to development, or have helped us get back on our feet after problems have occurred, or both. But just as importantly, I urge you to contact me if you can help with the technology on the site, whether or not you have done so in the past. You can always send me a private message via Tatoeba to tell me you would like to get involved, along with ways that you can be contacted and a description of your skills, interests, and experience. (There will be other ways you can contact me in the future.) I'll let you know how the current software on the site is set up and how you can jump in.

I'm excited about our getting together to make this site, which I love so much, stronger and better in every way. Join us in making 2014 a happy new year for Tatoeba!

Saturday, January 11, 2014

We need a better team!

Alright, so we have a problem.

Last month Tatoeba crashed and was then unavailable for more than a week, which is a pretty long time for a website that is used every day by thousands of people. After we managed to get Tatoeba back up, there were still issues with the site being very slow, and there were some additional downtime. And only now, after 3-4 weeks, things are stable again (at least they seem to be). It's a good thing that we got everything to work again, but it's not a good thing that it took so long. Now that I don't have to worry about Tatoeba not responding, or being too slow, I'd like to take some time to talk about the current situation.

I don't want to sound dramatic, but the current situation isn't good. There was a time when Tatoeba was actually growing, as opposed to the past 2 years where things have been stagnant. There was a time when users would report bugs, and they would be fixed within a few days, sometimes within a few hours. There was a time when users would request new features, and they would be implemented and released the next week. And if Tatoeba crashed, we could be working on it before you would even notice it was down. Basically, that was the time when sysko and I were both very involved in the project, and had necessary the time, motivation, energy, and passion to work on it.
But things have changed and now it feels like the project is going to fall apart if we don't do anything. Not right now, but one day. I mean, it's still working, a lot of people still love it, but nobody can maintain it properly anymore, and nobody can make it grow anymore. There are so many things that we could do, that we should do, but neither sysko or myself can (or want to) do it, and it will never be done unless someone else than us is willing to take care of it.

So my priority right now is to make a better team for Tatoeba. We need more people to take on tasks/responsibilities that would usually be done by sysko or myself. I'm talking about things like accessing the server and updating Tatoeba's code to include bug fixes or new features, writing on the blog and on Twitter to keep users informed of what's going on or what we're working on, replying to emails that are sent to team@tatoeab.org, etc.
As far as I'm concerned, I know that I will never be able to dedicate as much time and energy to Tatoeba as I used to, and neither does sysko. But I still want this project to keep growing and be more successful, and I know it's not going to happen if there's only sysko and myself in charge of these "higher responsiblity" kind of tasks.
Of course we wouldn't give such responsibilities to complete strangers and there are some people that are in my "need to talk to" list. But whoever you are, if you're reading this and feel that you would want to participate to this project on a higher level, then contact us and let us know about it!

Now, before I end with this article, there is another topic that I'd like to mention briefly: donations. This has been brought up a few times on the Wall and also in our IRC channel, that we should start a donation campain, or try to raise money through Kickstarter or selling goodies. I will talk more about this in another article but my short answer is, yes, I agree. And this is probably going to be one of the next priorities. But first, we need a better team and hopefully 2014 will be a better year for the Tatoeba.

Happy New Year by the way :)