Sunday, January 6, 2019

New Year, New Tatoeba

Happy New Year everyone :)

In a couple of weeks we will be releasing a new version of Tatoeba! The deployment is currently scheduled on January 19th. On the surface, you won’t be noticing any difference. Same look, same features (kind of). But there will be actually some major changes.

We’re handling a new license: Creative Commons Zero

It will be possible for Tatoeba contributors to choose between Creative Commons Attribution (CC-BY) and Creative Commons Zero (CC0) when submitting new sentences.

The difference between CC-BY and CC0:
  • With CC-BY anyone can reuse the data for any purpose, but is required to mention where they got the data from.
  • With CC0 there is no requirement at all, no need to say where the data comes from.
As a contributor, if you do not wish to use CC0, you will not have to. You can continue contributing as you used to, nothing will change for you. Your sentences will keep being released under CC-BY.

If you however want your contributions to be reused in other projects without any strings attached, then you’ll have the possibility to contribute new sentences under CC0, as well as switch the license of your existing sentences to CC0 under some conditions.

All of this will be detailed further once we deploy the new Tatoeba.

We’re migrating to CakePHP 3

Tatoeba is built on top of a framework called CakePHP. We’ve always been lagging behind, using much older versions than the latest available. The current website is still based on version 2.9, while version 3 was released almost four years ago. But we’ve finally been able to catch up and migrated our code to work on CakePHP 3.6.

There are still a few features to migrate, but we should be ready to deploy in two weeks!

For our non-tech-savvy users, this migration will perhaps feel like we went backwards. There will be nothing new, but there may be some features broken and there may be some features working slower than they used to. We will be fixing all of that within the following weeks, so please bear with us.

This migration was an important task for the longer term, for the same reasons than when we migrated from CakePHP 1 to CakePHP 2 a couple of years ago: there are various technical benefits and Tatoeba can now hopefully look more attractive for the developers out there who want to contribute to an open source project.

If you are one of these developers, we will be more than happy to welcome you onboard. Don’t be afraid to contact us.

We’re growing as an organisation

Looking back at when we had our “big crash” in 2017 and people were a bit worried about the state of Tatoeba, and looking at where we are now, Tatoeba has made a big step forward as an organisation.

Back then, Tatoeba was funded only with donations. These donations helped us paying for the server but we never made big campaigns and could not do much more with our money. Hiring staff was completely out of reach.

Thanks to Mozilla Open Source Support (MOSS), this has changed. We heard of the MOSS program after Mozilla Common Voice approached us to explore ways of collaboration. We applied for it and got accepted. We were awarded $25,000 and were able to hire our first employee.

This made a huge difference for us. Not only the integration of the CC0 license and the migration to CakePHP 3 were possible thanks to this award, but we were also able to fix many bugs and implement many improvements.

We will undoubtedly apply for MOSS again, but we will also look into other ways to get fundings. The next big goal would be to find a sustainable flow of income for the decades to come.

2018 was a pretty good year for us. Let's hope the trend continues in 2019 :)

Friday, June 1, 2018

Tatoeba's first employee

With the grant we are receiving from the Mozilla Open Source Support program, we are able to hire our very first employee!

If you're a veteran at Tatoeba, you surely know him: it's gillux.

He has contributed a lot to Tatoeba as a developer a couple years ago and he is now back, as official staff, starting today :)

Sunday, May 13, 2018

MOSS award for Tatoeba

I can finally share some big news with you. Tatoeba will be receiving $25,000 via the Mozilla Open Source Support (MOSS) program. This was a long process, but it's now finally official :)

A little bit of background story.

Back in October last year, folks from Mozilla got in touch with us to explore possible ways of collaboration. They're working on a project called Common Voice and with this project they basically want to collect people's voice. A lot of it.

To achieve this, they need sentences for people to read. Someone told them about Tatoeba... And that's how it started.

But it's not that simple.

One of the requirements of Common Voice is to be able to release their data under CC0 (the Creative Commons version of public domain). Tatoeba's data is CC-BY. Common Voice cannot reuse CC-BY sentences to record audio that they'll publish as CC-0. They can only reuse sentences that are in the public domain or CC0.

So there's quite some work to do there, if we want to let Common Voice reuse sentences from Tatoeba. This is what the MOSS award is for. We cannot change our CC-BY license for the data we've released so far. But we can evolve Tatoeba to handle more licenses than just CC-BY.

I'll be explaining more in details later on what changes we plan to do exactly. But until then, I would really like to have an idea where the Tatoeba community stands on this matter.

Would you consider putting part (or all) of your sentences under CC0? Why, or why not? Let me know via this form:

Thursday, July 27, 2017

What's up with Tatoeba now?

It's been a month and a half since our SSD incident and while we managed to bring Tatoeba back online, there are still many features not working and the website is overall very slow to use.

I know many people wished the situation could improve faster, but as it stands now, we don't really have the manpower to get things done more quickly. Fixing everything and getting back to a stable situation will take a long time. Perhaps another couple of months. Perhaps more.

To be honest, the main reason is because I (as the founder of Tatoeba) am in a phase where I wouldn't want to dedicate more than a few hours per week.
There has been times where I could spend as much as a part time job working on Tatoeba (maybe even as much as a full time job?), and things were evolving at a fast pace. Then there has been times where I was completely absent and the project could not really move forward.
Right now, I'm on the low side, which is a big part of why Tatoeba is much slower to recover.

There's been a few questions asked on the Wall, regarding what can be done to improve the situation, and what can be done to keep Tatoeba healthy in the future. I'd like to answer them here to give people a clearer idea about how Tatoeba is functioning, and I'd like as well to give a few updates about what is being worked on at the moment.

Also, do you have any idea how much it would cost if tatoeba moved to a better web-hosting with better support? 2 weeks to restore functionality - that's a bit too much, and that's not the first time something like this happened to the site. 
I can contribute 10 euros every year, which is probably a drop in the ocean, but we could probably find 100 people like me among our active users.
It wouldn't cost a whole lot more to move to a web host with better support.

To be fair, the long time it took to restore Tatoeba was not entirely due to our host. They are not in charge of maintaining Tatoeba as a whole, they are only responsible of taking care of the machine where Tatoeba is hosted. It stopped working because the SSD died, and they were definitely quite slow to react (took 5 days of waiting before they replaced it), but it's not their fault that it took an extra week to restore the system and the data on the new SSD.

Still, we're definitely planning to move to another host. We have ordered a new server and hopefully will manage to move Tatoeba to a new home by the end of month.

Money is not an issue at least not for paying web hosting. We currently spend around 35€/month for our server, and even if we'd have to spend twice as much, we could still afford it without extra donations.

Is there any way to avoid a potential following breakdown or for that to save all the data?
Yes, there are ways.

Tatoeba is currently hosted on a dedicated server and if we move to a VPS (which is the plan), we would no longer have to worry about hardware failure.

Besides of that, our data recovery could have gone a lot better if we had invested more efforts on backups before the incident. We would have lost only one day of data had we checked our backups properly.

But keep in mind that securing the data is only a small part of the problems we have to solve. Tatoeba has grown into a complex system, which is becoming more and more challenging to maintain, the more features and content we put into it.

Which leads to the last question.

How is Tatoeba funded? It looks like the recent problems, and the current half-working status are the direct result of not having permanent staff (a.k.a money).
Tatoeba is funded with donations only. We have more than enough to pay for the server, we however don't have even closely enough to hire permanent staff. And indeed, not having permanent staff is a handicap.

With the way we are operating, when something's not working, it takes time before it gets noticed by someone who can do something about it, because we don't monitor Tatoeba 24/7. Then it takes time to solve the issue because everyone is a volunteer, and Tatoeba is just a side project for all of us. Problems can occur while we're at work, while we're traveling, while we're sick or just too tired to work on it. Some problems are actually quite difficult to solve.

We would need ideally a small team of 2 to 4 people working on Tatoeba at least as a part-time job, to ensure that Tatoeba keeps running smoothly at all times and continues to evolve in a sustainable way.

I don't think we can raise enough money for this via donations or crowdfunding. To be fair, I have never tried, so I could be wrong. But we're talking about 50k-100k euros per year, to secure a team. It's a completely different scale from what we're currently dealing with. Honestly, I don't have the "marketing" skills, nor would I have the energy, to raise this kind of money. If not me though, I'm not sure who else would do this.

But even if someone walked to us and threw millions at us, our issues won't magically disappear. It would still take time to build a team, to find the right people with the right skills to solve those issues.

While I would like Tatoeba to be as much as possible independent from money, I do think that one day or another, Tatoeba will need permanent staff, which is quite difficult to achieve without money. We can keep the project alive with volunteer staff, but we cannot make it grow much bigger than it is now. 

Anyway, this is more of a long term discussion.

In the shorter term, what's happening?

Currently, we're lucky to have pep (aka. Ppjet6 on Tatoeba) who stepped up to help on the whole sysadmin/devops part. He'll be the main person working on migrating Tatoeba to the new server.
If you have any knowledge in these areas and wish to get involved in a way or another, you're more than welcome to join the Tatoeba IRC channel (#tatoeba) on freenode.

In fact, you're welcome to join even if you're just gonna be lurking, or if you'd like to give some moral support. You can potentially learn a lot about how Tatoeba works by just hanging around in the IRC channel. And even if you won't get involved for time being, who knows, maybe in 6 months, in a year or in two years, Tatoeba will need you to save it from a dire death.

Also, you should be aware that there are other places than the Wall where you can gather and interact with other Tatoeba members. When Tatoeba is down, the Wall is down too, so it's important to have other channels of communications. Those are:
Feel free to use those external communication channels to discuss about Tatoeba. They are not restricted to developers only.

Last but not least, we now have a status page to communicate information about the status of Tatoeba (whether it's online, or down, or experiencing issues, or undergoing maintenance, etc):
For now it's very minimal, but the idea is that if you're trying to access the Tatoeba website but can't, or if something doesn't work anymore on the website, you'll be able to find information about it on the status page.

Monday, June 26, 2017

Tatoeba is back

We can finally bring Tatoeba back online, but be aware: things will probably be a little bit shaky for the next couple of weeks.

What happened?

Basically, the SSD of our server died. Last time it happened (back in February), our web host was able to recover the data. This time however, we were not as lucky. As a result, we lost everything on that SSD and had to set up everything from scratch.

On the bright side, we had some backups, so we didn't lose absolutely everything.

On the not-so-bright side, our database backup, for some reason, doesn't contain data past mid-February.

We were still able to recover some data from the weekly CSV exports that we share on the Downloads page, but there's data we won't be able to recover at all. We also couldn't get the latest exports for everything. Some of the CSV files were from June 10, and others from June 3.

So, roughly:
  • Any private message that was sent after Feb 19 is lost.
  • Sentences/translations that were added after June 10 are lost.
  • Contributions made after June 3 don't have logs.
  • Comments posted on sentences after June 3 are lost.
  • The Wall message have not been re-imported yet, and any message posted after June 3 is lost.
  • Users who have registered after Feb 19 will not be able to log in, nor will they be able to reset their password. They will have to contact us so we can fix it, case by case. Or they will have to create a new account.
  • Accounts of users who registered after June 3 are lost.

This is not an exhaustive list. There's a few other things here and there missing.

But anyway, Tatoeba is at least partially usable and we can re-enable the website and see how it goes while we slowly fix the remaining issues. We may have to shut it down again temporarily if there's any major issue, but any further maintenance shouldn't last more than a few hours (hopefully!).

We're very sorry for this incredibly long down time, and we're very sorry we couldn't recover more data.

Thanks a lot for your patience and your support!

Monday, June 12, 2017

Tatoeba is temporarily down (June 11th, 2017)

Tatoeba is temporarily down due to issues with our service provider. We have filed a ticket. We will write another post when the website is available again.

Tuesday, February 28, 2017

Tatoeba is back up (February 28th, 2017)

As of February 28, 2017, Tatoeba is back up. Thank you for your patience yesterday.