Tatoeba Project

Sunday, November 23, 2014

Tatoeba update (November 23rd, 2014)

  • Regarding the link feature for advanced contributors: it is now possible to drag-and-drop the icons (instead of the sentence text) into the link icon in the menu.
  • Our assets files (images, CSS, javascript) now have a timestamp, so that the browser knows whether or not it needs to update them. This means you should no more have to worry about clearing your browser's cache.

Development website

Gillux recently set up a development (dev) website. The purpose of the dev website is to let members test new features and check the interface translations BEFORE they get released into the production (prod) website, that is the actual Tatoeba website.


We are planning to disable Tatoeba temporarily next weekend (November 29) for maintenance.
The maintenance is about changing the engine of our MySQL database from MyISAM to InnoDB. For this operation we need to stop access to the database, that's why we need to shut down Tatoeba. It should take around 3 hours.
We need to do this change in order to run the sentences deduplication script. More about this below.

Sentences deduplication

First of all, note that the deduplication script will not be running during the maintenance, but after. The script can run with Tatoeba being available. It is still unsure whether we will run the script next weekend or later. We are still in the phase of debugging the script.

There was a first test of the script of the dev website. It took 9.5 hours to complete. You can help us make sure that the script works well by checking the dev website. Duplicates that were removed can be identified as they were deleted by Horus (it's the current name of the deduplication bot).
If you notice any issue such as sentences that were deleted while they shouldn't have, or information that was not re-linked properly, report the problem to us on the Wall of the real website (not on the dev please) or on our Google group.

Sunday, November 16, 2014

Tatoeba update (November 16th, 2014)

Link to any sentence

This new feature affects only advanced contributors and corpus maintainers. It is now possible to link a sentence to any other sentence, and not just to its indirect translations. You will find an additional icon in the sentence's menu, which looks the same as the "link" icon next to the translations. Clicking on the button opens a textinput where you can indicate the target sentence.
You can enter either the sentence number or copy-paste the sentence URL.
You can also drag-and-drop a sentence's URL into the icon.

Linking and unlinking refreshes all the translations

There were some inconsistencies with the list of indirect translations displayed after linking or unlinking a sentence. This is now fixed. Whenever you link or unlink, you will see the correct list of indirect translations without having to refresh the page.

Contributions logs

The logs design have been reviewed to take into account the various feedbacks. If you do not see any change, try to empty your cache and/or refresh again.
Note that the date is now clickable and will redirect you to the sentence's page. The sentence will be left as a text so that people can copy-paste it - or part of it - more easily.

We won't implement any option to choose between the new and old design, but for those who are very attached to the old design, here's some CSS code that you can use with the Stylish extension.
Our member CK also has a page about using Stylish with Tatoeba, with some code snippet that you can reuse.
I encourage you to learn about CSS and customize the looks to your own taste, not only for the logs but for any other part of Tatoeba. And if you do come up with something that looks a lot better, don't hesitate to share with the rest of the community!

Search fix for sentences translated into the same language

If you ever tried to search from and into the same language (for instance search "fish" from English to English), you may have noticed that the results includes many sentences that do not have any translation - if you wonder, yes, it's possible in Tatoeba to have two sentences of the same language linked to each other.
This kind of search now only returns sentences that do have translations. So searching "fish" from English to English will only return sentences that have at least one direct or indirect translation in English.

(Edit: forgot to mention one thing)
Fixed message not submitted after changing UI language

This update also fixes an annoying bug that prevented people to send comments, wall posts, translations, private messages etc. whenever the interface language was changed from a different place than the page you were submitting from. The symptom was a never ending loading icon that replaced the text you wanted to submit, while nothing was actually submitted.

Tuesday, November 11, 2014

Tatoeba update (November 11th, 2014)

Contributions logs
  • The contributions logs have been redesigned. 
  • There is a small additional visual feature: log entries that are obsolete are displayed a bit differently (with a dotted line and grey text), to indicate that there was more modification on the sentence afterwards.
  • The latest contributions page now also includes the list of users who participated in the latest contributions. It is the same list that you would find in the Members page.

New platform for UI translations

We moved to a platform called Transifex to manage our interface translations. Hopefully this will help us build a more cohesive translators team.
For those who were previously translating on Launchpad: we do not use Launchpad anymore. Don't worry, the translations that were made in Launchpad were exported to transifex, so no translation was lost.

If you would like to join the translators team, simply go to this page, click on "Help translate Tatoeba website", create an account, log in and apply to the language(s) in which you'd like to translate. If the language is not listed, you can request it to be added. Once your application is validated, you will be able to submit translations.

Sunday, October 19, 2014

Tatoeba update (October 19, 2014)

Search results sorted by sentence length

Shorter sentences will have higher priority over longer ones in the search results. Even though the length of a sentence does not necessarily imply that it's a better example sentence, this should make the results more relevant overall.

Possibility to comment deleted sentences

The comment form was displayed on deleted sentences, but the comment was not saved after submission. This has been fixed and it is now possible to post comments on deleted sentences.

Script to remove duplicate sentences

This is just a little note that there has been good progress on the deduplication script. We'll hopefully be able to clean up the corpus soon :)

Other fixes
  • Fixed truncation of long URL's containing non Latin characters.
  • Long words or links that exceed their container box are now split into a new lines instead.
  • Fixed a bug where a part of an URL would be converted into a sentence's link.
  • Fixed a bug where some Wall message previews were displayed as empty on the homepage.

Saturday, October 4, 2014

Tatoeba update (October 4th, 2014)

Sphinx 2.1.9

We have upgraded the search engine to Sphinx 2.1.9. This fixes an issue where searching the word "why" would return no result, despite the fact that many sentences in the database use this word.

New sentences quickly available to search

You will no more have to wait weeks before you can find, through the search, a sentence that you have added. We know that many people have been wondering how come they cannot find a sentence that they have recently added, and the reason, in short, is because sentences need to be indexed before you can find them through the search. We couldn't index too often, because it would take too long, and too much resources.
But with the new server, and with gillux's work on implementing a "delta index", we can now provide search results that are much more up-to-date. New sentences will appear in the search results within an hour or less.

Sentences of a user visible to everyone

We have fixed a bug where the page listing the sentence of a specific user was only accessible to logged in users. The page is now visible to any user, logged in or not.

Saturday, September 27, 2014

Tatoeba update (September 27th, 2014)

Improvement of the search feature
  • The priority in which sentences are displayed in a search result has been improved: sentences with an owner will be displayed before sentences without any owner, and unapproved sentences (whether they have an owner or not) will be displayed last.
  • Uppercase letters with diacritics are now properly assimilated to their lowercase version in a search. The problem was that searching for instance "ça va" would not return sentences containing "Ça va" (with the ç in uppercase). This should now be fixed.

Furigana and romaji
  • Furigana for Japanese sentences is now displayed properly (this has been fixed last week). It was previously displayed with katakana and was displayed on all the words. It is now displayed with hiragana, and only on words with kanjis.
  • The tool to convert Japanese text into romaji now displays romaji properly. It was previously displaying the output in katakana instead of latin letters.

Other fixes and changes
  • The random feature has been fixed for the following languages: Amharic, Cherokee, Lao, Mon, Sinhala, Tamil, Telugu, Tibetan.
  • References to a sentence number are now converted properly into a link if they are the first word of the message.
  • The "translate" button has been disabled on unapproved sentences.
  • An option was added in the settings to remember or not the last list selected. By default it is disabled.

Friday, September 12, 2014

Tatoeba update (September 12th, 2014)

What's new
  • New look for the comments form. Clear your cache or refresh the page again if your form looks strange.
  • New data available in export files: sentences with audio.
  • The anchor in links to comments is back, so that you get directly jump to the right comment when you click on the "#" link.
  • The confirmation popup when deleting a comment is back as well.
  • The text for downloading lists has been reviewed to be clearer.
  • The text on the page "Improve sentences" was made translatable. It will take a bit more time until the strings appear in Launchpad for translation.