Tatoeba Project

Sunday, October 19, 2014

Tatoeba update (October 19, 2014)

Search results sorted by sentence length

Shorter sentences will have higher priority over longer ones in the search results. Even though the length of a sentence does not necessarily imply that it's a better example sentence, this should make the results more relevant overall.


Possibility to comment deleted sentences

The comment form was displayed on deleted sentences, but the comment was not saved after submission. This has been fixed and it is now possible to post comments on deleted sentences.


Script to remove duplicate sentences

This is just a little note that there has been good progress on the deduplication script. We'll hopefully be able to clean up the corpus soon :)


Other fixes
  • Fixed truncation of long URL's containing non Latin characters.
  • Long words or links that exceed their container box are now split into a new lines instead.
  • Fixed a bug where a part of an URL would be converted into a sentence's link.
  • Fixed a bug where some Wall message previews were displayed as empty on the homepage.

Saturday, October 4, 2014

Tatoeba update (October 4th, 2014)

Sphinx 2.1.9

We have upgraded the search engine to Sphinx 2.1.9. This fixes an issue where searching the word "why" would return no result, despite the fact that many sentences in the database use this word.


New sentences quickly available to search

You will no more have to wait weeks before you can find, through the search, a sentence that you have added. We know that many people have been wondering how come they cannot find a sentence that they have recently added, and the reason, in short, is because sentences need to be indexed before you can find them through the search. We couldn't index too often, because it would take too long, and too much resources.
But with the new server, and with gillux's work on implementing a "delta index", we can now provide search results that are much more up-to-date. New sentences will appear in the search results within an hour or less.


Sentences of a user visible to everyone

We have fixed a bug where the page listing the sentence of a specific user was only accessible to logged in users. The page is now visible to any user, logged in or not.

Saturday, September 27, 2014

Tatoeba update (September 27th, 2014)

Improvement of the search feature
  • The priority in which sentences are displayed in a search result has been improved: sentences with an owner will be displayed before sentences without any owner, and unapproved sentences (whether they have an owner or not) will be displayed last.
  • Uppercase letters with diacritics are now properly assimilated to their lowercase version in a search. The problem was that searching for instance "ça va" would not return sentences containing "Ça va" (with the ç in uppercase). This should now be fixed.

Furigana and romaji
  • Furigana for Japanese sentences is now displayed properly (this has been fixed last week). It was previously displayed with katakana and was displayed on all the words. It is now displayed with hiragana, and only on words with kanjis.
  • The tool to convert Japanese text into romaji now displays romaji properly. It was previously displaying the output in katakana instead of latin letters.

Other fixes and changes
  • The random feature has been fixed for the following languages: Amharic, Cherokee, Lao, Mon, Sinhala, Tamil, Telugu, Tibetan.
  • References to a sentence number are now converted properly into a link if they are the first word of the message.
  • The "translate" button has been disabled on unapproved sentences.
  • An option was added in the settings to remember or not the last list selected. By default it is disabled.

Friday, September 12, 2014

Tatoeba update (September 12th, 2014)

What's new
  • New look for the comments form. Clear your cache or refresh the page again if your form looks strange.
  • New data available in export files: sentences with audio.
Fixes
  • The anchor in links to comments is back, so that you get directly jump to the right comment when you click on the "#" link.
  • The confirmation popup when deleting a comment is back as well.
Other
  • The text for downloading lists has been reviewed to be clearer.
  • The text on the page "Improve sentences" was made translatable. It will take a bit more time until the strings appear in Launchpad for translation.

Wednesday, September 3, 2014

Tatoeba update (September 3rd, 2014)

New design for messages
  • The main change in this update is the new design for comments on sentences, messages on the Wall, and private messages.
  • You will now also be able to use the sentence URL syntax everywhere, and not only for comments on sentences. The syntax doesn't require brackets anymore (you can simply type #123 instead of [#123]). If you were not aware of this feature: for instance typing #123 in your message will be displayed as a link to the corresponding sentence.
  • The button to send a private messages to the author of a message is now present on all the messages, and not just for comments on sentences.
Other fixes
  • The tags count is updated properly. It was previously not incremented/decremented when adding a tag to a sentence.
  • The sentences count is updated properly. Same issues as with tags, the count was not incremented/decremented properly when adding a sentence.

Sunday, August 24, 2014

Tatoeba update (August 24, 2014)

Upgrade to CakePHP 1.3 and other small fixes
  • We have upgraded to CakePHP 1.3. This will not have any visible impact for users. The next step would be to upgrade to version 2.x, but there is no clear plan for it at the moment.
  • We have changed the links to the wiki to make it less confusing for non-English users. The links currently point to the English version only, because too much of the wiki content is untranslated. When the links were not forced to English, non-English users would be redirected to the non-English version of the page and find an empty page or a page requesting them to log in, because the page was not translated or didn't exist.
  • We have fixed various graphical bugs.

Donations news

We recently receive a very high donation of 1939€. The donator wished to remain anonymous. We also received another non-negligeable 100€, thank you Ray!

In total we have now received 2334€ in donations, which means Tatoeba can afford to pay for a dedicated server for the next few years. We will therefore much less likely have all issues of slowness and unavailabilities that you may have experienced in the past.
With these donations we can also start considering using platforms such as Bountysource to get things done faster.

Saturday, August 16, 2014

Tatoeba update (August 16, 2014)

New downloads URL's

We have changed the URL of the downloads files, containing the data that we redistribute. The files are also now compressed. The old URL is still available for the time being, but will no more contain the latest data.

International targeting

We've included the necessary HTML tags for Google to display the results in the relevant language, and not systematically in English.
On a related topic, we still (and will always) need people to help us translate Tatoeba's interface into other languages. If you would like to help, check out the instructions here.

Donation news

We'd like to thank our two latest donators, Dmitriy and Aleksandr. We've had 8 donations so far, that amount to a total of 295€. The top donation was 100€.