Sunday, June 27, 2010

Tatoeba update (Jun 27th, 2010)

What's new
  • Page that lists all the tags. NOTE: It's not organized at all, it's really just for sake of having a page that displays all the existing tags.
  • Page that lists all the sentences in a specific language, with possibility to show only those that are NOT translated yet into a certain language. For instance Japanese sentences not yet translated into English. Useful feature for contributors =)
  • Possibility to filter by language, on the page that lists sentences with a certain tag.
What's next
  • Possibility to import sentences from CSV file. This feature won't be available to normal users. For a start (and I think for a long time), only moderators will have access to it. So anyone who wants to import sentences from a file will have to make a request. Anyway, the main point is that as soon as we have this feature, we will add massively lots of new sentences =]

Friday, June 11, 2010

Tatoeba update (June 12th, 2010)

What's new

I am glad to announce that we are finally introducing... tags!! :D

This will provide a way for people to add meta-data to sentences. For instance "proverb", "formal", "informal", "male", "female", etc. Such information can be very useful for language learners because they cannot necessarily guess such things just by reading the sentence.

Tags will be restricted for a short period of time. Only trusted users will be able to add tags, but everyone can see the tags associated to a sentence. When we feel the feature is ready for everyone, we will allow everyone to add tags.

People will be free to tag sentences with whatever they want. We don't really have any strict rules yet because tags are still new, and we want to see how people use them. But I can at least suggest some basic tags:
  • proverb, archaic, slang
  • formal, informal
  • male, female (to indicate whether the sentence is said by a man or a woman)
  • to delete, to correct, checked (I will talk more about these)
  • controversial, unsafe (to mark sentences that can cause problems, are not suitable for kids, etc).
  • easy, intermediate, difficult (to indicate the level of difficulty of a sentence)
So these are only my suggestions. Again, the tag feature is new, so we will necessarily go through a phase of experimentation before we can clearly set any rule. We count on everyone to try and help us figure out what works best. Feel free to discuss about issues related to tags on the Wall.

A few more things you need to know about tags:
  • You can see the list of sentences associated to a certain tag by clicking on the tag.
  • You can remove a tag from a sentence only if you were the one who added it.
  • Moderators can remove any tag.
  • It's not possible to add twice a same tag for a sentence. If someone has already added "proverb", you can't re-add "proverb".

"to delete" tag

Those tags will help moderators in their work. At the moment, in Tatoeba, only moderators can delete sentences. The traditional way of requesting a deletion was to add a comment to it, and point out that it should be deleted (and explain why). But the flow of comments has increased a lot and it's less easy for moderators to keep track.

So if you come upon a sentence that you feel should be deleted, then tag it with "to delete" so that moderators can easily find them and clean Tatoeba from entries that are not valid. Anything that is gibberish is not valid. Anything that is not a complete sentence is not valid. But then again, we haven't decided what exactly is a "sentence" so it's debatable.

"to correct" tag

In Tatoeba, it is not possible to modify a sentence that doesn't "belong" to you. These sentences are typically sentences that you have added yourself. No one (or almost) can touch them besides you. If someone sees a mistake in your sentence, all they can do is post a comment, and you have to correct it.

But certain members contribute sentences with mistakes and never come back. And for now, no one can correct their mistakes... except moderators. So if you want to help moderators, whenever you come across a sentence that needs to be corrected, that has a comment asking for correction, but even after two weeks, it was still not corrected, then you can tag the sentence with "to correct".

"checked" tag

Before I explain further, I must stress that this tag is experimental. Many times people have asked for a way to tell whether a sentence can be trusted or not. Okay, so now we can tag a sentence as "checked" to indicate that it has been proofread and validated as a correct sentence.

Of course, this raises some of course problems...
  • What if a user tags a sentence as "checked" just for the fun of it?
  • What if a user tags a sentence as "checked" but was tired and overlooked a mistake?
Well, we can't guarantee 100% accuracy. A sentence that is tagged "checked" will simply have a higher reliability rate than one that doesn't, but it won't be 100% (no one can guarantee that anyway).

What's next
  • We will make tags available to everyone.
  • We will add a page that lists all tags, to enable people to easily browse by tags.
  • We will provide a way to merge tags.
  • And many other things, but I will talk about it when the time comes.
In the meantime, enjoy :)