[Mageia-i18n] www pages: about and main index

Fri May 25 23:02:57 CEST 2012

написане Fri, 25 May 2012 22:49:37 +0300, Romain d'Alverny  
<rdalverny at gmail.com>:

> On Fri, May 25, 2012 at 6:41 PM, Yuri Chornoivan <yurchor at ukr.net> wrote:
>> Just of curiosity, a couple of questions.
>>
>> 1. What was the reason to choose unsustainable format that is not  
>> recognized
>> by the existing toolchains (lang)?
>
> My past experience with gettext (4 years, I dropped it about 3 years
> ago, things may have changed since) has been terrible (mostly, because
> gettext is [was?] not thread-safe and the server implementation made
> that hugely problematic). So we had erratic server behaviour regarding
> locale management on our websites.

I am not telling about server side at all. Why not extract all the strings  
form these .lang files into reliable format. It should be deadly easy with  
the current format (drop two lines and extract every line with ";" at the  
beginning. It can be done even via PHP. Then generate .lang for locales  
using cron with minimal task priority.

No mistakes in quotation, no formatting breaks, no extra manual job, no  
additional load on server for every single translation update, no  
additional pages with lists, no wiki explanations, just tell the  
translators the name of the directory in SVN and the single pot file  
(generated by cron).

> I could have migrated to .po files nonetheless and use a library
> reimplementing gettext over it. But frankly, the past struggles with
> gettext (in a Web context - it does perfectly its job in a local app
> context) made me want to try something else, lighter (so less featured
> as well obviously). I know Mozilla used to have .lang files too, I
> checked it out a bit and thought it would be worth the try; and
> perhaps would we be able to reuse their own libs in this regard.
>
>> 2. Consider someone decided to change something on a webpage. How can we
>> (translators) know about the changes (Manually parse svn diffs, copy and
>> paste diffs from PHP page? What about fuzzy matches?) ?
>
> First, someone would be identified - I commit about 95% of what goes
> on www, 5 other percent are between obgr and dams. That's for the
> code, content, layout, etc. (I'm not happy with this either). I wrote
> quickly report and diff utilities so you can have a look (and we can
> rework them, just tell me how), but I expect to sneak peak at
> Mozilla's codebase.

Thanks. Great work, imho.

But with all respect, Mozilla (currently) has some language teams with  
number of members that exceeds the number of all Mageia translators. They  
can waste their resources, at least in the current state.

> So, upon modified or added contents on a webpage, that would happen in
> English only (that's the pivot language). That will need to extract
> the string from the source, inject it in the lang file, and the diff
> will appear on the report page then. No need for anyone to play with
> PHP code, you will only edit the .lang files.

That's what I see now:

украї́нська мо́ва (wrong, by the way. It should be "Українська". I wrote  
Oliver about this, but nobody cares :'( ):

3 / 95 100% (looks like a cipher, 3 files, 95 messages, I guess)

2 untranslated (heh? ";page_title" and ";Mageia 2". Penalty for  
copy-pasting diff to the .lang, I guess ;) )

Français (fr):

3 / 93 98%

7 missing 8 untranslated (15/98*100 + 98% = 113.3%)

looking further:
OK+1 (must be bonus points for 13.3% ;) )

> And there won't be fuzzy matches. The string will be blankly added, or  
> removed.

Very sad. It's not a big problem to translate even whole KDE or GNOME with  
their docs and wikis (believe me, at least about KDE ;) ). It's a problem  
to keep them translated.

The Rosetta situation with whole string discarding for one comma or one  
space is somewhat unacceptable without any chance to have at least  
minimalistic translation memory.

>> P.S. Written when struggling to realize where to place the new strings  
>> from
>> http://www.mageia.org/langs/report.php
>> Thinking about left this without translation in the future.
>
> ? didn't understand. What was the issue?

There are no big issues now. But it will end up bad for the following  
reasons:

1. Copy-pasting from report page can break (and will break) the sequence  
of messages making a total mess of outdated/new/existing translations.
2. No translation memory, hence no uniform translations and no saving of  
time if something has been already translated earlier.
3. Fragmentation with minimal scaling. What is the future? Directory with  
dozens of files from different releases (2.uk.lang, 2-1.uk.lang,  
3.uk.lang, 3beta.uk.lang, etc.)? Dozen of directories for different  
releases? Table with 70 language columns and dozens of rows?

> Note that I'm all to improve and ease the translation process for you
> - but being alone, I do it with what my experience has been as well -
> that will explain as well why I chose a dead boring simple homemade
> framework for the Web site, and not an existing CMS - the situation
> wouldn't be the same with an active 3 or 4-seats Web team.
>
> And again, tell me if I'm missing something or if something can be
> improved for you.

Ok.

1. If as a result of lacking manpower nothing can be done with the  
above-mentioned issues, can we have RSS/Atom feed (automatic) for the  
report page to automate the process of updating or automatic messaging to  
this list (with no manual sending warnings needed) in the case when  
English strings are changed/added?

2. If it is not hard to do, can the .lang files be regenerated  
automatically (with keeping the order of current English pivot)?  
Identifying of untranslated strings can be harder, but at least it will  
keep the logical order of the messages.

Thanks for your answer and efforts to improve the current workflow.

Best regards,
Yuri