[lac-discuss-en] Translation Software Status

kent kent at icann.org
Thu Sep 22 07:29:19 UTC 2011


Hello all -- I tried to send this a couple of days ago, but apparently 
it didn't go through.

As you know, the Google interface we were using with some success a few 
months ago was discontinued.  Google released a commercial version not 
too long ago, and I have been working on incorporating that into the 
translation software.  This has been a bigger deal than I would have 
thought, for several reasons.

The previous version of the software used a general purpose translation 
interface that happened to support Google, but unfortunately that 
general purpose interface did not, last time I looked, work with the new 
commercial API.  So it was necessary to re-code the system to directly 
use the new API.  This was probably a good thing to do on general 
principles, but it wasn't as easy as one would like.  However, I now 
have this working.

However, a further problem emerged in my testing: the new API is a 
commercial product.  They carefully meter how much text you have 
translated, and they are stricter about the amount of text that you can 
translate in a single request.  In particular, the default limit for a 
request is now around 2000 bytes, and a significant part of that is 
consumed in the protocol boilerplate.  Moreover, the transmitted text 
must be encoded in a way that can significantly expand the number of 
bytes sent (for those that care, the interface is a REST interface, the 
text must be URI-escaped, and all the other parameters count in the 
limit - it is 2K for the entire URL).

Previously we just returned an error when the text to be translated was 
too large, but this new limit is so small that it really is necessary to 
handle the error in a more graceful way.  This change has been 
implemented; the software has been updated to include all these changes. 
It is still possible to get a "too large" error, if the email is 
formatted in such a way that it appears there is a sentence longer than 
about a thousand characters, but the general restriction on size has 
been relaxed.

Please bear in mind that there have been significant internal changes, 
so there may well be issues.  Please contact me (kent at icann.org) with 
any problems you may find.

Best Regards
Kent Crispin



More information about the lac-discuss-en mailing list