|
|
Log in / Subscribe / Register

Mozilla releases a machine-translation plugin

Mozilla has announced the release of a translation plugin for Firefox as part of the Project Bergamot initiative.

The ultimate goal of this consortium was to build a set of neural machine translation tools that would enable Mozilla to develop a website translation add-on that operates locally, i.e. the engines, language models and in-page translation algorithms would need to reside and be executed entirely in the user’s computer, so none of the data would be sent to the cloud, making it entirely private.


to post comments

Mozilla releases a machine-translation plugin

Posted Jun 3, 2022 7:31 UTC (Fri) by cdamian (subscriber, #1271) [Link] (19 responses)

I hope this will end up on mobile too.

Mozilla releases a machine-translation plugin

Posted Jun 3, 2022 7:41 UTC (Fri) by pabs (subscriber, #43278) [Link] (18 responses)

I hope this will end up in Linux i18n frameworks (gettext), desktops and mail clients too. It would be useful when manual translations aren't available; such as for not fully translated languages, or for when you get legit or spam mails in a language you don't understand.

Mozilla releases a machine-translation plugin

Posted Jun 3, 2022 18:08 UTC (Fri) by alexander.batischev (guest, #122369) [Link] (16 responses)

> It would be useful when manual translations aren't available

It might be useful for long-form texts like manpages, but for short strings like gettext message catalogues it'd be worse than nothing. Even human translators manage to miss some of the context when translating (I did that myself more than once!) Machines stand very little chance there, at least until they can *use* the software to figure out the meaning of messages.

Mozilla releases a machine-translation plugin

Posted Jun 4, 2022 2:49 UTC (Sat) by pabs (subscriber, #43278) [Link] (15 responses)

There are a lot of people who speak no English, I'd wager they would prefer machine translation over zero translation and we shouldn't exclude them from using Free Software because we haven't attracted translators for their language.

I know that is what I feel when I visit a country where English isn't common; non-English OCR and TTS plus MT of interfaces and OCR/TTS results to English would be excellent.

Mozilla releases a machine-translation plugin

Posted Jun 4, 2022 10:30 UTC (Sat) by Wol (subscriber, #4433) [Link] (11 responses)

> There are a lot of people who speak no English, I'd wager they would prefer machine translation over zero translation and we shouldn't exclude them from using Free Software because we haven't attracted translators for their language.

Time flies like an arrow, fruit flies like a banana.

Without context, that statement is just NOT TRANSLATEABLE. Machine translation is worse than zero translation - it's impossible even to tell the difference between a noun, a verb and an adjective! Quick - is "flies" a noun or verb?

And English is probably one of the worst languages to translate, given its complex conjugation and massive vocabulary. But other languages have got their quirks, too.

If you limit machine translation to areas it works well (mostly technical, I guess), then great, but once you start using it to translate prose, or even worse poetry, it's going to have a very hard time of it.

Cheers,
Wol

Mozilla releases a machine-translation plugin

Posted Jun 4, 2022 14:38 UTC (Sat) by mathstuf (subscriber, #69389) [Link] (6 responses)

FWIW, I'd consider handling such ambiguities to be a requirement for translation anyways. Sure, it's not translatable by anything that does "word -> word" translation, but I do think the bar is far higher than that these days.

Of course, you could also be in a strange sci-fi universe where bananas fly and arrows are food for a certain kind of fly (feels kind of Douglas Adams-y to me given the "flooping" of certain mattresses and such). *That* kind of context definitely needs more than just a sentence.

Mozilla releases a machine-translation plugin

Posted Jun 4, 2022 15:10 UTC (Sat) by mpr22 (subscriber, #60784) [Link] (5 responses)

The funniest part is that, "time flies like an arrow" makes "fruit flies like a banana" easier to translate, because it primes the mind to think of "fly" as a verb.

Whereas without a wider context, "fruit flies like a banana" is ambiguous.

(Bananas, like pigs, fly just fine if you throw them hard enough.)

Mozilla releases a machine-translation plugin

Posted Jun 4, 2022 17:56 UTC (Sat) by hvd (guest, #128680) [Link] (4 responses)

The idea of that sentence is that to parse it correctly, the first "flies" should be parsed as a verb, the second as a noun. It's not meant to be ambiguous, it's meant to be hard to parse. Fruit does not fly as bananas do. That's grammatically correct but makes no sense, fruit does not fly. The verb in the second sentence is "like", as fruit flies are animals that like bananas.

Mozilla releases a machine-translation plugin

Posted Jun 5, 2022 20:58 UTC (Sun) by JoeBuck (guest, #2330) [Link] (3 responses)

Google Translate also has trouble with this sentence:

English to French:

le temps passe comme une flèche, mais les fruits volent comme une banane.

Translating this back to English gives

time flies like an arrow, but fruits fly like a banana.

Mozilla releases a machine-translation plugin

Posted Jun 5, 2022 21:12 UTC (Sun) by Wol (subscriber, #4433) [Link] (2 responses)

Which throws up another quirk of English - many words (fruit included) either have weird plurals or are number-indefinite. A similar example is sheep.

I'm guessing (like with die/dice, thou/you), the singular has simply fallen into disuse, although I have no clue what the singular might have been for fruit/sheep if that guess is correct.

Cheers,
Wol

Mozilla releases a machine-translation plugin

Posted Jun 6, 2022 16:20 UTC (Mon) by rgmoore (✭ supporter ✭, #75) [Link] (1 responses)

I think nouns like fruit and sheep were originally uncountable, like water. That means you'd talk about a quantity of them rather than a number, so there wouldn't really be a singular or plural.

Mozilla releases a machine-translation plugin

Posted Jun 17, 2022 9:59 UTC (Fri) by nix (subscriber, #2304) [Link]

> I think nouns like fruit and sheep were originally uncountable, like water.

Something like that: for sheep at least they were similar in some cases at one time, but that was because of loss of a trailing vowel which *did* indicate a plural, presumably because you could usually figure out the number from contextual clues anyway. The OED says:

> The prehistoric plural *skǣpu normally lost its final vowel in Old English, so that nominative and accusative singular and plural became identical.

Mozilla releases a machine-translation plugin

Posted Jun 4, 2022 15:22 UTC (Sat) by rsidd (subscriber, #2582) [Link] (1 responses)

This is because "fruit flies like a banana" is unidiomatic even in English. You would say "fruit flies like bananas" except in this context of tripping someone up.

That said, both the mozilla and the google translators translate "fruit flies like bananas" as "moscerini della frutta come banane". (fruit flies such as bananas). Google translates "gorillas like bananas" correctly though (ai gorilla piacciono le banane). Odd.

My point is, google and, as far as I have seen, the mozilla translator handle individual sentences just fine, so it would be fantastic to use them for i18n where possible. Where there are errors, native readers can figure it out, and not go through life thinking that all kinds of fruit travel through the air in the manner of a banana.

Mozilla releases a machine-translation plugin

Posted Jun 5, 2022 17:01 UTC (Sun) by Wol (subscriber, #4433) [Link]

> This is because "fruit flies like a banana" is unidiomatic even in English. You would say "fruit flies like bananas" except in this context of tripping someone up.

Unidiomatic? In American, maybe. I don't actually use that sort of language much, it feels perfectly normal to me ...

Cheers,
Wol

Mozilla releases a machine-translation plugin

Posted Jun 4, 2022 15:25 UTC (Sat) by david.a.wheeler (subscriber, #72896) [Link]

> If you limit machine translation to areas it works well (mostly technical, I guess), then great, but once you start using it to translate prose, or even worse poetry, it's going to have a very hard time of it.

In UI frameworks, the text tends to technical and thus easier to handle. Modern machine language translators are now doing a better job at prose, too. They are obviously not as good as a human, but they are much better than being completely unable to access the information entirely. There's an argument that poetry isn't fully translatable, even by humans fluent in both languages... I don't see why that limitation should mean we can't use the technology in other ways.

I'd rather have half a loaf than starve.

Mozilla releases a machine-translation plugin

Posted Aug 1, 2022 12:30 UTC (Mon) by immibis (subscriber, #105511) [Link]

Reading "clock insects prefer darts, vegetable insects prefer bananas" is still much preferable to "%$^(%&@(#$*@#)$&^$%*&$^#" which the text may as well be if you don't understand the language. You can now skim the text and delve deeper into only the parts that don't make sense, instead of painstakingly looking up every single word in a dictionary.

Mozilla releases a machine-translation plugin

Posted Jun 9, 2022 13:38 UTC (Thu) by tbelaire (subscriber, #141140) [Link] (2 responses)

I was just translating a karaoke program recently, and "queue" being used as a verb (to add to the queue) as well as the noun (title of the page displaying the queue) was impossible to even translate with `gettext`. I ended up having to ask upstream to use "enqueue" for the verb form to get it translated.

I think getting the verb / noun versions wrong (some languages they do *not* overlap) is more confusing than leaving it in English, at least for the target audience I was translating for (my partner's parents know a little english, but prefer Chinese strongly).

Mozilla releases a machine-translation plugin

Posted Jun 9, 2022 16:13 UTC (Thu) by nye (subscriber, #51576) [Link] (1 responses)

Excuse my gettext ignorance, but why can't contexts be used to solve this? I thought the point of them is to allow the same text to be translated in different ways according to the context. Surely gettext can't get this that badly wrong?

This is such a common standard issue in translation that any translation tool that can't handle it is barely even a toy IMO.

Mozilla releases a machine-translation plugin

Posted Jun 9, 2022 17:40 UTC (Thu) by tbelaire (subscriber, #141140) [Link]

Ok, well here it is.

https://github.com/vicwomg/pikaraoke/blob/master/template...

I have it hooked up with pybabel and flask_babel, and jinja2.ext.i18n for inflating the templates. I see the context correctly when editing the translation files, but I don't think the context is used for a lookup? That would be pgettext right?

https://docs.python.org/3/library/gettext.html#gettext.pg...

Oh, I see that the jinja2 extension mentions pgettext

https://jinja.palletsprojects.com/en/3.1.x/extensions/#i1...

But I'm not sure how to wire it up to the {% trans %} blocks, and it was easier to just upstream it. I was doing this as a hobby and I'm not a pro in this area.

Mozilla releases a machine-translation plugin

Posted Jun 4, 2022 18:55 UTC (Sat) by JanC_ (guest, #34940) [Link]

I don’t think it would be very useful to replace gettext translations right now, as (for now) most languages that have machine translations available are generally well-translated already, but it could be useful to assist the existing human translators (especially if it can learn from corrections somehow).

Mozilla releases a machine-translation plugin

Posted Jun 3, 2022 8:12 UTC (Fri) by rsidd (subscriber, #2582) [Link] (6 responses)

I gave this a try. Its language coverage is far behind Google, but on some supported languages, it seems about as good as Google Translate, and done on your computer not via the cloud. Impressive.

But these days Google Translate does a decent job with Indian languages which have an entirely different syntax, word ordering, etc (some are not even from the Indo-European family). Mozilla's is still restricted to European languages (and not all: it omits French!) I wonder if it will be feasible to do other languages on a desktop CPU. I feel the Indian government and other interested governments should fund further development, as the EU has so far.

Mozilla releases a machine-translation plugin

Posted Jun 3, 2022 17:58 UTC (Fri) by developer122 (guest, #152928) [Link]

I've found *far* better translation performance from deepl or even bing, so I wonder how it stacks up against those?

For the language pairs I care about (eg. English<->Japanese) google has really let their translation quality slip.

Mozilla releases a machine-translation plugin

Posted Jun 3, 2022 18:01 UTC (Fri) by nybble41 (subscriber, #55106) [Link] (2 responses)

> on some supported languages, it seems about as good as Google Translate, and done on your computer not via the cloud

Google Translate can work offline as well. The Android app, at least, supports downloading data packs for specific languages and can use it to translate text, audio, and images without an Internet connection.

Mozilla releases a machine-translation plugin

Posted Jun 5, 2022 4:17 UTC (Sun) by k8to (guest, #15413) [Link] (1 responses)

Certainly google translate's technology does not require a huge compute cloud for the user. But do you trust google apps with your privacy? I certainly don't.

Mozilla releases a machine-translation plugin

Posted Jun 5, 2022 15:03 UTC (Sun) by rsidd (subscriber, #2582) [Link]

I had forgotten that this is possible on android. But it is not sufficiently transparent. Indeed, once you are back online you don't know what is being sent upstream.

Competition is good. I hope the offline translation space improves.

Need ukranian language too.

Posted Jun 3, 2022 18:09 UTC (Fri) by Alterego (guest, #55989) [Link] (1 responses)

Incredible that EU has funded this project, and did not require to have German and French !

But the most needed today is ukranian , for refugees and mainly kids in schools who often cannot pay the bill for an unlimited 4G connection they need.

Need ukranian language too.

Posted Jun 4, 2022 8:03 UTC (Sat) by tlamp (subscriber, #108540) [Link]

It has German though?! But there are other European languages missing too, not sure what should single out German and French, I'm sure this will be further extended in the future, IMO there's no point in waiting on every possible language combination being supported before doing a first public release.
Note also that EU Horizon funds are often not funding the full project, but "only" a partial part of the total funds.

I, for one, are extremely happy to get solutions that are independent of Companies with a proprietary focus, especially those residing in countries of which law require to turn over all data of foreign (and sometimes national) users and am thankful for the creators and those funding it.

Mozilla releases a machine-translation plugin

Posted Jun 3, 2022 11:49 UTC (Fri) by MattBBaker (guest, #28651) [Link]

This especially has potential if they can find a way to leverage the browser engine to suss out context for the translation. That's always seemed to be a major stumbling block for machine translations

Mozilla releases a machine-translation plugin

Posted Jun 5, 2022 10:00 UTC (Sun) by zoobab (guest, #9945) [Link] (6 responses)

Why not start with a CLI client?

Mozilla releases a machine-translation plugin

Posted Jun 5, 2022 14:43 UTC (Sun) by atnot (guest, #124910) [Link] (5 responses)

At this point with modern tooling and wasm it isn't really significantly more work to create a simple web app than a CLI, and will be significantly easier for most people to quickly try out.

Mozilla releases a machine-translation plugin

Posted Jun 7, 2022 4:54 UTC (Tue) by JanC_ (guest, #34940) [Link] (2 responses)

But ideally we get a library/service that can be called from/used by any language/application, as browsers are not the only applications people use that could benefit from machine translation…

Mozilla releases a machine-translation plugin

Posted Jun 7, 2022 14:07 UTC (Tue) by atnot (guest, #124910) [Link] (1 responses)

You can just use it as a C++ library natively, they have instructions in the repo: https://github.com/mozilla/bergamot-translator#using-nati...

Mozilla releases a machine-translation plugin

Posted Jun 23, 2022 2:45 UTC (Thu) by JanC_ (guest, #34940) [Link]

Great!

And sorry for not checking this out further myself first, but all the PR was rather empty on things like that, and your answer to zoobab seemed to indicate that Mozilla only cared about the in-browser use mostly…

So, good to see we were wrong about that.

Mozilla releases a machine-translation plugin

Posted Jun 7, 2022 12:19 UTC (Tue) by ballombe (subscriber, #9523) [Link] (1 responses)

It works both way:
wasm makes it easy to convert a cli tool to a simple web app, but not the other way round,
so why not do both ?

Mozilla releases a machine-translation plugin

Posted Jun 7, 2022 14:12 UTC (Tue) by atnot (guest, #124910) [Link]

I'm not really sure what you mean, wasm is just a compiler target like any other. If you want to use this library natively, you can just compile it for your preferred architecture. You still need to wrap it with a UI of course, be that a CLI or some JS for a website.

It looks like the repo already contains a basic CLI for testing purposes either way: https://github.com/mozilla/bergamot-translator/blob/main/...

Mozilla releases a machine-translation plugin

Posted Jun 5, 2022 17:15 UTC (Sun) by flussence (guest, #85566) [Link] (5 responses)

The UX needs work - it pops up randomly offering to translate pages from a language they clearly aren't in and there's no way to turn that off, occasionally the popup does something that pushes *the entire browser chrome* off the right edge of the window, and there's no way to translate only part of a page.

But the few opportunities I've managed to use it properly, it actually works. Amazing.

Mozilla releases a machine-translation plugin

Posted Jun 7, 2022 5:04 UTC (Tue) by JanC_ (guest, #34940) [Link] (4 responses)

There is also no way (that I could see) to disable it for languages that you do understand (and thus never need translations for), or even better only activate it with the translations button in the URL bar.

And some websites seem to have an allergic reaction to text being translated (I assume it triggers some JavaScript the wrong way?). Maybe there should be a simple “translations don’t work on this page” button or menu option or whatever?

Mozilla releases a machine-translation plugin

Posted Jun 7, 2022 18:04 UTC (Tue) by Wol (subscriber, #4433) [Link] (3 responses)

I had similar fun with automatic redirection - browser address bars are a bloody nightmare nowadays - give them an address and they want to search for it!

Anyways, I typed in "amazon.de", and found myself on amazon.co.uk. FFS, if I give you a *real* web address, in the address bar, JUST TAKE ME THERE!!!

Cheers,
Wol

Mozilla releases a machine-translation plugin

Posted Jun 18, 2022 7:30 UTC (Sat) by dr@jones.dk (subscriber, #7907) [Link]

A hostname is not a web address: It lacks protocol.

Mozilla releases a machine-translation plugin

Posted Jun 23, 2022 2:35 UTC (Thu) by JanC_ (guest, #34940) [Link] (1 responses)

That might have been a redirect from Amazon itself, of course.

FWIW: I can go to both of those, and get no redirect, but I’m not located in either Germany or the UK…

Also: in Firefox you can disable search in the URL bar, and use a separate search bar, if that’s what you prefer (It’s what I do in most of my Firefox profiles).

Mozilla releases a machine-translation plugin

Posted Jun 23, 2022 9:19 UTC (Thu) by Wol (subscriber, #4433) [Link]

> Also: in Firefox you can disable search in the URL bar, and use a separate search bar, if that’s what you prefer (It’s what I do in most of my Firefox profiles).

How do you do that? Can you get back the old functionality where it assumed the address bar was an address bar and searched if it couldn't find it?

I find the current setup where - if you start typing in the search bar it takes you to the address bar to search - somewhat ... well I'd like to be bloody rude about but can't think of any words to describe the idiocy ... - and then it assumes if you type an address into the address bar you want to search :-(

Cheers,
Wol


Copyright © 2022, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds