| |

Subscribe / Log in / New account

Damages i18n has done?

Damages i18n has done?

Posted Dec 9, 2024 17:10 UTC (Mon) by raven667 (subscriber, #5198)
In reply to: Damages i18n has done? by mbunkus
Parent article: Debian opens a can of username worms

Clearly all error messages should be standardized to Esperanto (j/k)

What might be useful in that case is to promote the use of unique ASCII identifiers for error messages that make them searchable across languages or text edits, eg %FOO-PLORT-12345: this is a well worn solution to this problem

Damages i18n has done?

Posted Dec 10, 2024 11:14 UTC (Tue) by taladar (subscriber, #68407) [Link] (5 responses)

That only really works if the parameters in the error message (e.g. filenames, ports,...) do not matter to finding others with the problem though.

Damages i18n has done?

Posted Dec 10, 2024 11:46 UTC (Tue) by farnz (subscriber, #17727) [Link] (4 responses)

The well-worn solution has messages that look like %FOO-PLORT-12345:"filename","example.com","2001:db8:1::42/64"%. The idea is that you look up %FOO-PLORT-12345 in your catalogue of possible messages, and get told that it's "could not download {1} over HTTP from https://{2}/ (resolved IP {3})". You can then fill in the parameters (by hand, back in the day, computer can do it now), and discover what the error meant.

Damages i18n has done?

Posted Dec 11, 2024 9:45 UTC (Wed) by taladar (subscriber, #68407) [Link] (3 responses)

While that does seem like a good solution to the issue I can't say I have ever encountered a program using that in 20+ years of professional Linux administration.

Damages i18n has done?

Posted Dec 11, 2024 10:50 UTC (Wed) by farnz (subscriber, #17727) [Link] (2 responses)

The UNIX world never went this way; I encountered it interacting with mainframes and minicomputers, back 30-odd years ago.

Damages i18n has done?

Posted Dec 11, 2024 12:13 UTC (Wed) by Wol (subscriber, #4433) [Link]

That sounds exactly like what I was thinking of - of course, using a database, there was a MESSAGES file which the error function searched - keyed on message id and language, then it printed the appropriate error message for the locale. Prefixed by the message id, to make it easy to search for / report a problem. If you're dealing with a support team who speak a different language, the message id makes much more sense than the error message.

Cheers,
Wol

Damages i18n has done?

Posted Dec 11, 2024 18:42 UTC (Wed) by raven667 (subscriber, #5198) [Link]

I've seen this standard in various vendor software, eg Cisco IOS and variants use a similar kind of system

https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/16_xe/s...
https://www.cisco.com/c/en/us/td/docs/ios/12_2/sem2/syste...

found an example from some IBM system that is this style where every log is numbered

https://publibz.boulder.ibm.com/epubs/pdf/ispzmc90.pdf

grabbing one at random ISRB0001 is searchable and leads to further docs https://www.ibm.com/docs/en/zos/2.4.0?topic=codes-ispf-me... which would be a searchable tag even if the text of the message was localized or changed between versions

Damages i18n has done?

Posted Dec 10, 2024 17:23 UTC (Tue) by mbunkus (subscriber, #87248) [Link] (4 responses)

It's not just about error messages, though. If you don't speak any of the languages a tool is available in (including its help output, man page etc.), then that tool is most likely completely unusable for them.

It's kind of hard to know how many people across the world do not speak English. There are several statistics out there that say that up to 1.45 billion people do speak English[1], but there are 8.2 billion people across the globe (or something like that). For whatever reason. Lack of education (or even educational possibilities), too young, too old, learning disabilities, socio-economic pressure & limitations etc. etc. "Just learn English" is not going to cut it just yet, maybe never.

For example, I started using computers when I was eight, I think. I was able to learn to program in it because the manual it came with was in German, even though the software itself was in English. I could not speak English at that point, but having documentation in my native language enabled me to at least associate several English words (PRINT, IF…) & short phrases (SYNTAX ERROR IN…) with their German counterparts, but only because I had the German stuff to learn from. If that hadn't been available, I might only have started doing stuff with computers years later if ever at the scale I'm doing it now. Having stuff available in your own language enables you to learn, to use, to create. Saying things like "everyone needs to learn English in our field" and "i18n has cost businesses a lot" is really thinking from inside a certain bubble, and it's really excluding & limiting.

All I'm asking for here is to be more open to make software, especially Open Source software, available and usable to all, not just the English-speaking system admin clique.

[1] https://www.statista.com/statistics/266808/the-most-spoke...

Damages i18n has done?

Posted Dec 11, 2024 9:57 UTC (Wed) by taladar (subscriber, #68407) [Link] (3 responses)

Please note that I was specifically talking about error messages and auto-switching of output to another language on relatively low level interfaces, the kind most likely used directly only by relatively skilled computer users.

I am absolutely in favor of translating interfaces used by laymen (but only those parts they want skip over anyway like error messages) and documentation.

I have the opposite experience to yours though, when i was younger, in the 1990s, a lot of computer books were translated by clueless translators so every publishing house had a different German version of the standardized English IT terminology and some of the coding examples in programming books were broken because the translators didn't understand how to translate e.g. a regex replacing part of a string.

Similarly, even in entertainment media, once I learned English I noticed how many of the German dubs contain English idioms that do not exist in German and were just translated word for word (presumably to make the lip-sync work).

I am also not talking about the cost to business here, I am talking to the cost i18n has to the communication itself by making that worse, not the financial cost.

Damages i18n has done?

Posted Dec 11, 2024 12:20 UTC (Wed) by Wol (subscriber, #4433) [Link]

> Similarly, even in entertainment media, once I learned English I noticed how many of the German dubs contain English idioms that do not exist in German and were just translated word for word (presumably to make the lip-sync work).

This! As someone who's German is passable, and who's French has mostly been forgotten (plus ancient smatterings of Russian and Khmer), so much information is passed *by reference* in conversation, that if you're not a native speaker it's extremely easy to miss what is actually being said. Or (as has happened to me) the "meaning as written" can be very different to the "meaning as understood", so you end up saying something completely different from what you thought you had said!

Cheers,
Wol

Damages i18n has done?

Posted Dec 11, 2024 16:50 UTC (Wed) by mbunkus (subscriber, #87248) [Link] (1 responses)

> Please note that I was specifically talking about error messages and auto-switching of output to another language on relatively low level interfaces

You're trying to enforce permanence on human language here. Error messages may change for a number of reasons, including them being unclear or even plain wrong, having to be extended to include additional information, include examples to the user how to fix the error/use the program correctly, or just stylistic changes. Even error messages written in English might contain non-ASCII characters if they include user-generated content, and that might not even be validly encoded (e.g. a file name). Note that all of those can happen with English as well.

If you want "I don't want to have to change my things, ever", then you're in well-trotten territory of e.g. REST APIs & similar. Argue for your low-level tools to implement best practices from those APIs, including:

- structured, versioned output
- a status indicator
- machine-parseable, stable error codes (that don't change) alongside human-readable error messages (that are subject to change & translation)
- one imposed language on all identifiers, most likely English (e.g. hash keys, status strings etc.)

That gets you everything you want while also allowing the tools to be translated, their messages changed in whatever way, to be easier to use by more people. This is something that I would very much like to see as well.

As for two examples, the "ip" tool & the "restic" backup command have JSON output in addition to the well-known, default human-readable one. It's easy to handle. Unfortunately in both cases error messages (and in the case of Restic certain verbose status messages) are still printed as human-readable messages instead of using JSON for it as well, falling short of what I'd like to see.

Damages i18n has done?

Posted Dec 16, 2024 10:28 UTC (Mon) by taladar (subscriber, #68407) [Link]

Oh, I would absolutely be for machine-readable output for all of those situations.

Unfortunately as long as you have some sort of output that isn't fully pre-specified (like an enum) but a free form value you would then soon get the feature request to translate those parts of the output too because someone wants to build some sort of user-facing UI based on the machine-readable output.

My argument is more that certain messages should not be translated because translations are literally hurting communication when compared to the use of a single language.

Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds