Gentoo bans AI-created contributions

Posted Apr 20, 2024 12:11 UTC (Sat) by donald.buczek (subscriber, #112892)
In reply to: Gentoo bans AI-created contributions by gmgod
Parent article: Gentoo bans AI-created contributions

> Prompting an AI tool to do "say that the app was missing a feature about how to handle numbers in a commit message" is going to generate a novel's worth of text that the person with broken English won't be able to vet.

Like all tools, LLMs can be used wrongly and correctly. You gave an example of a bad prompt and usage. I agree, that many people use LLMs wrongly with bad outcome and I see some dangers for society.

However, LLMs can also be used correctly. For example, you can pipe a text you've wrote through a LLM and prompt it to optimize the text or make suggestions for improvement. I sometimes use gpt4-turbo for that. In my experience the model is able to make good suggestions, fix errors and optimize the text while preserving its meaning.

I've tried a few local models but, unfortunately, these seem not yet to be on the same level as gpt4-turbo.

True, you can't trust the output, you need to manually review it sentence by sentence to make sure, it still says what you want to say and only that. But errors and hallucinations are the exception not the rule. Often the optimized text is just better and you can take it as is.

It is a service to your readers if your text is free of errors, well structured, polite and to the point. You can very much fine-tune what you get if you tell the model what you want and what recipients you address.

===

As an example, I've wrote the above text without thinking too much about it. Probably there are a bunch of typos and bad style in it. Now I asked gpt4-preview via the api to "Optimize the following text, which is a comment in lwn.net for clarity and conciseness."

Here's the output, which I intentionally left untouched for demonstration purpose.

===

LLMs, like any tool, can be misused or used effectively. Misuse can lead to negative outcomes and societal risks. However, when used correctly, such as refining or suggesting improvements to a text, LLMs like gpt4-turbo can enhance clarity, fix errors, and maintain the original meaning. While local models may not yet match gpt4-turbo's capabilities, it's crucial to review the LLM's output to ensure accuracy and avoid misinformation. Properly optimized texts benefit readers by being error-free, well-structured, and clear, especially when the model is guided by specific instructions and audience considerations.

===

If I was to use that model for my "real" reply, I would further work on it iteratively. I'm not happy that it didn't mention that you provided a bad example. I'm not happy about "Misuse can lead to .. societal risk" which is not what I wanted to say. I didn't specify what societal risk I see or that misuse is a cause of it. In the end, I would probably just drop the vague side-note, which would be an improvement.

Not being an native English speaker, I'm not sure about the "While local models ..., it's crucial to ..." construct. Does it make a connection between the statements of the two sub-sentences, which in my intent are totally independent? I would either research that question (learning something by the way) or replace the construct out of my uncertainty [1].

So, IMO, LLMs can very well be use to optimize text, for example commit messages. But you need to write the text yourself in the first place, than use the LLM, than carefully review the result. If you don't like it, amend the prompt, rinse and repeat.

[1] However, now that I've mentioned it, I think I don't need to research it to satisfy my curiosity, because I more or less expect a lecture from a certain member of this community :-)

Gentoo bans AI-created contributions

Posted Apr 20, 2024 13:36 UTC (Sat) by Wol (subscriber, #4433) [Link] (12 responses)

> [1] However, now that I've mentioned it, I think I don't need to research it to satisfy my curiosity, because I more or less expect a lecture from a certain member of this community :-)

Well, the LLM's output certainly feels "stilted", or off, somehow.

Cheers,
Wol

Gentoo bans AI-created contributions

Posted Apr 20, 2024 16:14 UTC (Sat) by donald.buczek (subscriber, #112892) [Link] (4 responses)

> Well, the LLM's output certainly feels "stilted", or off, somehow.

I'm a bit disappointed by your restrained reply, I hope I didn't offend you, which was surly not indented. That was supposed to be a harmless joke.

@gpt3-5 (with gpt4-turbo currently being unavailable to me for yet unknown reasons) : "The following text feels a bit stilted. Make it more colloquially without changing the meaning. Avoid passive constructs. Drop the reference to the risk for society. Drop the comparison of gpt4-turbo with local models.<ret><ret><text from above>"

===

LLMs are like any other tool - you can either use them right or mess things up. If you misuse them, you'll end up with a lot of problems. But if you use them the right way, like tweaking or suggesting improvements to a text, LLMs can make things clearer, fix mistakes, and keep the original meaning intact. It's super important to check the stuff LLMs churn out to make sure it's right and avoid spreading false info. When texts are optimized properly, it helps readers by being error-free, well-organized, and easy to understand, especially when the model follows specific instructions and considers the audience.

Gentoo bans AI-created contributions

Posted Apr 21, 2024 7:36 UTC (Sun) by Wol (subscriber, #4433) [Link] (3 responses)

> > Well, the LLM's output certainly feels "stilted", or off, somehow.

> I'm a bit disappointed by your restrained reply, I hope I didn't offend you, which was surly not indented. That was supposed to be a harmless joke.

Well, my reply was - while true - intended to be tongue-in-cheek. And of course, when you're worried about your own English, and the LLM gets it worse, of course that's funny :-)

And no, I'm not offended. Speaking four languages to varying ability (and a brit to boot!), I know how hard it can be to express yourself in another language. It's when native speakers can't be bothered to try and get it right that I get upset. After all, English is reputed to be one of the hardest languages to learn (and then you've got American, Strine, Pidgin, and all the others with their quirks on top :-)

Cheers,
Wol

Gentoo bans AI-created contributions

Posted Apr 21, 2024 12:43 UTC (Sun) by donald.buczek (subscriber, #112892) [Link] (2 responses)

> > > Well, the LLM's output certainly feels "stilted", or off, somehow.
> > I'm a bit disappointed by your restrained reply, I hope I didn't offend you, which was surly not indented. That was supposed to be a harmless joke.
> Well, my reply was - while true - intended to be tongue-in-cheek. And of course, when you're worried about your own English, and the LLM gets it worse, of course that's funny :-)

"Indented". OMG, how comes you only see your mistakes after you send/post/print something? I should have had it reviewed by an LLM or spouse; the former would have pointed the error out. And thank you for you not picking on it.

Btw: We diverted a bit into whether LLMs can help when you need to communicate in a language which you are not perfect in. In my opinion, they can. But it's not only with foreign languages. I occasionally ask an AI to review my German texts, too, and most of the time the reviews are helpful and the suggestions are good. I often accept them.

And I've replied to a comment which mentioned AI-tooling for text (commit messages). I don't want to open a whole new discussion with some of the arguments being the same, but I do think that using code-trained AIs can be very helpful in the programming domain, too.

Gentoo bans AI-created contributions

Posted Apr 21, 2024 15:35 UTC (Sun) by Wol (subscriber, #4433) [Link] (1 responses)

> "Indented". OMG, how comes you only see your mistakes after you send/post/print something?

> surly not indented

There's an even bigger typo in there - I'm really sure you didn't mean that !!! :-)

Cheers,
Wol

Gentoo bans AI-created contributions

Posted Apr 21, 2024 19:49 UTC (Sun) by donald.buczek (subscriber, #112892) [Link]

My error rate of 66% surely makes me surly.

Gentoo bans AI-created contributions

Posted Apr 20, 2024 16:56 UTC (Sat) by atnot (subscriber, #124910) [Link] (6 responses)

> Well, the LLM's output certainly feels "stilted", or off, somehow.

AI writing always makes me feel like those eerie pictures of thousands of faces averaged together. Yes, that may be a face and my brain recognizes as somewhat pleasant, but it's definitely not human.

It's somewhat worse for LLMs because one voice they have been specifically trained with is corporate american "as per my last email" customer service politeness, which leaves everything coming out of it with the characteristic charme and wit of a moist car insurance sales brochure.

Now, I'm sure there's people out there whose writing could be improved with the vacuous input of a thousand first year copywriting hires whose cover letter said they were "very enthusiastic about sewage lift pumps". But I really really don't want to live in a world where that normalized or encouraged, or necessary to suck the life out of everything that way. At least, more so than it already is. It's one of the great joys of reading, it's why I subscribe e.g. to this website, and it kills me inside a little every time I see someone think so little of the art of writing to sterilize their voice in this way of their own free will.

Gentoo bans AI-created contributions

Posted Apr 20, 2024 18:44 UTC (Sat) by NYKevin (subscriber, #129325) [Link] (2 responses)

The unfortunate reality is that much writing is already like this with or without LLMs. If you work for a large corporation, you both read and write* like this every day. The object of the game is not to produce good quality writing, it's to produce writing that communicates specific information or opinions, to as wide of an audience as possible, in a tone and style that is unlikely to confuse or offend anyone.

Of course, many people detest that game. But, on the other hand, money. Corporate jobs generally pay well and provide good benefits. If I have to write something bland and inoffensive every day, that's an entirely fair price of entry in my eyes.

* Freudian typo I actually made here: "writhe"

Gentoo bans AI-created contributions

Posted Apr 21, 2024 7:46 UTC (Sun) by Wol (subscriber, #4433) [Link] (1 responses)

> Of course, many people detest that game. But, on the other hand, money. Corporate jobs generally pay well and provide good benefits. If I have to write something bland and inoffensive every day, that's an entirely fair price of entry in my eyes.

That game is very good for corporates - it's very good at blame shifting.

And it's both encouraged - and damaging - in Science. It lends a "third person" air of authority which is often not deserved. Which is why the raid wiki style guide says "Write in the first person. Take personal responsibility for what you write". It's much pleasanter to read, as well :-)

Cheers,
Wol

Gentoo bans AI-created contributions

Posted Apr 21, 2024 15:39 UTC (Sun) by Wol (subscriber, #4433) [Link]

Oh - I'll add - Michael Faraday was a "first person" person. Which was one of the reasons he was a successful/famous scientist - he could communicate.

Cheers,
Wol

Gentoo bans AI-created contributions

Posted Apr 20, 2024 21:04 UTC (Sat) by donald.buczek (subscriber, #112892) [Link]

> customer service politeness

Yes, I understand. Nonetheless, in the domain of the technical references and descriptions, I don't want to see an authors personality shine through, neither polite nor rude nor funny. Just condensed, right to the point technical information. The text object, this subthread started with, was a commit message.

But any other text, even those which should show emotions and personality, can be improved if unintentional errors are removed. LLMs are good with text, they can point to possible errors. From spelling over grammar to even more abstract levels. Spell checkers on steroids. You can very much choose to what degree and how exactly you use the tool for. You don't need to let it rewrite your text into something without a soul.

---
Rephrase the following text into the style of a pirate from a fictional, funny, over the top pirate story. <above text>
---
Arrr, me hearties, listen well! When we be navigatin' the treacherous waters of technical manuscripts and parchments, I yearn not for a glimpse of the author's spirit, be it courteous, churlish, or full of jest. Give me straight, no-nonsense treasure maps of knowledge, I say, starting with the scrawl on the log that kicked off this here discussion.

But mark me words, any other bilge, even that which ought to be brimming with life and character, could stand to lose a few barnacles if it means cleaning up unintended blunders. Us sea dogs have our sea-lore chroniclers - LLMs, the mighty beasts of text, capable of sniffing out a whole array of missteps. From the cursed spellin' errors to the murky depths of grammar and beyond, they're like our spell-checkin' parrots on a hefty dose of grog. Ye be the captain of this vessel, choosing how to steer with this tool at your disposal. There's no need to set a course for blandness, devoid of the soul. Avast! Let your message fly true, without losin' its heart!

Gentoo bans AI-created contributions

Posted Apr 20, 2024 21:12 UTC (Sat) by kleptog (subscriber, #1183) [Link] (1 responses)

> It's somewhat worse for LLMs because one voice they have been specifically trained with is corporate american "as per my last email" customer service politeness,

So tell it to use a different voice. It can't read your mind you know. If you want it to talk in Southern US slang, it can do that.

There are services out there that will, for a fee, lookup all public communications you've ever made and any private communications you provide, and then create a customised chatbot that will respond with your exact writing style. That's the thing that makes the transformer architecture so revolutionary: it's so easy to customise, anyone can do it.

People sometimes forget it's just a computer therefore doesn't have all the subtle context cues conversations with people have. So if you don't tell it to produce an output appropriate for a Linux kernel commit message, it's not going to figure that out itself. Learning how to configure an LLM to produce output suitable for the context is not very hard, and can be learned by anyone in an afternoon.

The idea you're going to be able to tell someone is using an LLM is on the same level as whether you can tell someone is using a spell checker. If done well, you're not going to notice at all.

Gentoo bans AI-created contributions

Posted Apr 21, 2024 2:26 UTC (Sun) by khim (subscriber, #9252) [Link]

> If done well, you're not going to notice at all.

And even if you do notice that it may not be a bad thing.

Have you ever tried to write to support of Chinese companies in an era before LLMs?

These answers were sure written by humans back then but oh, boy, if you think customer service politeness is something to complain about… you haven't seen what the majority of this small globe called Earth population produces in writing.

Gentoo bans AI-created contributions

Posted Apr 20, 2024 16:18 UTC (Sat) by rgmoore (✭ supporter ✭, #75) [Link] (29 responses)

So, IMO, LLMs can very well be use to optimize text, for example commit messages. But you need to write the text yourself in the first place, than use the LLM, than carefully review the result. If you don't like it, amend the prompt, rinse and repeat.

And one might reasonably ask if this is worth the bother compared to revising the text yourself. If you can't trust the AI to do a good job, you'll probably spend as much time reviewing its work as you save by having it do that work for you. Maybe some future version will be better, but for now it's of questionable benefit.

Gentoo bans AI-created contributions

Posted Apr 20, 2024 16:43 UTC (Sat) by donald.buczek (subscriber, #112892) [Link]

> And one might reasonably ask if this is worth the bother compared to revising the text yourself.

To me it is. Its difficult to see our own errors or find alternative structures and expressions, once you've settled for something. That's the reason why we sometimes give important texts to someone else to proofread it or make suggestions for improvements from another perspective.

With LLMs you don't need to bother your spouse :-)

===
Point out any error or bad style in the following text.<ret><ret><above text>
===

In the text provided, there is a spelling mistake in the second sentence. It should be "It's difficult" instead of "Its difficult." Additionally, the phrase "once you've settled for something" could be rephrased for clarity and conciseness.

The sentence "That's the reason why we sometimes give important texts to someone else to proofread it or make suggestions for improvements from another perspective" is a run-on sentence and can be divided into two separate sentences for better readability.

The informal smiley face ":-)" used at the end could be considered unprofessional in some contexts.

Gentoo bans AI-created contributions

Posted Apr 21, 2024 2:36 UTC (Sun) by khim (subscriber, #9252) [Link] (27 responses)

> If you can't trust the AI to do a good job, you'll probably spend as much time reviewing its work as you save by having it do that work for you.

Where this level of arrogance comes from and why this haven't worked in era before AI when spellcheckers where introduced?

Sure, AI couldn't write text better than most native speakers.

But for every English native speaker there are three non-native cpeakers and twenty more who don't know English at all.

Which means that for 95% of Earth population AI **already** does job better then what they can do, themselves.

I, for example, don't know where and how to use “a” and “the”. And would, probably, never learn because my native language doesn't even have a notion of article. As in: such part of language don't exist, I have no idea why anyone would want to use it, where and why.

If AI may just only add these correctly this would already be worth for me to reread what it wrote. And it may do more.

Gentoo bans AI-created contributions

Posted Apr 21, 2024 7:43 UTC (Sun) by Wol (subscriber, #4433) [Link] (22 responses)

> I, for example, don't know where and how to use “a” and “the”. And would, probably, never learn because my native language doesn't even have a notion of article. As in: such part of language don't exist, I have no idea why anyone would want to use it, where and why.

And how many other languages have THREE different forms of the present tense? Which one do you use where? That's another classic "this guy is a foreigner" giveaway ...

(btw, khim, the difference between "a" and "the" is easy to explain and hard to apply. If you (could) have several, and don't care which, then it's "a". If you have several and *do* care which you're talking about, then it's "the". "I saw a/the cat in the street". "a" means there are a lot of cats around, and it's a casual comment you saw one of them. "the" means you were talking about a cat earlier, and it's that specific cat.)

Cheers,
Wol

Gentoo bans AI-created contributions

Posted Apr 21, 2024 8:20 UTC (Sun) by khim (subscriber, #9252) [Link] (20 responses)

> btw, khim, the difference between "a" and "the" is easy to explain and hard to apply

The problem is not that it's hard to understand, the problem is that to do the right choice I need to think about something that I normally don't care about at all.

It's similar to palatalization to English speaker: compare ugol' to ugol: these are certainly sounding not completely identically, but would you care about that difference enough to hear and reproduce that difference in casual speech? I'm yet to see any English speaker who can reliably do that. Simply because that's not something they are trained to perceive.

I know the difference between “a” and “the” and if life (or, more, likely, my work permit) would depend on that difference I can use them correctly… most of the time. Just like people (even non-native speakers) can spell words correctly… most of the time. But similarly to how spellchecker effortlessly catches cases where there are no ambiguity with spelling so AI does that for me with articles: instead of deciphering my own text and looking at it from angle that is just not natural for me I may spend my mental efforts on something else.

English have it easy, BTW. Try to ensure that you are using articles in German correctly some day.

> And how many other languages have THREE different forms of the present tense?

You would be suprised. Latin have six tenses and because it was used in so many countries for so long that crazyness leaked out into them, too. Only it has mutated in the process and different languages got similar yet different ideas about how time passes.

Gentoo bans AI-created contributions

Posted Apr 21, 2024 13:36 UTC (Sun) by Wol (subscriber, #4433) [Link] (19 responses)

> English have it easy, BTW. Try to ensure that you are using articles in German correctly some day.

German is my second language. I know :-)

> > And how many other languages have THREE different forms of the present tense?

> You would be suprised. Latin have six tenses and because it was used in so many countries for so long that crazyness leaked out into them, too. Only it has mutated in the process and different languages got similar yet different ideas about how time passes.

Well, I was taught we have the same 6 tenses. It makes perfect sense to me.

But another reply tells me we have FOUR present tenses - I'm not aware of the fourth ... that's for just ONE Latin present tense ...

(I program, I am programming, I do program. I'm not aware of any other European language with multiple present tenses - doesn't mean there aren't any, I've just never heard of any...)

Cheers,
Wol

Gentoo bans AI-created contributions

Posted Apr 21, 2024 16:55 UTC (Sun) by malmedal (subscriber, #56172) [Link] (7 responses)

> But another reply tells me we have FOUR present tenses - I'm not aware of the fourth ... that's for just ONE Latin present tense ...

They are:
Present simple I work
Present continuous I am working
Present perfect I have worked
Present perfect continuous I have been working

These are common, e.g. equivalent in Spanish:
Yo trabajo
Yo estoy trabajando
Yo he trabajado
Yo he estado trabajando

(apologies if I am messing up the conjugation)

German also has present simple and present perfect, it is missing the continuous forms.

Anyway, as someone who has studied multiple foreign languages. English is by *far* the easiest to deal with.

Gentoo bans AI-created contributions

Posted Apr 21, 2024 19:11 UTC (Sun) by NYKevin (subscriber, #129325) [Link] (2 responses)

This is complicated enough that linguists usually divide it up into three (or occasionally four) parts:

* Tense - Usually past, present, and future.
* Aspect - Simple ("perfective"), progressive (continuous), and perfect. Some languages have imperfect or other aspects, which English lacks.
* Mood or modality - Anything marked with a modal verb (the auxiliary verbs used for the progressive and perfect do not count as modal in this formalism). Sometimes subcategorized into indicative and subjunctive modalities (but we can further categorize into conditional, counterfactual, normative or deontic, etc.).
* Evidentiality - Not used in English, and for that matter not used in most/all Indo-European languages. Some languages mark verbs to indicate why the speaker believes the asserted fact to be true, for example distinguishing between something directly witnessed and something indirectly reported. Some authorities consider this an extension of modality.

One confusing thing about English is that English has no future tense in this formalism. The future is sometimes marked with the modal verb "will," so you could say that the future is a modality. But modalities are optional in a way that tenses are not, and so we can have sentences like "Tomorrow, I'm buying a new laptop," which is semantically happening in the future but has no grammatical marker indicating as much (and if you chop off the "tomorrow" prefix, it's a perfectly good present continuous sentence without a whiff of the future).

On the other hand, there are languages (such as Mandarin Chinese) that have only one tense in this formalism. Those languages either treat all time information as modal, or do not have grammaticalized time markers at all. Of course, speakers of those languages are perfectly capable of distinguishing between past, present, and future. Every language can do that. But in these languages, time information is truly optional. You can say "I buy a new laptop," as a discrete event (rather than the habitual or indefinite sense that English simple present would normally imply), without specifying when that happens.

Another "fun" property of English is do-support: There are some constructions in English which grammatically require a modal verb (e.g. turning a declarative sentence into a yes-no question), but modals are semantically optional in English, so the word "do" (or "does") is used as a placeholder when no modal is required. In general, English does a lot of fronting and other grammatical rearrangement when building different constructions, and I would imagine that this annoys speakers of Spanish just as much as Spanish verb conjugations tend to annoy speakers of English.

Gentoo bans AI-created contributions

Posted Apr 21, 2024 19:57 UTC (Sun) by donald.buczek (subscriber, #112892) [Link]

It fills me with great satisfaction that I am now receiving the prophesied English lecture after all.

Gentoo bans AI-created contributions

Posted Apr 21, 2024 20:31 UTC (Sun) by malmedal (subscriber, #56172) [Link]

I find https://www.ithkuil.net/ fascinating, not fascinating enough to actually learn it, but still interesting to what sort of concepts you can cram into a language.

Gentoo bans AI-created contributions

Posted Apr 21, 2024 20:55 UTC (Sun) by donald.buczek (subscriber, #112892) [Link] (3 responses)

> German also has present simple and present perfect, it is missing the continuous forms.

No, German has that, too. "I am working" would be "Ich bin arbeitend". Yes, this isn't used much and sounds a bit strange but it is valid. It strongly indicates "right at this very moment".

Now, to make things even more complicated: A male "worker" is a "Arbeiter" and a female worker is a "Arbeiterin". A worker or a group of workers with unknown, irrelevant or mixed gender would be "Arbeiter", too. But many people now reject the generic masculine.

So currently, several gender neutral forms compete with the old generic masculine and with each other. For plural, the substantivized form of the verb in the present continuous tense is often used: "Arbeitende".

So the seldom-used tense got a little revival lately.

I don't like it, because to me the natural interpretation of "Arbeitende" would be "people, who work at this very moment".

Gentoo bans AI-created contributions

Posted Apr 21, 2024 21:14 UTC (Sun) by malmedal (subscriber, #56172) [Link] (2 responses)

> "I am working" would be "Ich bin arbeitend".

Mmmm, I believe arbeitend functions as an adverb, in this sentence. That is "I am something" and the something that I am is "working".

Gentoo bans AI-created contributions

Posted Apr 22, 2024 7:25 UTC (Mon) by donald.buczek (subscriber, #112892) [Link] (1 responses)

> Mmmm, I believe arbeitend functions as an adverb, in this sentence. That is "I am something" and the something that I am is "working".

After reading a bit I have to admit that you are more correct than I am. The word form 'arbeitend' is known as 'Partizip I' (Present Participle) in German, which functions as a hybrid between a verb and an adjective/adverb.

This is from the beginning of the German variant of the Wikipedia of "Participle" / "Partizip" page:

> A participle (Latin participium, from particeps "participating"; plural: participles) is a grammatical form (participial form) that is derived from a verb and thereby partially acquires the properties of an adjective, but also retains some properties of a verb. The term "participle" and likewise the German term Mittelwort express this characteristic of participating in two categories at the same time, namely verb and adjective. [...]
> German examples of participles are the forms ending in -end like spielend (to the verb spielen; called "Present Participle") and the forms starting with ge- like gespielt (called "Past Participle"). In traditional grammar, participles were often listed as a separate part of speech alongside verbs, adjectives, nouns, etc.; however, this view is not shared in modern linguistics, where participles are considered as words or even constructions that contain varying proportions of verbal and adjectival
components.

You are correct; the Present Participle ('Partizip I') is not typically used in natural speech as a verb to denote an 'immediate' tense. "Ich bin arbeitend" can be said to be wrong and is not listed in tables with verb tenses. At least, it sounds yoda-ish.

As you mentioned, The Present Participle is used as an adjective or adverb to describe a state. It's also used, a bit more verbish, to indicate simultaneous actions, for example, "Die Kinder kamen lachend aus der Schule" ("The children came out of the school laughing").

Gentoo bans AI-created contributions

Posted Apr 27, 2024 2:19 UTC (Sat) by gutschke (subscriber, #27910) [Link]

Just to throw another wrinkle into this discussion, I believe that in "ich bin arbeitend", the "arbeitend" would be a predicative expression, which is different both from an adverb and from a way of expressing what English would do with present continuous. It simply describes a state that you are in. For a better discussion, see https://en.wikipedia.org/wiki/Predicative_expression

I understand why it is tempting to say that the present participle is used to form a present continuous. That would feel very natural to an English speaker who is familiar with Latin. And it feels almost but not quite as if German should do the same. In fact, I wouldn't be surprised if some regional German dialects did this. There is a lot of cross pollination between all of these languages, but in the process grammatical concepts get repurposed and subtly change.

Gentoo bans AI-created contributions

Posted Apr 21, 2024 18:07 UTC (Sun) by khim (subscriber, #9252) [Link] (10 responses)

> I program, I am programming, I do program.

And where do you see different present tenses?

> Well, I was taught we have the same 6 tenses. It makes perfect sense to me.

Well… to some degree it even makes sense to me. Just different from what sense it makes to you. Please open that Wikipedia link, scroll down literally dozen of lines and read:

English has only two morphological tenses: the present (or non-past), as in he goes, and the past (or preterite), as in he went.

WHA… what happened to these “six tenses”? That's also explained right there, too:

The study of modern languages has been greatly influenced by the grammar of the Classical languages, since early grammarians, often monks, had no other reference point to describe their language. Latin terminology is often used to describe modern languages, sometimes with a change of meaning, as with the application of "perfect" to forms in English that do not necessarily have perfective meaning, or the words Imperfekt and Perfekt to German past tense forms that mostly lack any relationship to the aspects implied by those terms.

> I'm not aware of any other European language with multiple present tenses - doesn't mean there aren't any, I've just never heard of any...

English times are “different” for the same reason English inches, feet and miles are different from meters, that everyone else uses. Difference is not in language per see, it's in how it's teached. Just why you say that I am programming is separate time while I love programming is not?

In reality most European languages may also use verbs to adjust time perception, they just don't [try to] pretend it's, somehow, grammatically different time and are [slowly] adopting the rules what actual linguistic designed and not [try to] pretend everyone is talking in a variant of Latin.

Gentoo bans AI-created contributions

Posted Apr 21, 2024 22:14 UTC (Sun) by Wol (subscriber, #4433) [Link] (9 responses)

> > I program, I am programming, I do program.

> And where do you see different present tenses?

Because they have completely different meanings?

"I am a programmer, I program" - it's my job, I do it all the time, I may - OR MAY NOT - be doing it right now (I'm not - I'm busy writing right now :-)

"I am programming" - I'm not, I'm not doing it now, I'm writing.

"I do (not) act" - this variant is almost always either emphatic or negative - and when negative it implies "never".

So the first variant is the continuous present, I may not be doing it right now but it happens past present and future. (In the positive it also does not necessarily imply "right now".)

The second variant is the present - it's happening right now.

The third variant - I'm not sure what it's called - is almost always used to imply "never".

Three clearly different meanings.

To jump on your mention of "English has "he goes" and "he went"", what do you understand by the two sentences

"Jim is going to the gym" and "Jim goes to the gym". I was taught they are two - clearly different - present tenses. Because they have two - clearly different - meanings.

Cheers,
Wol

Gentoo bans AI-created contributions

Posted Apr 22, 2024 4:38 UTC (Mon) by khim (subscriber, #9252) [Link] (8 responses)

> And where do you see different present tenses?

Sure. And I love programming would be another meaning and I teach programming yet another one.

Does it mean there are bazillion times in English? Loving present, teaching present and so on?

No, there are two times and many verbs of which few selected ones are interpreted by teachers as “yet another time”.

English is not unique and not even particularly hard WRT to how it treats time (other languages have many other and different ways to talk about time passage, too). What is inique is absolute refusal to change anything anywhere for any reason in teaching of said language.

Gentoo bans AI-created contributions

Posted Apr 22, 2024 7:38 UTC (Mon) by Wol (subscriber, #4433) [Link] (7 responses)

> No, there are two times and many verbs of which few selected ones are interpreted by teachers as “yet another time”.

Let's apply some very simple logic.

"I program" = "I am programming"

therefore true = false

Unless you live in Crete, these two cannot be same, therefore while they are both present, they have to be different present tenses. And I don't know about you, but this confusion is one of the absolutely standard ways by which we detect foreign speakers ... it's a VERY common mistake. (Coupled with the occasional giveaway of "I programming" which simply doesn't exist in standard English.)

Cheers,
Wol

Gentoo bans AI-created contributions

Posted Apr 22, 2024 10:29 UTC (Mon) by kleptog (subscriber, #1183) [Link] (4 responses)

This is one of the classic mistakes Dutch people with poor English make.

The phrase "ik ga naar school" in Dutch can mean either "I go to school" or "I am going to school (now)" depending on the context. For some reason Dutch people often throw in the "am" when it is unnecessary, but other Dutch don't pick up the problem either. Once you point it out to them it usually corrects fairly quickly, but it's fascinating that the same type of error keeps popping up.

Gentoo bans AI-created contributions

Posted Apr 22, 2024 11:00 UTC (Mon) by paulj (subscriber, #341) [Link] (2 responses)

I'm confused, how is the am in "I am going to school" unnecessary in English? "I go to school" in English - at least in most of the Celtic Isles - would sound a little foreign. Indeed, it sounds... Dutch. ;)

Also (and ICBW, I've never really had native /adult/ dutch, and it's been a long time since I had native child's dutch), but could a dutch person not be more precise with "Ik ga nu naar school" for "I am going to school now"? Also, "Ik ga zo naar school" for "I am going to school shortly"? Part of the problem with dutch is it has become very terse, and dropped a lot of constructs - even in my lifetime AFAIK. (??).

Gentoo bans AI-created contributions

Posted Apr 22, 2024 11:45 UTC (Mon) by Wol (subscriber, #4433) [Link]

> I'm confused, how is the am in "I am going to school" unnecessary in English? "I go to school" in English - at least in most of the Celtic Isles - would sound a little foreign. Indeed, it sounds... Dutch. ;)

My daughter goes to school - and she's 40. She's a deputy head :-)

The "am" is WRONG (not unnecessary, wrong) if it's school holidays :-) "I go to school" typically means "I am a student", while "I am going to school" means I'm on my way right now.

Cheers,
Wol

Gentoo bans AI-created contributions

Posted Apr 22, 2024 21:04 UTC (Mon) by kleptog (subscriber, #1183) [Link]

Yeah, I guess I'm not explaining myself very well. It's not that the "am" is unnecessary in general, but that most of the time they mean the variant without. So you get conversations like:

A: What do you do during the day?
B: I am working.
A: (confused) Clearly you are sitting here having a drink? Oh you mean "I work".

It's not that some languages cannot express certain tenses, given enough words you can express any tense in any (sufficiently advanced) language. It's whether certain tenses have a special status in the grammer of a language. Generally similar concepts in different languages are linked in different ways which leads to people learning the language using words in ways a native speaker finds confusing.

But yes, this is a post about Gentoo, so better leave it at that.

Gentoo bans AI-created contributions

Posted Apr 22, 2024 13:48 UTC (Mon) by khim (subscriber, #9252) [Link]

> Once you point it out to them it usually corrects fairly quickly, but it's fascinating that the same type of error keeps popping up.

What's fascinating about that? You are using less flexible language and are forcing someone to pick between two choice that to him (or her) are almost undistinguishable. Of course there would be mistakes!

It's like an attempt of someone to write perl program for the first time. Learning when should you use `$` and when should you use `@` with arrays names is non-trivial, to say the least.

Gentoo bans AI-created contributions

Posted Apr 22, 2024 13:43 UTC (Mon) by khim (subscriber, #9252) [Link] (1 responses)

I don't know whether continuing is constructive at this point. You are sprouting the same kind of nonsense that you sprouted when undefined behavior was discussed and ignore everything except what you believe to be true. Even if your believe don't even remotely match the reality.

> Let's apply some very simple logic.

If by now “logic”, in English, means “random sequence of letters without any clear meaning”, then I guess I learned some kind of wrong English.

> "I program" = "I am programming"

therefore true = false

Unless you live in Crete, these two cannot be same

Sure, they are not the same, but so are sentences “I am”, “I like”, “I like programming”, “I teach”, “I teach programming” and many others.

> therefore while they are both present

Yes. And they are present in most other human languages. Or do you believe other languges couldn't distinguish without person who is programming for living and person who programs something right now, this very second? They can, that's not a reason to introduce some nonsense bazillion present tenses.

Why does it may surprise that not all things that may happen in present have the same meaning… or why have you decided that alls these sequences of words should be split into three semi-randomly picked present times?

> they have to be different present tenses

Why? Why “I like programming” or “I teach programming” don't need different present tenses, but “I am programming” needs it?

> And I don't know about you, but this confusion is one of the absolutely standard ways by which we detect foreign speakers ... it's a VERY common mistake.

Yes, but is it because English have more “more present tenses” or… because has it “less present tenses”? I would say that it's because it has less.

It's the same story as with articles: similarly to how most of the time difference between “a” and “the” is meaningless (can be picked from the context easily and can be easily conveyed if needed) difference between “I program” and “I am programming” exist but it's not useful! Of course other languages can distinguish between these two forms if needed, it's just most of the time there are no need to distingush them.

Worse: the form that is conveying more often needed meaning (that I'm programming right now) is longer and more complicated.

English is similar to BASH here: like in BASH you may want to write $* or "$@" and, most of the time, short form is not needed and not used so English insist on use of longer form where difference between two forms are meaningless (e.g. on a programmer's forum saying that you know how to program is not useful but saying that you are in process of writing program is useful).

> Coupled with the occasional giveaway of "I programming" which simply doesn't exist in standard English.

Indeed. English grammar is extremely inflexible, rigid and, I would even say, “strange”. It takes a long time for a speaker of some other language where words don't come in a sentence in any particular order to adjust to it.

English, of course, have no choice because it has words that may sound identically when used as noun and as verb, but, again, problem arises not when English offers you more capabilities (you may just ignore them) but when it doesn't have capabilities that other languages have (similarly to how translating program from statically-typed language to dynamically-typed is easy but going in the other direction is not).

On continuing

Posted Apr 22, 2024 13:57 UTC (Mon) by corbet (editor, #1) [Link]

Indeed, this conversation has gone fairly far afield, and it seems like a good time to wind it down.

Remember Gentoo? ... this is an article about Gentoo ...

Gentoo bans AI-created contributions

Posted Apr 21, 2024 9:29 UTC (Sun) by malmedal (subscriber, #56172) [Link]

> And how many other languages have THREE different forms of the present tense? Which one do you use where? That's another classic "this guy is a foreigner" giveaway ...

English has four, I believe, and this is the case for pretty much every language in the Indo-European language group. It is not rare.

In fact English has lost a number of constructs common in related languages, .e.g. "he is arrived" vs. "he has arrived".

Gentoo bans AI-created contributions

Posted Apr 22, 2024 3:17 UTC (Mon) by ssmith32 (subscriber, #72404) [Link] (3 responses)

It's not arrogance.

If you don't understand the language, you won't be able to review the output.

If you barely understand the language, it's either going to take a loooong time to review or you're just not doing it right.

It's like writing code vs testing and debugging. Writing it out is the easy part (relatively speaking).

Gentoo bans AI-created contributions

Posted Apr 22, 2024 4:54 UTC (Mon) by khim (subscriber, #9252) [Link] (2 responses)

> It's like writing code vs testing and debugging. Writing it out is the easy part (relatively speaking).

How much code in Haskell or Rust you wrote?

Whether writing code is an easy part or not depends very much on what you are writing and how.

Sure, if you are using language which allows you to write something like [] + {} and get nonsense output without any errors then writing code is easy and testing and debugging is tedios and time-consuming part.

If you use something like Haskell or Rust then writing the code is the majority of your work and if you are using something like WUFFS then writing the code that compiler accepts is 99% of work.

> If you don't understand the language, you won't be able to review the output.

Sure, but that's where AI and human complement each other: for human it's easier to understand unfamiliar language than to write sentence in unfamiliar language, while for AI it's the opposite. So by allowing AI to create something that is “looking nice” (task which current generative AIs already perform better than non-native speakers) and giving human the task that s/he does well you reduce the time needed to create the final result. That's true both for programming language like Python or Ruby and regular language like English or Chinese.

I have no idea why is it so hard to accept when it's obvious. It's the exact same reason spellcheckers work, after all.

Heck, do an expriment: try to write some simple program in language that you have never used before (Haskell, Scheme, or maybe APL or MUMPS) and compare to the time needed to first learn said language and then write something.

Gentoo bans AI-created contributions

Posted Apr 22, 2024 7:23 UTC (Mon) by anselm (subscriber, #2796) [Link] (1 responses)

If you use something like Haskell or Rust then writing the code is the majority of your work

I don't buy that. There will still be loads and loads of bugs in people's code even if the compiler accepts it. Haskell and Rust may be nice languages but it's not as if they made debugging unnecessary.

Gentoo bans AI-created contributions

Posted Apr 22, 2024 7:43 UTC (Mon) by khim (subscriber, #9252) [Link]

> Haskell and Rust may be nice languages but it's not as if they made debugging unnecessary.

This, of course, depends to a large degree on how you structure your code and, even more importantly, how you structure your data.

The determined Real Programmer can write FORTRAN programs in any language, after all.

But if you structure your code to embed enough domain knowledge in the data types then yes, debugging becomes mostly unnecessary. And even if you do need to debug things you know where to go: to these pesky few corner cases which you cut because you had poor understanding on what your program actually should do.