Human extinction from alignment problems

Posted May 6, 2023 0:00 UTC (Sat) by david.a.wheeler (subscriber, #72896)
In reply to: Much ado about *censored* by roc
Parent article: Google "We Have No Moat, And Neither Does OpenAI" (SemiAnalysis)

The *current* crop of AI/ML won't lead to human extinction due to lack of alignment.

But I do think that in the long term this is a legitimate concern. Not because the AI/ML becomes "evil", but because the program does what it was asked to do yet does it in an unexpected way. Computers do what they're told to do, not what we *meant* to tell them to do, and we humans are really bad at being precise about what we mean. What concerns me most is variations of the "paperclip maximizer". That is, an AI is told to make as many paperclips as it can, and it turns the humans / Earth / the universe into paperclips. See this clicker game with this as the premise: https://www.decisionproblem.com/paperclips/

I have no idea how to address this problem. I hope someone else figures it out...!

Human extinction from alignment problems

Posted May 6, 2023 4:48 UTC (Sat) by roc (subscriber, #30627) [Link]

For quite a long time people like LeCun said "why would an AI want to take over the world or destroy humanity? That's ridiculous." Turns out one answer is "because people will ask it to, for the lulz if for no other reason" --- see ChaosGPT.

So I don't think we're going to reach a state where paperclip-maximizer misalignment is the crucial problem. That issue is going to be swamped by people providing their own bad goals.

Like other LWN readers I'm a dyed-in-the-wool open source enthusiast in general, but here I feel like it's going to be more like open-source nukes-for-all. I am not enthusiastic about that.

Human extinction from alignment problems

Posted May 8, 2023 9:34 UTC (Mon) by ssokolow (guest, #94568) [Link]

Robert Miles has a good video named We Were Right! Real Inner Misalignment which really drives home the problem of misalignment, not in terms of politics, but in terms of how difficult it is to be sure that these systems have actually learned what you tried to train.

...and it's got this great comment:

Turns out the Terminator wasn’t programmed to kill Sarah Connor after all, it just wanted clothes, boots and a motorcycle.
-- Luke Lucos

Human extinction from alignment problems

Posted May 10, 2023 22:14 UTC (Wed) by JoeBuck (subscriber, #2330) [Link] (4 responses)

I don't think that this is the near-term threat. Long before AI is good enough to independently take over the world, it might be good enough that management can fire most of the programmers, writers, artists, and middle management, have AI replace their functions, have a skeleton crew to clean up any problems in what the AI generates, and the stockholders keep all the money.

Human extinction from alignment problems

Posted May 16, 2023 0:06 UTC (Tue) by ras (subscriber, #33059) [Link] (3 responses)

> Long before AI is good enough to independently take over the world, it might be good enough that management can fire most of the programmers, writers, artists, and middle management,

I'm not sure about that. I don't think there is much doubt in time AI will be able to any "thinking" job better than a human, given they already do a lot of things better than humans now. Their one downside is the enormous cost of training and running. So the ideal task is something that generates large rewards for intelligence. That doesn't sound like a taxi driver, programmer or writer. The thing that seems to fit the bill best is ... replacing upper management.

So my prediction for where we end up is - we all work for AI's whose loss function (the thing they are trying to optimise) is to maximise profit.

Human extinction from alignment problems

Posted May 16, 2023 13:20 UTC (Tue) by Wol (subscriber, #4433) [Link] (2 responses)

> So the ideal task is something that generates large rewards for intelligence. That doesn't sound like a taxi driver, programmer or writer. The thing that seems to fit the bill best is ... replacing upper management.

I think you're making a mistake in assuming management is intelligent ... managers typically have high EQ but low IQ - they are good at manipulating people, but poor at thinking through the consequences of their actions.

Mind you, given that studies show that paying over-the-top bucks to attract talent is pouring money down the drain (your typical new CEO - no matter their pay grade - typically underperforms for about 5 years), an AI might actually be a good replacement.

Cheers,
Wol

Human extinction from alignment problems

Posted May 16, 2023 22:47 UTC (Tue) by ras (subscriber, #33059) [Link]

> I think you're making a mistake in assuming management is intelligent

Guilty as charged. But in the (smallish, successful, run by the person who founded them) companies I've been associated with, the top level people have always been smart. Not perhaps as good as a top engineer at abstract reasoning, but they definitely much better than most at thinking through the consequences of actions and planing accordingly.

Your characterisation does seem accurate for the middle management in large organisations. When the smallish organisation I worked for got taken over by $4B company, I got to experience what it was like to for for middle "IT" management. After 2 years I could not stand it any more, and resigned.

> they are good at manipulating people

Yes, but AI's can be too - as demonstrated by AI's being besting humans at playing diplomacy. In fact that seems to imply an AI can be better at manipulating people than people are. So future AI's could have both higher EQ's and IQ's than most humans, and have the extraordinary general knowledge ChatGPT displays. But to be useful they would have to be trained continuously as new conditions arise. The CPU and power requirements would be enormous - so big you could only justify it for something like a CEO of a large company.

Compared to a human CEO an AI CEO would know of every aspect of the companies operations and everybody's contribution - even in a company with 10's of thousands of employees. (A thing that amazes me about ChatGPT is it's breadth of knowledge. Ask it about some question that could only be covered on a few obscure pages on the internet - and it often knows about it. I find it amazing that a lot of the information on the internet can be condensed into "mere" trillion 16 bit floats.) I'm guessing such breath of knowledge about the company would give it an enormous advantage over a human. Why would it need middle management, for a start?

Where merely copying a AI works you get away with sharing the training expense over a lot of instances. That's what may happen for taxi drivers and other "cookie cutter" jobs. I'm not sure programming and other engineering jobs fall into the same class. Good programmers have a lot of domain knowledge about the thing they are writing software for, which means the cookie cutter approach doesn't work so well.

Human extinction from alignment problems

Posted May 17, 2023 12:10 UTC (Wed) by pizza (subscriber, #46) [Link]

> I think you're making a mistake in assuming management is intelligent ... managers typically have high EQ but low IQ - they are good at manipulating people, but poor at thinking through the consequences of their actions.

I don't think that's fair, or accurate.

It's probably much more accurate to state that, for most management in large-ish organizations, the incentives in place reward very short-term gains at the cost of long-term consequences. So management is rationally optimizing for what gives them the most benefit.