ChatGPT says...

Posted Oct 29, 2024 17:13 UTC (Tue) by paulj (subscriber, #341)
In reply to: ChatGPT says... by leromarinvit
Parent article: OSI readies controversial Open AI definition

> Exactly, they "have been learned". I can't make it "unlearn" them without having the original training data to modify and run the full training again.

This isn't quite true, depending on what you mean.

a) You can start from scratch, just start with random parameters again, and train with your own data. I.e., just don't use the parameter set.

b) If there are specific things in the trained parameters that you don't like/want, you can /keep/ training the model but apply your own objective function to penalise the thing you don't like. Essentially "correcting" whatever it is you don't like.

The challenge with b may that you don't know what is in there that you don't want. However, you will probably have the same problem if you have the full input data. The model is basically given you the training data in a digested form that at least lets you query it though.

That "correcting" process is something all the big players with public AI tools surely must have /big/ teams of people working on: just filtering out whatever "fruity" content happens to be discovered in public LLMs and generative AIs. Teams manually looking for "dodgy" content, to then come up with the widest set of prompts that spit out that dodgy content (in all its combinations), to then pass on to other teams to create refinement training cases and test-cases. So that we get the safely sanitised LLMs / models suitable for public use.