Gentoo bans AI-created contributions

Posted Apr 18, 2024 17:54 UTC (Thu) by snajpa (subscriber, #73467)
In reply to: Gentoo bans AI-created contributions by atnot
Parent article: Gentoo bans AI-created contributions

Umm, haven't they said the same thing about the shared e-scooters, ride-sharing, couch-sharing, etc.? That it will solve itself? :)

As long as there are always new investors ready to pour resources in, it won't solve itself, certainly not in the way you think. They might actually manage to make inference dirt cheap, so they could afford to stay at these subscription levels, while even making profit. I don't see why not. The hardware hasn't even really started moving in the direction of cheaper inference yet, but it will.

Gentoo bans AI-created contributions

Posted Apr 18, 2024 17:58 UTC (Thu) by snajpa (subscriber, #73467) [Link] (5 responses)

btw the improved autocomplete from Github is $100/year, not $100/month - and so far, at least to me, it's been worth every penny :)

Gentoo bans AI-created contributions

Posted Apr 18, 2024 18:00 UTC (Thu) by snajpa (subscriber, #73467) [Link] (2 responses)

(*and* I got three RTX 3090 sitting around here just so that I can play around these so-called improved autocompletes :D weren't even that expensive, 2nd hand from a miner)

Gentoo bans AI-created contributions

Posted Apr 19, 2024 18:30 UTC (Fri) by intelfx (subscriber, #130118) [Link] (1 responses)

> I got three RTX 3090 sitting around here just so that I can play around these so-called improved autocompletes

Is there anything of that sort (I mean, LLM-powered code assistance, Copilot-grade quality) that can actually be used locally? Any pointers?

(There is JetBrains' FLCC which runs on the CPU, but it is really not much better than lexical autocompletion. I'm talking about more powerful models.)

Gentoo bans AI-created contributions

Posted Apr 19, 2024 22:28 UTC (Fri) by snajpa (subscriber, #73467) [Link]

So far the closest to Copilot experience was with phind-codellama-34b-v2.Q4_K_M (GGUF format, llama.cpp and derivatives eat that, fits one 3090; bigger models are too slow to respond IMO) + Twinny extension for VS Code - though next time I get to it (ie. when my ISP has an outage so I have to use flaky backup LTE) I'm going give the Continue ext another shot; phind-codellama-34b-v2.Q4_K_M isn't as good as Copilot, but I haven't tried to modify the prompts the plugins feed to it, from the behavior I get I think there's a lot of room for optimization there.

Outside of code completion, people really ought to try the miqu-1-70b "leak", which can fit onto two 24G cards, to see where the state of the art is (or was, not that long ago) - comparatively to how much resources it needs to run... text generation with this thing is just about the most boring thing once can do, it IMHO doesn't deserve as much attention as it is getting; when we finally get an open- (or at least published-) weights models with those current extended "up to 1M"-class context window sizes, combined with QLoRA, I think people are going to make some amazing things with it. For me, the 32k context size is currently the most limiting factor.

Gentoo bans AI-created contributions

Posted Apr 18, 2024 18:55 UTC (Thu) by atnot (subscriber, #124910) [Link] (1 responses)

Sorry, but $100 is just nowhere near enough to cover the cost of running these things. Microsoft charges their enterprise customers roughly 4x that and not even they have remotely turned a profit on it. In fact to my knowledge, not a single company has ever turned a profit with an LLM offering at any price point. And they'd be yelling it from the rooftops if they did.

It's also notable that even at that price, they have to give deep discounts to enterprise customers so that they can proudly announce companies like McKinsey getting on board. Not because they have any use for it either mind you, but to be able to "better answer our customers questions about AI".

Gentoo bans AI-created contributions

Posted Apr 18, 2024 21:17 UTC (Thu) by snajpa (subscriber, #73467) [Link]

At that scale, they also have massive opportunities to optimize and cut the total amount of work they need to do, just by looking at the data that goes through and balancing it against the compute costs (using heuristics, for example, how often is the suggested code accepted, etc.).