|
|
Log in / Subscribe / Register

About KeePassXC's code quality control (KeePassXC blog)

The KeePassXC project has recently updated its contribution policy and README to note its policy around contributions created with generative AI tools. The project's use of those tools, such as GitHub Copilot, have raised a number of questions and concerns, which the project has responded to:

There are no AI features inside KeePassXC and there never will be!

The use of Copilot for drafting pull requests is reserved for very simple and focused tasks with a small handful of changes, such as simple bugfixes or UI changes. We use it sparingly (mostly because it's not very good at complex tasks) and only where we think it offers a benefit. Copilot is good at helping developers plan complex changes by reviewing the code base and writing suggestions in markdown, as well as boilerplate tasks such as test development. Copilot can mess up, and we catch that in our standard review process (e.g., by committing a full directory of rubbish, which we identified and fixed). You can review our copilot instructions. Would we ever let AI rewrite our crypto stack? No. Would we let it refactor and rewrite large parts of the application? No. Would we ask it to fix a regression or add more test cases? Yes, sometimes.

Emphasis in the original. See the full post to learn more about the project's processes and pull requests that have been created with AI assistance.



to post comments

Copilot is most useful for the smaller stuff

Posted Nov 10, 2025 13:49 UTC (Mon) by kleptog (subscriber, #1183) [Link] (2 responses)

> Copilot is good at helping developers plan complex changes by reviewing the code base and writing suggestions in markdown, as well as boilerplate tasks such as test development.

Honestly, I've haven't even had much luck with that. I've had Copilot enabled for a while and it's most helpful for things like:

* you're restructuring a list to a dict or vice-versa, after you've done two it suggests: do you want all these as well?

* you're adding an error check, after typing the if statement it completes the rest of the error handling for you based on the other similar blocks nearby.

* noticing when I've typed the wrong variable name and suggesting the correct one

* you're creating a k8s YAML config and it's helpfully suggesting the keys you need to add with useful suggestions.

I wouldn't know how to even document how I used it, I don't know the prompts it used, or if it even does something smarter than that.

I wouldn't try it on anything larger, certainly my attempts to use it at refactoring haven't been great (very slow) and I suspect this is a common theme: https://www.reddit.com/r/ProgrammerHumor/comments/1kvlj4m...

It doesn't seem to understand code bases at all at a larger scale. But maybe I'm using it wrong.

Copilot is most useful for the smaller stuff

Posted Nov 10, 2025 14:52 UTC (Mon) by Baughn (subscriber, #124425) [Link] (1 responses)

It’s a naming thing.

As in, there are multiple tools named copilot. The article is talking about the chat mode, most likely, where the data is sent to Sonnet 4.5 or similar. You’re talking about tab completion.

Neither option is as good at coding as e.g. Claude Code, so we’re also talking about quite limited use cases.

Copilot is most useful for the smaller stuff

Posted Nov 10, 2025 23:18 UTC (Mon) by intgr (subscriber, #39733) [Link]

Copilot IDE integration also has a chat panel.

But there are lots of ways to invoke it to create a PR, including IDE or from the main GitHub web site: https://docs.github.com/en/copilot/how-tos/use-copilot-ag...

Generative AI for PRs

Posted Nov 12, 2025 14:41 UTC (Wed) by emk (subscriber, #1128) [Link] (1 responses)

High-end generative AI (eg Claude Code with Sonnet 4.5) can be quite genuinely useful in the hands of a senior engineer. It occupies roughly the same space as a diligent intern. If you give it clear instructions, achievable goals and sufficient feedback, it can write a lot of OKish code quickly. But to actually use that code in production, you'll need to review it and understand it, and you'll need to impose architectural discipline. The entire process feels extremely familiar to me, because it's the same stuff I need to do to scale up a development team.

I also get much better results if I design the code for heavy, automatic testing. This includes things like property tests, which can easily generate a few million test cases. I also install linters, formatters, pre-commit hooks, CI, and all the other tools I'd set up for a team. Typically, I need more of these tools than I'd need with an all-human team.

I absolutely have used Claude Code on one or two open source projects I maintain. I do read all the code carefully, and don't hesitate to scrap or rework it as necessary. This is fine, as processes go. As long as an actual human can be trusted to take responsibility for the code, these tools can be useful in specific circumstances.

(And honestly, if I were getting spammed with AI CVEs referring to non-existent functions, I would be sorely tempted to set up Claude Code as a first pass filter to draft rejection notices for me. Reporting bugs in non-existent functions would lead to a permanent ban on that submitter.)

Generative AI for PRs

Posted Nov 13, 2025 15:26 UTC (Thu) by khim (subscriber, #9252) [Link]

> It occupies roughly the same space as a diligent intern.

Can you clarify? Diligent intern is someone you spend time on today, to, hopefully, get a good colleague in the future.

I fail to see how spending your time on LLM can be a good time investment.

> If you give it clear instructions, achievable goals and sufficient feedback, it can write a lot of OKish code quickly.

There's a problem with that idea: it's actually faster to write code than to give a clear instructions, achievable goals and sufficient feedback if your code doesn't include bazillion copy-paste lines that are more-or-less identical.

And if your code does include bazillion copy-paste lines then the proper way is to find a way to not have that code duplication and not bring intern or LLM into your project.

I guess both intern and LLM can be useful for tests writing, where some copy-paste may be better to ensure that each individual test is simpler, in isolation… but that's it.

> This includes things like property tests, which can easily generate a few million test cases.

And how is that in improvement? Every time I need to pass VK-GL-CTS I want to cry because there are million tests that only are good to report ten thousands of errors if one mistake is made. So I need to spend thousand times more resources to get the exact same effect.

And it's like that everywhere: most of the “improvements” from the use of LLMs are net negative. They are doing and improving things that are not needed in the first place.

If, once a while, in the pile of garbage generated there are something nice it's a good thing, sure, but most of the time it's just garbage!


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds