How to fix the whole catagory of shell injection

Posted Dec 6, 2024 19:51 UTC (Fri) by excors (subscriber, #95769)
In reply to: How to fix the whole catagory of shell injection by raven667
Parent article: Abusing Git branch names to compromise a PyPI package

As far as I can tell, shell expansion is not relevant in this case:

GitHub Actions workflows are defined in YAML files, where you can use syntax like ${{github.ref}} in values. Those will be expanded by the action runner when it executes the workflow. (https://docs.github.com/en/actions/writing-workflows/choo...)

One of the values in the YAML file is a shell script to execute. If you use ${{github.ref}} in there, the attacker-provided branch name will be inserted before the script is passed to the shell. The attacker doesn't even need to use $(...) in the branch name, they can simply use "; to exit from the current shell command and start a new one, and the shell sees a totally normal sequence of basic commands. (See PoC in https://github.com/advisories/GHSA-7x29-qqmq-v6qc , which is a different but similar vulnerability in exactly the same project a few months ago.)

I think the specific vulnerability here is fixed in https://github.com/ultralytics/actions/commit/89c1f380073... by moving all the ${{...}} out of the shell script block and into the 'env' block where they can't do any harm (although actually the branch name was removed entirely; presumably they didn't need it at all).

GitHub warns about this problem at https://docs.github.com/en/actions/security-for-github-ac...

How to fix the whole catagory of shell injection

Posted Dec 6, 2024 20:37 UTC (Fri) by wahern (subscriber, #37304) [Link] (2 responses)

This highlights the real problem, mixed data and code. Emphasis on sanitization seems misguided. Sanitization has value in the sense of defense in depth, but it's the wrong way to think about things. Where ever data and code are intermixed, the proper fix is always to separate them, preferably before an exploit is ever identified. Too many people jump straight to the sanitization issue, but not only does it not address the root problem, often it's a much thornier problem to resolve. In this case, as in most, you don't have to decide what characters should be valid in a branch name to solve an issue like this. More over, if you don't fix the issue properly, you leave the door open--maybe you got the character filter wrong, or maybe the semantics of evaluation changed (or were misdiagnosed) so it doesn't rely on special characters.

How to fix the whole catagory of shell injection

Posted Dec 6, 2024 21:14 UTC (Fri) by jwarnica (subscriber, #27492) [Link]

The SQL world mostly figured this out decades ago (if not always used) with parameters, basically making injection attacks impossible.

How to fix the whole catagory of shell injection

Posted Dec 7, 2024 11:02 UTC (Sat) by epa (subscriber, #39769) [Link]

The attacker only has to be lucky once. You need to sanitize input because there will be a mistake somewhere in the separation of data and code. And you need to separate data and code because there will be missing sanitation somewhere. As you say, defence in depth: but that’s the only possible approach in a system that might be exposed to a public network.

But even without any attacker, constraining inputs is still good practice, I feel. It saves you time debugging later when a stray space character got into a value and (even though you were not mixing data and code) no longer compared equal to the value without a leading space. Something like a git branch name (or a username) is not the place to allow the full flowering of human creativity with special characters. There are plenty of commit message and comment fields for that.

How to fix the whole catagory of shell injection

Posted Dec 7, 2024 5:59 UTC (Sat) by raven667 (subscriber, #5198) [Link] (2 responses)

yeah, I just read through those details and that matters a bit in this case, if you are using another template system to programmatically create a shell script then that provides more avenues for code injection. In this specific case, wouldn't it make sense for GitHub to scan actions yaml files and flag any `run:` sections which have `${{` variable expansions in them and at least link to their warning document.

What if multiple commands on a single line using `;` was also a flagged feature and not enabled by default or able to be disabled? I'm not smart enough to figure out a whole system on my own but I keep thinking we should find something as effective as SQL bind variables are in stopping unexpected command interpretation that can be applied to shell scripting, maybe the wholesale separation of the shell implementation used for interactive CLIs and for scripting and automation, to make the little bits of automation that are sprinkled _everywhere_ safer.

How to fix the whole catagory of shell injection

Posted Dec 7, 2024 10:02 UTC (Sat) by NYKevin (subscriber, #129325) [Link] (1 responses)

The proper solution is to stop trying to interact with shells programmatically. Everything you can do with a shell can also be done with your favorite non-shell programming language. There are only two "good" reasons to use a shell:

* Interactive use by a human.
* Scripting of things that would otherwise be done interactively by a human (as a timesaving measure for human sysadmins, not as something you programmatically invoke in production).

Everything else is an antipattern, where the shell has been co-opted to replace a missing feature in another program (often using the shell's -c flag).

How to fix the whole catagory of shell injection

Posted Dec 7, 2024 20:15 UTC (Sat) by raven667 (subscriber, #5198) [Link]

I think that is where my half-formed thought was going, in this specific case the commands could just be an array of yaml strings and the ci could parse into an argv before applying template variable replacement. That wouldn't work when you have some if/then logic but a shell-like dsl with a very restricted feature set could make transition easier, or just building expression evaluation into the yaml along with your template engine, similar to how Ansible has Python/Jinja `when:` expressions, there is some customization/integration to be sure but they didn't have to invent a whole custom system, and experience with the library transfers between different tools in the same ecosystem.

I'm just frustrated at seeing this same kind of problem again

How to fix the whole catagory of shell injection

Posted Dec 7, 2024 10:57 UTC (Sat) by fishface60 (subscriber, #88700) [Link]

> moving all the ${{...}} out of the shell script block and into the 'env' block where they can't do any harm

I see this bug everywhere, these yaml templating engines should require a schema define how to escape interpolated values (e.g. like ninja shell escapes everything in a command context) or that part of your data format doesn't support interpolation.