|
|
Log in / Subscribe / Register

Vulnerabilities in various GTK-based PDF readers

Michael Catanzaro has disclosed a command-injection vulnerability affecting a number of GTK-based PDF readers; exploits included:

They contain a script for building malicious polyglot PDFs that are simultaneously both valid PDF files and also valid ELF binaries. When the user opens the PDF in the PDF viewer and clicks on a malicious link embedded in the PDF, the PDF abuses the command injection vulnerability to load itself as a GTK module using the `--gtk-module` command line flag. It can then execute arbitrary code via its library constructor. That flag was removed in GTK 4, which is why the vulnerability is much less serious for Papers than it is for Evince, Atril, and Xreader.


to post comments

A careful programmer...

Posted May 22, 2026 3:56 UTC (Fri) by ccchips (subscriber, #3222) [Link]

should be able to use an AI to suggest fixes. A not-so-careful programmer could allow the AI to fix the problem itself and make mistakes!

A hurdle for the attacker?

Posted May 22, 2026 5:56 UTC (Fri) by rrolls (subscriber, #151126) [Link] (4 responses)

While I don't want to understate the severity of this flaw, I think it's worth pointing out something, if I'm reading the writeup correctly: the malicious PDF has to contain its own absolute path for the exploit to work.

That would usually mean having to guess/know the victim's username if it's going to be delivered via a browser download (as it'd likely end up in /home/username/Downloads), and any delivery that gives it a random temp name will not work out for obvious reasons. Likely still very useful for spear phishing, but perhaps we won't be seeing this one used for indiscriminate opportunistic attacks.

A hurdle for the attacker?

Posted May 22, 2026 6:06 UTC (Fri) by josh (subscriber, #17465) [Link]

Wouldn't have to be full spearphishing. If you know the user's name, or GitHub handle, or similar information, you could guess some likely possibilities for their local username. Many sites might have enough information to guess that.

"Try my fun tool to generate artwork in a PDF. Create an account to try the advanced features."

A hurdle for the attacker?

Posted May 22, 2026 8:14 UTC (Fri) by cjwatson (subscriber, #7322) [Link]

Do any of the PDF readers keep a file descriptor open on the PDF? If so, perhaps /proc/self/fd/... would also work.

A hurdle for the attacker?

Posted May 22, 2026 14:02 UTC (Fri) by mcatanzaro (subscriber, #93033) [Link] (1 responses)

Unfortunately that's the first version of the exploit. The follow-up version removes that requirement using %f, which works because the command line is actually processed as a desktop file Exec line.

A hurdle for the attacker?

Posted May 22, 2026 20:16 UTC (Fri) by hvd (guest, #128680) [Link]

Even with the first version, in a technical sense it may have needed to be an absolute path, in a practical sense it did not: if a malicious file exploit.pdf knows it will likely reside at /home/user/Downloads/exploit.pdf and likely be opened by an application with /home/user as the current work directory, it can force an access to /proc/self/cwd/Downloads/exploit.pdf and not need to know the user's name.

With hindsight, it was a code smell anyway

Posted May 22, 2026 6:34 UTC (Fri) by epa (subscriber, #39769) [Link] (8 responses)

The exploit is ingenious but an audit of the code base, by hand or perhaps with some simple heuristics, would surely have spotted this:
g_string_append_printf (cmd, " --page-label=%s",
     ev_link_dest_get_page_label (dest));   // unquoted
String pasting like that should always be protected by paranoia about the characters contained (in a scripting language, I would make a regular expression matching known safe strings), and that it’s obviously constructing a shell command line is an even bigger red flag.

Perhaps it would make more sense to run the AI over the codebase looking for this kind of problem: code which is wrong in principle, and file bugs for that. The code could be quietly tightened up over a few months. Only if the project refuses to fix the bug (and it is a bug, anyway, if a filename containing shell characters does something weird) would it be necessary to demonstrate an exploit.

With hindsight, it was a code smell anyway

Posted May 22, 2026 6:37 UTC (Fri) by epa (subscriber, #39769) [Link]

P.S. I do take the point that with widely accessible LLMs there is not much point in embargoes. Someone else could find the same exploit. I guess what I’m saying is that all of these code smells ought to be fixed, not just those for which we have managed to find an exploit so far.

With hindsight, it was a code smell anyway

Posted May 22, 2026 16:49 UTC (Fri) by NYKevin (subscriber, #129325) [Link] (6 responses)

I would go further. There is way too much use of the shell to do things that you could just as easily do by hand. In my book, there are only two legitimate reasons to invoke the shell:

* You are implementing a launcher or the like, and passing a user-input command line directly to the shell with little or no modification.
* You are executing a shell script (possibly one that you wrote), and that shell script has gone through the usual linters (shellcheck and whatnot).

The following cases are not legitimate:

* system(3) is more convenient than fork(2)/execlp(3) (get over it and write the extra 5-10 lines of code).
* You have some convoluted command line and don't want to do it all by hand (write a very short shell script instead).
* Nobody told you that execlp(3) exists, and you're just following what the LLM says (I'm willing to forgive this, but not excuse it - please educate yourself before you write code that will be used by other people).

With hindsight, it was a code smell anyway

Posted May 23, 2026 9:48 UTC (Sat) by MarcB (subscriber, #101804) [Link] (1 responses)

> I would go further. There is way too much use of the shell to do things that you could just as easily do by hand.

This is, unfortunately, true. I suspect some languages, and their communities, dragged countless young developers to the dark side. PHP in particular is horrible in this regard. AFAIK, it only added a way to spawn processes without a shell to its standard library with PHP 7.4 (2019!) - and still in a horrible way.
The advice to new developers usually was "use escaping!", which is a fool's errand given the potential of non-standard shells and the existence of Windows. Those developers then carried those patterns to other languages.

As a positive example, Perl makes it very easy to get it right - and young developers usually are also taught to do it right. Assuming $cmd is not attacker controlled, this is safe in Perl on a POSIX system: system($cmd, @args). DOS-lineage systems may need the slightly annoying system { $cmd } ($cmd, @args). POSIX systems can also use this; they have to if they want to override ARGV0.

This is basically fork() + execlp() + wait().

With hindsight, it was a code smell anyway

Posted May 23, 2026 13:57 UTC (Sat) by dskoll (subscriber, #1630) [Link]

Tcl's exec command also avoids invoking the shell, instead constructing its own pipelines with fork/execve/dup2

With hindsight, it was a code smell anyway

Posted May 23, 2026 21:36 UTC (Sat) by ppisa (subscriber, #67307) [Link]

By the way, why not use directly posix_spawn() or posix_spawnp(), they remove need for fork, close what should be closed and the second one even takes PATH into account?

With hindsight, it was a code smell anyway

Posted May 24, 2026 14:11 UTC (Sun) by epa (subscriber, #39769) [Link]

Agreed. Not invoking the shell for anything is a good rule. But it can also be expressed in more black-box terms. If the test suite for all programs routinely tested operation using “bad” filenames, a lot of shell-injection bugs would be caught, though perhaps not all of them (and probably not this one).

system() is indeed a banana skin. Perl provides a second form of that call which takes a list of arguments and doesn’t go via the shell. That could be carried back into the C library. But even with good hygiene making the command, you might still get tripped up by the program’s own argument parsing. A filename beginning - could be interpreted as an option. Libraries like GTK can add their own peculiar command line options, as happened here. So even without shell interpretation, running anything at all is tricky enough to get right.

With hindsight, it was a code smell anyway

Posted May 24, 2026 15:22 UTC (Sun) by iabervon (subscriber, #722) [Link]

This code is using g_app_info_create_from_commandline(), not system(), and there doesn't seem to be an execlp()-like equivalent available. That does limit how much a mistake can mess up the parsing, but not enough to avoid passing arbitrary arguments to a program that's not designed for that.

With hindsight, it was a code smell anyway

Posted May 25, 2026 18:07 UTC (Mon) by turistu (guest, #164830) [Link]

> system(3) is more convenient than fork(2)/execlp(3) (get over it and write the extra 5-10 lines of code).

It's not just fork + execlp. It's also wait, SIGINT, SIGQUIT, SIGCHLD blocking / ignoring (then restoring), waiting for the exit status, etc. Most programmers get all those details wrong 90% of the time.

Also notice that your execlp() may end up calling the shell, too, if execve fails with ENOEXEC (different from plain execl()).

A better idea would be to standardize a function which takes a list of arguments (like execve), and does everything system(3) does <b>except</b> ever calling a shell. Just like the "system" from perl when called with multiple arguments.

> You have some convoluted command line and don't want to do it all by hand (write a very short shell script instead).

That's just kicking the can down the alley. Your "very short" shell script will most certainly be exploitable via its arguments. Nobody knows how to write "secure" shell scripts when their arguments are controlled by a hostile third party.

Browser

Posted May 22, 2026 8:57 UTC (Fri) by patrick_g (subscriber, #44470) [Link] (9 responses)

What's the point of Evince or Atril when you can open all pdf files directly in your browser?
The firefox viewer (pdf.js) is sandboxed.

Browser

Posted May 22, 2026 10:16 UTC (Fri) by grawity (subscriber, #80596) [Link]

Originally – they predate the (built-in) ability to open pdf files in-browser by a few years. (Not counting the Acrobat plugin for Netscape...)

These days – I'd say that "not in browser" is very much the point in itself. Both Firefox and Chrome make decent PDF readers nowadays, sure, but sometimes you want a standalone app in principle, or it doesn't have the performance overhead, or you want to open something temporarily *in a different context* and not get lost in your 100 open tabs. (Like how image viewers exist even though browsers can display images, or how sometimes you 'cat' or 'less' a file instead of opening it in your full IDE session.)

Browser

Posted May 22, 2026 10:27 UTC (Fri) by tux3 (subscriber, #101245) [Link] (1 responses)

pdf.js is very nice and featureful, but it "stateless".
It won't remember what page number you were reading, you can't leave little bookmarks in your ebooks or manuals and come back later.

You can annotate the PDF if you save your changes, but outside of the file itself pdf.js has amnesia.

Browser

Posted May 22, 2026 13:59 UTC (Fri) by vasvir (subscriber, #92389) [Link]

In that tangent,

You can have bookmarks (highlights), annotations (notes) and collections of them, in pages but also in PDFs in a collaborative environment (comments, users and groups with https://web.hypothes.is/

I found it useful in a research setting where you have to read a number of PDFs and later can't remember where is what.

It's a very cool project not widely known and thought to bring it up. I am not affiliated with them in any way.

Browser

Posted May 22, 2026 10:37 UTC (Fri) by parametricpoly (subscriber, #143903) [Link]

pdf.js is not very efficient for complex pdf files. I have rather fast machines. E.g. Ryzen 5000 series laptop & desktop, but still evince feels a lot faster. Mupdf is much much faster and lighter than Firefox + pdf.js.

Browser

Posted May 22, 2026 16:16 UTC (Fri) by atai (subscriber, #10977) [Link] (4 responses)

a local native viewer is always lighter than a bull browser

Browser

Posted May 23, 2026 5:23 UTC (Sat) by ametlwn (subscriber, #10544) [Link] (3 responses)

even if $browser is already running?

Browser

Posted May 24, 2026 1:39 UTC (Sun) by jmalcolm (subscriber, #8876) [Link] (2 responses)

Probably. Even individual browser tabs are far from lightweight.

Browser

Posted May 25, 2026 1:43 UTC (Mon) by k8to (guest, #15413) [Link] (1 responses)

The PDF reader I use at least typically consumes less than 10MB, while a browser tab is usually over 100MB. They probably make very different tradeoffs.

Browser

Posted May 25, 2026 5:05 UTC (Mon) by DemiMarie (subscriber, #164188) [Link]

Chromium’s PDF viewer is written in C++. No DOM involved.

important thing

Posted May 22, 2026 17:49 UTC (Fri) by atai (subscriber, #10977) [Link]

any news on security updates for these viewers?

Be careful what you wish for

Posted May 23, 2026 9:43 UTC (Sat) by cyperpunks (subscriber, #39406) [Link] (3 responses)

Its worth to note that the .note.gnu.build-id which is kind of security feaature (creates a link between source code/build process and finally binary) here is used to create valid PDF file of a standard elf binary.

Be careful what you wish for

Posted May 23, 2026 19:20 UTC (Sat) by chmod (subscriber, #169510) [Link] (2 responses)

I wouldn't say that the .note.gnu.build-id has anything to do with security. It is intended for debugging and profiling to have a key to lookup ELF binaries/debuginfo/sources, e.g. it can be used to query debuginfod. Even in the standard use-case, it is not cryptographically tied to the ELF content, it can be random or user-controlled, e.g.
echo 'int main() { return 42; }' |gcc -xc - -Wl,--build-id=0x0123456789012345678901234567890123456789
From what I have understood, the only requirement is to place 9 "magic" bytes (%PDF-1.4\n) in the first 1024 bytes of the ELF/PDF. I guess there are plenty of other possibilities aside the build id, like other notes or just between ELF segments.

Be careful what you wish for

Posted May 24, 2026 9:49 UTC (Sun) by adobriyan (subscriber, #30858) [Link] (1 responses)

> the only requirement is to place 9 "magic" bytes (%PDF-1.4\n) in the first 1024 bytes of the ELF/PDF

What's the logic behind _not_ placing magic header in the beginning of the file like ELF does?

Be careful what you wish for

Posted May 24, 2026 10:59 UTC (Sun) by pizza (subscriber, #46) [Link]

> What's the logic behind _not_ placing magic header in the beginning of the file like ELF does?

Just a guess, but probably so PDF files wrapped in PJL (and the like) JustWork(tm).


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds