Treating Python's debugging woes

By Jake Edge
August 8, 2025

Debugging in Python is not like it is for some other languages, as there is no way to attach a debugger to a running program to try to diagnose its ills. Pablo Galindo Salgado noticed that when he started programming in Python ten years ago or so; it bugged him enough that he helped fill the hole. The results will be delivered in October with Python 3.14. At EuroPython 2025, he gave a characteristically fast-paced and humorous look at debugging and what will soon be possible for Python debugging—while comparing it all to medical diagnosis.

When he started with Python, he came from the compiled-language (C, C++, and Fortran) world, where you can attach a debugger like GDB to a running program. That would allow stopping the execution, poking around to see what the program is doing, then letting it continue to execute. Python has the pdb debugger, but when he asked around about why it could not attach to running programs like GDB does, people said "Python does not work like this". Ten years later, "now I am here to tell you 'yeah, it actually works likes this'", he said with a laugh.

Medicine

He showed a picture of a magnetic-resonance-imaging (MRI) machine, noting that it was an amazing piece of equipment that can be used to look inside a person to see how well they are working. He is a physicist, so he needed to study how they work; they use magnetic fields, "which are just light". So they are technically using light to "get a precise map of what is wrong with you without actually even cutting you open". MRI machines produce enormous magnetic fields, much larger than Earth's; it is simply amazing that humans can "produce magnetic fields that are 60,000 times the magnetic field of a planet". We use them "to charge you a lot of money and try to find out what is wrong with you".

Unfortunately, the Python debugging experience is not like an MRI machine, it is instead more like the Hieronymus Bosch painting Cutting the Stone (seen at left). The painting depicts a hapless patient having their skull opened up by a "doctor" in medieval times. Galindo Salgado renamed the painting: "Two senior engineers and a manager debugging a live application". The manager was easily spotted as the one with the book on their head "because he looks like he is helping, but he is not", he said with a laugh.

He asked his manager about using the joke, who admitted it was pretty funny. They looked into the background of the painting and found out that Bosch was actually aiming that figure in a similar way; it is meant to represent the church, which is trying to help but doesn't know how. Jokes aside, the Python debugging experience is "kinda bad", Galindo Salgado said.

Debugging a complicated application may require restarting many different components after adding some debugging output and hoping that the problem happens again. It would be like killing a sick patient, resurrecting them on the exam table, and hoping they get the same sickness so it can be observed. "You laugh because it sounds stupid, because it is stupid." With Python 3.14, things will be much better, and not just for pdb; "this is going to open a new field of tools that can do very cool things with the Python interpreter".

Debugging

Attaching pdb to a running Python program is not possible, prior to Python 3.14, but one can attach a native debugger, like GDB, to the running interpreter. On Linux, ptrace() is used by GDB to attach to a process; macOS and Windows have similar facilities. The target process stops executing once the debugging process attaches to it, so the debugger can use other ptrace() calls to retrieve information (e.g. register contents) about the target. From the register values, the debugger can determine where the program is executing, what the values of local variables are, and so on.

Beyond that, other ptrace() calls can be used to examine and modify the memory of the program. That includes the registers, so the debugger can, in principle, change the instruction pointer to execute a different function. At that point, the ptrace() continue opcode (PTRACE_CONT) can be used to cause the program to start executing again. "And the whole thing explodes."

It explodes for various reasons, but the basic problem is that the running interpreter is not prepared for that kind of manipulation. For example, malloc() has a lock, so if the interpreter was trying to allocate memory when it was interrupted, a subsequent call to malloc() will deadlock. In Python there are other locks ("actually we have a big one", he said with a grin) that are similarly affected. "Python is fundamentally unsynchronized with this mechanism", so GDB cannot be used to debug that way.

The Memray memory profiler that he works on does attach to running Python programs, but it has a long list of functions that it sets breakpoints for. When a breakpoint in, say, malloc(), is triggered, it may be safe to manipulate the program. "The list is like 80 functions long." The technique is fragile, since missing one or more functions may result in an explosion, which is particularly bad when working in production. "All of this is horrible, it's just disgusting." What is needed is for the interpreter to be able to tell the debugger "now it is safe to attach".

One could imagine a debugger that sends bytecode to the target program to debug at the Python level. But GDB and other native debuggers do not speak bytecode; "these two processes are basically speaking two different languages". One of them uses and recognizes the C language calls in the interpreter, while the other uses Python in the target program, so some other technique is required.

It turns out, Galindo Salgado said, that various profilers use the process_vm_readv() system call to access the memory of a target process. It allows reading memory, based on an address and length, without even stopping the target process. After learning about that call, he started thinking about the movie Inception, and wondered if you can put something into memory as well. He found the process_vm_writev() call, which allows just that; being able to write is "where the fun starts". With those calls (and their equivalents for other operating systems), an interface for safely debugging running Python programs can be developed.

PEP

He authored PEP 768 ("Safe external debugger interface for CPython") with two of his Bloomberg colleagues: Matt Wozniski and Ivona Stojanovic. It is a complicated PEP, Galindo Salgado said, because it covers security implications and "all sorts of different boring things, unless you are into security, in which case I'm very sorry for you".

The first step is for the debugger to call a new function in the sys module called remote_exec(); it takes a process ID and the file name of a script to be run by the target process. The target Python program needs to find a safe point when it can run the script, or else the result will be the explosions he mentioned earlier.

The interpreter main loop, which steps through the bytecode of the program as it executes, has an "eval breaker" that is checked periodically to handle certain events, such as signals. A ctrl-c is not immediately processed by CPython because the interpreter cannot just be interrupted anywhere; the CPython core developers also found out, painfully, that the garbage collector needs to be restricted to only running when the eval breaker is checked, he said. That makes for a safe place to run the script that gets passed to remote_exec().

Starting with 3.14, all CPython processes will have a new array to hold the script name that remote_exec() will run. The eval breaker will check a flag to see if it should execute the file and, if so, it will run the code. The trick is that the remote_exec() call, which is running in the debugging Python process, needs to be able to find the array in the memory of the target Python process so that the script name can be copied there.

The key to that is finding the PyRuntime structure "that contains all the information about the entire interpreter that is running"; it also includes information about any subinterpreters that are active. Since CPython 3.11, a symbol has been placed into the binary as an ELF section (_PyRuntime) that contains the offset of the structure from the start of the binary. He suggested using "readelf -h" on the binary to see it, but it is not present in the Python 3.13 binary on my Fedora 42 system.

That offset is not sufficient to find the structure, however, due to address-space-layout randomization (ASLR), which changes the address where the CPython interpreter gets loaded each time it is run. Figuring out that address is different for each platform (though Windows does not do ASLR for reasons unknown to him, he said); for Linux, the /proc/PID/maps file gives all of the information needed (see our report from a PyCon US talk that included information about processing the file from Python). From that, the address of the interpreter binary can be extracted; adding the offset of the structure results in the address of the target process's PyRuntime structure.

At that point, process_vm_readv() can be used to look at the structure; the array for the script name is not contained there, but there is a _Py_DebugOffsets structure that contains information that will allow the debugging Python process to correctly access objects in the memory of the target program. A new _debugger_support structure has been added to _Py_DebugOffsets with an offset to the actual array where the script name can be copied; it is an offset from a thread-specific structure because each thread can be debugged separately. He quickly went through the path needed to find the interpreter and thread state; from there, the place to write the script name can be found.

He showed some code from the interpreter for the eval breaker, with the code for new remote debugger script added. Once the script name has been written to the proper location, a flag is set in the structure, which is checked by the eval breaker. If it is set, the audit system is consulted (PySys_Audit()) and, if execution is allowed, PyRun_AnyFile() is called on the open file. It took a lot to get all of that working, but it has been done at this point, so users can simply run:

    sys.remote_exec(1234, '/tmp/script.py')

He gave a quick demo of running a Python program in one window and, in another, doing:

    $ python -m pdb -p PID

That stopped the other program and gave him a "(Pdb) " prompt, where he could get a Python stack backtrace with "bt", step through with the next command ("n"), and so on. "Awesome, it only took a year of work", he said with a laugh, to applause for the demo.

He noted that Mark Shannon likes to call the feature "remote execution as a service", so there are ways to disable it. "There are a bunch of increasingly nuclear options to deactivate this", Galindo Salgado said. Though he wanted to call it PYTHON_NO_FUN_MODE, the PYTHON_DISABLE_REMOTE_DEBUG environment variable can be set to a value in order to disable the feature for any Python started in that environment. A more targeted approach is to start Python with the "-X disable-remote-debug" flag. Sites that do not want the feature available at all can configure the Python build using the "--without-remote-debug" option. He joked that it was a boring option; "people will not invite you to parties and things like that".

Future

The remote_exec() call is meant to provide a building block for debuggers and profilers in the future. While it could be used for some kind of interprocess-communication (IPC) or remote-procedure-call (RPC) mechanism, he warned against using it that way. Beyond debuggers and profilers, though, he wanted to show some examples of "the tiny tools that you can do". Something he has learned as a core developer is that building blocks are a great way to add features, because "people are kind of weird" and will find interesting and unexpected ways to use them. For example, there are various uses of remote_exec() for introspection tasks in the standard library for Python 3.14.

If a web application is having a problem, but the logging level is not showing enough to diagnose it, for example, remote_exec() can be used to change the logging level while it is running—and change it back once the problem is found. The application could have a diagnostic report of some sort that can be triggered via the remote-debugging interface. He showed a web server that dumped information about its active connections.

Another application might be for memory-allocation debugging. Memory profilers exist, he said, but they normally need to observe allocations as they happen; if there is a program that is already using too much memory, "bad luck, because the profiler has not seen anything". But with the remote debugging, the garbage collector can be queried for all of the live objects in the program. He showed sample reports of the object types with the most allocated objects and of the objects with a size larger than 10MB. Trying to interpret that kind of data with a native debugger like GDB is effectively impossible, he said.

Another example showed all of the modules that were loaded by the program. It could be extended to show the module versions as well. For a company that runs lots of Python programs, a script could poll all of the programs to gather the full picture of module use throughout production. And so on.

In conclusion

He likes to think of the feature as "the Python MRI moment" because "it is the technology that unblocks inspection". As with an MRI machine, the feature does not actually diagnose anything—a doctor or programmer is needed to interpret the output. Unlike an MRI, remote_exec() allows changing things within the patient, which "maybe is not a great idea, but maybe you know what you are doing".

It will allow other Python debuggers, such as the one in Visual Studio Code (VSCode), to drop "their megacode that injects crazy things" into Python in favor of using remote_exec(). He is excited to see that, but is also interested in what kinds of tools users come up with. Python is the most popular programming language right now and has been cutting into heads for too long; it is time to use the MRI machine instead, he concluded.

The first question was about attaching multiple times to the same program, which Galindo Salgado said can be done, but sequentially. He noted that the script is running in the context of the program, so it can import modules if needed, but only if they are already installed. If they are not installed, "you could shell out to pip maybe ... whoops", he said with a worried laugh.

Another question was about forward compatibility: would a Python 3.14 program be able to call remote_exec() for a target program running on 3.15? Is the method for finding PyRuntime and the other pieces of the interpreter state version-specific? Galindo Salgado said that the protocol is "technically forward-compatible", but currently the same major version of Python needs to be used on both sides. That may change eventually, but there are a lot of installed Pythons out there that will not work, so the idea is to avoid user confusion. The protocol itself could be used to implement a debugger in Rust, if desired; all of the information needed to interpret the Python objects is provided in _Py_DebugOffsets.

The subinterpreter support in the new feature was the next question. There is no real support for subinterpreters yet, Galindo Salgado said. The remote_exec() in the standard library selects the main subinterpreter currently, but the protocol makes it possible for other implementations to choose different subinterpreters. The protocol allows stepping through the list of interpreters to pick the one to target; that may eventually be added for remote_exec() as well.

[I would like to thank the Linux Foundation, LWN's travel sponsor, for travel assistance to Prague for EuroPython.]

Index entries for this article
Conference	EuroPython/2025
Python	Debugging

a very useful feature

Posted Aug 8, 2025 21:34 UTC (Fri) by randomguy3 (subscriber, #71063) [Link]

This is so cool, and i'm looking forward to making use of it! I have many times been frustrated by not being able to debug a running Python program when it is in a "bad" state.

Awesome

Posted Aug 8, 2025 22:39 UTC (Fri) by cjwatson (subscriber, #7322) [Link]

This is ridiculously cool. I probably won't be able to use it much in production for a couple of years (Debian forky will need to be stable first and we'll need to upgrade to it), but it will be a tremendously handy tool to have.

HotSpot Serviceability Agent

Posted Aug 9, 2025 14:38 UTC (Sat) by cesarb (subscriber, #6266) [Link]

This reminded me of a very obscure Java feature called the "Serviceability Agent", which also exposes metadata about the structure of the interpreter and objects through a set of global symbols with names like "gHotSpotVMStructs", which allow an external debugger which attaches to a running JVM process (or even reads from a core file) to directly parse the heap and stack. Though I don't think the JVM SA can be used to ask the JVM to call arbitrary code.

PUDB_TTY

Posted Aug 9, 2025 19:48 UTC (Sat) by marcH (subscriber, #57642) [Link]

For the record, PUDB_TTY has always allowed using an "external" debugger which is critical when the python script to debug is buried below 5 layers of indirections and redirections and has zero access to any sort of interactive terminal whatsoever. Typical with build systems.

Maybe there is something equivalent to PUDB_TTY in standard Python; I don't know.

This remote_exec() is of course better because it does not require changing the code being debugged and it can attach at any point in time.

Django

Posted Aug 16, 2025 16:05 UTC (Sat) by yodermk (subscriber, #3803) [Link]

Nice, I can imagine Django implementing a manage.py command to use this to introspect running web apps.