Free software's not-so-eXZellent adventure
Free software's not-so-eXZellent adventure
Posted Apr 4, 2024 11:22 UTC (Thu) by farnz (subscriber, #17727)In reply to: Free software's not-so-eXZellent adventure by epa
Parent article: Free software's not-so-eXZellent adventure
The problem with "safe linking" as a concept is indirect dependencies. I might turn off safe linking because I know that one of my dependencies uses a C++ static global; that then means that I have to (at the very least) trust it to not have dependencies that should have used safe linking, but didn't, and depending on the implementation, I might have to trust all my dependencies to not have unwanted startup code.
It'd be better to take advantage of the fact that, at its strongest, the C++ standard has required statics to be initialized before the first reference by the program to a symbol from the translation unit the static is defined in. You can, under C++ rules, delay running the initializer code until the dynamic linker has determined that a given library actually provides the desired symbol (which avoids shenanigans with weak symbols where the final link is done against a different library).
Doing this consistently (including ensuring that IFUNCs only come into play once you've resolved the IFUNC as the definition of this symbol, avoiding games with weak symbols there, too) would ensure that "merely linking" a library is harmless up until you use a symbol from that library; there's no specification that prevents this, it's just a lot of changes to make to ELF and to compilers.
Posted Apr 4, 2024 12:07 UTC (Thu)
by khim (subscriber, #9252)
[Link]
There are absolute is a specification that prevents it. It's called System V Application Binary Interface. And C++ standard is pretty explicit about it, too: It is implementation-defined whether the dynamic initialization of a non-block non-inline variable with static storage duration is sequenced before the first statement of main or is deferred. For better or for worse, ELF have picked “sequenced before the first statement of main” option. You couldn't just go and change it without defining new platform. And if you are defining new platform and asking everyone to adopt it anyway then you may as well declare that this particular part of the standard doesn't apply to that new platform. But that would help apps that are written for the existing platform. It's incredibly common for the C++ programs to rely on registry that is filled in global constructors (think Abseil Flags) and you may expect significant pushback to such a proposal. OpenBSD may do that, they don't care about being usable by Joe Average and they may just patch all the programs they care about. Linux couldn't. It's definition of a new platform, first and foremost. It's not normal GNU/Linux anymore.
Posted Apr 4, 2024 17:22 UTC (Thu)
by epa (subscriber, #39769)
[Link] (3 responses)
Posted Apr 4, 2024 17:27 UTC (Thu)
by farnz (subscriber, #17727)
[Link] (2 responses)
That opens up a different problem; if only the main binary can indicate dependencies (so no indirect dependencies), how do you remove your liblzma dependency when libsystemd stops depending on it? How do you cope when a future version of libsystemd adds a dependency on libzstd2?
Remember that we're talking about the dynamic linker at this point, not the static linker - so you can't rely on the version of libsystemd you used at compile time being the one used at runtime.
Posted Apr 4, 2024 18:54 UTC (Thu)
by epa (subscriber, #39769)
[Link] (1 responses)
In C programming, there are those who argue that header files should not include other header files. Rather, if a header needs something then each user should #include its dependencies explicitly. I don’t have a view on the wisdom of that for C development but it’s a similar idea to what I am suggesting at the linking step.
Perhaps given a difference between “safe linking”, which can only define new symbols and not replace them or execute arbitrary code at startup, and the full-fat kind that can do things merely by including a library, transitive dependencies could be included by default if they are “safe”, but if you have a dependency of a dependency which wants to have ifunc hooks or startup code, you must explicitly ask for it or the linker will complain.
Posted Apr 5, 2024 9:38 UTC (Fri)
by smurf (subscriber, #17840)
[Link]
There's a reason for this attitude, which is that C is somewhat brain dead and doesn't have namespaces, an explicit list of things included files export, or anything else along these lines. Thus any random macros which X exports to Y are visible in Z, and the programmer working on Z should be aware of that.
Fortunately many public include files of non-system headers use prefixed symbols, and the linker has gotten better at complaining when globals collide. Thus this is not much of an issue.
On the other hand, this practice severely hampers upwards compatibility. When A includes B which includes C which requires a new system header D, should we force A to add some random include file? not really IMHO.
Posted Apr 7, 2024 17:19 UTC (Sun)
by mathstuf (subscriber, #69389)
[Link]
FWIW, macOS does this, so while it is not compatible with the SysV ABI as khim points out, anything cross platform that needs reliable static initialization and is cross-platform needs to handle this case already. I'd be fine with a compile flag to make this doable in ELF with an opt-in compile flag…
> there's no specification that prevents this
Free software's not-so-eXZellent adventure
Free software's not-so-eXZellent adventure
Free software's not-so-eXZellent adventure
Free software's not-so-eXZellent adventure
Free software's not-so-eXZellent adventure
Free software's not-so-eXZellent adventure