|
|
Subscribe / Log in / New account

Not a "bloated, monolithic system"?

Not a "bloated, monolithic system"?

Posted Feb 4, 2019 20:44 UTC (Mon) by jccleaver (guest, #127418)
In reply to: Not a "bloated, monolithic system"? by Cyberax
Parent article: Systemd as tragedy

> The problem was that this daemon sometimes hanged on shutdown, ignoring anything short of targeted SIGKILL but still having the port open. So our tests periodically failed because of that.

Well, I mean it sounds like the problem was more that you were sending buggy code through the system. I'd have yelled first at the daemon dev, and secondly at the test writer for not cleaning it up itself, potentially with kill -9 if it wasn't responsive.

> The fix way back then was to "lsof | xexec kill" at the start of the test.

A test shouldn't have left it hanging, so I'd run that at the end. But if the problem was the blocked port, then, sure this would work too.

Congratulations, you fixed the blocker. I'd much prefer that approach, which is clean, easy to understand, and easy for a human to debug, than *ripping out PID1 and replacing it with something 50x more complicated* just because someone left a hanging process lying around.


to post comments

Not a "bloated, monolithic system"?

Posted Feb 4, 2019 20:57 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link]

> Well, I mean it sounds like the problem was more that you were sending buggy code through the system. I'd have yelled first at the daemon dev, and secondly at the test writer for not cleaning it up itself, potentially with kill -9 if it wasn't responsive.
This was a closed source binary from a big vendor with the name starting with O and ending with "racle".

> A test shouldn't have left it hanging, so I'd run that at the end. But if the problem was the blocked port, then, sure this would work too.
Except that a test could also die in the middle of its run. Sometimes from OOM.

> Congratulations, you fixed the blocker. I'd much prefer that approach, which is clean, easy to understand, and easy for a human to debug, than *ripping out PID1 and replacing it with something 50x more complicated* just because someone left a hanging process lying around.
The correct decision here is EXCACTLY to create a generic solution that can be used to make sure that no bad code can cause damage.

This is why we have protected memory.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds