Debugging?

Posted Nov 19, 2010 14:35 UTC (Fri) by zdzichu (subscriber, #17118)
In reply to: Debugging? by Yenya
Parent article: systemd v12 released

You can redirect all output from misbehaving application into syslog. You can ask systemd to display almost everything during runtime (systemd.log_level=). You can run emergency shell at any point (kbrequest.target). You can change almost any of over hundred unit properties (# systemctl show --all ntpd.service | wc -l
117).

List exact modification you had to do in script and we will try to provide you systemd equivalent.

Debugging?

Posted Nov 19, 2010 14:48 UTC (Fri) by jackb (guest, #41909) [Link] (1 responses)

How does that emergency shell work? The biggest problems I've had with OpenRC is that if a service fails to come up for some reason it's possible for the boot to hang forever and never give me a terminal so that I can log in and fix the problem.

Debugging?

Posted Nov 19, 2010 18:12 UTC (Fri) by mezcalero (subscriber, #45103) [Link]

Note that by default all services have a timeout assigned, which is 1min for native and 3min for sysv services. You can change or disable the timeout individually. In fact there are even two timeouts in place: one will terminate a service that fails to start after a certain amount of time, and the other will simply dictate that jobs depending on it will not wait any longer for it to finish. You can control both timeouts individually.

Since timeouts are opt-out, not opt-in, even if some daemon freezes the worst that happens in most cases is that your boot is delayed for a minute and you can then introspect what was going on afterwards with "systemctl list-units", "systemctl status" and dmesg/syslog.

Debugging?

Posted Nov 19, 2010 15:51 UTC (Fri) by Yenya (subscriber, #52846) [Link] (5 responses)

> List exact modification you had to do in script
> and we will try to provide you systemd equivalent.

OK, here are few I can think about right now:

- using "set -x" to see where the things went wrong.

- loading a different firmware to the storage controller before non-root volumes are scanned

- adding a sleep command in order to give the disks chance to get detected properly

- adding another md rescan in order to build a raid-0 of raid-1 volumes (poor man's raid-10, before the native raid-10 was available)

- lowering the insane TCQ depth of the 3ware controller in order to make it smaller than the iosched queue length and give the iosched chance to do anything useful before the root is fscked and mounted r/w

- loading a non-standard crypto module in order to have a non-root volume accessible by LUKS (before the non-root volumes are probed)

etc. Thanks for providing systemd equivalents of these tasks!

Debugging?

Posted Nov 19, 2010 18:20 UTC (Fri) by mezcalero (subscriber, #45103) [Link] (4 responses)

Use systemd.log_target=debug on the kernel cmdline to figure out what exactly systemd is doing.

The correct place to load firmware into controllers is from a udev hook. Similar for patching your TCQ depth.

There's no need to add sleep commands and such since systemd is fully dynamic and actually binds fsck/mount to the device actually showing up. i.e. it is unnecessary to resort to hacks such as adding sleeps everywhere since systemd waits exactly for what it needs to wait for (i.e. what is listed in fstab, ...) before proceeding with the boot.

If you want to statically load arbitrary modules, simply place them in /etc/modules-load.d/foobar.conf (replace foobar by whatever you like), and they will be loaded at the same time as udev loads all other modules. However, usually it's a better approach to fix the kernel module to be loaded implicitly on request. In fact most kernel modules already work like that and statically loading modules is only needed in exceptional cases.

Note that at this time we still rely on the old scripted raid setup code anyway. You can continue to edit that as necessary. In the future stc will hopefully support this more correctly and dynamically and set things up as needed without resorting to manual intervention.

Debugging?

Posted Nov 20, 2010 9:39 UTC (Sat) by quotemstr (subscriber, #45331) [Link] (3 responses)

Of course you can do things the "right way" by hooking into configuration infrastructure. But the OP's point was that conventional scripts make ad hoc modifications easy. It's perfectly reasonable to create an interim local solution without needing to dive into the details of arcane subsystmes. In theory, you shouldn't need band-aids like sleep(1) invocations, but in practice, the need comes up once in a while. It's better to accommodate that need instead of denying it.

Debugging?

Posted Nov 20, 2010 17:02 UTC (Sat) by foom (subscriber, #14868) [Link] (2 responses)

Most of the reasons I've ever needed to edit a init.d script is because the init.d script sucked, not the thing it was starting. IMO systemd has the potential to eliminate that problem entirely...

Debugging?

Posted Nov 20, 2010 19:51 UTC (Sat) by quotemstr (subscriber, #45331) [Link] (1 responses)

Erm, so your assertion is that systemd will magically lead to bug-free initialization that doesn't need workarounds? "systemd works fine as long as the initialization configuration is perfect" is a rather weak argument.

Debugging?

Posted Nov 21, 2010 4:12 UTC (Sun) by foom (subscriber, #14868) [Link]

No, my point was that most of the problems I've had have been because of all the duplicate crud in init.d shell scripts which have been copied incorrectly while being cargo-culted from program to program. So, my problems won't have a chance to happen, because the description of how to manage a daemon in systemd is so much shorter and harder to screw up.

Obviously it's possible to have other kinds of problems too...If you want to call sleep 1 before starting a daemon there's nothing stopping you from replacing the start command-line with a shell script that calls sleep 1 before starting the daemon. Even in systemd that should work...

Debugging?

Posted Nov 19, 2010 18:09 UTC (Fri) by mezcalero (subscriber, #45103) [Link]

Let's not forget that with systemd.confirm_spawn=1 on the kernel cmdline you can enforee an interactive boot where spawning of every process needs to be OK'eyed by the user. Also, after boot you can introspect what exactly happened to a service that failed by typing "systemctl status foobar.service". It will automatically record exit code/signal and timestamp. Also, by default all system services are redirected to syslog.