This is why we can't have safe cancellation points

Posted Apr 15, 2016 0:11 UTC (Fri) by luto (guest, #39314)
In reply to: This is why we can't have safe cancellation points by neilbrown
Parent article: This is why we can't have safe cancellation points

Using it correctly is a real pain. Using it correctly in C++ is even worse.

AFAICT the only way to use it safely is to have cancellation off *except* at very carefully selected points and to turn it on at those points. Every cancellation point then needs to be aware that the thread can go away without unwinding.

ISTM any code that actually does this would be better off using ppoll, etc.

This is why we can't have safe cancellation points

Posted Apr 15, 2016 0:38 UTC (Fri) by neilbrown (subscriber, #359) [Link] (4 responses)

This is assertion without substance.

Surely I can:
1/ create a data structure that contains a list of all resources I might hold (file descriptor, byte range locks).
2/ register a cleanup handler which walks that data structure and frees everything.
3/ write simple wrappers for open/accept/whatever which record the results in the data structure
4/ just call those wrappers, never the bare API.

Then if I ever get canceled, everything will be cleaned up nicely.

I would need to disable cancellation while manipulating a data structure shared with other threads, but I see cancellation more as being appropriate for largely independent threads.

What specific risks do you see if cancellation is mostly enabled?

This is why we can't have safe cancellation points

Posted Apr 15, 2016 7:21 UTC (Fri) by khim (subscriber, #9252) [Link] (2 responses)

What specific risks do you see if cancellation is mostly enabled?

I think the problem is simple inefficiency.

Cancellation support is not free - even if it's not used.

And even your "simple" scheme includes many steps and couldn't arrive in a random program by accident.

Surely if you change a design of your program that much to make it possible to use cancellation you could as well go and create wrapper for pthread_create which will call pthread_setcancelstate(p), too?

This is why we can't have safe cancellation points

Posted Apr 15, 2016 8:00 UTC (Fri) by neilbrown (subscriber, #359) [Link] (1 responses)

> I think the problem is simple inefficiency.

Specifically? The solution used by musl costs almost nothing except on x86_32 and the change to make it work well on x86_32 has zero extra performance cost.

> And even your "simple" scheme includes many steps and couldn't arrive in a random program by accident.

I'm failing to parse that... Certainly you wouldn't put any code in any program by accident (I hope) ??
The random program probably never cancels threads so it wouldn't want these steps anyway.

> Surely if you change a design of your program that much to make it possible to use cancellation you could as well go and create wrapper for pthread_create which will call pthread_setcancelstate(p), too?

I fail to see how this would solve anything at all.

Many applications never cancel any threads. They are irrelevant. They need do nothing and they suffer no cost (maybe a couple of instructions per syscall. If you can't afford that, hand-code your systemcalls).

Some applications do find value in the ability to cancel threads. Those threads clearly need to be prepared to be canceled. Being prepared is not zero work, but it is not too onerous.

If the thread is not doing any resource allocation, maybe just computing pi to a few million bits, then it can deliberately request async cancellation and go about its business.
If the thread is allocating resources then it naturally needs to make sure they get de-allocated. Any code already needs to worry about this. Code that can be canceled needs to do maybe 10% more work.
It can disable cancellation over a short allocate/use/deallocate sequence that won't block. Or it can register a cleanup helper and record the allocation in some array or something.
If your allocations follow a strict LIFO discipline you can even
- alloc
- push cleanup handler
- use the allocation
- pop the cleanup handler

Which makes for nice clean code with the certainty that the cleanup handler will run even if the thread is canceled.
The point of deferred cancellation is that this can be done with no locking, no extra system calls. It just needs a little care - like not logging any messages between the allocation and pushing the cleanup handler.

This is why we can't have safe cancellation points

Posted Apr 15, 2016 15:10 UTC (Fri) by khim (subscriber, #9252) [Link]

Many applications never cancel any threads. They are irrelevant.

99.9% of all programs (and I've picked conservative number) are irrelevant? That's novel idea to me.

They need do nothing and they suffer no cost (maybe a couple of instructions per syscall. If you can't afford that, hand-code your systemcalls).

When proportion is this skewed even these two instructions make no sense: why should 99.9% of all the apps suffer at all if this could be avoided?

The natural response would: because I could just take bits and pieces from these 99.9% apps and use these to build these rare few apps which do use cancellation.

But as you've shown you couldn't just take random working code from working library, plug it in a program which uses cancellation and hope that the end result would work.

ALL code must be carefully designed in such a program. And if ALL code is specifically written for such a program then additional burden of adding couple of pthread_setcancelstate calls here and there wouldn't be large at all!

The argument that "hey, I don't know where and how threads are created in this large program" wouldn't fly: if you don't know even that much about your program/library/whatever then how could you be sure that you control is enough to even try to attempt to use cancellation of threads?

It just needs a little care - like not logging any messages between the allocation and pushing the cleanup handler.

Sure. But if that's called "a little care" then "you also need to call pthread_setcancelstate(PTHREAD_CANCEL_ENABLE) in each thread" wouldn't a large problem...

This is why we can't have safe cancellation points

Posted Apr 15, 2016 17:17 UTC (Fri) by nix (subscriber, #2304) [Link]

Quite. The program I work on for my day job does precisely that, with subsidiary threads whose primary job is ptrace()ing and waitpid()ing, but which may be commanded to go away by a controlling 'master' thread with many more jobs. All the work of such subsidiary threads is associated with a single structure relating to a specific subprocess under monitoring, and it is easy (and good style) to make sure that this structure is properly freeable at all times (you need that for decent error handling anyway). The cleanup handler then just needs to do the same 'shut down and free everything associated with the process structure' that we have to do when the subprocess dies anyway (indeed, we simply call the cleanup handler by hand in that situation).

I've had problems with the multithreading in that code, but they were all races associated with mutexes and condition variables. The nature of synchronous cancellation has caused me zero problems.