The question there is more if you're dealing in python objects; if so, then you need the GIL. If not, dropping the gil is easy/peasy and most cpy extensions do that (particularly around syscall blocks).
If you're doing strictly native python, then chunking it into processes is the usual route- can be annoying at times, but the forced separation also does wonders for crappy/entangled code bases....
That said, I don't quite see your notion of hardware transactional memory here beyond possibly liking buzzwords ;)