Szorc: Mercurial's Journey to and Reflections on Python 3
Szorc: Mercurial's Journey to and Reflections on Python 3
Posted Jan 17, 2020 16:51 UTC (Fri) by excors (subscriber, #95769)In reply to: Szorc: Mercurial's Journey to and Reflections on Python 3 by anselm
Parent article: Szorc: Mercurial's Journey to and Reflections on Python 3
You'll have an issue in Python when you say print("Opening file %s" % sys.argv[1]) or print(*os.listdir()), and it throws UnicodeEncodeError instead of printing something that looks nearly correct.
You can see the file in ls, tab-complete it in bash, pass it to Python on the command line, pass it to open() in Python, and it works; but then you call an API like print() that doesn't use surrogateescape by default and it fails. (It works in Python 2 where everything is bytes, though of course Python 2 has its own set of encoding problems.)
Anyway, I think this thread started with the comment that Mercurial's maintainers didn't want to "use Unicode for filenames", and I still think that's not nearly as simple or good an idea as it sounds. Filenames are special things that need special handling, and surrogateescape is not a robust solution. Any program that deals seriously with files (like a VCS) ought to do things properly, and Python doesn't provide the tools to help with that, which is a reason to discourage use of Python (especially Python 3) for programs like Mercurial.