|
|
Subscribe / Log in / New account

Python 2.8?

Python 2.8?

Posted Jan 23, 2017 11:12 UTC (Mon) by Jonno (subscriber, #49613)
In reply to: Python 2.8? by adobriyan
Parent article: Python 2.8?

> Ironically, Rust assumes everything is valid UTF-8 including Unix filenames.

Rust does not. While Rust Strings are always UTF-8, Rust OsStrings are not. They are "arbitrary sequences of non-zero bytes" (on Unix) or "arbitrary sequences of non-zero 16-bit values" (on Windows).

Directory listings uses OsStrings, not Strings, for filename components, and File::open() will accept anything from which Rust knows how to build a path, including both Strings *and* OsStrings.

There are convenience methods to convert an OsString to a String (which will fail if the OsString does not contain valid Unicode), as well as to convert a String to an OsString (which will fail if the String contains any "U+0000 NULL" characters), but there is no requirement that you use them.

In fact, in most circumstances you should not. Keep the OsString for path manipulations, and if you need a pretty UTF-8 string to show the user, use the heavier OsString::to_string_lossy() method to get a string with any invalid Unicode sequences replaced with "U+FFFD REPLACEMENT CHARACTER".


to post comments

Python 2.8?

Posted Jan 23, 2017 12:22 UTC (Mon) by ssokolow (guest, #94568) [Link]

Actually, OsString is a superset of String and whatever the OS offers. It'll carry NULL characters just fine.

Here's a Rust Playground link demonstrating that.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds