A literal string type for Python
A literal string type for Python
Posted Apr 14, 2022 16:31 UTC (Thu) by tialaramex (subscriber, #21167)In reply to: A literal string type for Python by milesrout
Parent article: A literal string type for Python
"EAT BABIES" // clearly expresses your intent, it's your fault, you wrote that.
"EAT" + " BABIES" // I can see why they felt like they should make this work, they've cited examples that do this and it genuinely is still clear at this point what your intent was, although I think it should be discouraged anyway.
doComplicatedStuffBasedOnUserInput("DO NOT EAT BABIES", input) // this still type checks as LiteralString, and might be EAT BABIES yet we can hardly claim now that we're reflecting clear programmer intent when that happens.
The reason to want literals here rather than allowing arbitrary strings is to get closer to requiring intent. I'd rather give up the second example than, as this PEP does allow the third example opportunity to set fire to everything and pretend that's "safe".
Rust of necessity has to require actual literals in formatting (not merely constant strings) because the formatting work is done via the macro system, and the macro system can't see inside variables. But I think even though more sophisticated behaviour would be welcomed by many Rust programmers I personally prefer the literal requirement.
Posted Apr 14, 2022 16:45 UTC (Thu)
by mb (subscriber, #50428)
[Link] (8 responses)
Posted Apr 15, 2022 20:28 UTC (Fri)
by tialaramex (subscriber, #21167)
[Link] (7 responses)
Posted Apr 16, 2022 8:56 UTC (Sat)
by mb (subscriber, #50428)
[Link] (6 responses)
It's not supposed to prevent the programmer from hardcoding the wrong string.
Posted Apr 16, 2022 21:28 UTC (Sat)
by tialaramex (subscriber, #21167)
[Link] (5 responses)
doComplicatedStuffBasedOnUserInput("DO NOT EAT BABIES", input)
"DO NOT EAT BABIES" is blessed as a LiteralString because it is. No problem so far. But "cleverly" this proposal allows operations (such as truncation, concatenation, duplication and splitting) on LiteralString to produce a LiteralString, and so if doComplicatedStuffBasedOnUserInput has a bug, as it may well do, it can end up producing quite unexpected results, such as "EAT BABIES" and yet they're blessed as LiteralString anyway via this rationale.
Thus, the program user in fact gets arbitrary control over these strings in at least some cases, whereas that's definitively not the situation in languages where there's an actual literal string type. In exchange, Python gets to write "WO" + "RDS" and have that be a LiteralString whereas in the other languages it is not. I think that's a bad trade, despite being very clever.
Posted Apr 19, 2022 4:58 UTC (Tue)
by NYKevin (subscriber, #129325)
[Link] (4 responses)
Posted Apr 19, 2022 11:06 UTC (Tue)
by mathstuf (subscriber, #69389)
[Link] (3 responses)
If there were LiteralNumber, one might be able to do that, but without, there's no difference between a literal 7 and a 7 coming in from "the outside" through a variable. Though there are a number of other methods that take SupportsIndex that might now be suspicious to me…
Posted Apr 19, 2022 15:23 UTC (Tue)
by gbleaney (guest, #158077)
[Link] (2 responses)
If a developer want to circumvent the protections of 'LiteralString', they can easily do it. They don't even need fancy functions like the example we gave, they can just add a '# pyre-ignore' (or equivalent lint suppression comment for their typechecker of choice). The goal is to protect against accidental mistakes, not malicious or implausible behaviour by developers.
Posted Apr 24, 2022 13:39 UTC (Sun)
by tialaramex (subscriber, #21167)
[Link] (1 responses)
If I'm correct the proof of course would likely arrive too late. ie, this PEP succeeds, everybody gets used to the behaviour as documented, and then a hole is found in some code, say, a popular Django app, where users can manipulate a LiteralString so as to cause mischief. I'm certain that the instinct will be to blame the app programmer, but of course that's missing the whole point of these protections, programmers are human and as such lack foresight.
To be quite fair, the other way forward can also be dangerous. In C++ for example std::format() resolutely insists on a constant format string, so that's pretty safe (it needn't be a literal, but it can't be sensitive to user input as that's not constant), but it necessitates providing std::vformat() which does not take a constant format string, and so programmers may be tempted to call std::vformat() rather than re-factor some code to ensure the format strings are actually constant... Defensive programming is possible, maybe even encouraged, but it's probably easier to do the Wrong Thing™ in many cases than it should be.
Posted Apr 25, 2022 7:50 UTC (Mon)
by farnz (subscriber, #17727)
[Link]
To a large extent, though, these sound like the same problem as unsafe in Rust; sure, I can wrap all sorts of crawling horrors in unsafe, and have a Safe Rust API on top so that when you look at my crate's documentation, it's not obvious that I've done this.
And similar to Unsafe Rust, the answer is tool-assisted review of code you're planning to use that highlights the areas of code that need extra attention - just as a Rust-aware review system calls out unsafe wherever it appears for extra human attention, so a Python-aware review system needs to call out manipulation of LiteralString that results in a LiteralString typed output for extra human attention.
A literal string type for Python
A literal string type for Python
A literal string type for Python
A literal string type for Python
A literal string type for Python
A literal string type for Python
A literal string type for Python
https://peps.python.org/pep-0675/#appendix-b-limitations
A literal string type for Python
A literal string type for Python