|
|
Subscribe / Log in / New account

A literal string type for Python

A literal string type for Python

Posted Apr 26, 2022 15:25 UTC (Tue) by nye (subscriber, #51576)
In reply to: A literal string type for Python by ovitters
Parent article: A literal string type for Python

The problem with taint checking is that experience has shown that - even if it's always correct, which it often isn't - it leads a surprising number of programmers to assume that because the evil bit isn't set, then it must therefore be good. In other words, taint checking separates data into "definitely unsafe" and "might be safe assuming you're using it correctly, whatever that might mean", whereas many developers treat is as meaning "maybe unsafe" versus "definitely safe".

By restricting the feature to simply "is this a literal string, or derived from literal strings purely by means of concatenation"[0], the meaning is well-defined and easier to understand. In other words, it depends less upon programmer education, which is a strategy that has been repeatedly proven ineffective.

There is some discussion about this in https://wiki.php.net/rfc/is_literal if you're interested - that's the proposal for a very similar feature in PHP, which sadly did not pass for reasons I've not yet investigated.

[0] This PEP is a bit broader than that and does include some operations that create substrings, which makes me uncomfortable.


to post comments

A literal string type for Python

Posted Nov 9, 2022 17:00 UTC (Wed) by craig.francis (guest, #162085) [Link]

Hi nye, bit weird to see you mention the PHP RFC for is_literal(), I'm the author :-)

I completely agree with everything you said - taint checking is flawed, concatenation is fine, and the extra functions PEP 675 include make me feel a bit uncomfortable as well (but, to be fair, I cannot think of a vulnerability from them, I just can't say with 100% confidence they will be fine for every single context).

Anyway... I'm just looking at how the Python implementation works, now 3.11 is out, because I need to go back to the PHP Internals Developers to try again.

As I'm not a Python developer, do you think the following is a good example of this feature being used:

https://github.com/craigfrancis/php-is-literal-rfc/blob/m...

https://eiv.dev/python-pyre/

---

As to reasons for the PHP RFC rejection... it was not clear, most people who voted against did not comment (bit weird, considering RFC stands for "Request for Comments"), two people didn't want it to support string concatenation (they believe it would help find issues, but I've found that hasn't been the case; instead it does make adoption much easier due to the amount of existing code that uses concatenation), three people believe these checks should only be done by Static Analysis (the most optimistic stat I can find is 33% of PHP developers use Static Analysis[0], which I support, and can now be done with the `literal-string` type in Psalm and PHPStan, but I don't believe it will ever get to 100%), one person believed this should be solved though better documentation... and someone thought the idea was flawed, because a *malicious* developer could write the user-value into a new PHP file (e.g. `<?php return "$user_value"; ?>`), and execute it.

[0] https://www.jetbrains.com/lp/devecosystem-2021/php/


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds