|
|
Subscribe / Log in / New account

Fedora and Python 2

Fedora and Python 2

Posted Apr 14, 2018 6:26 UTC (Sat) by dvdeug (guest, #10998)
In reply to: Fedora and Python 2 by peniblec
Parent article: Fedora and Python 2

If Python normalized text by default, the text editor would have a hard time doing that.


to post comments

Fedora and Python 2

Posted Apr 14, 2018 11:48 UTC (Sat) by peniblec (subscriber, #111147) [Link]

OK. Let’s say Python’s string type uses normalization/grapheme clusters/nanomachines to correctly compare sequences of Unicode characters. Would that necessarily make a text editor overzealously normalize your whole file, thus polluting your patch?

I don’t know how actual text editors do it, but I imagine that their representation of your file’s content is more nuanced than simply “whatever open(filename) returned”. I would assume that they represent a “file” as sequences of opaque “word” or “line” objects, each of those objects having methods to

  • get their position in the file’s byte-stream (start and end offset, cached once decoded), so that the editor knows where to apply changes;

  • get their “canonical” Unicode representation, so that the editor can do whatever an editor is supposed to do with meat-space characters (comparison for search-and-replace, length computation for line-wrapping).

So with such a design, I don’t think “Python’s str canonicalizing behind your back” would necessarily lead to “OMG this commit is full of extraneous crap introduced by this dumb Python text editor”. Again, I might not have thought enough about this, maybe the above does nothing to solve the problem.

(Congratulations, you’ve nerd-sniped me into designing a text editor ;) )

Alternative workaround: teach our diffing tools to normalize text before computing differences :D

They do already let us skip whitespace changes, for example, which is a subclass of the more general category of “things computers care about despite being mostly irrelevant to meatbags”.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds