Improving Python's SimpleNamespace
Python's SimpleNamespace class provides an easy way for a programmer to create an object to store values as attributes without creating their own (almost empty) class. While it is useful (and used) in its present form, Raymond Hettinger thinks it could be better. He would like to see the hooks used by mappings (e.g. dictionaries) added to the class, so that attributes can be added and removed using either x.a or x['a']. It would bring benefits for JSON handling and more in the language.
A SimpleNamespace provides a mechanism to instantiate an object that can hold attributes and nothing else. It is, in effect, an empty class with a fancier __init__() and a helpful __repr__():
>>> from types import SimpleNamespace >>> sn = SimpleNamespace(x = 1, y = 2) >>> sn namespace(x=1, y=2) >>> sn.z = 'foo' >>> del(sn.x) >>> sn namespace(y=2, z='foo')
Hettinger proposed his idea to the python-dev mailing list in mid-April. He described it as follows:
catalog = json.load(f, object_hook=SimpleNamespace) print(catalog['clothing']['mens']['shoes']['extra_wide']['quantity']) # currently possible with dict() print(catalog.clothing.mens.shoes.extra_wide.quantity) # proposed with SimpleNamespace() print(catalog.clothing.boys['3t'].tops.quantity) # would also be supported
There are examples of production code that does this sort of thing, he
said, but each user needs to reinvent the wheel: "This is kind of [a]
bummer because
the custom subclasses are a pain to write, are non-standard, and are
generally somewhat slow.
"
He had started with
a feature request in the
Python bug tracker, but responses there suggested adding a new class.
Guido van Rossum thought that kind of usage was not particularly Pythonic, and was not really in favor of propagating it:
Kyle Stanley wondered
if it made sense for the feature to reside in the json module;
"that seems like the most useful and intuitive location for the dot
notation
". He thought that JSON users would not be surprised by
that style of usage, but Van Rossum disagreed:
Several others agreed that the duality of object and dictionary access was
not a good fit for Python, but there is a still a problem to be solved, as Hettinger
noted:
"working with heavily nested dictionaries (typical for JSON) is no
fun with square brackets and quotation marks
". Victor Stinner listed
a handful of different projects from the Python
Package Index (PyPI) that provide some or all of the features that are
desired, but he did not see that any of those had "been battle-tested
and gained enough popularity
" that they should be considered for the
standard library.
Stinner (and others in the thread) pointed to the glom library as one that might be of use in working with deeply nested JSON data. But the "AttrDict" pattern is rather popular, as Hettinger pointed out. glom can do lots more things, but it is not able to freely mix and match the two access types as Hettinger wants.
There were some who thought it might be reasonable for the json
module to provide the functionality, as Stanley had suggested, including
Van Rossum who seemed to come
around to the idea. Glenn Linderman supported adding
the feature in a
bug report comment; he thinks
it is useful well beyond just JSON. "Such a feature is just too practical
not to be Pythonic.
" Similarly, Cameron Simpson thought
it would make a good addition:
It is true that adding dictionary-like functionality to SimpleNamespace should not affect existing code, but most in the thread still seem to be against adding the feature to that class. Eric Snow put it this way:
Perhaps the most radical suggestion came from Rob Cliffe. He thought it might make sense to add a new operator to the language (perhaps "..") with no default implementation. That would allow classes to define the operator for themselves:
obj..abc..def..ghiStill fairly concise, but warns that what is happening is not normal attribute lookup.
As Stinner pointed out, though, that and some of the other more speculative posts probably belonged in a python-ideas thread instead. It does not seem particularly likely that SimpleNamespace will be getting this added feature anytime soon—or at all. There is enough opposition to making that change, but there is recognition of the problem, so some other solution might come about. It would, presumably, need the PEP treatment, though; a visit to python-ideas might be in the offing as well.
Index entries for this article | |
---|---|
Python | Enhancements |
Posted Apr 29, 2020 17:18 UTC (Wed)
by NYKevin (subscriber, #129325)
[Link] (3 responses)
(I'm also not a huge fan of JSON's object_hook, since it's slightly too dumb to actually deal with complicated JSON object structures correctly - the type of an object is context-sensitive and needs to be recursively deduced based on the parent type, unless you were clever and preemptively tagged all of your JSON objects with a type hint. Unfortunately, I don't see that sort of tagging much in practice. So you end up just converting everything to dicts and then manually parsing it into the actual object type. A declarative way of writing these schemata, and passing them directly to json.load(), would be Nice To Have. Perhaps just pass the dataclass of the root object as an argument or something like that? dataclasses already have all the introspection support required for json to figure the rest out on its own.)
Posted Apr 29, 2020 19:44 UTC (Wed)
by martin.langhoff (subscriber, #61417)
[Link] (2 responses)
In other words -- it's dynamic typing, applied to complex variables.
Just like dynamic typing, it works well for small projects, breaks down eventually because they are not the same thing.
Posted Apr 29, 2020 22:03 UTC (Wed)
by NYKevin (subscriber, #129325)
[Link] (1 responses)
Dynamically-typed JSON seems like a really Bad Idea to me. JSON is an interaction point with the outside world, and very likely to contain untrusted (or only marginally trusted) data. At a bare minimum, you should be checking the types of the parsed objects, recursively in all sub-objects. Otherwise, it's ridiculously easy for an attacker to cause all sorts of headaches. Consider:
And I'm sure there are more "interesting" examples than just OOMing the client. But by the time you're doing recursive type checking, it's really hard to justify not using a "real" statically-checkable type like a dataclass. The only excuse I can think of is that the standard library lacks a facility to do it automatically, which is a shame because I could probably bang that out in an hour or two (with the bulk of that time devoted to re-normalizing the weird type objects in typing.py back into classes that you can hand to isinstance()).
Posted Apr 30, 2020 7:11 UTC (Thu)
by smurf (subscriber, #17840)
[Link]
Posted Apr 29, 2020 20:21 UTC (Wed)
by amarao (guest, #87073)
[Link] (21 responses)
Posted Apr 30, 2020 2:22 UTC (Thu)
by dtlin (subscriber, #36537)
[Link] (19 responses)
Posted Apr 30, 2020 5:12 UTC (Thu)
by neilbrown (subscriber, #359)
[Link] (18 responses)
I wonder what you mean by that exactly. For example, does it contravene some part of the Zen of Python?
Simple is better than complex.
All seem to support it.
The article mentioned a ".." suggestion which is much like this "!" suggestion. Would ".." feel less foreign??
Posted Apr 30, 2020 6:19 UTC (Thu)
by dtlin (subscriber, #36537)
[Link] (17 responses)
Obviously there's other forms of syntactic sugar in Python. But it seems to me like other sugar has more benefit than saving 3 characters - or 2 characters, in the case of
Posted Apr 30, 2020 7:16 UTC (Thu)
by smurf (subscriber, #17840)
[Link] (16 responses)
It's also not just readability but also typing. One dot is one keystroke with pretty much any keyboard layout ever. On a German keyboard, however, [''] requires eight (brackets require AltGr while single quotes need Shift). Owch.
Posted Apr 30, 2020 10:42 UTC (Thu)
by mbunkus (subscriber, #87248)
[Link] (7 responses)
Not only that, for pressing{ with one hand you really need to do funky acrobatics as it's on AltGr+7. Doing that for several hours a day _hurt_! Look at images of German keyboard layouts to get an idea how you have to contort your hand for that.
The result was that I switched to English layouts, even with German keyboards. I then spent hours on implementing some way to write German Umlauts & ß without too much hassle (I also switched to using ergonomic keyboards, but that's a different topic).
smurf is right, having to type asd['qwe']['whatever'] requires a LOT of changing states of different modifier keys, it slows down typing significantly: a s d press&hold AltGr 8 release AltGr press&hold Shift # release Shift q w e press&hold Shift # release Shift press&hold AltGr+9 etc. etc.
I'm pretty sure other non-English languages have similar problems typing something like that.
Posted Apr 30, 2020 14:12 UTC (Thu)
by NAR (subscriber, #1313)
[Link] (2 responses)
that it's accessing a struct or a map. It also has an interesting interplay with command line expansion - it is possible to create atom names with space (crazy idea, but possible):
Then when I type
Of course,
Posted May 4, 2020 12:14 UTC (Mon)
by ballombe (subscriber, #9523)
[Link] (1 responses)
Posted May 4, 2020 12:58 UTC (Mon)
by mathstuf (subscriber, #69389)
[Link]
% cat /etc/udev/99-kb-capslock.hwdb
which means that it also works on the TTY and not just when X is running (and apps can't sniff the fact that it is Caps Lock behind my back and do the wrong thing).
Posted May 1, 2020 8:03 UTC (Fri)
by knuto (subscriber, #96401)
[Link]
\documentstyle[12pt]{report}
would show up on Norwegian screens as
ØdocumentstyleÆ12ptÅæreportå
and 'hopeless' == 'håpløst' in Norwegian
I ended up writing a small 138 lines of c code preprocessor for TeX which I used for all my early years of TeX and subsequently LaTeX work that used /<> instead of \{} , translated all the æøå variants to the right escape sequences and an escape construct @( @) and special handling of \begin{verbatim} .. \end{verbatim}.
Posted May 7, 2020 22:24 UTC (Thu)
by flussence (guest, #85566)
[Link]
But now I can emphasise with the plight of users who get this experience by default. A lot of programming's still too ASCII-centric.
Posted May 7, 2020 23:01 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link] (1 responses)
Posted May 9, 2020 16:42 UTC (Sat)
by smurf (subscriber, #17840)
[Link]
Posted May 2, 2020 4:20 UTC (Sat)
by NYKevin (subscriber, #129325)
[Link] (7 responses)
One of the more interesting suggestions in that thread, from Chris Angelico:
My solution to that has usually been something along the lines of:
Will often be custom-tweaked to the situation, but the basic idea is the same.
My 2 cents: I would much prefer forward slashes as the separator (by analogy with filesystem paths), but otherwise that looks quite reasonable to me. It also solves the foo["names with multiple words aren't valid identifiers"] problem. And as an extra bonus, this requires no changes to core Python or the standard library, and can traverse any dict-like object that supports __getitem__(), rather than being a class in its own right (no ugly multiple inheritance if you want to combine functionality with another mapping type).
Posted May 2, 2020 7:12 UTC (Sat)
by dtlin (subscriber, #36537)
[Link] (4 responses)
Seems to me that it makes more sense to keep splitting the responsibility of the caller:
The caller should know what an appropriate separator is, and could even build the path up from multiple parts split in different ways if that's appropriate.
Although that reminds me of how convenient Perl's
Posted May 2, 2020 8:47 UTC (Sat)
by smurf (subscriber, #17840)
[Link] (1 responses)
One more character and it works today.
w/'hello world'
Writing the three-line singleton object 'w', with an appropriate dunder method, is left as an exercise to the reader.
Posted May 2, 2020 10:30 UTC (Sat)
by dtlin (subscriber, #36537)
[Link]
Yeah, that would be simple, but precedence doesn't work out entirely in our favor:
would result in
Posted May 3, 2020 14:35 UTC (Sun)
by kleptog (subscriber, #1183)
[Link] (1 responses)
I've often ended up coding methods to do this, but it'd be cool if there was something standard.
Posted May 7, 2020 5:45 UTC (Thu)
by njs (subscriber, #40338)
[Link]
Posted May 2, 2020 8:03 UTC (Sat)
by PhilippWendler (subscriber, #126612)
[Link] (1 responses)
Posted May 3, 2020 11:24 UTC (Sun)
by mathstuf (subscriber, #69389)
[Link]
Posted May 1, 2020 1:18 UTC (Fri)
by moxfyre (guest, #13847)
[Link]
In my vpn-slice and wtf utilities, I have long used an even simpler version of SimpleNameSpace, dubbed slurpy:
It's very simple and performs well for a pure-Python implementation, and it even throws KeyError or AttributeError appropriately so that callers/REPLs don't get confused by the “wrong” kind of exception.
(See https://github.com/dlenski/vpn-slice/blob/HEAD/vpn_slice/util.py#L13-L22 and https://github.com/dlenski/wtf/blob/HEAD/wtf.py#L10-L18 for some context as to how this is useful.)
Posted Apr 30, 2020 7:27 UTC (Thu)
by LtWorf (subscriber, #124958)
[Link] (3 responses)
@dataclass
typedload.load(data, A)
It tells mypy of the output type, will do the runtime checks to make sure the type is actually correct, the exceptions offer a way to figure out where exactly in the data the error happened.
It has a number of options, for example to disallow unknown fields in the dictionaries that are not in the classes, and allows to define custom functions to load into whatever type.
Personally I'd rather be able to access fields that exist for sure than expect exceptions to happen all over the place.
Posted May 1, 2020 9:52 UTC (Fri)
by tamasrepus (subscriber, #33205)
[Link] (2 responses)
Posted May 1, 2020 15:01 UTC (Fri)
by LtWorf (subscriber, #124958)
[Link]
I'd trust mine more because it has tests running on all the python versions that are supported and on mypy.
jsons doesn't seem to use mypy, because at a casual glance I found some typing errors.
jsons supports more types, but typedload has a better code for unions and is more customisable.
Pretty similar but jsons is MIT licensed and typedload is GPL3 so I guess I lose on the license. Which is pretty deliberate because I don't want my free time work to be used for free by proprietary projects.
Ah, mine one already exists in Debian, so that's a slight advantage.
Posted Aug 7, 2022 21:48 UTC (Sun)
by LtWorf (subscriber, #124958)
[Link]
It seems to be very low quality. For example loading 1.1 into a Union[int, float] returns 1, which is obviously wrong.
It's also 10x to 40x times slower than typedload.
Despite this it has 8x more downloads :)
Improving Python's SimpleNamespace
Improving Python's SimpleNamespace
Improving Python's SimpleNamespace
>>> ham = json.loads(r'{"eggs": 5}')
>>> ham['eggs'] * 1000000000 # Convert seconds to nanoseconds
5000000000
>>> spam = json.loads(r'{"eggs": [1, 2, 3]}')
>>> spam['eggs'] * 1000000000 # Convert tiny list to MemoryError
Improving Python's SimpleNamespace
Improving Python's SimpleNamespace
That's an interesting idea. Using something like Improving Python's SimpleNamespace
obj!abc!def!ghi
to mean obj['abc']['def']['ghi']
wouldn't break any existing code. Still feels pretty foreign to Python though.
Improving Python's SimpleNamespace
Readability counts.
Now is better than never.
Improving Python's SimpleNamespace
There should be one-- and preferably only one --obvious way to do it.
obj..abc..def..ghi
. IMO obj['abc']['def']['ghi']
already scores reasonably well along simple, readable, and now measures, so a proposal should be substantially better.
Improving Python's SimpleNamespace
Typing costs of non-English keyboard layouts in programming languages
This is the reason why I use "hunglish" layout: English layout, extra Hungarian characters available by AltGr (mostly) on the right side of the keyboard (e.g. 0-=[];'\), so it's fairly easy to type them. I never understood people who can use Hungarian layout for programming...
BTW Elixir has maps (the usual associative arrays) in the language. It has also structs, but those are implemented using maps with a special Typing costs of non-English keyboard layouts in programming languages
__struct__
field containing the type and the field names of the structs are keys in the map (as atoms). The generic syntax for accessing maps is the usual map[key]
, while for structs it is struct.attribute
. So far so good, it's probably what people from other languages expect. However, for some reason if the keys of a map are atoms, the "struct syntax" also works - which sometimes drives me nuts as I can't tell by looking at code like this:
some.thing
iex(8)> m3 = %{:"a b" => "c"}
%{"a b": "c"}
m3.
followed by TAB, the shell helpfully extends the field name, so I get
iex(9)> m3.a b
** (CompileError) iex:9: undefined function b/0
m3."a b"
or m3[:"a b"]
works.
Typing costs of non-English keyboard layouts in programming languages
Typing costs of non-English keyboard layouts in programming languages
evdev:input:b0011v0001p0001eAB41-*:
KEYBOARD_KEY_70039=backspace
Typing costs of non-English keyboard layouts in programming languages
Yes, same issue in Norwegian, both {} and [] requires AltGr. But when iso8859-1 emerged,
it was still an enormous a relief compared to back in the 7 bit ascii days when a construct like
would have to be written as the less readable 'hØaaplØost'
Typing costs of non-English keyboard layouts in programming languages
Typing costs of non-English keyboard layouts in programming languages
Typing costs of non-English keyboard layouts in programming languages
Improving Python's SimpleNamespace
def get(obj, path):
for step in path.split("-"):
obj = obj[step]
return obj
print(get(catalog, 'clothing-mens-shoes-extra_wide-quantity'))
Improving Python's SimpleNamespace
def deep_getitem(obj, *path):
return functools.reduce(operator.getitem, path, obj)
deep_getitem(catalog, *'clothing mens shoes extra_wide quantity'.split())
qw(...)
and Ruby's %w[...]
are. I wonder if there might be interest in some hypothetical w-string in Python, such that
w'hello world' == ["hello", "world"]
Improving Python's SimpleNamespace
Improving Python's SimpleNamespace
w/'hello world'[0]
"h"
instead of "hello"
.
When dealing with complex data structures from a database or client it's useful to have a kind of "deep get" like you have, but one that gracefully handles missing entries well. Otherwise you end up with horrors like:
Improving Python's SimpleNamespace
a.get('foo', {}).get('bar', {}).get('baz', None)
Ideally you'd like something that also handled arrays, but Python doesn't have an easy way to index an array with a default when you go off the end.
Improving Python's SimpleNamespace
Improving Python's SimpleNamespace
Improving Python's SimpleNamespace
I agree. It's a matter of convenient access, whether you're populating contents “on the fly” (more a use case for dict) or with a relatively small set of fixed names (more a use case for a custom class or attrs).Improving Python's SimpleNamespace
# Quacks like a dict and an object
class slurpy(dict):
def __getattr__(self, k):
try:
return self[k]
except KeyError as e:
raise AttributeError(*e.args)
def __setattr__(self, k, v):
self[k]=v
This allows you to create an object like d = slurpy(foo="bar", baz=1) and then refer to any of its members/contents
either by member access (.foo) or by item access (["foo"]).Improving Python's SimpleNamespace
class A:
a: int
b: List[str] = field(default_factory=list)
Improving Python's SimpleNamespace
Improving Python's SimpleNamespace
Improving Python's SimpleNamespace