|
|
Subscribe / Log in / New account

Re: Impact of Namedtuple on startup time

From:  Raymond Hettinger <raymond.hettinger-AT-gmail.com>
To:  Antoine Pitrou <antoine-AT-python.org>
Subject:  Re: Impact of Namedtuple on startup time
Date:  Mon, 17 Jul 2017 07:59:51 -0700
Message-ID:  <E2D91FB4-AF77-4FF5-A588-88BDFB49BB9A@gmail.com>
Cc:  "Python-Dev-AT-Python. Org" <python-dev-AT-python.org>


> On Jul 17, 2017, at 6:31 AM, Antoine Pitrou <antoine@python.org> wrote:
> 
>> I think I understand well enough to say something intelligent…
>> 
>> While actual references to _source are likely rare (certainly I’ve never
>> used it), my understanding is that the way namedtuple works is to
>> construct _source, and then exec it to create the class. Once that is
>> done, there is no significant saving to be had by throwing away the
>> constructed _source value.

There are considerable benefits to namedtuple being able to generate and match its own source.

* It makes it is really easy for a user to generate the code, drop it into another another module,
and customize it.

* It makes the named tuple factory function completely self-documenting. 

* The verbose/_source option teaches you exactly what named tuple does.  That makes the tool
relatively easy to learn, understand, and debug.

I really don't want to throw away these benefits to save a couple of milliseconds.   As Nick
Coghlan recently posted, "Speed isn't everything, and it certainly isn't adequate justification for
breaking public APIs that have been around for years."

FWIW, the template/exec implementation has had excellent benefits for maintainability making it
very easy to fix and update.  As other parts of Python have changed (limitations on number of
arguments, what is allowed as an identifier, etc), it mostly automatically stays in sync with the
rest of the language.

ISTM this issue is being pressed by micro-optimizers who are being very aggressive and not
responding to actual user needs (it is more an invented issue than a real one).  Named tuple has
been around for a long time and users have been somewhat happy with it.

If someone truly cares about the exec time for a particular named tuple, the _source option makes
it trivially easy to just replace the generator call with the expanded code in that particular
circumstance.


Raymond


P.S. I'm fully supportive of Victor's efforts to build-out structseq to make it sufficiently
expressive to do more of what collections.namedtuple() does.  That is a perfectly reasonable path
to optimization. We've wanted that for a long time and no one has had the spare clock cycles to
make it come true.

  
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/python...


to post comments


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds