Native Python support for units?

By Jake Edge
July 12, 2022

Back in April, there was an interesting discussion on the python-ideas mailing list that started as a query about adding support for custom literals, a la C++, but branched off from there. Custom literals are frequently used for handling units and unit conversion in C++, so the Python discussion fairly quickly focused on that use case. While ideas about a possible feature were batted about, it does not seem like anything that is being pursued in earnest, at least at this point. But some of the facets of the problem are, perhaps surprisingly, more complex than might be guessed.

Custom literals

On April 1, Will Bradley posted a, presumably non-joking query about custom literal support for Python; "has this been considered and rejected, or is there a reason it's unpopular?" According to Stephen J. Turnbull, user-defined syntax for Python has a been a hard sell, in general, though literal syntax for units (e.g. "10m" for meters or "1.2kW" for kilowatts) has gotten somewhat further. In addition, the idea of adding a literal syntax for fixed-point constants that use the decimal package also crops up with some frequency, he said. His recollection is that "in general Python has rejected user-defined syntax on the grounds that it makes the language harder to parse both for the compiler and for human beings".

Brian McCall warned that he was getting up on his soap box, but said that he strongly supported Python (and, indeed, all programming languages) having native units. "It's not often that I would say that C++ is easier to read or more WYSIWYG than Python, but in this case, C++ is clearly well ahead of Python." He suggested that "lack of native language support for SI units" is particularly problematic because of the prevalence of scientific computing today. SI units are the international system of units, which are often referred to as the metric system.

Anyone who has ever dealt with units will immediately recognize a problem associated with enshrining only the SI units into Python (or anywhere else): other measurement systems exist, including imperial and US units and even other metric systems. As Ricky Teachey put it:

BUT- SI units isn't enough. Engineers in the US and Canada (I have many colleagues in Canada and when I ask they always say: we pretend to use SI but we don't) have all kinds of units.
Give us native, customizable units, or give us death! Who's with me??!!

Units

Beyond the conflict over measuring-system support, there are other fundamental questions that need to be resolved before new syntax could be added, Chris Angelico said. The number "4K" might mean four kelvins, 4000, or 4096, depending on context, for example. Angelico suggested that an existing bit of syntax could be repurposed for units:

But I would very much like to see a measure of language support for "number with alphabetic tag", without giving it any semantic meaning whatsoever. Python currently has precisely one such tag, and one conflicting piece of syntax: "10j" means "complex(imag=10)", and "10e1" means "100.0". (They can of course be combined, 10e1j does indeed mean 100*sqrt(-1).)
[...] In Python, I think it'd make sense to syntactically accept *any* suffix, and then have a run-time translation table that can have anything registered; if you use a suffix that isn't registered, it's a run-time error. Something like this:
    import sys
    # sys.register_numeric_suffix("j", lambda n: complex(imag=n))
    sys.register_numeric_suffix("m", lambda n: unit(n, "meter"))
    sys.register_numeric_suffix("mol", lambda n: unit(n, "mole"))
[...] Using it would look something like this:
    def spread():
	"""Calculate the thickness of avocado when spread on
	a single slice of bread"""
	qty = 1.5mol
	area = 200mm * 200mm
	return qty / area

Greg Ewing objected to the idea of a global registry since two libraries might want to use the same suffix for different units. But there are other reasons modules might need their own definitions, as Steven D'Aprano pointed out. There are multiple units called a "mile" (e.g. Roman, international, nautical, US survey, imperial, ...), for one thing, but even the definitions of the "same" unit may have changed over time: "a kilometre in 1920 is not the same as a kilometre in 2020, and applications that care about high precision may care about the difference". He thinks that units should be scoped like variables or, at least, have a separate per-module namespace where libraries can register their own units if they wish to.

Angelico did not agree, at least in part because he did not see the need to make things more complex for what he sees as a rare use case. He seemed to be only participant in the thread that thought that way, however, as D'Aprano, Ewing, and others were all fairly adamant that some kind of unit namespace would be needed. Though the reply is perhaps a bit on the rude side, D'Aprano put it this way:

Units are *values* that are used in calculations, not application wide settings. The idea that libraries shouldn't use their own units is as silly as the idea that libraries shouldn't use their own variables.
Units are not classes, but they are sort of like them. You wouldn't insist on a single, interpreter wide database of classes, or claim that "libraries shouldn't create their own classes".

Other possibilities

Ewing suggested a solution that could perhaps be done with no (or minimal) syntax changes:

Treating units as ordinary names looked up as usual would be the simplest thing to do.
If you really want units to be in a separate namespace, I think it would have to be per-module, with some variant of the import statement for getting things into it.
    from units.si import units *
    from units.imperial import units inch, ft, mile
    from units.nautical import units mile as nm

Teachey had mentioned the Pint library in his reply. It is perhaps the most popular Python Package Index (PyPI) library for working with units. Pint comes with a whole raft of units, which can be easily combined in various ways. For example:

    >>> import pint
    >>> ureg = pint.UnitRegistry()
    >>> speed = 17 * ureg.furlongs / ureg.fortnight
    >>> speed
    <Quantity(17.0, 'furlong / fortnight')>
    >>> speed.to('millimeter/second')
    <Quantity(2.82726756, 'millimeter / second')>
    >>> d = 1 * ureg.furlong
    >>> d.to('feet')
    <Quantity(660.00132, 'foot')>
    >>> d.to('mile')
    <Quantity(0.12500025, 'mile')>

At least on my system, Pint seems to have a slightly inaccurate value for a furlong, which is defined as 1/8 mile, or 660 feet; a fortnight is, of course, two weeks or 14 days. That oddity aside, Pint has much of the functionality users might want, but it (and other Python unit-handling libraries) have "so many shortfalls", Teachey said, mostly because they are not specified and used like real-world units are. Beyond Python libraries, the venerable Unix units utility has similar capabilities and can be used directly from the command line (its man page is where the classic "furlongs/fortnight" example comes from). As D'Aprano noted, units has over 3000 different units, which can be combined in a truly enormous number of ways.

Ethan Furman started a new thread from Teachey's message in order to focus specifically on native support for units. He floated his own suggestion for new syntax:

Well, if we're spit-balling ideas, what about:
    63_lbs
or
   77_km/hr
? Variables cannot start with a number, so there'd be no ambiguity there; we started allowing underbars for separating digits a few versions ago, so there is some precedent.

Teachey wondered about the behavior of the "simple tags" being suggested for units:

[...] What should the behavior of this be?
   height = 5ft + 4.5in
Surely we ought to be able to add these values. But what should the resulting tag be?

He also wondered if a more natural-language-like formulation (e.g. 5ft 4.5in) should be supported. Overall, he thinks that figuring out a solution for units in Python would be a "massive contribution" to the engineering world, "but boy howdy is it a tough [nut] of a problem to crack". Angelico said that the "5ft 4.5in" syntax was a step too far in his mind, but using addition should work. "It's not that hard to say '5ft + 4.5in', just like you'd say '3 + 4j' for a complex number."

Angelico went on to describe the benefits of the syntax change over simply defining constants for units, as Ewing suggested. Since, for example, "m" would only be valid as a unit when it was used as a suffix, it would not pollute the namespace for using "m" as a variable. It is also more readable:

If this were accepted, I would fully expect that libraries like pint would adopt it, so this example:
    >>> 3 * ureg.meter + 4 * ureg.cm
    <Quantity(3.04, 'meter')>
could look like this:
    >>> 3m + 4cm
    <Quantity(3.04, 'meter')>
with everything behaving the exact same after that point. Which would YOU prefer to write in your source code, assuming they have the same run-time behaviour?

But Ken Kundert said that units are primarily useful on input and output, not in the calculations within programs and libraries.

The idea that one carries units on variables interior to a program, and that those units are checked for all interior calculations, is naive. Doing such thing adds unnecessary and often undesired complexity.

His QuantiPhy library provides a means for "reading and writing physical quantities". It effectively adds units as an attribute to Python float values so that they can be used when converting the value to a string. But he said that it might make sense to incorporate scale factors and units into Python itself for readability purposes:

For example, consider the following three versions of the same line of code:
     virt /= 1048576
     virt /= 1.048576e6
     virt /= 1MiB
The last is the easiest to read and the least ambiguous. Using the units and scale factor on the scaling constant results in an easy to read line that makes it clear what is intended.
Notice that in this case the program does not use the specified units, rather the act of specifying the units clarifies the programmers intent and reduces the chance of misunderstandings or error when the code is modified by later programmers.
But this suggests that it is not necessary for Python to interpret the units. The most it needs do is to save the units as an attribute so that it is available if needed later.

Library deficiencies?

Beyond just Pint and QuantiPhy, the units module was also mentioned in the thread, so Paul Moore wondered why none of those solutions was acceptable. He pointed out that the @ matrix-multiplication operator was added to the language because of arguments from the NumPy community; "language changes are *more likely* based on a thriving community of library users, so starting with a library is a positive way of arguing for core changes". Turnbull echoed that and also wondered if the typing features could be harnessed to help solve the problem.

In a lengthy message, McCall tried to answer Moore's question, though it is not clear that he really changed any minds. He laid out a complicated calculation, with many different units, and showed how it looked using various existing libraries and the syntax proposed by Angelico; for each he listed a set of pros and cons. One could perhaps quibble with his analysis, but that is not really the point, Moore said; what it shows is that "the existing library solutions might not be ideal, but they do broadly address the requirement". Each has its own pain points, so:

Maybe that suggests that there's room for a unified library that takes the best ideas from all of the existing ones, and pulls them together into something that subject experts like yourself *would* be happy with (within the constraints of the existing language). And if new syntax is a clear win even with such a library, then designing a language feature that enables better syntax for that library would still be possible (and there would be a clear use case for it, making the arguments easier to make).

A somewhat late entrant into the syntax derby (though others had shown similar constructs along the way) came from Matt del Valle who suggested that the numeric types (e.g. int, float) could gain units by way of Python's subscript notation, which might look something like:

    from units.si import km, m, N, Pa

    3[km] + 4[m] == 3004[m]  # True
    5[N]/1[m**2] == 5[Pa]  # True

Moore thought that looked like a plausible syntax, but reiterated his belief that any change would necessarily need to come by way of a library that supporters of "units for Python" developed—and rallied around. That can all be done now, without any need for a PEP or core developer support. After that, a language change could be proposed if it made sense to do so:

Once that library has demonstrated its popularity, someone writes a PEP suggesting that the language adds support for the syntax `number[annotation]` that can be customised by user code. This would be very similar in principle to the PEP for the matrix multiplication @ operator - a popular 3rd party library demonstrates that a well-focused language change, designed to be generally useful, can significantly improve the UI of the library in a way which would be natural for that library's users (while still being general enough to allow others to experiment with the feature as well).
[...] But the library would be useful even if this doesn't happen (and conversely, if the library proves *not* to be useful, it demonstrates that the language change wouldn't actually be as valuable as people had hoped).
[...] So honestly, I'd encourage interested users to get on with implementing the library of their dreams. By all means look ahead to how language syntax improvements might help you, but don't let that stop you getting something useful working right now.

Those who want to try out different syntax changes without actually having to hack on the CPython interpreter directly may be interested in the ideas library. André Roberge, who developed the library, suggested using it as a way to prototype the changes. Ideas modifies the abstract syntax tree (AST) on the fly to enable changes to the input before handing it off to CPython. In another message, he noted that he had implemented the subscript notation for ideas so that it could be tested using Pint or astropy.units.

So far at least, it does not seem like there is a groundswell of activity toward yet another library for units, but one focused in the way that Moore suggested could lead to changes to the language. It may be that the disparate ideas of what unit support would actually mean—and how it would be used—make it hard to coalesce around a single solution. It may also be that the need for additional solutions for Python unit handling is not as pressing as some think. It seems likely that the idea will crop up again, however, so proponents may well want to consider Moore's advice and come up with a unified library before pursuing language changes.

Native Python support for units?

Posted Jul 12, 2022 22:34 UTC (Tue) by logang (subscriber, #127618) [Link]

Units would be nice but seems to ambitious.

What I'd really like to see is simple a standard library way to print and parse both SI and binary prefixes.

I feel like I've had to rewrite that kind of things dozens of times -- I can never justify adding a dependency for that, especially for simple scripts that are meant to be portable.

So many command line tools support taking sizes with a k/M/G suffix, it would be nice if argparse just had that option built-in as well as an easy way to print numbers with suffixes using the .format syntax.

Native Python support for units?

Posted Jul 13, 2022 0:23 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]

Just rename Python to Perl.

SI ≠ metric

Posted Jul 13, 2022 4:46 UTC (Wed) by rsidd (subscriber, #2582) [Link] (23 responses)

> SI units are the international system of units, which are often referred to as the metric system.

This is disappointing sloppiness from LWN. Metre, centimetre, kilometre, gram, milligram, kilogram are all part of the metric system. SI units are specifically metre for length, kilogram for weight, second for time (second isn't part of the metric system), etc.

SI ≠ metric

Posted Jul 13, 2022 5:28 UTC (Wed) by k8to (guest, #15413) [Link]

Ehh. "Metric system" can mean many things across time, but in modern practicality, the SI form is what is meant.

SI ≠ metric

Posted Jul 13, 2022 8:01 UTC (Wed) by pbonzini (subscriber, #60935) [Link] (19 responses)

> second isn't part of the metric system

Are you sure? My understanding is that metric systems are systems like MKS, CGS etc., all of which include the second. The French revolution indeed introduced only the meter and the kilogram, but that's because the second was in use even before. In fact the first attempt of the definition of a meter was "the length of a pendulum that oscillates every second".

SI ≠ metric

Posted Jul 13, 2022 8:24 UTC (Wed) by smurf (subscriber, #17840) [Link] (18 responses)

In fact they also tried to decimalize the day's subdivisions, hence change the length of a second.

The problem with that idea was that meters and kilograms solved a real problem – it is impossible to determine how many different pounds and feet, or whatever they were called, existed at the time –, while everybody (… for some arbitrary value of "everybody" of course …) already agreed that the day has 24h / 60min / 60sec. Hence that attempt died.

French Republican time and dates

Posted Jul 13, 2022 13:33 UTC (Wed) by tialaramex (subscriber, #21167) [Link] (17 responses)

And they wanted to divide the year up differently too, although interestingly still into 12 months. Given the Roman months we have are stupid (December is the 12th month, really? Yes I know why that happened, it's still stupid) the French Republican calendar seems better, at least the names and regularity of month length - Fructidor in particular appeals to me.

Whether these changes work or not comes down to popularity. Time changes involved considerable infrastructure overhaul which the Revolutionaries could not get done. So you get a dispensation, OK, your town is down with the Republican agenda but you've got this massive town clock keeping time the old way, ordering you to dismantle the clock would turn you against us when we can't handle any more enemies, so, keep your clock "for now" and just do the other stuff nearer the top of our list. A year later, the Republic is looking a bit wobblier, your clock is still standing, and "Replace your clock" isn't even on the list any more.

For the calendar, you've got a problem because the Republican calendar requires the same trick as Lunar calendars popular in Islam and Judaism - somebody has to go look at the sky to check whether it's next year yet. Some people will insist that what matters is whether the appointed astronomer saw the requisite signs, others are content to rely on a model of the universe. The astronomer might not notice, and the model could be wrong. It's a mess.

What we do today is pretty stupid (the IERS astronomers add or remove seconds from UTC to approximate UT1, even though moment by moment we actually track TAI because unlike the Earth's spin it isn't varying) but still less so than the Republican calendar.

The Republicans also tried ten day weeks which blew up for a different reason. Workers used to a six on/ one off pattern were now given nine on/ one off. The revolutionaries said aha, but we give you half of the fifth day extra, and that's better - but the sort of person who can do arithmetic well enough to confirm that yes, strictly 1.5/10 is better than 1/7 is not usually a manual labourer, *half* a day off doesn't feel much like a day off at all, and so it lasted only a few weeks (of either kind) before being abandoned as unworkable.

French Republican time and dates

Posted Jul 13, 2022 16:57 UTC (Wed) by rgmoore (✭ supporter ✭, #75) [Link] (5 responses)

French Republican calendar seems better, at least the names and regularity of month length - Fructidor in particular appeals to me.

The month names are specific to the climate of France; I think they're actually specific to the climate of Paris, so people in Provence would disagree about, e.g. whether Nivôse was actually snowy. They certainly have nothing to do with the seasons here in Southern California, and I assume anyone in the Southern Hemisphere would object strenuously.

French Republican time and dates

Posted Jul 13, 2022 17:03 UTC (Wed) by pbonzini (subscriber, #60935) [Link] (4 responses)

Several Slavic countries name months in ways that resembles the French calendar. In Czech for example January is "ice", May is "blossom", June is "red" (don't ask why), August is "sickle", November is "falling leaves". Other months have similarly bucolic names whose translation I don't remember. I guess people don't care because Czech isn't widely spoken outside Czech Republic.

French Republican time and dates

Posted Jul 13, 2022 20:35 UTC (Wed) by rgmoore (✭ supporter ✭, #75) [Link] (3 responses)

Some of the Roman month names came from the same kind of naming scheme. For example, March was the month devoted to Mars, the god of war, and it was the traditional start to the military campaigning season. June was named for Juno who, among other things, was the goddess of marriage, and June is still the most popular month for weddings. But if you want your calendar to be a truly universal one, it's probably better not to give your months names that are tightly bound to the seasonal happenings in the country where it started. If it does go on to be universal, it will be because those original meanings have become lost over time and people treat the names as if they're arbitrary.

Relating back to the units, thing, one can think of the names of units. It's kind of fun that units are often named after scientists who did important work that's related to the thing they're measuring- Watt for power, Ampere for current, Kelvin for temperature, etc.- but plenty of people can use the units just fine without knowing that, say, Henri Becquerel is credited with discovering radioactivity. That's why we can mix units named for people with ones named for other things- meters, moles, and seconds- without too much problem, though it helps to understand why some are capitalized and others not.

French Republican time and dates

Posted Jul 13, 2022 21:02 UTC (Wed) by excors (subscriber, #95769) [Link] (2 responses)

> It's kind of fun that units are often named after scientists who did important work that's related to the thing they're measuring- Watt for power, Ampere for current, Kelvin for temperature, etc.- but plenty of people can use the units just fine without knowing that, say, Henri Becquerel is credited with discovering radioactivity. That's why we can mix units named for people with ones named for other things- meters, moles, and seconds- without too much problem, though it helps to understand why some are capitalized and others not.

I think your examples are wrong (although I might be misinterpreting your intent) - none of the unit names should be capitalised when written in full, so it should be watt, ampere, kelvin, becquerel, etc. (Except for "degrees Celsius" which should keep its C). It's the abbreviations that use capitals iff they're named after a person (W, A, K, Bq, etc) (except for litres which can be L or l for typographic reasons). (See e.g. https://www.bipm.org/utils/common/pdf/si-brochure/SI-Broc... section 5.2, 5.3)

French Republican time and dates

Posted Jul 14, 2022 18:15 UTC (Thu) by kpfleming (subscriber, #23250) [Link] (1 responses)

And because litre can be either 'l' or 'L', a group of us had quite a laugh at a restaurant whose after-dinner menu included '75ML' portions of popular aperitifs and fortified wines.

French Republican time and dates

Posted Jul 16, 2022 9:33 UTC (Sat) by edeloget (subscriber, #88392) [Link]

> And because litre can be either 'l' or 'L', a group of us had quite a laugh at a restaurant whose after-dinner menu included '75ML' portions of popular aperitifs and fortified wines.

"... and to finish your night with a rapid drink, here is a small pool of 4 megaliters of vodka..."

French Republican time and dates

Posted Jul 13, 2022 21:33 UTC (Wed) by Wol (subscriber, #4433) [Link] (10 responses)

> And they wanted to divide the year up differently too, although interestingly still into 12 months. Given the Roman months we have are stupid (December is the 12th month, really? Yes I know why that happened, it's still stupid) the French Republican calendar seems better, at least the names and regularity of month length - Fructidor in particular appeals to me.

Well, it's for exactly the same reason that the UK tax year is the 6th April. Until only 250 years ago, the 6th April (actually I think it's the 7th now) WAS New Year's Day, and December WAS the tenth month. We have Pope Gregory to thank for the modern mess.

250 years isn't really long enough to get it into peoples' heads that the name no longer reflects reality (actually, names rarely reflect reality at all :-)

Cheers,
Wol

French Republican time and dates

Posted Jul 13, 2022 23:09 UTC (Wed) by rgmoore (✭ supporter ✭, #75) [Link] (5 responses)

It's getting pretty deep in the weeds, but the Roman calendar started the year on 1 January even before Julius Caesar reformed it into the Julian system. So even though they had 12 months, the Romans called the last month in the calendar year 10th Month. Supposedly their original calendar had only 10 months, with a period of about 50 days that weren't part of any month, and January and February were added later. It's possible that the calendar originally started in March, with January and February being the 11th and 12th months, and it was only later switched to start in January, but the civil year starting in January was well established before the Julian reforms.

French Republican time and dates

Posted Jul 14, 2022 7:27 UTC (Thu) by Wol (subscriber, #4433) [Link] (4 responses)

> but the civil year starting in January was well established before the Julian reforms.

Of course, there had to be a good reason for Gregory to change the New Year, but ...

Why (and I've seen evidence of this) are many dates between 1550 and 1750 "double year" dates - when the two calendars ran in parallel - but there's no evidence of this before or after?

For evidence, Pepys diaries, and Derby Cathedral - I was a bit shocked at that, I saw a monument and it took me a while to realise *why* the date - something like "20 January 1740/41" - read like that.

Cheers,
Wol

French Republican time and dates

Posted Jul 14, 2022 10:53 UTC (Thu) by pbonzini (subscriber, #60935) [Link] (1 responses)

http://dpgi.unina.it/giudice/calendar/Adoption.html says Great Britain's adoption of the Gregorian calendar was in 1752 and at the same time the beginning of the year was changed from March 25 to January 1, commencing with the year 1752. So you might see double years between 1582 and 1752? And before 1582 they just didn't care?

French Republican time and dates

Posted Jul 16, 2022 13:03 UTC (Sat) by Wol (subscriber, #4433) [Link]

Well, before 1582 New Year's day was 25th March (the Spring Equinox - yes I know that's now March 25th) all over Europe so it's not they didn't care, it just wasn't the case, Jan and Feb and most of March were at the end of the year.

The reason they used both years between 1582 and 1752, was because Europe was on the New System, and we were on the old, so as soon as anything showed any hint of Internationalism you had to make sure it was clear which system you meant.

(Oh - as for the Roman calendar, January is named after Janus, the Roman God who looked both backwards at the old year, and forward to the new. So March was the first month of the year, December was the last, and yes what is now Jan and Feb was "winter".

Cheers,
Wol

French Republican time and dates

Posted Jul 15, 2022 22:55 UTC (Fri) by NYKevin (subscriber, #129325) [Link] (1 responses)

If you look at Eastern European sources, you may see this going on for significantly longer, because Russia only adopted the Gregorian calendar in 1918 (and then proceeded to do some weird stuff with the work week, because USSR, but that's a different story). This is presumably a consequence of the Eastern Schism (the Gregorian reform was the pope's idea).

French Republican time and dates

Posted Jul 16, 2022 8:05 UTC (Sat) by Wol (subscriber, #4433) [Link]

Yup - which is why the October revolution happened in November ... :-)

Cheers,
Wol

French Republican time and dates

Posted Jul 14, 2022 9:14 UTC (Thu) by james (subscriber, #1325) [Link] (1 responses)

Actually, before Britain adopted the Gregorian calendar, the beginning of the tax and legal year was Lady Day (25 March, the Church feast celebrating the angel Gabriel telling the Virgin Mary that she was to conceive Jesus: nine months before Christmas).

When Britain adopted the Gregorian calendar in 1752, eleven days were "lost" to the calendar. The lawyers saw no reason why the tax year should remain fixed to the Church calendar, but some very good ones why the tax year should remain 365 or 366 days long. That moved it to 5 April.

Then 1800 would have been a leap year under the old Julian calendar, but not under the Gregorian calendar, and the Treasury pushed the start of the tax year to 6 April, to match Lady Day in the old Julian calendar.

French Republican time and dates

Posted Jul 18, 2022 15:21 UTC (Mon) by Wol (subscriber, #4433) [Link]

Which is why I said I thought it was now the 7th - because 1900 also wasn't a leap year in the new calendar ... :-)

I was a bit vague (as always) about 5th/6th/7th April being the Julian 25th March, but I didn't want to overcomplicate by spelling it all out ... and in another 80(ish) years it'll be the 8th April :-) Although I doubt the Treasury will bother moving the tax year.

Cheers,
Wol

French Republican time and dates

Posted Jul 14, 2022 9:36 UTC (Thu) by nowster (subscriber, #67) [Link]

Actually March 25 "Lady Day" was New Year in the British Isles, especially for financial purposes. Then the switch to the Gregorian calendar moved it later in the calendar by eleven days, and a further change moved it later by another day.

https://theconversation.com/why-the-uk-tax-year-begins-on...

The year number used to change on March 25th too, which makes historic date handling rather complex.

French Republican time and dates

Posted Jul 14, 2022 12:27 UTC (Thu) by nsheed (subscriber, #5151) [Link]

Aul Eel, Christmas Day, Old Style (6 Jan.)

is still well understood in my part of the world (N.E. Scotland).

SI ≠ metric

Posted Jul 13, 2022 15:12 UTC (Wed) by intelfx (subscriber, #130118) [Link] (1 responses)

> This is disappointing sloppiness from LWN. Metre, centimetre, kilometre, gram, milligram, kilogram are all part of the metric system

But the SI units system is indeed *often referred to* as the metric system. Even if it is not technically correct. So there's no sloppiness, the article says exactly what it wanted to say.

SI ≠ metric

Posted Jul 13, 2022 16:17 UTC (Wed) by pbonzini (subscriber, #60935) [Link]

After going down a metrology rabbit hole I think I am quite sure that there are many metric systems, one of them being the SI, but all of them share the units that were listed in the article. Where they vary is in the "size" of the unprefixed derived units (e.g. dyne vs newton or erg vs joule) and especially in the definition of electrical units.

Native Python support for units?

Posted Jul 13, 2022 5:40 UTC (Wed) by vulpicastor (subscriber, #122452) [Link]

The Astropy package, popular in the astronomy community, has an excellent and robust subpackage for dealing with units. It seems to me that it’s not too much of an imposition to write 1 * u.cm rather than 1cm, after one writes import astropy.units as u. Other than the one-letter variable u, it avoids the namespacing problem of having units implemented by incompatible libraries. There are also features provided by astropy.units that is imo difficult or impossible to express with custom literals.

Custom (numeric) literals doesn’t fully solve the problem of giving units to arrays, which is essential for many numerical computations. Having different syntax for literals with units and arrays with units seems inelegant. In addition, Astropy overloads the << operator to express the idea of attaching units to an array without copying, so my_array << u.kg is much faster than my_array * u.kg for large arrays. For the same reason, even for an array with literal values given in code, it might make more sense to write numpy.array([1, 2, 3]) << u.kg, which results in an implementation-defined wrapper class around the Numpy array with units, than numpy.array([1kg, 2kg, 3kg]), which results in a Numpy array of object pointers to quantities with units.

The composition of units is naturally done through the *, /, and ** operators. The unit of the Newtonian gravitational constant, for example, is kg^-1 m³ s^-2 in base SI units. This kind of semantic seems difficult to express flexibly and extensibly using a custom literal system, unless one defines a whole new syntax for composing literal suffixes that essentially duplicates the three *, /, and ** operators anyway. This is made even worse by the possibility of fractional powers of units. An example would be the unit of electrical charge in the non-SI Gaussian cgs system, which can be expressed as cm^3/2 g^1/2 s⁻¹.

I am sure there are other contexts that only requires simple units with primitive numeric type where the custom literal approach might work out. But the approaches mentioned in the article, for reasons detailed above, seem inadequate for the current needs of scientific numerical computing.

Contexts

Posted Jul 13, 2022 6:05 UTC (Wed) by smurf (subscriber, #17840) [Link] (20 responses)

I wonder whether anybody in that discussion has thought of using contexts for specifying the meaning of suffixes.

There's some precedent for that, e.g. the way floats are handled (rounding, errors etc.).

Contexts

Posted Jul 13, 2022 6:43 UTC (Wed) by NYKevin (subscriber, #129325) [Link] (19 responses)

I dislike that, because the meaning of "1K" should be lexically scoped, not dynamically scoped.

Contexts

Posted Jul 14, 2022 7:30 UTC (Thu) by Wol (subscriber, #4433) [Link] (18 responses)

Which is why we have KiB, MiB, GiB etc.

Officially, we now have the Ki, Mi, Gi prefixes to mean base 2. Which of course now causes its own problems when I believe the official definition of "a gigabyte of disk" is now an MKiB?

Cheers,
Wol

Contexts

Posted Jul 14, 2022 10:09 UTC (Thu) by geert (subscriber, #98403) [Link] (17 responses)

I think you're mixing up with the "1.44 MB" floppy, where "MB" means "1000 * 1024 bytes"?

The packaging of the last hard drive I bought says "When referring to drive capacity, one gigabyte, or GB, equals one billion bytes and one terabyte, or TB, equals one thousand billion bytes." (some online variant used "trillion" instead of "thousand billion").

We can still discuss about the meaning of "billion" and "trillion" ;-)

Contexts

Posted Jul 14, 2022 15:32 UTC (Thu) by esemwy (guest, #83963) [Link] (4 responses)

I really wish American English used “milliard.” I just think it sounds cool.

Contexts

Posted Jul 28, 2022 16:35 UTC (Thu) by sammythesnake (guest, #17693) [Link] (3 responses)

The "long scale" is used in most of continental Europe and most places where the main language comes from Europe (or at least used to be, as in the UK, the "short scale" is taking over, though more slowly)

Million: 1,000,000
Milliard: 1,000 million
Billion: 1,000 milliard (a million million - a "Bi-illon")
Billiard: 1,000 billion
Trillion: 1,000 billiard (a million million million - a "tri-illion)
Trilliard: 1,000 trillion etc.

I prefer this style for the cool extra words, because it feels fractionally logical to my brain, but also because it gives a lot more headroom before running out of names I can remember :-D

Sadly, the "short scale" (with billion=1000 million, trillion=1000 billion etc.) is very definitely winning, partly because of US culture being so influential internationally, but also because it's the norm in scientific usage.

Now that I think of it, I wonder how we ended up with two conventions on this space in the first place...

Contexts

Posted Jul 28, 2022 17:08 UTC (Thu) by Wol (subscriber, #4433) [Link] (2 responses)

The short scale has won, the official definition of billion is 10^9, and trillion 10^12.

And I've never heard of your long scale, to me a billion was a million^2, a trillion was a billion^2. Easily described, you can have a million billion no problem ... (apart from it being a huge amount of whatever :-)

Cheers,
Wol

Contexts

Posted Jul 28, 2022 18:13 UTC (Thu) by mpr22 (subscriber, #60784) [Link]

I've never heard of your long scale, and neither has Wikipedia:

https://en.wikipedia.org/wiki/Long_and_short_scales#Long_...

American English has been using short scale since before the USA was a country.

France, bizarrely, switched from short scale to long scale in the 20th century (this being officially confirmed in 1961).

British official usage was declared to be short scale in 1974, on the occasion of the Tory member for Tiverton asking Harold Wilson if he was going to affirm British official usage to be long scale.

"Winning" long vs. short scale

Posted Jul 30, 2022 15:39 UTC (Sat) by smurf (subscriber, #17840) [Link]

Depends on your locale. In German we use long scale, which admittedly causes no end of confusion when people "translate" English news snippets, but I don't see that changing any time soon.

Contexts

Posted Jul 14, 2022 18:23 UTC (Thu) by anselm (subscriber, #2796) [Link]

The packaging of the last hard drive I bought says "When referring to drive capacity, one gigabyte, or GB, equals one billion bytes and one terabyte, or TB, equals one thousand billion bytes."

Of course it does – it makes the drive seem bigger! At least to people who naïvely assume that one terabyte is 2⁴⁰ bytes (what SI calls “one tebibyte”).

Contexts

Posted Jul 18, 2022 15:26 UTC (Mon) by Wol (subscriber, #4433) [Link]

I probably am, but I understood they had carried that confusion forward into the definition of GB et al ...

(Certainly when fdisk tells me how big my 4TB hard drive is, it comes out at rather more than 4 billion (that is 4x(10^6)^2 :-)

Cheers,
Wol

Contexts

Posted Jul 19, 2022 22:50 UTC (Tue) by nybble41 (subscriber, #55106) [Link] (9 responses)

HDD manufacturers have been using mixed units for sizes since the beginning. It's mostly about making the numbers look bigger.

RAM manufacturers, on the other hand, stuck mostly to binary sizes since RAM modules scale based on the number of address and data lines. The exception would be the modules with an odd number of data lines intended for parity or ECC bits... but the usable space after ECC is still generally a power of two.

The SI unit for information is the *bit*. Insisting on the SI definition of "kilo" with *bytes* as the base unit makes no sense; in pure SI terms you're measuring in multiples of 8,000 (or 8,000,000 etc.) SI base units, not powers of 1,000. The prefixes used for SI units can have other meanings in different contexts; no one insists that a microservice must be exactly one-millionth of a service, for example.

Unfortunately this has been muddied to the point that a simple "KB" or "kilobyte" can never again be considered unambiguous, so when precision matters I suggest using "KiB" for "binary kilobyte" or "KeB" for "decimal kilobyte". Forget about "kibibyte"; that just sounds stupid. (But if you insist-- the decimal equivalent can only be "kedebyte".) Or you can measure the data in bits rather then bytes, with an unambiguous SI decimal prefix.

Contexts

Posted Jul 20, 2022 7:00 UTC (Wed) by geert (subscriber, #98403) [Link] (8 responses)

The decimal prefix for a multiple of 1000 is not "Ke" or "kede", but "k" or "kilo". Hence "kB".

Contexts

Posted Jul 22, 2022 1:38 UTC (Fri) by nybble41 (subscriber, #55106) [Link] (7 responses)

> The decimal prefix for a multiple of 1000 is not "Ke" or "kede", but "k" or "kilo".

Within the SI system, sure. But as I said, bytes are not an SI unit, so SI prefixes do not apply. In another context the "kilo" prefix can easily mean something else entirely—even 1024.

Practically speaking, "kB" or "kilobyte" means either 1024 bytes (the traditional version dating back to the early days of binary computers, and an integer power of two in bits) or 1000 bytes (the new version coerced into the ill-fitting SI system, mixing base-2 and base-10 to arrive at 8,000 bits). A reader can't tell which you meant, so if the difference between 1024 and 1000 matters at all then you should avoid the term altogether. I only offered an unambiguous alternative modeled on the KiB / "kibibyte" nomenclature. It's not SI but it does a far better job of communicating the intent.

If you want to stick with SI, don't talk about bytes. The SI base unit for information is the bit.

Contexts

Posted Jul 22, 2022 2:49 UTC (Fri) by rschroev (subscriber, #4164) [Link]

You got it the wrong way around regarding 'new' and 'traditional':

Kilo meaning 1024 is the new version invented in the early days of binary computers, already wrong and in conflict with both existing standards and the Greek word χίλιοι (chilioi) it's derived from literally meaning 1000.

Kilo meaning 1000 is the traditional version, consistent with existing usage dating back to the end of the 18th century, long before the rise of binary computers; and consistent with the Greek word going back a few thousand years further still.

Contexts

Posted Jul 30, 2022 4:16 UTC (Sat) by JanC_ (guest, #34940) [Link] (5 responses)

You can use SI prefixes with non-SI units, and that is a very common, and even recommended by the organisation behind the SI system IIRC.

The only problem with using bytes is that the size of a byte is not fixed (it is hardware-dependent), so you have to specify somewhere what size the bytes you are talking about are…

It would have been much better if English language computer engineers had used 'octet' for “modern” 8-bit bytes instead (as the French do).

Contexts

Posted Jul 30, 2022 9:27 UTC (Sat) by mpr22 (subscriber, #60784) [Link] (3 responses)

> so you have to specify somewhere what size the bytes you are talking about are…

Outside of what are now very niche contexts, this is not a serious concern.

Contexts

Posted Jul 30, 2022 11:36 UTC (Sat) by Wol (subscriber, #4433) [Link] (2 responses)

> Outside of what are now very niche contexts, this is not a serious concern.

Niche contexts ... like networking?

I was always under the impression that you can't divide your networking kb by 8 to get your data transfer kB, because an 8-bit data byte is about a 10-bit network byte ...

(or is it because the b in networking stands for baud which is most definitely not a bit ...)

Cheers,
Wol

Contexts

Posted Jul 30, 2022 13:47 UTC (Sat) by pizza (subscriber, #46) [Link]

> I was always under the impression that you can't divide your networking kb by 8 to get your data transfer kB, because an 8-bit data byte is about a 10-bit network byte ...

That's still a good rule of thumb, as when you factor in network/protocol overhead, it works out pretty consistently:

10Mbps =~ 1MB/s, 100Mbps =~ 10MB/s, 1000MBps =~ 100MB/s

(Over 1Gbps it tends to fall off somewhat; for example the most I recall getting using 10Gbps fiber (and 9K jumbo frames) was about 550MB/s, though that was probably CPU bound as I was using 'scp')

Contexts

Posted Jul 30, 2022 19:29 UTC (Sat) by mpr22 (subscriber, #60784) [Link]

> I was always under the impression that you can't divide your networking kb by 8 to get your data transfer kB

It depends on where the network technology's speed rating is measured.

For example, 100BASE-TX's rated speed of 100 Mbit is measured in terms of the 25MHz 4-bit parallel data stream fed to the MII, not the 125 MHz run-length-limited serial data stream the 4b5b encoder behind the MII feeds to the MLT-3 encoder that generates the three-level waveform seen on the wire.

Of course, 100 Mb/s of packet data transfer doesn't translate into 12.5 MB/s of actual application data transfer, because of the protocol overheads imposed by various layers.

Contexts

Posted Jul 30, 2022 15:42 UTC (Sat) by smurf (subscriber, #17840) [Link]

Two syllables instead of one? no way. ;-)

Native Python support for units?

Posted Jul 13, 2022 7:15 UTC (Wed) by Karellen (subscriber, #67644) [Link] (14 responses)

(They can of course be combined, 10e1j does indeed mean 100*sqrt(-1).)

Huh. I did not read 10e1j as indeed meaning (10e1)j. Rather, I read it as 10e(1j).

Native Python support for units?

Posted Jul 13, 2022 7:50 UTC (Wed) by NYKevin (subscriber, #129325) [Link] (10 responses)

While that interpretation is not nonsensical, it would be pretty weird. It would have the effect of rotating about the origin (in the complex plane) in increments of ln(10) radians. I'm... really not sure why anyone would want to do that, to be honest.

Native Python support for units?

Posted Jul 13, 2022 11:55 UTC (Wed) by willy (subscriber, #9762) [Link]

Would you prefer the example of 10eπj?

Native Python support for units?

Posted Jul 13, 2022 17:05 UTC (Wed) by Karellen (subscriber, #67644) [Link] (8 responses)

In that context, sure, that particular operation might not make much sense. But generally, if you see a "2i" somewhere, in my mind the precedence of "2i" is super high, and my first instinct was just that "i" binds to a number before it tighter than "e" does to the number after it. I've no idea if that's right, it's just how my brain "naturally" grouped it.

I wonder, if it was written "10×10^1j", where would the precedence of the "j" go? Lower than "×"? Or equivalent but still applied last because of its rightmost position?

"Imaginary part" is always forgotten on the operation order-of-precedence lists. I wonder if it feels left out! :-)

Native Python support for units?

Posted Jul 13, 2022 19:38 UTC (Wed) by dtlin (subscriber, #36537) [Link] (7 responses)

When you see "2𝑖" in math, it means "2·𝑖". So it makes sense for "10e1j" to be mean "10e1*𝑖", and "𝑖" is spelled "1j" in Python.

I don't really see the need for a postfix unit syntax. Using the usual * and / operators generalizes to stuff like "ft·lb" and "m/s" which would be awkward to express as single tokens. A library (such as Pint in the article or astropy.units mentioned below) with appropriate operator overloads should be enough.

Native Python support for units?

Posted Jul 14, 2022 1:29 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (6 responses)

Strictly speaking, it's slightly more convoluted than that.

In Python (not math), the notation 10e1 is *not* a combined-multiplication-and-exponentiation operator applied to the arguments 10 and 1. Rather, it is another way of writing the float literal 100.0. Similarly, in Python, the j suffix is not a multiply-by-𝑖 postfix operator. Rather, it is the suffix for an imaginary literal. You cannot write some_varj and expect it to multiply some_var by 𝑖, nor can you write some_vare1 and expect it to multiply some_var by 10, because in both cases, the syntax is meant for constructing literals. This also means that both of these syntaxes have the highest possible precedence, since they're not proper operations at all, they're just literal notation. This cannot even be overridden by parentheses. 10e(1j) is a syntax error, not 10 × 10^𝑖. The latter can only be written explicitly using the * and ** operators (or equivalent functions from the cmath module).

To be even more pedantic: The imaginary literal syntax consists of a (real) float literal followed by the j suffix. So you have to parse a real literal before you can even deal with parsing an imaginary literal, and that's why the j has to bind less tightly than the e. If you had instead written 10 * 10 ** 1j, then the opposite would happen, and you really would get 10 × 10^𝑖 (i.e. the j binds more tightly than the exponent operator, contrary to PEMDAS and similar rules).

Native Python support for units?

Posted Jul 30, 2022 16:53 UTC (Sat) by Wol (subscriber, #4433) [Link] (5 responses)

Don't you mean 10e1 is 10.0 - 10 to the power 1?

Ten SQUARED is a hundred - 10e2.

Cheers,
Wol

Native Python support for units?

Posted Jul 30, 2022 17:57 UTC (Sat) by anselm (subscriber, #2796) [Link] (4 responses)

Don't you mean 10e1 is 10.0 - 10 to the power 1?

$ python3
Python 3.9.2 (default, Feb 28 2021, 17:03:44) 
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> 1e1
10.0
>>> 1e2
100.0
>>> 10e1
100.0
>>>

Native Python support for units?

Posted Jul 30, 2022 20:34 UTC (Sat) by Wol (subscriber, #4433) [Link] (3 responses)

Hmmm my bad ...

But 10e1 is rather odd notation - 10 x 10^1

I'm used to either scientific notation where it's en and n is a multiple of 3, or (dunno what it's called) where 1 <= mantissa < 10.

( I know some people prefer the mantissa between 0 and 1, rather than 1 and 10, but the combination of mantissa not between 0 and ten, and exponent not a multiple of 3, is, well, weird!)

Cheers,
Wol

Native Python support for units?

Posted Jul 30, 2022 23:51 UTC (Sat) by rschroev (subscriber, #4164) [Link] (1 responses)

<mantissa>e<exponent> always means mantissa * 10^exponent, with whatever value for the mantissa and exponent.

When communicating with other people it's probably best to stick with either scientific notation in the strict sense (1 <= mantissa < 10) or engineering notation (exponent is a multiple of 3, 1 <= mantissa < 1000), but other representations are just as valid and programming languages don't care in the least how you scale the exponent.

Native Python support for units?

Posted Jul 31, 2022 7:50 UTC (Sun) by Wol (subscriber, #4433) [Link]

> and programming languages don't care in the least how you scale the exponent.

But brains usually do ... :-)

Cheers,
Wol

Native Python support for units?

Posted Jul 31, 2022 17:21 UTC (Sun) by anselm (subscriber, #2796) [Link]

I'm used to either scientific notation where it's en and n is a multiple of 3, or (dunno what it's called) where 1 ≤ mantissa < 10.

Scientific pocket calculators (are these still a thing?) used to call the “1 ≤ mantissa < 10 and arbitrary exponent” style “scientific” and the “1 ≤ mantissa < 1000, exponent a multiple of 3” style “engineering”.

Native Python support for units?

Posted Jul 22, 2022 9:54 UTC (Fri) by Darkstar (guest, #28767) [Link] (2 responses)

On a related note, I don't understand why they use the character (unit) "j" for the mathematical constant "i"? Wouldn't "10i" be much clearer than "10j"?

Native Python support for units?

Posted Jul 22, 2022 13:58 UTC (Fri) by farnz (subscriber, #17727) [Link]

"j" is used for the imaginary unit instead of "i" in any field where "i" would be potentially confusing. Electrical engineering, for example, uses "I" for currents, and so talking about an AC current of I = (5 + 3i) / (4 + 2i) has room for confusion when spoken, where I = (5 + 3j) / (4 + 2j) does not.

Native Python support for units?

Posted Jul 23, 2022 7:02 UTC (Sat) by smurf (subscriber, #17840) [Link]

Guido's answer is that an i, esp. when uppercased, looks too much like the digit 1 to be confusing.

Also, well, Mathematicians use i while engineering uses j because i is either current or an index. Python had to choose one, so Guido picked j.

Units without types?

Posted Jul 13, 2022 14:55 UTC (Wed) by tialaramex (subscriber, #21167) [Link]

Units seems like an odd place to begin without agreeing at least the Properties for which these Units will be used, as types.

d = 3.2metre + 6.1inch + 8mile

... implies the existence of a distance type, for which metre, inch and mile are just constant factors, otherwise why shan't I write:

x = 1mile + 1kilogram + 1hour # What does this mean?

In a language like C++ we can assume that 3.2metre has some type my::distance and so if we find a way to write 4mile, either that's also a my::distance (and so this actually does what we wanted) or else it's some::other::property and we can't add them together so we know we need an adaptor. There's no risk that oops, 3.2metre + 4mile = 9.2inch and our clever unit system actually made things more confusing not less.

Native Python support for units?

Posted Jul 13, 2022 17:10 UTC (Wed) by xyz (subscriber, #504) [Link]

There was a talk at CppCon 2020 about all the issues raised here.
What I found interesting was the way as the authors discussed the challenges and pitfalls from this topic:

A Physical Units Library For the Next C++ - Mateusz Pusz - CppCon 2020
https://youtu.be/7dExYGSOJzo?list=RDCMUCMlGfpWw-RUdWX_JbL...

Furlongs

Posted Jul 13, 2022 19:57 UTC (Wed) by nickodell (subscriber, #125165) [Link] (2 responses)

>At least on my system, Pint seems to have a slightly inaccurate value for a furlong, which is defined as 1/8 mile, or 660 feet

I was curious why this happens, so I took a look at the Pint source code. It seems that Pint defines the furlong in terms of the survey foot, which is defined as exactly 1200/3937 meters, while the normal foot is defined as exactly 0.3048 meters.

>>> (1 * ureg.furlongs / ureg.survey_foot).to('')
<Quantity(660.0, 'dimensionless')>

Furlongs

Posted Jul 14, 2022 9:57 UTC (Thu) by nowster (subscriber, #67) [Link] (1 responses)

USA uses the survey mile. The rest of the world uses the statute mile (1760 yards). There's a 2ppm difference between them, as you note. The US standard yard changed in 1959 to match the International Yard, except for survey purposes. In the US references to a furlong (220 yards) are assumed to be based on the Survey Yard, not the International Yard.

And then we come on to volume measures...

Furlongs

Posted Jul 15, 2022 18:46 UTC (Fri) by NYKevin (subscriber, #129325) [Link]

The US only sometimes uses the survey units, most commonly in the context of surveying. As a result, you will sometimes see survey units used to define the borders of real estate parcels and such, but they are not the standard units that most Americans actually use on a day-to-day basis. For example, if you buy a foot-long ruler, it should be an international foot (i.e. 304.8 mm), not a survey foot. NIST and other federal agencies plan to phase out the survey units beginning in 2023, but they do plan to retain some legacy data in that format for backwards compatibility reasons.

Source: https://www.nist.gov/pml/us-surveyfoot

Native Python support for units?

Posted Jul 13, 2022 20:22 UTC (Wed) by mb (subscriber, #50428) [Link]

In PySpice they use the @ operator for units.

8@u_kOhm
22@u_nF

That's kind of useful and easier to read than plain literals. But I'd still prefer something like

8_kOhm
22_nF

Native Python support for units?

Posted Jul 13, 2022 21:43 UTC (Wed) by pj (subscriber, #4506) [Link] (7 responses)

Also note that conversion ratios may not be constant. Consider units of 'dollars' and 'euros'... to a currency trader, the conversion rate may change day to day or even second to second.

Native Python support for units?

Posted Jul 14, 2022 0:56 UTC (Thu) by k8to (guest, #15413) [Link] (4 responses)

It would be, IMO, very weird to try to handle currency the same way as measurement units. They usually have different application domains, conversion behavior, precision concerns, built in expectations around rounding, and so on.

Native Python support for units?

Posted Jul 14, 2022 7:37 UTC (Thu) by taladar (subscriber, #68407) [Link] (3 responses)

That said it would still be valuable to make sure the units in any calculation work out properly even with currencies, e.g. that I multiply my value in Euro with the Dollars/Euro value to get Dollars, not with the Dollars/Yen value.

Native Python support for units?

Posted Jul 14, 2022 8:07 UTC (Thu) by farnz (subscriber, #17727) [Link] (2 responses)

That's an important point. There are two uses of units in a program:

To encode the correct scale factors to your numbers so that you can combine arbitrary units and get the right answer (e.g. 3 feet times 10 Newtons times 1,200 rotations per minute is how many metric horsepower?)
As a type system to stop you making nonsense combinations of units (e.g. if I know something's price in USD/feet, I can't multiply by the volume of the object I'm building to get a price in JPY, because the units don't work out).

The first is very Pythonic in nature - the second amounts to a static type system for units, and I'm not sure how that fits in with the dynamic typing Python prefers.

Native Python support for units?

Posted Jul 15, 2022 19:09 UTC (Fri) by NYKevin (subscriber, #129325) [Link]

Python already *sort of* supports (2) with NewType, see https://docs.python.org/3/library/typing.html#distinct

The problem with the existing support is that, for example, [length] + [length] should be of type [length], but if you try to do that with NewType, you just get back float or int (types aren't preserved across binary operations). That's fixable for addition and subtraction relatively easily (define and type hint appropriate dunder methods), but it would be more complicated in the case of multiplication and division, because Python's generics are insufficiently advanced to support e.g. [length] / [time] = [speed]. You could hard-code all of those conversions one at a time, and perhaps generate the code somehow, but a lot of fundamental constants have weird units that wouldn't necessarily have a "standard" interpretation, such as the Boltzmann constant ([length]^2 * [mass] * [time]^-2 * [temperature]^-1), and it would not be fun to hard code those as well.

Ideally, a unit library should allow you to multiply and divide whatever by whatever and just make sense of it, but that would require the ability to put non-type arguments into Python's generics. Then we could express the Boltzmann constant's type as something like Quantity[2, 1, -2, 0, -1, 0, 0], where each number indicates the exponent of a given unit. Right now, Python does not let you do that (to the best of my understanding), because the arguments have to be types, not integers.

However, this also raises more philosophical problems. There are seven base units in the SI system, so one might assume that Quantity should have a fixed arity of seven exponents. However, the SI system doesn't cover many commonly-used units, such as bytes. There are also logarithmic units such as the bel (or decibel) and the cent (used in music theory, not to be confused with various currencies), which are subject to more complicated coherence rules than typical (linear) units. And the radian is officially a derived unit of the form m/m (so that the equation s=rθ) is valid, but that would just simplify to dimensionless unless you special-case it somehow (the steradian has the same problem).

Native Python support for units?

Posted Jul 24, 2022 23:41 UTC (Sun) by pdundas (guest, #15203) [Link]

It seems to me that units are not a property of a variable, but of a number. Perhaps that is more pythonic - or more amenable to duck typing (if a value walks and talks like a length... it's a length). Python already has the ability not to add incompatible types at run time (like a variable representing a string to a variable representing an integer). So if you were using numbers with types, it would be just as pythonic to fail to add 5 metres to 30 seconds.

Interestingly you can *multiply* or *divide* distance and time, giving a quantity with a complex unit - speed might be 5m/30s (or 10m/minute, or some other number in furlongs per microfortnight). All kinds of weird and wonderful composite units are available - as some posters mentioned earlier. Or consider electrical units - Amps are Coulombs per second, Volts are Joules per Coulomb, and Watts are Joules per second - or something like that - it's been a while. Which raises interesting possibilities for *display* of numbers with units attached, when they need to be scaled, or converted between families of units that measure the same thing, or expressed (as for Current) in a particular way.

As for how to do that in Pythin, I've no idea. But it's a fascinating problem.

Native Python support for units?

Posted Jul 14, 2022 1:10 UTC (Thu) by tialaramex (subscriber, #21167) [Link] (1 responses)

Currency traders are, as their name implies, buying and selling currency, and so necessarily that's not a conversion. If I buy 14 carrots for 85 pence, that's not a conversion between pence and carrots, it's a purchase. Likewise if I buy 85 Euros for 90 Dollars, that was a purchase and not a conversion, even if I attempt it at the same moment I can't also sell 85 Euros for 90 Dollars, I will find that (because people doing this want to make money on their transactions) my 85 Euros sells for rather less than 90 Dollars even though that's what it cost me.

In contrast if I have 0.0508 metres of something I also have 2 inches of it, that's a conversion, I don't have less of it or more of it, I just changed my units of measurement.

The thornier case is situations where arguably there is a conversion, but pragmatically that's never going to be what you meant. If I have a kilogram, and I want joules, I probably need to re-examine my priors rather than use Einstein's famous equation which gives a very large number indeed for this conversion.

Native Python support for units?

Posted Jul 30, 2022 17:00 UTC (Sat) by Wol (subscriber, #4433) [Link]

> Likewise if I buy 85 Euros for 90 Dollars, that was a purchase and not a conversion, even if I attempt it at the same moment I can't also sell 85 Euros for 90 Dollars,

??? Isn't that exactly what the guy on the other side of your trade just DID?

Whether the transaction is reversible has nothing to do with the transaction, and everything to do with whether the parties are trading or buying/selling.

Cheers,
Wol

Native Python support for units?

Posted Jul 14, 2022 7:27 UTC (Thu) by niner (subscriber, #26151) [Link] (4 responses)

The Physics::Measure module provides a comprehensive system of units by exporting generated (i.e. combined SI-prefixes, base and derived units) postfix operators.

use Physics::Measure :ALL;

# Define a distance and a time
my \d = 42m; say d; #42 m (Length)
my \t = 10s; say t; #10 s (Time)

# Calculate speed and acceleration
my \u = d / t; say u; #4.2 m/s (Speed)
my \a = u / t; say a; #0.42 m/s^2 (Acceleration)

As a special treat it allows working with measurement errors:

my \d = 10m ± 1;
my \t = 8s ± 2;
say d / t # 1.25m/s ±0.4

More examples (including how to use non-SI units) are to be found in the docs at https://github.com/p6steve/raku-Physics-Measure

All of this is lexically scoped, like almost everything in Raku. While the module itself does not export postfix operators for non-SI units, it would be easy for another module to provide those (they are really just small wrappers around object constructors). Thanks to lexical scope, different definitions of "mile", etc. wouldn't be a problem.

All of this comes from the power of supporting custom operators in the language. Python already supports overloading operators via magic methods like __add__. All it would need would be the possibility to add new operators. Postfix could give you units, infix operators even those nifty measurement errors. Sadly, I would bet that Python will just not get that support, as it would require making the language grammar extensible and so far ease of parsing has always been a design priority for Python.

Native Python support for units?

Posted Jul 15, 2022 21:00 UTC (Fri) by jrwren (subscriber, #97799) [Link] (2 responses)

> The number "4K" might mean four kelvins, 4000, or 4096, depending on context

Only if it is implemented poorly. SI clearly defines K as Kelvins, period. k is kilo and ki is kibi (although maybe not SI)

https://www.nist.gov/pml/owm/metric-si-prefixes

Native Python support for units?

Posted Jul 18, 2022 14:44 UTC (Mon) by jwilk (subscriber, #63328) [Link]

Kibi is "Ki" (with uppercase K) and indeed not SI.

Native Python support for units?

Posted Jul 19, 2022 20:54 UTC (Tue) by p6steve (guest, #159775) [Link]

Physics::Measure & Physics::Unit follow the SI conventions for 'K' and 'k' (plus many non-SI conversions)
#viz. https://en.wikipedia.org/wiki/International_System_of_Units

say 29K.in('°C'); # -244.15°C
say 29km.in('miles'); # 18.019765mile

Sadly, Kibi (and other computing units are not yet implemented)

Native Python support for units?

Posted Jul 17, 2022 19:34 UTC (Sun) by NYKevin (subscriber, #129325) [Link]

That restriction is not for ease of parsing. It is because Python's core developers believe (perhaps correctly, but it's a matter of opinion) that Perl is inferior to Python precisely because Perl has such elaborate and flexible syntax. They do not wish to recreate Perl in Python, nor to give third-party developers the tools to do so. The "ease of parsing" is simply a bonus for them.

(Yes, Raku is not Perl, but it's close enough for the comparison to still be relevant.)

Native Python support for units?

Posted Jul 22, 2022 8:27 UTC (Fri) by callegar (guest, #16148) [Link]

I may be missing something, but isn't this all syntactic sugar to save a "*" that is probably better left in place? The class system and the operator overloading mechanism has already proven to be powerful enough to define a `Quantity` class, so you can write `12*mm` and that does not seem particularly less readable to me than `12mm` or `12_mm` or `1[mm]`. The use of `*` seems also to be the appropriate thing: when you write `12mm` you really mean `12*mm` (the unit taken twelve times). Why `12*mm` should be worse than `12[mm]` (one char longer to type) or `12_mm`? In fact the explicit usage of the `*` shows its advantages when you start speaking of compound units like `12*(m/s)` that seems more readable and better parseable than `12_m/s`. It is also easier to write `a * mm` than `a*1[mm]` if you want to apply the unit to a number stored in a variable. Incidentally, note that some languages support the concept of implicit multiplication (e.g. Mathematica). If you write `1m` there, it actually means `1 * m`. So if anything, shouldn't the discussion be about cases to allow an implicit multiplication?