|
|
Log in / Subscribe / Register

DeVault: Announcing the Hare programming language

DeVault: Announcing the Hare programming language

Posted May 3, 2022 11:41 UTC (Tue) by nix (subscriber, #2304)
In reply to: DeVault: Announcing the Hare programming language by NYKevin
Parent article: DeVault: Announcing the Hare programming language

> Then Hyrum's Law suggests that you should have a short planned outage once a year so that everyone knows they can't really rely on it

... which means nothing you support can be relied on, so why would anyone want to use it? Nobody likes their stuff suddenly breaking when they need it. Carry this to its extremes, with everyone doing this sort of thing in an uncoordinated fashion, and you get systems that never work because *some* component is always broken by its dependencies silently breaking. This doesn't exactly seem like a good future to me.


to post comments

DeVault: Announcing the Hare programming language

Posted May 3, 2022 12:24 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

The words just prior to that conditionalize it on:

> let's say that service X is unsupported but has been chugging along, best-effort, and somehow manages to provide three nines of uptime on average.

The outages are letting people know that they're on thin ice. This is far better than doing a 3-day notice of "oh, sorry, we were cleaning up and realized no one was working on some project; no one cares internally, so you shouldn't either. good luck, have fun, so long, and thanks for all the fish^W^W^Wyour faithful patronage" and then being immortalized on something like the Google Graveyard[1], its sibling sites, or Our Incredible Journey[2]. Think of it as deprecation warnings for network services.

Oh, you did remember to check your error codes on external service communications, right?

[1] https://killedbygoogle.nl/
[2] https://ourincrediblejourney.tumblr.com/

Hyrum's Law for Downtime

Posted May 3, 2022 13:48 UTC (Tue) by atnot (guest, #124910) [Link]

The point of the planned outages and similar measures is to ensure that "never goes down" does not accidentally become a property of the system that code develops such a hard reliance on that it causes additional downtime.

Lets say for example (inspired by real events) that you have some some super reliable database cluster. It's so simple and reliable that it doesn't fail a single request for two years. Over that time, a lot of applications get written that consume that service. Because it is so reliable, the developers never notice that they have introduced bugs in their timeout, retry, backoff or failover logic.

Then one day, there's a hiccup on one instance of the cluster and it hangs on some requests for a few seconds. A small fraction of the application processes hang, or crash and get restarted. Because the database is incredibly reliable, the application has started to depend on the database being available at startup, and starts crashing in a loop. The database instance gets overwhelmed and goes unresponsive for a few seconds. This repeats the process, causing more and more application services to crash, until eventually none are left in a running state. All of them are constantly hammering the database trying to start up, taking it down completely. This outage cascades through all of the downstream dependents, taking days to fully resolve.

When people accidentally rely too heavily on things being available, even the smallest, transient failures start having serious consequences. Those consequences often cause far more damage and user-visible downtime than simply causing a few seconds of deliberate downtime a month would have.

DeVault: Announcing the Hare programming language

Posted May 3, 2022 13:50 UTC (Tue) by farnz (subscriber, #17727) [Link]

Getting you to stop depending on unsupported services is the goal. The service is unsupported, and its uptime is a fluke - by taking it down frequently, you cause people who need it to be up to switch to something that's supported in order to retain their uptime, or they do whatever's needed to get the service they depend upon back into support.

Basically, it's better to have the service become deliberately unreliable and trigger people into worrying about it, than to let it float along seemingly working just fine and then go offline permanently when it breaks and no-one knows how to fix it. A planned outage means that anyone who doesn't know that they depend on your unsupported service learns about their dependency at a point where fixing it is trivial, rather than finding out that they depend on it when the service breaks in a way that cannot be fixed.

I've had experience of working somewhere that refused to have planned outages for unsupported services - it was not pretty when the hardware failed, and it turned out to be impossible to get replacement parts, and non-trivial to port the software onto a machine we could get. And it turned out that something critical depended on the service on hardware that was now dead and not replaceable. Oops.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds