User: Password:
Subscribe / Log in / New account

Ksplice: kernel patches without reboots

Ksplice: kernel patches without reboots

Posted Apr 29, 2008 22:16 UTC (Tue) by gravious (guest, #7662)
In reply to: Ksplice: kernel patches without reboots by danielpf
Parent article: Ksplice: kernel patches without reboots

You are quite right.

As has been pointed out here recently, the system as a whole may achieve 100% availability by
using pervasive redundancy but the individual machines in the system won't. The associated PDF
suggests that there may be long lived processes that would be intolerant to any outages due to
their inability to save state properly or would exhibit a certain unhappiness with network
connections dying. One would suggest that a system administrator designing and implementing a
HA environment would make sure that each and every piece of software on each machine could
handle a reboot if necessary.

While I love the idea of hot updating and while I think this implementation is a fine fine
hack worthy of much admiration I would welcome a concrete use case other than, "look I have 5
years, 2 months, 4 hours and 16 minutes uptime".

(Log in to post comments)

Ksplice: kernel patches without reboots

Posted Apr 30, 2008 2:52 UTC (Wed) by gdt (subscriber, #6284) [Link]

There are many cases where the redundancy path to high availability is too expensive. Linux running on an closet ethernet switch would be one example. Are you really going to provision a second set of switches and cabling for the average corporate desktop computer?

Ksplice: kernel patches without reboots

Posted Apr 30, 2008 2:57 UTC (Wed) by dlang (subscriber, #313) [Link]

there is not much overlap between the set of uses where cost prohibits having redundant
hardware and the set of uses where a system cannot be down for a reboot.

Many upgrades to Cisco equipment requires a reboot, and they are used in many places that are
extremely sensitive to outages

Ksplice: kernel patches without reboots

Posted May 1, 2008 2:33 UTC (Thu) by a9db0 (subscriber, #2181) [Link]

Here's one: telecom

Big phone switches aren't usually redundant, and are frequently utilized 24x7x365(6).  The
rise in VoIP has brought this into higher relief, and I'd expect to see some of the telecom
folks looking very hard at this.

My uptime is never that good - the power around here is way too flaky.  Even for my oversized

Ksplice: kernel patches without reboots

Posted May 1, 2008 16:42 UTC (Thu) by piggy (subscriber, #18693) [Link]

The reason big phone switches have traditionally been non-redundant is that they were
staggeringly expensive when first created.

The telecom industry is still absorbing the consequences of a 1000X improvement in both price
and performance. Reliability of individual components has also dropped by a couple orders of
magnitude, so redundancy is becoming the solution of choice.

I agree with earlier assertions that the disjunction between businesses who want long uptimes
and and those willing to put in redundant equipment is vanishingly small.

Perhaps individuals after long uptime for geek-cred are a large enough population to sustain

Ksplice: kernel patches without reboots

Posted Apr 30, 2008 18:14 UTC (Wed) by droundy (subscriber, #4559) [Link]

Indeed.  Another example would be that of scientific computing.  If I've got a job that has
been running for a couple of weeks, and will finish in just a couple weeks more, I'd rather
not reboot the system.  Redundancy gains me nothing (since I'm utilizing all my computing
resources already).  The code could be trained to checkpoint (and some of my code is so
enabled), but that generally has a high cost (in terms of bandwidth and disk use), so you
don't want to checkpoint very often, if at all.

Ksplice: kernel patches without reboots

Posted Apr 30, 2008 20:53 UTC (Wed) by dlang (subscriber, #313) [Link]

any system like this should be isolated anyway, so delaying the security update for a week or
a month to let your job finish should not be a big problem.

remember that if a box is not exposed it doesn't need a security update.

Ksplice: kernel patches without reboots

Posted May 1, 2008 0:22 UTC (Thu) by gdt (subscriber, #6284) [Link]

any system like this should be isolated anyway

In practice that's increasingly difficult. Datasets are growing so large that the last thing you want is two copies of them, so you end up with the input data being remotely hosted and pulled across the Internet on demand. It's this sort of use that the academic community created the Internet for.

The other problem with scientific computing is simply that I might not want to reboot the system at this moment. Imagine that I've concurrently booked four radiotelescopes, which is about a six-month wait. I've got them streaming into my processing cluster. A security patch arrives. If I apply the patch and reboot then I lose resolution, and thus my experiment may be inconclusive. If I don't apply the patch and the machine is subverted then there are data integrity issues and again the experiment is inconclusive. In both cases I wait another six months and try again. My favoured choice would be to apply the patch whilst still running the telescope correlation.

I'm not saying the ksplice is the best thing since sliced bread. But it does have some use, particularly outside of the typical server application that Linux is generally used for.

Ksplice: kernel patches without reboots

Posted May 1, 2008 0:36 UTC (Thu) by dlang (subscriber, #313) [Link]

if you don't have more then one copy of your data you run the serious risk of loosing it. 

even for huge datasets, it's cheaper to keep an extra copy then to recreate the data.

I'm not saying that ksplice is worthless, I'm disagreeing with the idea that was posted that
it's required for these situations.

Ksplice: kernel patches without reboots

Posted May 1, 2008 12:57 UTC (Thu) by nix (subscriber, #2304) [Link]

Yeah, but using an extra copy for failover requires that it be online 
*now*. Using an extra copy for redundancy only does not require that (and 
is much cheaper: how will you keep an extra online copy of the ATLAS 
detector's collected data? It's far too large to keep even *one* copy at 
any one site: keeping an extra online copy means doubling the size of an 
already large collaboration...)

Ksplice: kernel patches without reboots

Posted May 8, 2008 11:40 UTC (Thu) by anandsr21 (guest, #28562) [Link]

Do you know how much data Google keeps. And they keep three copies not too. And in
Geographically separated locations. So the solution is essentially to make multiple copies.
Actually as Google has shown even two copies are not enough.

Ksplice: kernel patches without reboots

Posted May 1, 2008 13:04 UTC (Thu) by richardr (guest, #14799) [Link]

But the point about academic workloads is that often we use every desktop in the department as
a distributed supercomputer, so the nodes are both exposed to every possible attack because
people want their desktops accessible from outside (at least via ssh) and want to be able to
surf the web, and may be running background jobs for weeks at a time belonging to other people
who don't want them to be restarted. The conflict between these two factors is where this kind
of technology becomes important.

Ksplice: kernel patches without reboots

Posted May 1, 2008 21:20 UTC (Thu) by dlang (subscriber, #313) [Link]

if you are running on random desktops that are used for other things, your software had better
be able to handle reboots/crashes/power outages anyway as those events will happen.

while I see some use for live patching, I really don't see where it becomes a killer feature

Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds