LWN.net Logo

Reliability: Unix and Linux beat Windows (heise online)

Reliability: Unix and Linux beat Windows (heise online)

Posted Apr 18, 2008 0:20 UTC (Fri) by dlang (subscriber, #313)
In reply to: Reliability: Unix and Linux beat Windows (heise online) by alvieboy
Parent article: Reliability: Unix and Linux beat Windows (heise online)

over the last few months I've had a rash of hiccups (failover then fail back) on boxes that
appear to be related to 447 days of uptime. I had about a hundred systems hit this.

you know, I really should upgrade more frequently ;-)


(Log in to post comments)

Reliability: Unix and Linux beat Windows (heise online)

Posted Apr 18, 2008 2:47 UTC (Fri) by yarikoptic (subscriber, #36795) [Link]

what 447 days of uptime issue? I couldn't google it up.
my file server (quite a busy one) celebrates a year of uptime today, so I started to worry
that I will have to reboot it soon, and since it is running Debian, I am worrying it will be
such a long downtime period (probably not long enough for me to run to grab a cup of
coffee)!!! ;-)

Reliability: Unix and Linux beat Windows (heise online)

Posted Apr 18, 2008 7:02 UTC (Fri) by dlang (subscriber, #313) [Link]

I don't know the specific bug, but it was pretty consistant.

running 2.6.9 and an older heartbeat (1.2.x) right around 447 days of uptime heartbeat would
report a large delay in receiving a packet, long enough that it would declare the other system
dead (taking over) and then the flow would start again and the systems would realixe they were
both active for a few seconds. in my case I don't have shared drives so the only harm was the
failover/failback flop (~15 seconds of outage)

everything seemed to continue to work after that.

i figured that this was close enough to the 497 day time when 32 bit counts wrap that I wrote
it off to some interaction with this and moved on.

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds