Posted May 7, 2011 3:09 UTC (Sat) by mlawren
Parent article: Scale Fail (part 1)
Great article! Sadly everything you wrote rings very true. I've seen the No Metrics conversation go the other way as well:
"app: The network is broken."
"net: Hmmm... let me check. Nope, my measurements indicate everything is ok"
"app: Are you sure?"
"net: Yes, here is the data, see for yourself.
"app: But must be the network!!!"
"net: We haven't made any changes. Look at these historical graphs of usage. That spike from last week was the file-sharer who had that unfortunate accident in the carpark on Monday. Nothing difference since then."
"net: Have you checked your application?"
"app: Of course, it's running fine."
"net: How do you know?"
"app: I just know."
"net: When was the last time it ran fine?"
"net: What have you changed since then?"
"app: Only re-worked the dispatcher, and migrated the cache location to the other campus. But we checked, the code runs fine!"
"net: Say, you drive that green Ford Focus don't you?"
"app: Yeah, why?."
"net: No reason. Your problem will be solved by tomorrow."
to post comments)