Domesticating applications, OpenBSD style

Posted Jul 22, 2015 22:18 UTC (Wed) by dlang (guest, #313)
In reply to: Domesticating applications, OpenBSD style by flussence
Parent article: Domesticating applications, OpenBSD style

for fast lookups of log data what you want is a database, not a filesystem or a file.

And if you are going to make it 'the' way of dealing with logs, you need to make it scale well to handle people's logging needs.

if you are just offering it as an optional thing that can be turned off, the requirements are not as high.

Domesticating applications, OpenBSD style

Posted Jul 23, 2015 11:35 UTC (Thu) by dgm (subscriber, #49227) [Link] (10 responses)

It all depends on what you mean by "lookup". Logs are essentially sequential, both in how you write and how you use them. I have dealt with such problems and often found that simply sequentially scanning the data is faster than jumping from here to there following pointers through the index and then the data itself, specially when you keep related data together. Additionally, sequential data structures are more compact and less prone to corruption if you take the precaution of adding synchronization points.

Domesticating applications, OpenBSD style

Posted Jul 23, 2015 12:42 UTC (Thu) by dlang (guest, #313) [Link] (9 responses)

This thread on logging is off-topic for the article, but I'm willing to continue discussing things

When I say that logs need to be in a database for searchability, I'm not meaning a traditional SQL database, but something optimized for time-series data like ElastiSearch/splunk. Something that can deal not just with the raw text of the logs, but also with parsing the logs and creating indexes to make searching them more efficient than grep.

Grep is a great took if you have small logs (or know where to search in yout large logs and have them split into small enough chunks. but when you start combining the logs from even dozens, let alone hundreds or thousands of servers together, the sort of split options that the journal provides don't help you.

You don't want to do reports from the database as that's inefficent and doesn't scale well. this doesn't matter for a single laptop or 5-server company, but as you get up into tens or hundreds of gigs/day of logs it matters more. I've been in environments where we've seriously had to deal with Gig-E not being fast enough to handle all the logging traffic if it was done the 'obvious' way.

and splitting the logs, but then searching across all of them is counterproductive

> I have dealt with such problems and often found that simply sequentially scanning the data is faster than jumping from here to there following pointers through the index and then the data itself, specially when you keep related data together. Additionally, sequential data structures are more compact and less prone to corruption if you take the precaution of adding synchronization points.

I strongly agree, and in part this is part of why I dislike the journald implementation. Reading the logs out of journald isn't a sequential read through the data, it's following pointers from one message to the next (and these pointers can get corrupted, which has led to cases where following the pointers gives you a loop)

I like to keep my logs organized in multiple ways

1. I keep an authoritative copy for audit/legal reasons that's a simple sequential text file, gzipped and chunked into 'reasonable' sizes (typically per-minute files that then get signed/archived at larger intervals)

2. I parse the messages extensively and store them in something that makes it possible to do fast ad-hoc searches of data (splunk or elasticsearch) being able to do a query like 'show me every log containing this IP address' across hundreds of TB of data in just a couple minutes is a great security tool.

3. I categorize the logs and write them out in per-category files, sometimes in different formats (and one log can be written multiple places) so that reporting tools for each category can efficiently process what they need to get at.

4. some of the destinations that are written to are event correlation engines that do different things with the logs. Some generate summary data (that then feeds back into the logging system so that report generators and dashboards can use it, usually from ES/Splunk). Some generate alerts based on the absence of logs. And some generate alerts based on spotting logs or combinations of log messages.

I've done a bit more writing on the topic:
https://www.usenix.org/publications/login/david-lang-series
https://www.usenix.org/publications/login/feb14/logging-r...
https://www.usenix.org/conference/lisa12/technical-sessio...

Domesticating applications, OpenBSD style

Posted Jul 23, 2015 15:16 UTC (Thu) by dgm (subscriber, #49227) [Link] (8 responses)

As you pointed out this is clearly off topic, but I want to thank you for sharing your articles. Very interesting reading.

Domesticating applications, OpenBSD style

Posted Jul 23, 2015 15:59 UTC (Thu) by anselm (subscriber, #2796) [Link] (7 responses)

This is great but at the same time we should keep in mind that, as the wide world of logging applications are concerned, dlang is something of an outlier.

Domesticating applications, OpenBSD style

Posted Jul 23, 2015 19:15 UTC (Thu) by dlang (guest, #313) [Link] (5 responses)

> This is great but at the same time we should keep in mind that, as the wide world of logging applications are concerned, dlang is something of an outlier.

while I'm an outlier in the total volume of logs I've had to deal with, it's not by as much as you think.

the 100K logs/sec traffic was the 3-year projection at a 800 person SaaS company in 2006. When you take 100K logs/sec @ ~250 bytes/log (our measured average), delivering the logs to 4 destinations exceeds 1Gb/s. the company did not continue expanding at the predicted rate after it was purchased by a much larger company in 2007, but the log volume did continue to grow.

I spent some time at Google, and while I wasn't in their logging division, there's a lot in common between their logging architecture and what I advocate (although they do everything through their own APIs, a lot of NIH and some 'they were at a large scale before the tools got good enough for that scale')

At my new job, we 'only' have ~500 systems right now, and I find that these approaches work much better than many others that are talked about and tried. There are a LOT of companies that are at this scale and larger.

Domesticating applications, OpenBSD style

Posted Jul 23, 2015 21:16 UTC (Thu) by mathstuf (subscriber, #69389) [Link] (4 responses)

Maybe I'm just unfamilar with this level of sysadmining, but what do you *do* with all these logs? Dump them to disk and rotate out disks for archaeological use later (breach, debugging, etc.)? Scan them for "interesting" bits and toss out anything outside the context of those bits? It seems that, to me, these log databases are larger than the actual meat of the data being manipulated in many cases (LHC and scientific simulations being the ones that come to mind where the data would still outsize a log flow like that).

Domesticating applications, OpenBSD style

Posted Jul 23, 2015 21:59 UTC (Thu) by dlang (guest, #313) [Link] (3 responses)

> Maybe I'm just unfamilar with this level of sysadmining, but what do you *do* with all these logs?

Fair question

different things have different uses.

The archive is to recreate anything else as needed and to provide an "authoritative source" in case of lawsuits. How long you keep the logs depends on your company policies, but 3-7 years are common numbers (contracts with your customers when doing SaaS may drive this)

being able to investigate breeches, or even just fraud are reasons for the security folks to care.

for outage investigations (root cause analysis), you want to have the logs from the systems for the timeframe of the outage (and this is not just the logs from the systems that were down, you want the logs from all other systems in case there are dependencies you need to track down). For this you don't need a huge timeframe, but being able to look at the logs during a time of similar load (which may be a week/month/year ago depending on your business) to see what's different may help.

by generating rates of logs of different categories you can spot trends in usage/load/etc

By categorizing the logs and storing them by category you can notice "hey, normally these logs are this size, but they were much larger during the time we had problems" and by doing it per type in addition to per server you can easily see if different servers are logging significantly differently when one is having problems.

Part of categorizing the logs can be normalizing them. If you parse the logs you can identify all 'login' messages from your different apps and extract the useful info from them and output a message that's the same format for all logins, no matter what the source. This makes it much easier to spot issues and alert on problems.

A good approach is what Marcus Ranum coined "Artificial Ignorance"

start with your full feed of logs, sort it to find the most common log messages, If they are significant categorize those longs and push them off for something that knows that category to report on.

Remember that the number of times that an insignificant thing happens can be significant, so generate a rate of insignificant events and push that off to be monitored.

repeat for the next most common log messages.

As you progress through this, you will very quickly get to the point where you start spotting log messages that indicate problems. Pass those logs to an Event Correlation engine to alert on them (and rate limit your alerts so you don't get 5000 pages)

Much faster than you imagine, you will get to the point that the remaining uncategorized logs are not that significant, but also that there aren't very many of them and you can do something like generate a daily/weekly report of the uncategorized messages and have someone eyeball them for oddities (and keep an eye out for new message types you should categorize)

This seems like a gigantic amount of work, but it actually scales well. The bigger your organization the more logs you have, but the number of different _types_ of logs that you have grows much slower than the total log volume.

> It seems that, to me, these log databases are larger than the actual meat of the data being manipulated in many cases.

That's very common, but it doesn't mean the log data isn't valuable. Remember that I'm talking about a SaaS type environment, not HPC. Even if the service is only being provided to your employees. HPC and scientific simulations use a lot of cpu and run through a lot of data, but they don't generate much in the way of log info.

For example, your bank records are actually very small (what's your balance, what transactions took place), but the log records of your banks systems are much larger because they need to record every time that you accessed the system and what you did (or what someone did with your userid). When you then add the need to keep track of what your admins are doing (to be able to show that they are NOT accessing your accounts and catch any who try), you end up with a large number of log messages for just routine housekeeping.

But text logs are small, and they compress well (xz compression is running ~100:1 for my logfiles), so it ends up being a lot easier to store the data than you initially think. If you are working to do this efficiently, you can also use cheap storage and end up finding that the amount of money you are spending on the logs is a trivial amount of your budget.

It doesn't take many problems solved, or frauds tracked down to pay for it (completely ignoring the value of logs in the case of lawsuits)

Domesticating applications, OpenBSD style

Posted Jul 24, 2015 1:12 UTC (Fri) by pizza (subscriber, #46) [Link] (2 responses)

> The archive is to recreate anything else as needed and to provide an "authoritative source" in case of lawsuits. How long you keep the logs depends on your company policies, but 3-7 years are common numbers (contracts with your customers when doing SaaS may drive this)

You aren't using "logs" in the same sense that most sysadmins mean "logs" -- your definition is more akin to what journalling filesystems (or databases) refer to as logs -- ie a serial sequence of all transactions or application state changes.

I think that's why so many folks (myself included) express incredulity at your "logging" volume.

Domesticating applications, OpenBSD style

Posted Jul 24, 2015 1:49 UTC (Fri) by dlang (guest, #313) [Link] (1 responses)

When I say logs, I'm talking about the stuff generated by operating systems and appliances into syslog + the logs that the applications write (sometimes to syslog, more frequently to local log files that then have to be scraped to be gathered). This includes things like webserver logs (which I find to be a significant percentage of the overall logs, but only ~1/3)

I do add some additional data to the log stream, but it's low volume compared to the basic logs I refer to above (A few log messages/min per server)

Also, keep in mind that when I talk about sizing a logging system, most of the time I'm talking about the peak data rate. What it takes to keep up with the logs at the busiest part of the busiest day.

I want to have a logging system that can process all logs within about 2 min of when they are generated. This is about the limit as far as I've found for having the system react to log entries or having changes start showing up in graphs.

There is also the average volume of logs per day. This comes into play when you are sizing your storage.

so when I talk about 100K logs/sec or 1Gb of logs being delivered, this is the peak time.

100K logs/sec @256 bytes/log = 25MB/sec (1.5GB/min, 90GB/hour) If you send this logging traffic to four destinations (archive, search, alerting, reporting), you are at ~100MB/sec of the theoretical 125MB/sec signalling rate that gig-E gives you. In practice this is right about at or just above the limit of what you can do with default network settings (without playing games like jumbo frames, compressing the network stream, etc). The talk I posted the link to earlier goes into the tricks for supporting this sort of thing.

But it's important to realize that this data rate may only be sustained for a few min per day on a peak day, so the daily volume of logs can be <1TB/day on a peak day (which compresses to ~10GB), and considerably less on off-peak days. Over a year, this may average out to 500GB/day (since I'm no longer there I can't lookup numbers, but these are in the ballpark)

This was at a company providing banking services for >10M users.

now, the company that I recently started at is generating 50GB of windows eventlog data per day most weekdays, (not counting application logs, firewall logs, IDS logs, etc) from a couple hundred production systems. I don't yet have a feed of those logs, so I can't break it down any more than that yet, but the type of business that we do is very cyclical, so I would expect that peak days of the month/year the windows eventlog log volume will easily be double/triple that.

If you parse the log files early and send both the parsed data and the raw data, the combined parsed data can easily be 2x-3x the size of the original raw data (metadata that you gather just adds to this)

As a result, ElasticSearch needs to be sized to handle somewhere around 500G/day (3*3*50GB/day) for it's retention period to handle the peak period with a little headroom.

Domesticating applications, OpenBSD style

Posted Jul 24, 2015 1:55 UTC (Fri) by dlang (guest, #313) [Link]

as always, your log sizes and patterns may vary, measure them yourself. but I'll bet that you'll be surprised at how much log data you actually have around.

Domesticating applications, OpenBSD style

Posted Jul 23, 2015 22:30 UTC (Thu) by job (guest, #670) [Link]

Anyone who works under regulations, such as SOX, needs to keep an archive of logs in their sequential form. By far the easiest way to do that is to chunk them up, gzip them, sign if necessary, and file away. That should be quite a common use case in larger organizations, if not perhaps not the same volume of it. It's also a good idea to keep a (pruned) copy in elasticsearch for your daily bug hunting activities...