January 26, 2011
This article was contributed by Robert Fekete
Correlating log messages to get a deeper insight about the actual events
happening on a network or server is an important element of IT security.
Being able to do so is mandated by several security compliance standards, best
practices, and also common sense. However, many common log analyzing and
correlation engines cannot handle high message rates in real time,
requiring administrators to filter the input of the analyzing
engine. Proprietary solutions are often licensed based on the number of
processed messages, which limits their usefulness. The syslog-ng project aims to provide a flexible, real-time correlation solution that scales well even to extreme performance requirements.
Syslog-ng is
an advanced system logging tool, which can be a replacement for the
standard syslogd and rsyslog daemons. The syslog-ng pattern database,
introduced almost two years ago, allows for real-time message
identification and classification by comparing the incoming log messages to
a set of message patterns. The classification engine of syslog-ng is much
faster and scalable than using regular expressions to identify messages,
and also permits the administrator to extract relevant information from the
message body or to add custom metadata (for example, tags) to log
messages. We looked at
message classification in syslog-ng just over a year ago.
The new message correlation feature extends the syslog-ng pattern database to make it possible to associate related log messages, and to treat the information from those messages as if they were a single event.
Message correlation is one of the foundations of log analysis and
reporting, because log messages tend to be hectic, and often separate
important information about events into different log messages. For
example, the Postfix e-mail server logs the sender and recipient addresses
into separate log messages. For OpenSSH, if there is an unsuccessful login
attempt, the server sends a log message about the authentication failure
with the reason for the failure in the next message. But in fact the event
and its exact details are interesting, not necessarily the individual log
messages, therefore being able to collect information as events rather than
messages can be a boon for every system administrator.
How correlation works in syslog-ng
Message correlation in syslog-ng operates on the log messages
successfully identified by the syslog-ng's pattern database: you can extend the rules describing message patterns with instructions on how to correlate the matching messages.
Correlating log messages involves collecting the messages into message
groups called contexts. A context consists of a series of log messages that
are related to each other in some way, for example, the log messages of an
SSH session can belong to the same context. Messages may be added to a
context as they are processed. The context of a log message can be specified using simple static
strings or with macros and dynamic values. For example, you can group
messages received from the same host ($HOST), application ($HOST$PROGRAM), or process ($HOST$PROGRAM$PID).
Messages belonging to the same context are correlated, and can be processed in a number of ways. It is possible to include the information contained in an earlier message of the context in messages that are added later. For example, if a mail server application sends separate log messages about every recipient of an e-mail (like Postfix), you can merge the recipient addresses to the previous log message. Another option is to generate a completely new log message that contains all the important information that was stored previously in the context, for example, the login and logout (or timeout) times of an authenticated session (like SSH or telnet), and so on.
To ensure that a context handles only log messages of related events, a timeout value can be assigned to a context, which determines how long the context accepts related messages. If the timeout expires, the context is closed.
Triggering new messages and external actions
In syslog-ng Open Source Edition (OSE) 3.2, you can automatically generate new messages when a particular message is recognized, or the correlation timeout of a context expires. The generated messages can be configured within the pattern database rules, meaning that if needed, a new message can be generated for every incoming log message. Obviously this not necessary, unless you take log normalization really seriously.
When used together with message correlation, you can also refer to
fields and values of earlier messages of the context. For example, the
patterns:
<pattern>
Accepted @QSTRING:SSH.AUTH_METHOD: @ for@QSTRING:SSH_USERNAME: \
@from @QSTRING:SSH_CLIENT_ADDRESS: @port @NUMBER:SSH_PORT_NUMBER:@ ssh2
</pattern>
<pattern>
pam_unix(sshd:session): session closed for user @ESTRING:SSH_USERNAME: @
</pattern>
could be used to match OpenSSH's log messages. Then the action:
<value name="MESSAGE">
An SSH session for $SSH_USERNAME from ${SSH_CLIENT_ADDRESS}@1 \
closed. Session lasted from ${DATE}@1 to $DATE.
</value>
would put out a correlated message that included information from both log
messages. The above is just a snippet, consult the
full XML rules for all the gory details.
Sending alerts directly from syslog-ng is currently not supported, but would be a welcome addition to the next versions. However, it is reasonably simple to pass the selected messages to an external script that sends out alerts in e-mail or SNMP. And since completely new messages can be created from the information extracted from the correlated messages, all the script has to do is to send out the alerts, for example using sendmail or snmptrap.
To process already collected log messages, syslog-ng also allows for
correlating log messages from log files. For this reason, the time elapsed
between two log messages is calculated from the actual timestamps of the
log messages instead of using the system time.
Beyond syslog-ng 3.2
Work on syslog-ng OSE 3.3 has already started, and focuses on improving the support for multicore and multithreaded operations to increase the performance of syslog-ng and make it even more suitable for high-message rate environments. Transforming the internal representation of log messages to other, non-syslog outputs like JSON or WELF is also on the roadmap.
As correlating log messages becomes increasingly important for companies
and organizations, it is welcome to see that open source tools are also
focusing on solving this problem. Although the syslog-ng project has had a
sometimes rocky relationship with the open source community in the past,
its OSE is under active development. In fact, the message
correlation feature, among others, is currently available only in the OSE.
[ The author is a technical writer for BalaBit, which developed syslog-ng. ]
(
Log in to post comments)