|
|
Log in / Subscribe / Register

System monitoring with osquery

May 24, 2017

This article was contributed by Nur Hussein

Your operating system generates a lot of run-time data and statistics that are useful for monitoring system security and performance. How you get this information depends on the operating system you're running. It could be a from report in a fancy GUI, or obtained via a specialized API, or simply text values read from the filesystem in the case of Linux and /proc. However, imagine if you could get this data via an SQL query, and obtain the output as a database table or JSON object. This is exactly what osquery lets you do on Linux, macOS, and Windows.

Osquery is an open-source project created by Facebook and hosted at Github, which the company released to the world under a 3-clause BSD license in 2014. It initially supported only Linux and macOS, but a Windows port (with somewhat lesser capabilities) was released in 2016, allowing osquery to have a unified SQL-based query interface for quite different operating systems. A data center running multiple operating systems would be able to query the state of the infrastructure using a single querying interface, simplifying the data collection implementation for DevOps teams. The osquery development team wanted to created a fast, reliable, and easy instrumentation tool that did not require a lot of low-level programming to retrieve information from.

System information is presented in the form of database tables; it is dynamically generated at query time rather than retrieved from storage. Naturally, different operating systems have slightly different information to present to the user. However, the osquery developers have strived for "feature parity" between systems; there is a set of common tables available for every supported platform, along with a set of operating-system-specific tables. Querying the tables is done using a subset of SQL that follows the SQLite syntax.

Using osquery

Packages for osquery are available for CentOS and Ubuntu. I installed the provided package for Ubuntu 16.04 and it required no tweaking to get up and running immediately using the default configuration. There are two ways to use osquery: interactively or as an operating system service. The interactive program, osqueryi, can be used as a command interpreter that is similar to interactive prompt of SQLite. The interactive program is useful for exploring the running system. The available tables can be found in the documentation.

I ran a few sample queries to try out osquery in interactive mode. For example, getting a list of logged-in users, with their tty, login time, process id of the login, and host they are logging in from:

    osquery> SELECT * from logged_in_users;
    +-----------+----------+-------+------------------+------------+------+
    | type      | user     | tty   | host             | time       | pid  |
    +-----------+----------+-------+------------------+------------+------+
    | boot_time | reboot   | ~     | 4.8.0-52-generic | 1495211035 | 0    |
    | runlevel  | runlevel | ~     | 4.8.0-52-generic | 1495211045 | 53   |
    | login     | LOGIN    | tty1  |                  | 1495211045 | 792  |
    | user      | hussein  | pts/8 | 10.0.2.2         | 1495211067 | 1134 |
    +-----------+----------+-------+------------------+------------+------+

The table lists my username as well as several pseudo-users, such as boot_time and runlevel.

Starting osqueryi with the --json flag gives JSON output:

    osquery> SELECT version from os_version;
    [
      {"version":"16.04.2 LTS (Xenial Xerus)"}
    ]

Querying basic system information is fun, but the true usefulness of this tool shines when pulling data from several different tables to make inferences about the system. For example, if I were a system administrator I might want to check for processes with uid 0 (privileged processes) opening network connections to the outside world. The appearance of such connections is symptomatic of fishy behaviour that might indicate unauthorized system use. To do this, we can do an SQL join of some tables to search for any process with uid 0 with an open socket. To test this, I first needed a root-owned process to create an open socket, so I opened a network connection with sudo using telnet to google.com's port 80. Then I ran this query:

    osquery> SELECT DISTINCT processes.uid, process_open_sockets.pid, 
                       process_open_sockets.remote_address,
	               process_open_sockets.local_port, 
                       process_open_sockets.remote_port
		    FROM process_open_sockets INNER JOIN processes 
                    WHERE processes.pid=process_open_sockets.pid
		       AND processes.uid=0 
                       AND process_open_sockets.remote_address <> "" 
                       AND process_open_sockets.remote_address <> "0.0.0.0" 
                       AND process_open_sockets.remote_address <> "10.0.2.2" 
                       AND process_open_sockets.remote_address <> "::";

    +-----+------+----------------+------------+-------------+
    | uid | pid  | remote_address | local_port | remote_port |
    +-----+------+----------------+------------+-------------+
    | 0   | 2641 | 64.233.188.99  | 36600      | 80          |
    +-----+------+----------------+------------+-------------+

We do an inner join of the processes and process_open_sockets tables, using the uid information from the processes table to filter the results, and disregarding local addresses. As revealed from the table, there is a process with pid 2641 connected to 64.233.188.99 (one of Google's public-facing IPs) on port 80. I needed to run osqueryi using sudo, because otherwise it cannot display information that a regular user does not have the permissions to view. While it is possible to run other SQL commands beside SELECT, commands like UPDATE, INSERT, and DELETE don't do anything on the standard tables, as they are all understandably read-only. It is also possible to pipe queries into osqueryi and get the results back via stdout, which makes it possible to use as part of shell scripts.

Osquery can run as a service called osqueryd, which is a monitoring daemon that allows scheduling system queries or generating queries based on events. Query results are logged, either to the filesystem or via a plugin to a service such as Amazon's AWS. Configuring the daemon involves writing a script with a JSON-formatted list of SQL queries and the intervals at which they should run, much like a cron script. A scheduled query produces a log of the chosen system information at discrete intervals, so it is useful for seeing how the system state changes over time.

Logging, security, and monitoring

Osquery is useful for cross-platform monitoring of system infrastructure. The above example where the detection of root-owned processes that open network connections could be configured as a periodic query using osqueryd; any suspicious activity would be monitored and logged.

Additionally, a feature of osquery called file integrity monitoring on Linux and macOS systems tracks filesystem changes. This is accomplished via inotify on Linux and on macOS using FSEvents. Any files that a user wants to track need to be specified in the configuration file, and the status of the files can be read in a special table called file_events. Logs can be generated when a file is either accessed or changed. When a file change event happens, the MD5, SHA-1, and SHA-256 hashes of the affected file will be recalculated and logged.

Finally, basic auditing is also available on Linux and on macOS (via a kernel extension); the table process_events records process creation details. A table called socket_events is available on Linux that stores reports from the bind() and connect() system calls, as well.

There are two types of logs generated by osquery: status logs and results logs. Status logs contain the execution information of osquery itself; they are created using the Glog framework. Results logs are the output of queries, and come in two varieties: snapshot and differential. A snapshot logs the output of the entire query; that can often result in large logs as entire tables are written to disk at the specified intervals. A more space-economical format is the differential log, which is a format that shows the user which records were either added, removed, or changed.

By default osquery will log to the filesystem, but additional log aggregation options are available via plugins. Osquery version 1.7.4 and later can log directly to Amazon Web Service's Kinesis Streams and Kinesis Firehose, which are cloud-based data buckets that can capture data streams for storage and analysis.

Custom tables, extensions, and plugins

Custom tables can be added to the osquery source code; those tables can be filled with data from any data source. There are two parts to this: defining the table structures themselves, and then creating custom C++ classes to obtain the data to populate them. The table structure definition is done via defining a schema written in Python that will auto-generate the necessary C/C++ implementation of the table. Then a class is created for populating the data, which can be done via system calls, reading in values from /proc, or be dynamically generated. To get an idea of what this involves, there is example code given in the documentation that creates a custom table that holds the current time derived from the time() system call.

Osquery's functionality can also be built upon with plugins and extensions. Extensions are separate processes built and linked together with the osquery core code; they communicate with the main osquery process (either osqueryd or osqueryi) via a Thrift API. Extensions are useful for creating plugins for things such as logging and configuration.

Conclusion

Osquery is a unique tool for system monitoring. While it is useful on its own, one can imagine it as a basis for creating more complex monitoring systems with sophisticated malware and anomaly detection, where rules can be easily created by simply crafting an appropriate query, or even dynamic monitoring tools that create conditional monitoring rules based on real-time events. SQL is a powerful and expressive query language, and its use for log analysis gives a flexible tool to system administrators and DevOps teams.


Index entries for this article
GuestArticlesHussein, Nur


to post comments

System monitoring with osquery

Posted May 26, 2017 9:37 UTC (Fri) by amarao (guest, #87073) [Link] (5 responses)

I had really excited to read about it. I even has played with it in a sandbox machine for awhile.

There was one thing which I assumed to be supported: remote queries. I assumed that osqueryi (command line utility) can connect to remove osqueryd. It was so obvious to have and to do.

As far as I could see, there is not a single option which allows osqueryd to listen on TCP socket.

When I realized that this is not a 'remote SQL interface to your servers', most of my enthusiasm has faded. Yes, this thing is interesting, but it forces too much policy onto users, and it provides too few mechanisms.

System monitoring with osquery

Posted May 26, 2017 10:25 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (3 responses)

Remote communication can require an authentication scheme first which may not have been implemented yet. If you want, couldn't you use socat and/or SSH port forwarding to access the domain socket (I assume) it is using?

System monitoring with osquery

Posted May 26, 2017 10:31 UTC (Fri) by amarao (guest, #87073) [Link] (2 responses)

Yes, we could use ssh for this. Actually, in production, we're already using ssh, with old-school nagios checks under shinken. It works, but constant ssh sessions cause big strain on system. PAM, auth logging, key negotiations (including host key changes) - it's all too bulky and asks for refactoring.

I thought about trying osquery for this, but, as it seems, it wouldn't solve any of those problems. Too sad.

System monitoring with osquery

Posted May 29, 2017 11:34 UTC (Mon) by robbe (guest, #16131) [Link] (1 responses)

Take a look at ControlMaster and friends in ssh_config(5).

System monitoring with osquery

Posted May 29, 2017 12:05 UTC (Mon) by amarao (guest, #87073) [Link]

> Take a look at ControlMaster and friends in ssh_config(5).

As administrator I could say one thing: Never ever use ControlMaster for monitoring purposes. ControlMaster works well if servers and networks work well. But monitoring should continue to work even there are some nasty things happen with network and hosts. If previous TCP connection stuck in oblivion due to stalled contrack, or RST which was lost in turbulence, ControlMaster will cause massive false positives, distracting people from actual issues, or even masking actual problems on hosts by mundane 'network glitch' issue.

System monitoring with osquery

Posted May 27, 2017 6:59 UTC (Sat) by antifuchs (subscriber, #34569) [Link]

At work I believe we use doorman (https://github.com/mwielgoszewski/doorman) for this, and it works reasonably well.

System monitoring with osquery

Posted May 30, 2017 7:11 UTC (Tue) by teddy (guest, #116392) [Link]

Hi all, osquery developer here, always glad to hear positive feedback about the project and that people are deriving value.

From the Linux-based OS perspective we receive a lot of feedback about the build and dependency setup. Our cmake build does not play nice with operating system package managers as it tries to produce binaries that are portable: linking against old ABIs, statically building and compiling most C++ dependencies, setting up a brew-based local copy of modern tools, and holding GTest+SQLite in a submodule. If any package maintainer (who also dabbles in C++ development) wants to bring osquery into your OS's manager please be patient with us, we have the best intentions and would be more than happy for the help!

If anyone is interested in new features or expanding the set of virtual tables please say hi in the project's Slack or send us a Github issue: https://github.com/facebook/osquery


System monitoring with osquery

Posted Jun 4, 2017 18:57 UTC (Sun) by yodermk (subscriber, #3803) [Link]

This would be nice as a PostgreSQL foreign data wrapper. Then you could do even more arbitrary queries and data manipulation and query many of them together with other things from a single connection.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds