LWN: Comments on "Malcolm: SQL for the command line: "show"" https://lwn.net/Articles/324908/ This is a special feed containing comments posted to the individual LWN article titled "Malcolm: SQL for the command line: "show"". en-us Sun, 05 Oct 2025 05:09:11 +0000 Sun, 05 Oct 2025 05:09:11 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net Author responds https://lwn.net/Articles/326797/ https://lwn.net/Articles/326797/ nix <div class="FormattedComment"> What I've wondered about doing is hacking syslog() in libc and the syslog <br> protocol to pass the format string and arguments separately (as well as as <br> a formatted whole), so that syslog-ng can use its existing facilities to <br> dump the lot in a database. Then we can *really* do log analysis, with <br> variable and fixed parts spliced out. (The problem is the break of the <br> syslog protocol, though. I considered analyzing log messages to attempt to <br> retrospectively determine which parts are format string and which are <br> arguments, but that rapidly gets into a pattern-matching tarpit.)<br> </div> Wed, 01 Apr 2009 22:38:59 +0000 Author responds https://lwn.net/Articles/326751/ https://lwn.net/Articles/326751/ ortalo <div class="FormattedComment"> idea/need: ntsyslog backend (for parsing Windows event logs archived via NTsyslog to a Unix machine).<br> <p> But the actual reason for my comment was another suggestion. Have you considered implementing the same kind of backends inside a full fledged database? It seems to me at least PostgreSQL should offer enough extensibility to allow this. It could free you from dealing with the intricacies/limitations of an "SQL-like" parser and may open the door to more complex treatments (dunno if writing would be feasible).<br> Or maybe you would find this too overweight for your intended usage? (I routinely have to consider &gt;30Go of compressed log files so, even a full-fledged database engine does not seem overkill sometimes.)<br> Gonna look at your tool anyway. Thanks for the contribution.<br> </div> Wed, 01 Apr 2009 18:53:32 +0000 Special filenames https://lwn.net/Articles/325980/ https://lwn.net/Articles/325980/ jengelh <div class="FormattedComment"> Here you have your Useless Use Of Cat Award.<br> </div> Sat, 28 Mar 2009 11:22:08 +0000 Malcolm: SQL for the command line: "show" https://lwn.net/Articles/325945/ https://lwn.net/Articles/325945/ nix <div class="FormattedComment"> SQL extensions are an even more classic example. Look at MODEL, for <br> instance. In case the relational model is 'too hard', now you can turn <br> your DB into a tiny spreadsheet and bash at it in the query. How <br> relational...<br> <p> (actually it *is* useful, but that doesn't mean it's not totally bizarre <br> and screwy. The real problem here is SQL's halfassed incapable <br> implementation of half the relational calculus in a non-Turing-complete <br> fashion. But it paid for my house so I can't complain *too* terribly <br> hard.)<br> </div> Sat, 28 Mar 2009 00:55:23 +0000 Malcolm: SQL for the command line: "show" https://lwn.net/Articles/325905/ https://lwn.net/Articles/325905/ marcH Hey, how consistent is this? <pre> - delete from TABLE where ROWfilter - select COLnames from TABLE where ROWfilter - insert into TABLE values COLvalues </pre> Every time I go back to SQL I have to google again for examples... <p> Higher level concerns: "Why SQL Sucks": <a href="http://perlmonks.org/?node_id=515776"> http://perlmonks.org/?node_id=515776</a> Fri, 27 Mar 2009 20:09:24 +0000 Malcolm: SQL for the command line: "show" https://lwn.net/Articles/325766/ https://lwn.net/Articles/325766/ zlynx <div class="FormattedComment"> Insert's syntax may seem redundant for *simple* use. But if you do complicated things, it isn't.<br> <p> For example, the list of values isn't limited to one thing. Values can just go on and on, inserting as many records as you like.<br> <p> Or, you can put a SELECT statement there instead of VALUES.<br> <p> Now what seemed redundant is necessary.<br> <p> I won't argue about DELETE and UPDATE assuming all records without a WHERE being dangerous and stupid though. :-)<br> </div> Thu, 26 Mar 2009 23:56:25 +0000 Malcolm: SQL for the command line: "show" https://lwn.net/Articles/325611/ https://lwn.net/Articles/325611/ Wol <div class="FormattedComment"> You'd be surprised ...<br> <p> IBM are on record as saying that their version (the U2 databases) is the fastest growing product in their database VAR sales.<br> <p> And International Spectrum is holding their conference right now - against a background of collapsing conference attendances (typically down 25 - 50 %), they're holding their own - I think they were down 7% (or was it up?)<br> <p> imho relational *theory* is great. Unfortunately, relational practice falls foul of Einstein's corollary to Occam - practice is TOO simple, therefore system complexity (as in all the stuff *round* the database) rises sharply as a result. SQL queries are a classic example :-)<br> <p> Cheers,<br> Wol <br> </div> Thu, 26 Mar 2009 14:06:22 +0000 Author responds https://lwn.net/Articles/325471/ https://lwn.net/Articles/325471/ dave_malcolm <div class="FormattedComment"> Fame at last!<br> <p> Follow-up posting here: <a href="http://dmalcolm.livejournal.com/2009/03/24/">http://dmalcolm.livejournal.com/2009/03/24/</a> with info on git repo etc.<br> <p> Thanks for the feedback. As some have noted, this isn't intended just for log files. Note that in the original post it already had "rpm" and "proc" backends, for querying the local rpm database and /proc respectively.<br> <p> It can now parse many of the files in /etc, using the Augeas library (<a href="http://augeas.net">http://augeas.net</a>)<br> <p> I also just added tcpdump support, so you can look at e.g. a wireshark dump and run something like this:<br> $ show "count(*)", "total(length)", src_mac, dst_mac from test.pcap group by src_mac, dst_mac<br> (though this is merely a messy proof-of-concept hack at this stage)<br> <p> Ideas for other backends most welcome.<br> </div> Thu, 26 Mar 2009 00:08:37 +0000 Malcolm: SQL for the command line: "show" https://lwn.net/Articles/325237/ https://lwn.net/Articles/325237/ flewellyn <div class="FormattedComment"> Well, SQL is not, perhaps, the best thing every when it comes to query languages. I mean, it works okay, and it's "good enough", and I'll defend it on those grounds. But it's hardly the epitome of query languages.<br> <p> Truth be told, my ideal query language would be something Lisplike, with query operations specified by functions and special operators (or macros), and "views" just being newly defined querying functions or macros.<br> <p> Hmmm...now there's an idea...<br> </div> Tue, 24 Mar 2009 21:50:22 +0000 Special filenames https://lwn.net/Articles/325235/ https://lwn.net/Articles/325235/ skx <p>Or just use <a href="http://www.steve.org.uk/Software/asql">asql</a> which has a nice built-in shell for live queries, or the ability to run queries from the command line.</p> Tue, 24 Mar 2009 21:36:51 +0000 Malcolm: SQL for the command line: "show" https://lwn.net/Articles/325234/ https://lwn.net/Articles/325234/ skx <p>It is funny how people get hooked on using SQL for querying logfiles. Last year or so I wrote <a href="http://www.steve.org.uk/Software/asql">asql</a> which I use for producing adhoc statistics form Apache logfiles.</P> <p>Simple usage is:</p> <pre> $ asql asql v1.2 - type 'help' for help. asql&gt; load /home/www/www.steve.org.uk/logs/access.log Loading: /home/www/www.steve.org.uk/logs/access.log asql&gt; SELECT source,SUM(size) AS Number FROM logs GROUP BY source ORDER BY Number DESC, source LIMIT 0,10; 67.195.37.112 4681922 74.6.17.185 2353628 87.120.8.52 2066975 77.36.6.72 1859180 ... </pre> <p>Finding the top ten referers becomes:</p> <pre> asql&gt; SELECT referer,COUNT(referer) AS number from logs WHERE referer NOT LIKE '%steve.org.uk%' GROUP BY referer ORDER BY number DESC,referer LIMIT 0,10; - 1888 http://www.gnu.org/software/gnump3d/download.html 12 http://community.livejournal.com/lotr_tattoos/?skip=20 5 http://lua-users.org/wiki/LibrariesAndBindings 4 </pre> <p>Although SQL is often not the most natural way to query things I found it very useful and natural in this context.</p> Tue, 24 Mar 2009 21:35:39 +0000 Malcolm: SQL for the command line: "show" https://lwn.net/Articles/325137/ https://lwn.net/Articles/325137/ flewellyn <div class="FormattedComment"> That's fine, but who typically knows ENGLISH these days? SQL is far more common.<br> </div> Tue, 24 Mar 2009 16:24:41 +0000 Malcolm: SQL for the command line: "show" https://lwn.net/Articles/325125/ https://lwn.net/Articles/325125/ Wol <div class="FormattedComment"> Log files are typically two-dimensional :-)<br> <p> So yes, SQL is probably a good language for querying them. But then, so is ENGLISH, because it's n-dimensional (actually, it doesn't work that well if n hits 4 or more :-( so horses for courses, I'd use ENGLISH because that's what I'm comfortable with.<br> <p> Cheers,<br> Wol<br> </div> Tue, 24 Mar 2009 15:51:54 +0000 Special filenames https://lwn.net/Articles/325118/ https://lwn.net/Articles/325118/ efexis <div class="FormattedComment"> Standard unixy shell methodology, 'show' reads from standard input, so becomes:<br> <p> show host, "count(*)", "total(size)" &lt; /var/log/httpd/access_log<br> <p> or <br> <p> cat /var/log/httpd/*access_log* | show host, "count(*)", "total(size)"<br> <p> would solve that<br> </div> Tue, 24 Mar 2009 15:26:00 +0000 Malcolm: SQL for the command line: "show" https://lwn.net/Articles/325100/ https://lwn.net/Articles/325100/ pierre <div class="FormattedComment"> <a rel="nofollow" href="http://www.logparser.com/">http://www.logparser.com/</a> is quite powerful.<br> </div> Tue, 24 Mar 2009 13:32:28 +0000 Malcolm: SQL for the command line: "show" https://lwn.net/Articles/325045/ https://lwn.net/Articles/325045/ jordanb <div class="FormattedComment"> I find SQL to be a horrible interactive query language. So many things that should be implicit aren't. You can't say <br> <p> # select gender, race, sum(income)/count(income);"<br> <p> for instance. You have to say:<br> <p> # select gender, race, industry sum(income)/count(id) group by gender, race, industry;<br> <p> even though there's only one rational way to group the data. In fact. If you leave off any term in the group by statement, most servers will raise an error.<br> <p> While there's silly redundancy in the select statement, the others have horrible and dangerous defaults, for instance this:<br> <p> # delete from records;<br> <p> doesn't raise an error or do nothing, which would be sensible (you didn't specify what to delete!) instead, it deletes *everything* in the table. It'd be like if 'rm's behavior without arguments was to delete every file in the current directory.<br> <p> Then there's insert's annoying positional syntax, which is both tediously redundant *and* error prone. Plus the 'shortcut' way to do an insert ignores the fact that nearly every table has as its first column an auto-incrementing ID, forcing you to either go to the horrible long-form or use a non-standard workaround if you're lucky enough to be using a server that has one.<br> <p> Anyway, while his tool is fairly neat I think it might have been more useful if he'd made something to coerce the data into sqlite. It'd have been less work and you don't get stuck with the shell argument escaping horror he demonstrates there.<br> <p> </div> Tue, 24 Mar 2009 02:48:29 +0000 Malcolm: SQL for the command line: "show" https://lwn.net/Articles/325038/ https://lwn.net/Articles/325038/ flewellyn <div class="FormattedComment"> Well, is it incorrect to interpret log files relationally?<br> </div> Tue, 24 Mar 2009 00:44:54 +0000 Malcolm: SQL for the command line: "show" https://lwn.net/Articles/325034/ https://lwn.net/Articles/325034/ Wol <div class="FormattedComment"> :-)<br> <p> Sorry, but I did *not* say "use English". I said "use ENGLISH" (ENGLISH being a dedicated data access language).<br> <p> ENGLISH is the original Pick data query language, and is a very good NFNF query tool (It's also called ENGLISH because it is, actually, very similar to English!) For example<br> <p> SELECT INVOICE WITH INVOICE.TOTAL EQ 1600 AND WHERE LINE.ITEM EQ 215<br> <p> will select all invoices where the invoice value is 1600 and any individual line is 215.<br> <p> SQL is *not* a "moderately straightforward mapping of the concepts of database retrieval to English" - no way would I describe it as "moderately straightforward", and it is very relational-oriented. Using it to query a non-relational database is *horrid*.<br> <p> Cheers,<br> Wol<br> </div> Tue, 24 Mar 2009 00:21:28 +0000 Malcolm: SQL for the command line: "show" https://lwn.net/Articles/325009/ https://lwn.net/Articles/325009/ flewellyn <div class="FormattedComment"> It depends. SQL is really a domain-specific language, and it does queries very well. But it doesn't do other things that we might want. Having a query language which was fully general, and able to do any sort of general-purpose programming, would be nice. But then, what language to use?<br> </div> Mon, 23 Mar 2009 20:42:00 +0000 Malcolm: SQL for the command line: "show" https://lwn.net/Articles/325005/ https://lwn.net/Articles/325005/ clugstj <div class="FormattedComment"> OK, I was a little flippant before. A useful tool makes simple things simple and complex things possible. I see SQL in this case as only the first. Once you try to do something complex, you will have to throw it out and go back to a more general programming language.<br> </div> Mon, 23 Mar 2009 20:30:19 +0000 Special filenames https://lwn.net/Articles/324997/ https://lwn.net/Articles/324997/ epa <pre>show host, "count(*)", "total(size)" from /var/log/httpd/*access_log*</pre> And what happens when there is a file called 'where'? <p> OK, in this example it would be /var/log/httpd/where which presumably wouldn't trip up the parser. But using this thing in your current directory could be flaky. Mon, 23 Mar 2009 20:00:15 +0000 Malcolm: SQL for the command line: "show" https://lwn.net/Articles/324981/ https://lwn.net/Articles/324981/ KGranade <div class="FormattedComment"> First, as has been mentioned already, the data he's proposing accessing with SQL IS mostly (or totally, not sure) flat tables ( log files, /proc, tables of installed packages, sounds like flat tables to me ).<br> <p> Secondly, you recommend using something like English, but I have no idea how the examples you present would be used, which means learning a specialized syntax, which means I might as well use a generalized syntax that is a moderately straightforward mapping of the concepts of database retrieval to English... which is a decent description of SQL.<br> <p> &lt;Insert overly-verbose rant about the pitfalls of "natural language programming here&gt;<br> </div> Mon, 23 Mar 2009 18:51:24 +0000 Malcolm: SQL for the command line: "show" https://lwn.net/Articles/324978/ https://lwn.net/Articles/324978/ mrshiny <div class="FormattedComment"> Yeah but for analyzing structured log files such as the apache logs, this is the most awesome thing ever. I want this for my servers at work.<br> </div> Mon, 23 Mar 2009 18:09:28 +0000 Malcolm: SQL for the command line: "show" https://lwn.net/Articles/324921/ https://lwn.net/Articles/324921/ Wol <div class="FormattedComment"> Use something like ENGLISH :-) (otherwise known as ACCESS, INFORM, RETRIEVE etc :-)<br> <p> But seriously, if your data naturally fits a flat table, SQL is a good fit. Most data, however, doesn't.<br> <p> Cheers,<br> Wol<br> </div> Mon, 23 Mar 2009 16:53:00 +0000 Malcolm: SQL for the command line: "show" https://lwn.net/Articles/324913/ https://lwn.net/Articles/324913/ flewellyn <div class="FormattedComment"> Well, of course it does. It ISN'T a general programming language.<br> <p> But, that's not what he wants to use it for. He only wants to use it for its problem domain: querying a data store of some kind. And in that domain, it's perhaps not the best thing ever, but it's quite suitable and certainly well-understood.<br> <p> I just wonder how he'll handle issues like complex queries, with things like subqueries and joins and the like.<br> </div> Mon, 23 Mar 2009 16:33:52 +0000 Malcolm: SQL for the command line: "show" https://lwn.net/Articles/324912/ https://lwn.net/Articles/324912/ clugstj <div class="FormattedComment"> "Why can't I just use SQL"<br> <p> Because SQL sucks as a general programming language.<br> </div> Mon, 23 Mar 2009 16:29:46 +0000 Malcolm: SQL for the command line: "show" https://lwn.net/Articles/324909/ https://lwn.net/Articles/324909/ sjj <div class="FormattedComment"> In Microsoft world, Log Parser does exactly this. It is extremely useful. And free (beer-like).<br> <p> <a rel="nofollow" href="http://forums.iis.net/default.aspx?GroupID=51">http://forums.iis.net/default.aspx?GroupID=51</a><br> </div> Mon, 23 Mar 2009 16:27:14 +0000