Visualization of Ruby's Grammar

[Posted October 27, 2006 by ris]

Nick Sieger looks at Ruby's grammar on his blog. "As part of the momentum surrounding the Ruby implementer's summit, I have decided to take on a pet project to understand Ruby's grammar better, with the goal of contributing to an implementation-independent specification of the grammar. Matz mentioned during his keynote how parse.y was one of the uglier parts of Ruby, but just how ugly?" (Found on Linux Journal)

Visualization of Ruby's Grammar

Posted Oct 28, 2006 17:16 UTC (Sat) by tjc (guest, #137) [Link] (10 responses)

Don't miss the C and Python graphs posted further down in comment #8:

http://flickr.com/photos/nicksieger/281055530/
http://flickr.com/photos/nicksieger/281055485/

It's interesting how clean the C and Phython graphs are compared to the other three.

Visualization of Ruby's Grammar

Posted Oct 28, 2006 18:47 UTC (Sat) by gmaxwell (guest, #30048) [Link]

I wonder if such a graph for perl could be generated in finite time... :)

Visualization of Ruby's Grammar

Posted Oct 28, 2006 20:29 UTC (Sat) by hilmi (guest, #41368) [Link]

I wonder if the reason for Ruby's graph looking different is the
fundamental difference of Ruby from others that everything is an
expression and returns a value (like Lisp). In Ruby, you can embed any
language structure into another by using parenthesis.

Visualization of Ruby's Grammar

Posted Oct 29, 2006 3:03 UTC (Sun) by mikov (guest, #33179) [Link] (6 responses)

These graphs are completely meaningless, as far as I can tell. They may be
pretty to look at, but that's all. So, the relative "cleanliness" of one
graphs vs the other tells as nothing. I suspect anybody with artistic
sense could make any graph to look better.

Further on, I should point that the original blog post appears mostly
meaningless too. They converted Ruby's grammar from BISON (LRAR(1) parser
generator) to ANTLR (LL(k) parser generator) using an automatic converter.

General automatic conversion between LR and LL is, of course, completely
impossible (since LR(1) languages are a superset of LL(k) languages). I
suspect that the converter they used simply translates some common idioms,
but _does not produce a valid LL grammar_.

So, this is all totally uninteresting, and actually fairly misleading.

Visualization of Ruby's Grammar

Posted Oct 30, 2006 15:30 UTC (Mon) by tjc (guest, #137) [Link] (1 responses)

These graphs are completely meaningless, as far as I can tell.

Most non-programmers would probably say the same thing about source code. ;-)

Visualization of Ruby's Grammar

Posted Oct 30, 2006 18:38 UTC (Mon) by mikov (guest, #33179) [Link]

I can't really understand the meaning of your post. I will take it as a not very successful joke :-) I think I explained why the graphs are meaningless, and more importantly that Ruby's graph is incorrect (LR(1) vs LL(k), etc)

This story should not really have been posted here - unlike most other stories on LWN, it has no useful content and is misleading. I suspect that most people do not know enough about parsing to be able to judge that, so they might take those graphs as a "valuable insight". They aren't. Perhaps I sound harsher than I intend to - I am just trying to be helpful to the other LWN readers.

Visualization of Ruby's Grammar

Posted Oct 31, 2006 16:21 UTC (Tue) by tjc (guest, #137) [Link] (1 responses)

They converted Ruby's grammar from BISON (LRAR(1) parser generator) to ANTLR (LL(k) parser generator) using an automatic converter.

Perhaps you mean LALR(1)? I do not know what LRAR(1) is.

General automatic conversion between LR and LL is, of course, completely impossible (since LR(1) languages are a superset of LL(k) languages). I suspect that the converter they used simply translates some common idioms, but _does not produce a valid LL grammar_.

Are you saying that Ruby can not be parsed with an ANTLR grammar, or that it's just the automatic conversion from LALR(1) that is problematic? Even if the generated grammar is incorrect, I suspect that a corrected version would look substantially the same, assuming that it can be generated at all.

So, this is all totally uninteresting, and actually fairly misleading.

On the contrary, I find this very interesting! See Bruce Person's comment about the primary node in Ruby, for example.

Visualization of Ruby's Grammar

Posted Oct 31, 2006 16:47 UTC (Tue) by mikov (guest, #33179) [Link]

Are you saying that Ruby can not be parsed with an ANTLR grammar, or that it's just the automatic conversion from LALR(1) that is problematic? Even if the generated grammar is incorrect, I suspect that a corrected version would look substantially the same, assuming that it can be generated at all.

I have a problem with the automatic conversion. I am sure pretty much anything (well, within limits) can eventually be parsed with ANTLR using backtracking, with varying amount of effort. In this case, since the generated grammar is almost certainly not correct, and since the person who did it even neglected to mention that, it cannot be taken seriously. I have no idea whether the diagram looks similar to one for the correct grammar. Do you ? Oh, it will probably have similar node names, but that doesn't really tell us much, which is my next point.

I have no idea what the diagram really means. Where is it defined ? Who uses it ? Subjective observations that one "looks cleaner" than the other are meaningless. I could rearrange any diagram to look terrible. Perhaps one could derive some value from it, of one already had intimate knowledge about the grammar, but not the other way around.

On the contrary, I find this very interesting! See Bruce Person's comment about the primary node in Ruby, for example.

I agree that his comment is interesting. Then I am pretty sure it is because he probably knows Ruby, not because he actually used the diagram to derive insight. His comments about the size of the parser make sense in the same context. The observation about the lexical analyzer doing a lot of work especially doesn't stand on its own - it is impossible to deduce that from the diagram.

The problem is that the entire thing it is based on a wrong premise. Even if it can generate interesting comments and useful tidbits of information, ultimately it is misleading. A casual reader could be lead to believe that:

Conversion between LR and LL is trivial.
This diagram is the true diagram of Ruby's syntax. (whatever that means)
Diagrams are often used as tools to study and evaluate grammars
Subjective evaluation of the artistic properties of the diagram corresponds to properties of the language.

All these things are false.

Visualization of Ruby's Grammar

Posted Nov 1, 2006 9:18 UTC (Wed) by rafdz (guest, #41427) [Link] (1 responses)

LR(1) grammares are not supersets of LL(k) grammars.only LR(k) are supersets of LL(k) grammars.

Visualization of Ruby's Grammar

Posted Nov 1, 2006 16:44 UTC (Wed) by mikov (guest, #33179) [Link]

While I am a little rusty on the theory, I seem to remember that any LR(k)
grammar can be converted to an equivalent LR(1) grammar, and since LR(k)
is superset of LL(k), it follows that LR(1) is superset of LL(k).

Visualization of Ruby's Grammar

Posted Nov 2, 2006 17:07 UTC (Thu) by kbob (guest, #1770) [Link]

I'd like to see the graph for Lisp... (-: