R
Ryan Davis
ruby_parser version 2.0.0 has been released!
http://rubyforge.org/projects/parsetree/
ruby_parser (RP) is a ruby parser written in pure ruby (utilizing
racc--which does by default use a C extension). RP's output is
the same as ParseTree's output: s-expressions using ruby's arrays and
base types.
Changes:
=== 2.0.0 / 2008-10-22
* 1 major enhancement
* Brought on the AWESOME! 4x faster! no known lexing/parsing bugs!
* 71 minor enhancements
* 1.9: Added Fixnum#ord.
* 1.9: Added missing Regexp constants and did it so it'd work on 1.9.
* Added #store_comment and #comments
* Added StringScanner #begin_of_line?
* Added a bunch of tests for regexp escape chars, #parse_string,
#read_escape, ? numbers, ? whitespace.
* Added a hack for rubinius' r2l eval bug.
* Added a new token type tSTRING that bypasses tSTRING_BEG/END
entirely. Only does non-interpolated strings and then falls back to
the old way. MUCH cleaner tho.
* Added bin/ruby_parse
* Added compare rule to Rakefile.
* Added coverage files/dirs to clean rule.
* Added file and line numbers to all sexp nodes. Column/ranges to
come.
* Added lex_state change for lvars at the end of yylex.
* Added lexed comments to defn/defs/class/module nodes.
* Added stats gathering for yylex. Reordered yylex for avg data
* Added tSYMBOL token type and parser rule to speed up symbol lexing.
* Added tally output for getch, unread, and unread_many.
* Added tests for ambigous uminus/uplus, backtick in cmdarg, square
and curly brackets, numeric gvars, eos edge cases, string quoting %<>
and %%%.
* All cases throughout yylex now return directly if they match, no
passthroughs.
* All lexer cases now slurp entire token in one swoop.
* All zarrays are now just empty arrays.
* Changed sblock_arg, :blah) to :"&blah" in args sexp.
* Cleaned up lexer error handling. Now just raises all over.
* Cleaned up read_escape and regx_options
* Cleaned up tokadd_string (for some definition of cleaned).
* Converted single quoted strings to new tSTRING token type.
* Coverage is currently 94.4% on lexer.
* Done what I can to clean up heredoc lexing... still sucks.
* Flattened resbodies in rescue node. Fixed .autotest file.
* Folded lex_keywords back in now that it screams.
* Found very last instanceof ILiteralNode in the code. haha!
* Got the tests subclassing PTTC and cleaned up a lot. YAY
* Handle yield(*ary) properly
* MASSIVELY cleaned out =begin/=end comment processor.
* Massive overhaul on Keyword class. All hail the mighty Hash!
* Massively cleaned up ident= edge cases and fixed a stupid bug
from jruby.
* Merged @/@@ scanner together, going to try to do the same
everywhere.
* Refactored fix_arg_lex_state, common across the lexer.
* Refactored new_fcall into new_call.
* Refactored some code to get better profile numbers.
* Refactored some more #fix_arg_lex_state.
* Refactored tail of yylex into its own method.
* Removed Module#kill
* Removed Token, replaced with Sexp.
* Removed all parse_number and parse_quote tests.
* Removed argspush, argscat. YAY!
* Removed as many token_buffer.split(//)'s as possible. 1 to go.
* Removed begins from compstmts
* Removed buffer arg for tokadd_string.
* Removed crufty (?) solo '@' token... wtf was that anyhow?
* Removed most jruby/stringio cruft from StringScanner.
* Removed one unread_many... 2 to go. They're harder.
* Removed store_comment, now done directly.
* Removed token_buffer. Now I just use token ivar.
* Removed use of s() from lexer. Changed the way line numbers are
gathered.
* Renamed *qwords to *awords.
* Renamed StringScanner to RPStringScanner (a subclass) to fix
namespace trashing.
* Renamed parse to process and aliased to parse.
* Renamed token_buffer to string_buffer since that arcane shit
still needs it.
* Resolved the rest of the lexing issues I brought up w/ ruby-core.
* Revamped tokadd_escape.
* Rewrote Keyword and KWtable.
* Rewrote RubyLexer using StringScanner.
* Rewrote tokadd_escape. 79 lines down to 21.
* Split out lib/ruby_parser_extras.rb so lexer is standalone.
* Started to clean up the parser and make it as skinny as possible
* Stripped out as much code as possible.
* Stripped yylex of some dead code.
* Switched from StringIO to StringScanner.
* Updated rakefile for new hoe.
* Uses pure ruby racc if ENV['PURE_RUBY'], otherwise use c.
* Wrote a ton of lexer tests. Coverage is as close to 100% as
possible.
* Wrote args to clean up the big nasty args processing grammar
section.
* lex_strterm is now a plain array, removed RubyLexer#s(...).
* yield and super now flatten args.
* 21+ bug fixes:
* I'm sure this list is missing a lot:
* Fixed 2 bugs both involving attrasgn (and ilk) esp when lhs is an
array.
* Fixed a bug in the lexer for strings with single digit hex escapes.
* Fixed a bug parsing: a (args) { expr }... the space caused a
different route to be followed and all hell broke loose.
* Fixed a bug with x\n=beginvar not putting begin back.
* Fixed attrasgn to have arglists, not arrays.
* Fixed bug in defn/defs with block fixing.
* Fixed class/module's name slot if colon2/3.
* Fixed dstr with empty interpolation body.
* Fixed for 1.9 string/char changes.
* Fixed lexer BS wrt determining token type of words.
* Fixed lexer BS wrt pass through values and lexing words. SO STUPID.
* Fixed lexing of floats.
* Fixed lexing of identifiers followed by equals. I hope.
* Fixed masgn with splat on lhs
* Fixed new_super to deal with block_pass correctly.
* Fixed parser's treatment of :colon2 and :colon3.
* Fixed regexp scanning of escaped numbers, ANY number is valid,
not just octs.
* Fixed string scanning of escaped octs, allowing 1-3 chars.
* Fixed unescape for \n
* Fixed: omg this is stupid. '()' was returning bare nil
* Fixed: remove_begin now goes to the end, not sure why it didn't
before.
http://rubyforge.org/projects/parsetree/
ruby_parser (RP) is a ruby parser written in pure ruby (utilizing
racc--which does by default use a C extension). RP's output is
the same as ParseTree's output: s-expressions using ruby's arrays and
base types.
Changes:
=== 2.0.0 / 2008-10-22
* 1 major enhancement
* Brought on the AWESOME! 4x faster! no known lexing/parsing bugs!
* 71 minor enhancements
* 1.9: Added Fixnum#ord.
* 1.9: Added missing Regexp constants and did it so it'd work on 1.9.
* Added #store_comment and #comments
* Added StringScanner #begin_of_line?
* Added a bunch of tests for regexp escape chars, #parse_string,
#read_escape, ? numbers, ? whitespace.
* Added a hack for rubinius' r2l eval bug.
* Added a new token type tSTRING that bypasses tSTRING_BEG/END
entirely. Only does non-interpolated strings and then falls back to
the old way. MUCH cleaner tho.
* Added bin/ruby_parse
* Added compare rule to Rakefile.
* Added coverage files/dirs to clean rule.
* Added file and line numbers to all sexp nodes. Column/ranges to
come.
* Added lex_state change for lvars at the end of yylex.
* Added lexed comments to defn/defs/class/module nodes.
* Added stats gathering for yylex. Reordered yylex for avg data
* Added tSYMBOL token type and parser rule to speed up symbol lexing.
* Added tally output for getch, unread, and unread_many.
* Added tests for ambigous uminus/uplus, backtick in cmdarg, square
and curly brackets, numeric gvars, eos edge cases, string quoting %<>
and %%%.
* All cases throughout yylex now return directly if they match, no
passthroughs.
* All lexer cases now slurp entire token in one swoop.
* All zarrays are now just empty arrays.
* Changed sblock_arg, :blah) to :"&blah" in args sexp.
* Cleaned up lexer error handling. Now just raises all over.
* Cleaned up read_escape and regx_options
* Cleaned up tokadd_string (for some definition of cleaned).
* Converted single quoted strings to new tSTRING token type.
* Coverage is currently 94.4% on lexer.
* Done what I can to clean up heredoc lexing... still sucks.
* Flattened resbodies in rescue node. Fixed .autotest file.
* Folded lex_keywords back in now that it screams.
* Found very last instanceof ILiteralNode in the code. haha!
* Got the tests subclassing PTTC and cleaned up a lot. YAY
* Handle yield(*ary) properly
* MASSIVELY cleaned out =begin/=end comment processor.
* Massive overhaul on Keyword class. All hail the mighty Hash!
* Massively cleaned up ident= edge cases and fixed a stupid bug
from jruby.
* Merged @/@@ scanner together, going to try to do the same
everywhere.
* Refactored fix_arg_lex_state, common across the lexer.
* Refactored new_fcall into new_call.
* Refactored some code to get better profile numbers.
* Refactored some more #fix_arg_lex_state.
* Refactored tail of yylex into its own method.
* Removed Module#kill
* Removed Token, replaced with Sexp.
* Removed all parse_number and parse_quote tests.
* Removed argspush, argscat. YAY!
* Removed as many token_buffer.split(//)'s as possible. 1 to go.
* Removed begins from compstmts
* Removed buffer arg for tokadd_string.
* Removed crufty (?) solo '@' token... wtf was that anyhow?
* Removed most jruby/stringio cruft from StringScanner.
* Removed one unread_many... 2 to go. They're harder.
* Removed store_comment, now done directly.
* Removed token_buffer. Now I just use token ivar.
* Removed use of s() from lexer. Changed the way line numbers are
gathered.
* Renamed *qwords to *awords.
* Renamed StringScanner to RPStringScanner (a subclass) to fix
namespace trashing.
* Renamed parse to process and aliased to parse.
* Renamed token_buffer to string_buffer since that arcane shit
still needs it.
* Resolved the rest of the lexing issues I brought up w/ ruby-core.
* Revamped tokadd_escape.
* Rewrote Keyword and KWtable.
* Rewrote RubyLexer using StringScanner.
* Rewrote tokadd_escape. 79 lines down to 21.
* Split out lib/ruby_parser_extras.rb so lexer is standalone.
* Started to clean up the parser and make it as skinny as possible
* Stripped out as much code as possible.
* Stripped yylex of some dead code.
* Switched from StringIO to StringScanner.
* Updated rakefile for new hoe.
* Uses pure ruby racc if ENV['PURE_RUBY'], otherwise use c.
* Wrote a ton of lexer tests. Coverage is as close to 100% as
possible.
* Wrote args to clean up the big nasty args processing grammar
section.
* lex_strterm is now a plain array, removed RubyLexer#s(...).
* yield and super now flatten args.
* 21+ bug fixes:
* I'm sure this list is missing a lot:
* Fixed 2 bugs both involving attrasgn (and ilk) esp when lhs is an
array.
* Fixed a bug in the lexer for strings with single digit hex escapes.
* Fixed a bug parsing: a (args) { expr }... the space caused a
different route to be followed and all hell broke loose.
* Fixed a bug with x\n=beginvar not putting begin back.
* Fixed attrasgn to have arglists, not arrays.
* Fixed bug in defn/defs with block fixing.
* Fixed class/module's name slot if colon2/3.
* Fixed dstr with empty interpolation body.
* Fixed for 1.9 string/char changes.
* Fixed lexer BS wrt determining token type of words.
* Fixed lexer BS wrt pass through values and lexing words. SO STUPID.
* Fixed lexing of floats.
* Fixed lexing of identifiers followed by equals. I hope.
* Fixed masgn with splat on lhs
* Fixed new_super to deal with block_pass correctly.
* Fixed parser's treatment of :colon2 and :colon3.
* Fixed regexp scanning of escaped numbers, ANY number is valid,
not just octs.
* Fixed string scanning of escaped octs, allowing 1-3 chars.
* Fixed unescape for \n
* Fixed: omg this is stupid. '()' was returning bare nil
* Fixed: remove_begin now goes to the end, not sure why it didn't
before.