[#6660] Ruby on Neko ? — Nicolas Cannasse <ncannasse@...>

Hi folks,

14 messages 2005/11/19

[#6672] testing for hardlink with "test(?-, ...)" flawed on Windows — noreply@...

Bugs item #2858, was opened at 2005-11-20 16:35

13 messages 2005/11/20

[#6684] semenatics of if/unless/while statement modifiers — Stefan Kaes <skaes@...>

Hi all,

81 messages 2005/11/21
[#6685] Re: semenatics of if/unless/while statement modifiers — Mauricio Fern疣dez <mfp@...> 2005/11/22

On Tue, Nov 22, 2005 at 08:22:59AM +0900, Stefan Kaes wrote:

[#6686] Re: semenatics of if/unless/while statement modifiers — Stefan Kaes <skaes@...> 2005/11/22

Mauricio Fern疣dez wrote:

[#6687] Re: semenatics of if/unless/while statement modifiers — Eric Hodel <drbrain@...7.net> 2005/11/22

On Nov 21, 2005, at 4:37 PM, Stefan Kaes wrote:

[#6689] Re: semenatics of if/unless/while statement modifiers — Stefan Kaes <skaes@...> 2005/11/22

Eric Hodel wrote:

[#6693] Re: semenatics of if/unless/while statement modifiers — Yukihiro Matsumoto <matz@...> 2005/11/22

Hi,

[#6695] Re: semenatics of if/unless/while statement modifiers — Stefan Kaes <skaes@...> 2005/11/22

Yukihiro Matsumoto wrote:

[#6718] Re: semenatics of if/unless/while statement modifiers — mathew <meta@...> 2005/11/22

[#6722] Re: semenatics of if/unless/while statement modifiers — Stefan Kaes <skaes@...> 2005/11/22

mathew wrote:

[#6707] Re: semenatics of if/unless/while statement modifiers — "David A. Black" <dblack@...> 2005/11/22

Hi --

[#6708] Re: semenatics of if/unless/while statement modifiers — Stefan Kaes <skaes@...> 2005/11/22

David A. Black wrote:

[#6714] Re: semenatics of if/unless/while statement modifiers — "David A. Black" <dblack@...> 2005/11/22

Hi --

[#6717] Re: semenatics of if/unless/while statement modifiers — Stefan Kaes <skaes@...> 2005/11/22

David A. Black wrote:

[#6798] ruby 1.8.4 preview2 — Yukihiro Matsumoto <matz@...>

Hi,

37 messages 2005/11/30

Re: [RFC] Method of feeding input to regexp matching

From: Eric Mahurin <eric.mahurin@...>
Date: 2005-11-04 14:22:50 UTC
List: ruby-core #6569
On 11/3/05, Nikolai Weibull <mailing-lists.ruby-core@rawuncut.elitemail.org>
wrote:
>
> I would very much like to be able to provide a Regexp object input from
> some input source other than a "fixed" string. Examples of such sources
> would be a file bufferer that reads chunks from a file and can then make
> them accessible to the Regexp object as it needs them. At the moment
> there's no way to do this. The exact semantics of such a method might
> be quite complicated, as it can be hard to maintain the MatchData. How
> this is to be done is beyond this initial query, but I would like to
> know if anyone besides me sees any merit in this.
>

Since I've been doing parsers, I've thought about this a bit too. Because of
the problems you've mentioned, I've decided not to incorporated regexes in
my parser stuff. The only reason I would want to would be for performance
reasons in the lexers (my competing method is more consistent, flexible, and
more readable - for complex stuff). Regexes being confined to a String is
likely the reason I can't find any racc examples where the lexer reads in an
IO/File rather than a String.

I have put some hacky solutions in my Cursor classes (external
iterator/stream stuff), but I'm not using them in Grammar (parser/lexer
stuff). Here are the methods I came up with (after some discussion with
Caleb Clausen):

# scan for a pattern (\A anchored) with a finite length (specify max length
of the pattern)
scan_pattern(pattern,len=1,hold=false,buffer=nil)

# scan until a finite pattern is found (specify max length of the pattern)
# - kind of works like IO#gets(aString)
scan_pattern_until(pattern,len=1,hold=false,buffer=nil,init=16)

# scan while a loop pattern matches (specify max length of an iteration -
finite)
scan_pattern_while(pattern,len=1,hold=false,buffer=nil,init=16)

With these, you might match a multi-line comment (could be any length) like
this:

cursor.scan_pattern(/\A\/\*/,2,false,buf="") && cursor.scan_pattern_until
(/\*\//,2,false,buf)

buf should contain the multi-line comment if this was successful.


I see several solutions to the problems being discussed:

1. live with - do something like the above

2. you should be able to specify a get-more-data method when a regexp hits
the end of a string

3. you should have access to whether the regexp hit the end of string (pass
or fail)

4. Regexp should be duck-typed (with a String optimization and possibly
String buffering for other types) such that it can operate on anything that
responds to a subset of String methods (minimal: #[]). It wouldn't be too
difficult to make an IO (responding to #pos) handle a string-like #[]
method.

In This Thread