[#6660] Ruby on Neko ? — Nicolas Cannasse <ncannasse@...>

Hi folks,

14 messages 2005/11/19

[#6672] testing for hardlink with "test(?-, ...)" flawed on Windows — noreply@...

Bugs item #2858, was opened at 2005-11-20 16:35

13 messages 2005/11/20

[#6684] semenatics of if/unless/while statement modifiers — Stefan Kaes <skaes@...>

Hi all,

81 messages 2005/11/21
[#6685] Re: semenatics of if/unless/while statement modifiers — Mauricio Fern疣dez <mfp@...> 2005/11/22

On Tue, Nov 22, 2005 at 08:22:59AM +0900, Stefan Kaes wrote:

[#6686] Re: semenatics of if/unless/while statement modifiers — Stefan Kaes <skaes@...> 2005/11/22

Mauricio Fern疣dez wrote:

[#6687] Re: semenatics of if/unless/while statement modifiers — Eric Hodel <drbrain@...7.net> 2005/11/22

On Nov 21, 2005, at 4:37 PM, Stefan Kaes wrote:

[#6689] Re: semenatics of if/unless/while statement modifiers — Stefan Kaes <skaes@...> 2005/11/22

Eric Hodel wrote:

[#6693] Re: semenatics of if/unless/while statement modifiers — Yukihiro Matsumoto <matz@...> 2005/11/22

Hi,

[#6695] Re: semenatics of if/unless/while statement modifiers — Stefan Kaes <skaes@...> 2005/11/22

Yukihiro Matsumoto wrote:

[#6718] Re: semenatics of if/unless/while statement modifiers — mathew <meta@...> 2005/11/22

[#6722] Re: semenatics of if/unless/while statement modifiers — Stefan Kaes <skaes@...> 2005/11/22

mathew wrote:

[#6707] Re: semenatics of if/unless/while statement modifiers — "David A. Black" <dblack@...> 2005/11/22

Hi --

[#6708] Re: semenatics of if/unless/while statement modifiers — Stefan Kaes <skaes@...> 2005/11/22

David A. Black wrote:

[#6714] Re: semenatics of if/unless/while statement modifiers — "David A. Black" <dblack@...> 2005/11/22

Hi --

[#6717] Re: semenatics of if/unless/while statement modifiers — Stefan Kaes <skaes@...> 2005/11/22

David A. Black wrote:

[#6798] ruby 1.8.4 preview2 — Yukihiro Matsumoto <matz@...>

Hi,

37 messages 2005/11/30

Re: [RFC] Method of feeding input to regexp matching

From: Nikolai Weibull <mailing-lists.ruby-core@...>
Date: 2005-11-04 11:46:18 UTC
List: ruby-core #6567
David A. Black wrote:

> On Fri, 4 Nov 2005, Nikolai Weibull wrote:

> > David A. Black wrote:

> > > I'm thinking of cases like this:
> > >
> > >   re = /abc.*def/
> > >
> > > The first chunk out of the file might match this -- but then you'd
> > > have to keep going, really until EOF, to get the greedy match if
> > > it's there.  Then you'd have to go back.

> > Well, think of it like this instead.  The Regexp simply reads from
> > the input source when it needs more data.  The Regexp will
> > concatenate the new data with the old and continue on its matching
> > routine.  We build the input as we go along, i.e., we’re in a sense
> > dealing with implementing lazy Strings.  This won’t cause any issues
> > with backtracking, as the data will still be there.

> In the /abc.*def/ case, though, you'd always have to take all the
> input (at least up to the third-to-last character in the file), even
> if you had an intermediate match.  So "needs more data" would not be
> something the regex could tell you.  It would say, "Yes, there's a
> match", but you would have to know that the "yes" didn't mean you
> could stop.

.* needs more data until there is no more data (#read returns nil), then
it fails as it hasn’t been able to match 'def' and backtracks until that
part of the regex does.  Then it has a match.  (This ignores newline
conventions, but let’s ignore them for now.)  You have the same problem
when doing this on a regular string.

> But if the regex were /abc.*?def/, then as soon as there was a "yes",
> you could stop.

> There's also a question of: if the first 4096 bytes started with "abc"
> and ended with "de", then you'd add the next 4096 -- but you'd have to
> perform the match again.  Or else you'd have to know to rewind by
> exactly two characters.  But if you're changing where you start the
> match, that could affect how anchors worked.

Why?  If the first character in the next 4096 bytes is a "f" we’d have a
match.  If we’re using .*? we’re are done.  If we’re using .* we
wouldn’t have begun matching the "de" against the /de/.  What would be
an issue would be how to treat MatchData#post_match.  It’d have to be
the remaining data that wasn’t matched at the time of a match, not all
the possible data that may come from the source.

        nikolai

-- 
Nikolai Weibull: now available free of charge at http://bitwi.se/!
Born in Chicago, IL USA; currently residing in Gothenburg, Sweden.
main(){printf(&linux["\021%six\012\0"],(linux)["have"]+"fun"-97);}

In This Thread