[#1816] Ruby 1.5.3 under Tru64 (Alpha)? — Clemens Hintze <clemens.hintze@...>

Hi all,

17 messages 2000/03/14

[#1989] English Ruby/Gtk Tutorial? — schneik@...

18 messages 2000/03/17

[#2241] setter() for local variables — ts <decoux@...>

18 messages 2000/03/29

[ruby-talk:02277] Re: Multiple assignment of pattern match results.

From: mrilu <mrilu@...>
Date: 2000-03-30 21:05:42 UTC
List: ruby-talk #2277
> In message "[ruby-talk:02256] Multiple assignment of pattern match results."
>     on 00/03/30, schneik@us.ibm.com <schneik@us.ibm.com> writes:
> |The only problem I have with this example is that scan is going through the
> |work of producing an array of arrays, even though I only want the first
> |match:

On Thu, 30 Mar 2000, Yukihiro Matsumoto wrote:
> a)
>   x="aabbbccccdddeeeefffabcdeabcdeabcde"
>   (dummy, t, u, v) = /(a+)[^ace]*(c+)[^e]*(e+)/.match(x).to_a
> b)
>   t = u = v = nil
>   x.scan(/(a+)[^ace]*(c+)[^e]*(e+)/) do |t,u,v|
>     break
>   end

On a) I agree. It's nice, even tough x.match(re) might look better for
someone's eyes.

On b) I have to to think and learn a little.

mrilu> |I think it's time for String.scan(pattern[, limit]), like split.
matz> Interesting.

Is it interesting that scan(pattern,limit) might be useful or that me poor
stupid loser tend to think erratically consistently :) ?

Well, when I proposed this at morning I was think to make a patch at once.
But after half an hour of reading I decided to leave it up to you.

This was the case because there is this code in string.c routine
scan_once:

    if (rb_reg_search(pat, str, *start, 0) >= 0) {
        match = rb_backref_get();
        regs = RMATCH(match)->regs;
        if (BEG(0) == END(0)) {
            /* Always consume at least one character of the input string*/
            *start = END(0)+mbclen2(RSTRING(str)->ptr[END(0)],pat);
        }
        else {
            *start = END(0);
        }
        if (regs->num_regs == 1) {
            return rb_reg_nth_match(0, match);
        }
!       result = rb_ary_new2(regs->num_regs);
!       for (i=1; i < regs->num_regs; i++) {
!           rb_ary_push(result, rb_reg_nth_match(i, match));
!       }
!       return result;

So I thought this could return an array from scan_once to rb_str_scan,
and there we will call possible iterator for every match.

No, back to case b). I agree that functionality will be same, but I have
some doubts about performance. rb_reg_search will be called and it could
return finite amount of matches unless it's recoded, could it? So while

>   t = u = v = nil
>   x.scan(/(a+)[^ace]*(c+)[^e]*(e+)/) do |t,u,v|
      # do something with t,u and v
>     break
>   end

will use only first matched data while there might be more than one
group of matched data. And I think in original mail Conrad
expressed his wish for performance:

> The only problem I have with this example is that scan is going through
> the work of producing an array of arrays, even though I only want the
> first match:

I don't know anything about RE-library's internals and I don't want to
mess it up (especially if rb_reg_search with limit is not needed). So I 
didn't recode rb_str_scan nor scan_once with limit.

Or, as I know noted, rb_reg_nth_match will give only the nth data stored
when matched with paretheses like /a(.)b(d*)/. Is it so? If so, I think
you can all forget my mail. :)

But I'm really interested in to learn and hear how these things are 
working!

One more thing. I don't know how things are done in other languages, but
for my eyes C-files that I got with Ruby seem to be crystal clear and
simple (umm. maybe I haven't looked hard enough - and don't look re.c :).
I don't know if this is special feature of Ruby? Or are other scriptings
languages as crisp? 

In This Thread