[#113756] [Ruby master Bug#19711] NoMethodError "private method `new' called for class" since bebd05fb51ea65bc57344b67100748200f8311eb — "yahonda (Yasuo Honda) via ruby-core" <ruby-core@...>

Issue #19711 has been reported by yahonda (Yasuo Honda).

7 messages 2023/06/05

[#113771] [Ruby master Feature#19712] IO#reopen removes singleton class — "itarato (Peter Arato) via ruby-core" <ruby-core@...>

Issue #19712 has been reported by itarato (Peter Arato).

11 messages 2023/06/05

[#113782] [Ruby master Bug#19716] SystemStackError occurs too easily on Alpine Linux (due to small stack size reported by pthread_attr_getstacksize on musl libc) — "alexdowad (Alex Dowad) via ruby-core" <ruby-core@...>

Issue #19716 has been reported by alexdowad (Alex Dowad).

6 messages 2023/06/07

[#113788] [Ruby master Bug#19717] `ConditionVariable#signal` is not fair when the wakeup is consistently spurious. — "ioquatix (Samuel Williams) via ruby-core" <ruby-core@...>

Issue #19717 has been reported by ioquatix (Samuel Williams).

13 messages 2023/06/07

[#113819] [Ruby master Feature#19720] Warning for non-linear Regexps — "Eregon (Benoit Daloze) via ruby-core" <ruby-core@...>

Issue #19720 has been reported by Eregon (Benoit Daloze).

11 messages 2023/06/08

[#113835] [Ruby master Misc#19722] DevMeeting-2023-07-13 — "mame (Yusuke Endoh) via ruby-core" <ruby-core@...>

Issue #19722 has been reported by mame (Yusuke Endoh).

9 messages 2023/06/09

[#113944] [Ruby master Feature#19737] Add `IO::Buffer#cat` for concat `IO::Buffer` instances — "unasuke (Yusuke Nakamura) via ruby-core" <ruby-core@...>

Issue #19737 has been reported by unasuke (Yusuke Nakamura).

7 messages 2023/06/19

[#113953] [Ruby master Bug#19739] Key cannot be found in a Hash when slice! method is applied to the key — "ilya.andreyuk (Ilya Andreyuk) via ruby-core" <ruby-core@...>

Issue #19739 has been reported by ilya.andreyuk (Ilya Andreyuk).

9 messages 2023/06/20

[#113966] [Ruby master Bug#19742] Introduce `Module#anonymous?` — "ioquatix (Samuel Williams) via ruby-core" <ruby-core@...>

Issue #19742 has been reported by ioquatix (Samuel Williams).

47 messages 2023/06/21

[#114025] [Ruby master Feature#19744] Namespace on read — "tagomoris (Satoshi TAGOMORI) via ruby-core" <ruby-core@...>

Issue #19744 has been reported by tagomoris (Satoshi TAGOMORI).

71 messages 2023/06/27

[#114032] [Ruby master Misc#19747] Propose Kevin Newton and Jemma Issroff as core committers — "k0kubun (Takashi Kokubun) via ruby-core" <ruby-core@...>

Issue #19747 has been reported by k0kubun (Takashi Kokubun).

8 messages 2023/06/28

[#114038] [Ruby master Bug#19749] Confirm correct behaviour when attaching private method with `#define_method` — "itarato (Peter Arato) via ruby-core" <ruby-core@...>

Issue #19749 has been reported by itarato (Peter Arato).

15 messages 2023/06/28

[ruby-core:113882] [Ruby master Feature#19725] Improve the match cache optimization to support look-ahead and atomic groups

From: "make_now_just (Hiroya Fujinami) via ruby-core" <ruby-core@...>
Date: 2023-06-12 06:07:11 UTC
List: ruby-core #113882
Issue #19725 has been reported by make_now_just (Hiroya Fujinami).

----------------------------------------
Feature #19725: Improve the match cache optimization to support look-ahead and atomic groups
https://bugs.ruby-lang.org/issues/19725

* Author: make_now_just (Hiroya Fujinami)
* Status: Open
* Priority: Normal
----------------------------------------
Currently, the Regexp match cache optimization [Feature #19104] memoizes match cache points that once arrived; that is, it memoizes them **before backtracking**. This kind of implementation is simple and works fine in most cases. However, in some cases, it does not work well. For example,

1. it is hard to preserve the strange null-loop behavior, and
2. it is hard to support look-around operators and atomic groups correctly.

The first problem is simple. Memoization may change the matching behavior if a cache point in a null loop is memoized because Onigmo allows a null loop (a loop without consuming a character) at the first time to keep captures. We avoid this problem by carefully resetting the memoization table on null loops currently. But, nested null loops are not supported.

The second problem is a bit complex. For example, `/\Aa*(?=a*)a\z/` does not match against `"a"` on the memozation before backtracking. The reason is that when the first matching of the look-ahead is succeeded, the loop branch in the look-ahead is memoized, and it is impossible to succeed the second matching of the look-ahead. Furthermore, how do we prevent increasing the matching time of `/\Aa*?(?=a*)\z/` against `"a" * N + "z"` quadratically?

I suggest two improvements to resolve these issues:

1. Memoizing **after backtracking**, and
2. Memoizing **partial match results** for match cache points in look-ahead and atomic groups.

The first improvement is, I think, the canonical way to implement the match cache optimization. This avoids the first "null loop" problem entirely. This also enables to support look-ahead and atomic groups easily. This can be implemented by introducing an extra item to the stack and doing memoization on popping it.

The second improvement is to simulate on the memoization the atomic behavior of look-ahead and atomic (Of course!) groups on the memoization. The current memoization table only records the arrived points; that is, it does not record the result of matching and only records failures of matching. Actually, it is not problematic in the current implementation because "matching is succeeded" means the matching is done. However, when a part of look-around and atomic groups is matched, the stack is hoisted to behave atomically. To simulate this behavior, successes in matching these parts are also memoized. This can be implemented by introducing an extra bit for the result of matching to the memoization table.

These improvements will enable

- to avoid the nested null loop restriction, and
- to support positive/negative look-ahead and atomic groups correctly.

However, look-ahead and atomic groups that include captures cannot be supported. This is because captures may be changed on successful matches.



-- 
https://bugs.ruby-lang.org/
 ______________________________________________
 ruby-core mailing list -- ruby-core@ml.ruby-lang.org
 To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
 ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/

In This Thread

Prev Next