[#87847] undefined symbol: mjit_init_p — Leam Hall <leamhall@...>

I pulled Ruby trunk on 3 Jul and am now getting errors similar to the

10 messages 2018/07/07

[#88088] [Ruby trunk Misc#14937] [PATCH] thread_pthread: lazy-spawn timer-thread only on contention — normalperson@...

Issue #14937 has been reported by normalperson (Eric Wong).

9 messages 2018/07/24

[ruby-core:88133] [Ruby trunk Bug#14934] Unicode: Hangul normalize bug

From: duerst@...
Date: 2018-07-27 05:10:26 UTC
List: ruby-core #88133
Issue #14934 has been updated by duerst (Martin D端rst).


I think I have figured things out:

The patch is technically correct. While LBASE and VBASE are the values of the first actual leading and vowel jamos, the value of TBASE is one smaller than the first actual trailing jamo at 0x11A8. This is to account for the fact that the lowest value of the "trailing digit" of the Hangul syllable representation indicates the absence of a trailing jamo. So in contrast to the <= tests related to LBASE and VBASE, it is indeed technically correct to have a < comparison operator in the comparison related to TBASE.

However, I have also figured out why this apparent bug doesn't actually affect Ruby. The reason is that we use regular expressions to extract "normalization runs" from the string to be normalized. We know that a U+11A7 character can never participate in a normalization operation because it is a classical Hangul Jamo not used in modern Hangul. So U+11A7 never appears in a normalization run, and there's thus no error.

----------------------------------------
Bug #14934: Unicode: Hangul normalize bug
https://bugs.ruby-lang.org/issues/14934#change-73156

* Author: MaLin (Lin Ma)
* Status: Open
* Priority: Normal
* Assignee: duerst (Martin D端rst)
* Target version: 
* ruby -v: 
* Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN
----------------------------------------
I was involved to fix a similar bug in Python, I found Ruby also has bug code.

We should fix this line[1] like this:
[1] https://github.com/ruby/ruby/blob/96db72ce38b27799dd8e80ca00696e41234db6ba/lib/unicode_normalize/normalize.rb#L73

-if length>2 and 0 <= (trail=string[2].ord-TBASE) and trail < TCOUNT
+if length>2 and 0 < (trail=string[2].ord-TBASE) and trail < TCOUNT

-------
There was a change of Unicode Standard's demonstration code.

Before Unicode 4.1.0 (draft), here is: TBase <= code <= TBase+TCount
see: http://www.unicode.org/reports/tr15/tr15-24.html#hangul_composition

After Unicode 4.1.0, here is TBase < code < TBase+TCount, which in line with Unicode 10.0
see: http://www.unicode.org/reports/tr15/tr15-25.html#hangul_composition

This change happened in 2005.

Please note: The normalize algorithm didn't changed, only the demonstration code changed, see this discussion[2] about this point.
[2] https://bugs.python.org/issue29456

-------
Here is some test code[3] for Python, maybe useful for this fix.
[3] https://github.com/python/cpython/commit/d134809cd3764c6a634eab7bb8995e3e2eff14d5



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread

Prev Next