From: duerst@... Date: 2018-07-24T05:58:24+00:00 Subject: [ruby-core:88074] [Ruby trunk Bug#14934] Unicode: Hangul normalize bug Issue #14934 has been updated by duerst (Martin D��rst). Assignee set to duerst (Martin D��rst) ---------------------------------------- Bug #14934: Unicode: Hangul normalize bug https://bugs.ruby-lang.org/issues/14934#change-73094 * Author: MaLin (Ma Lin Ma) * Status: Open * Priority: Normal * Assignee: duerst (Martin D��rst) * Target version: * ruby -v: * Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN ---------------------------------------- I was involved to fix a similar bug in Python, I found Ruby also has bug code. We should fix this line[1] like this: [1] https://github.com/ruby/ruby/blob/96db72ce38b27799dd8e80ca00696e41234db6ba/lib/unicode_normalize/normalize.rb#L73 -if length>2 and 0 <= (trail=string[2].ord-TBASE) and trail < TCOUNT +if length>2 and 0 < (trail=string[2].ord-TBASE) and trail < TCOUNT ------- There was a change of Unicode Standard's demonstration code. Before Unicode 4.1.0 (draft), here is: TBase <= code <= TBase+TCount see: http://www.unicode.org/reports/tr15/tr15-24.html#hangul_composition After Unicode 4.1.0, here is TBase < code < TBase+TCount, which in line with Unicode 10.0 see: http://www.unicode.org/reports/tr15/tr15-25.html#hangul_composition This change happened in 2005. Please note: The normalize algorithm didn't changed, only the demonstration code changed, see this discussion[2] about this point. [2] https://bugs.python.org/issue29456 ------- Here is some test code[3] for Python, maybe useful for this fix. [3] https://github.com/python/cpython/commit/d134809cd3764c6a634eab7bb8995e3e2eff14d5 -- https://bugs.ruby-lang.org/ Unsubscribe: