From: malincns@163.com Date: 2018-07-24T12:15:53+00:00 Subject: [ruby-core:88078] [Ruby trunk Bug#14934] Unicode: Hangul normalize bug Issue #14934 has been updated by MaLin (Ma Lin Ma). > Can you provide some test case(s)? That is what frustrated me. I simply translated Python's test-cases for this issue[1] to Ruby. [1] https://github.com/python/cpython/commit/d134809cd3764c6a634eab7bb8995e3e2eff14d5 But them passed without rasing exception. Ruby's code seems relatived to the `\u11a7` character. > I won't have much time to look at this issue this week. I'll get around to it next week (maybe even this Friday). Need not hurry, it's a very old bug, and passed test-cases mystically. ---------------------------------------- Bug #14934: Unicode: Hangul normalize bug https://bugs.ruby-lang.org/issues/14934#change-73102 * Author: MaLin (Ma Lin Ma) * Status: Open * Priority: Normal * Assignee: duerst (Martin D��rst) * Target version: * ruby -v: * Backport: 2.3: UNKNOWN, 2.4: UNKNOWN, 2.5: UNKNOWN ---------------------------------------- I was involved to fix a similar bug in Python, I found Ruby also has bug code. We should fix this line[1] like this: [1] https://github.com/ruby/ruby/blob/96db72ce38b27799dd8e80ca00696e41234db6ba/lib/unicode_normalize/normalize.rb#L73 -if length>2 and 0 <= (trail=string[2].ord-TBASE) and trail < TCOUNT +if length>2 and 0 < (trail=string[2].ord-TBASE) and trail < TCOUNT ------- There was a change of Unicode Standard's demonstration code. Before Unicode 4.1.0 (draft), here is: TBase <= code <= TBase+TCount see: http://www.unicode.org/reports/tr15/tr15-24.html#hangul_composition After Unicode 4.1.0, here is TBase < code < TBase+TCount, which in line with Unicode 10.0 see: http://www.unicode.org/reports/tr15/tr15-25.html#hangul_composition This change happened in 2005. Please note: The normalize algorithm didn't changed, only the demonstration code changed, see this discussion[2] about this point. [2] https://bugs.python.org/issue29456 ------- Here is some test code[3] for Python, maybe useful for this fix. [3] https://github.com/python/cpython/commit/d134809cd3764c6a634eab7bb8995e3e2eff14d5 -- https://bugs.ruby-lang.org/ Unsubscribe: