[ruby-core:78412] [Ruby trunk Bug#12990][Open] unicode_case_mapping_tests

From: duerst@...
Date: 2016-11-29 08:57:48 UTC
List: ruby-core #78412
Issue #12990 has been updated by Martin D端rst.

Status changed from Closed to Open

Nobuyoshi Nakada wrote:
> I found that tests generated by `TestComprehensiveCaseFold.unicode_case_mapping_tests` compares `target` with same `target`, which should be always true usually.

Nice catch, thanks! This is indeed a serious error. I fixed it.

> I suspect this is unintentional, tried comparison `target` with `result`,
> and got 30 failures.

I confirmed this. The failures are all related to one very rare character, U+A64B CYRILLIC SMALL LETTER MONOGRAPH UK. I have temporarily excluded this character from the tests. All tests pass for all other characters.

> Is this intentional?

No. The upper case equivalent of U+A64B should clearly be U+A64A CYRILLIC CAPITAL LETTER MONOGRAPH UK, but somehow it seems the data doesn't reflect this, and says it's U+A64B. This produces 30 failures because of all the combinations of encodings, methods, and options.

I have some hunch about the reason for the error. It may be related to the fact that there is a third character, U+1C88 CYRILLIC SMALL LETTER UNBLENDED UK, which also has U+A64A as an upper-case equivalent. U+1C88 was newly added in Unicode 9.0, see http://www.unicode.org/versions/Unicode9.0.0/ (search for "Casing-related Issues"). There are several similar cases, but there may be an ordering issue (the 'special' U+1C88 comes before the 'regular' U+A64B, whereas it's the other way round for all the other, similar cases). I suspected there would be an ordering issues, but was assuming that the tests had me covered :-(. I'll investigate further to check whether my hunch is right or not.

----------------------------------------
Bug #12990: unicode_case_mapping_tests
https://bugs.ruby-lang.org/issues/12990#change-61785

* Author: Nobuyoshi Nakada
* Status: Open
* Priority: Normal
* Assignee: Martin D端rst
* ruby -v: 56907
* Backport: 2.1: DONTNEED, 2.2: DONTNEED, 2.3: DONTNEED
----------------------------------------
I found that tests generated by `TestComprehensiveCaseFold.unicode_case_mapping_tests` compares `target` with same `target`, which should be always true usually.
I suspect this is unintentional, tried comparison `target` with `result`,

```diff
diff --git a/test/ruby/enc/test_case_comprehensive.rb b/test/ruby/enc/test_case_comprehensive.rb
index 13639f3..cfff9b8 100644
--- a/test/ruby/enc/test_case_comprehensive.rb
+++ b/test/ruby/enc/test_case_comprehensive.rb
@@ -149,7 +149,7 @@
           source = code.encode(encoding) * 5
           target = "#{test.first_data[code]}#{test.follow_data[code]*4}".encode(encoding)
           result = source.__send__(test.method_name, *test.attributes)
-          assert_equal target, target,
+          assert_equal target, result,
             proc{"from #{code*5} (#{source.dump}) expected #{target.dump} but was #{result.dump}"}
         end
       end
```

and got 30 failures.

Is this intentional?



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread

Prev Next