From: duerst@...
Date: 2018-11-20T10:22:40+00:00
Subject: [ruby-core:89905] [Ruby trunk Feature#15317] How to deal with obsolete property values in Unicode 11.0.0

Issue #15317 has been updated by duerst (Martin D��rst).


Some pointers obtained from an Unicode-internal discussion:

- All (including past) property values are available from the Relax NG schema for UCD in XML at http://www.unicode.org/reports/tr42/tr42-23.rnc, linked off https://www.unicode.org/reports/tr42/.

- PropertyAliases.txt lists all the properties, and PropertyValueAliases.txt provides lists of property values for enumerated values. We already download these files as part of the Ruby make process.

- Hiragana_or_Katakana is an old obsolete script property, which currently leads to an error with `'abc' =~ /\p{hiragara_or_katakana}/'`

----------------------------------------
Feature #15317: How to deal with obsolete property values in Unicode 11.0.0
https://bugs.ruby-lang.org/issues/15317#change-74984

* Author: duerst (Martin D��rst)
* Status: Open
* Priority: Normal
* Assignee: 
* Target version: 2.6
----------------------------------------
http://www.unicode.org/versions/Unicode11.0.0/#Migration contains the following:

> Four Grapheme_Cluster_Break and Word_Break classes have become obsolete and are no longer used: E_Base, E_Modifier, Glue_After_Zwj, and E_Base_GAZ. Those values are still part of the enumeration of the property values, because stability constraints prevent removal of enumerated property values, even if obsolete; however, these are no longer assigned to any characters, and are no longer referred to explicitly by any rules in the algorithms.

For Ruby, we have to decide how to support (or not) these property values. The main choices are to throw an error or to just not match anything. The later seems preferable for backwards compatibility, but the relevant file (https://www.unicode.org/Public/UCD/latest/ucd/auxiliary/GraphemeBreakProperty.txt) does not mention these property values anymore.

I'm currently contacting other Unicode experts to find out whether there's some machine readable data for obsolete properties.

Your input is appreciated.



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>