From: duerst@... Date: 2017-05-29T07:51:40+00:00 Subject: [ruby-core:81446] [Ruby trunk Feature#13588] Add Encoding#min_char_size, #max_char_size, #minmax_char_size Issue #13588 has been updated by duerst (Martin D��rst). haines (Andrew Haines) wrote: > phluid61 (Matthew Kerwin) wrote: > > I hope there are no encodings where valid characters might not be a multiple of the minimum size. > > Me too :) it works for now... the only encodings on Ruby 2.4.1 with `min_enc_len` > 1 are UTF-16 and UTF-32; UTF-16 is variable-length with either 1 or 2 16-bit code units, and UTF-32 is fixed-length. Not true. There are quite a few East Asian encodings with max length of 2, 3, or 4. E.g. Shift_JIS, EUC_JP, GB18030,... But it's still true that the maximum size is a multiple of the minimum size. ---------------------------------------- Feature #13588: Add Encoding#min_char_size, #max_char_size, #minmax_char_size https://bugs.ruby-lang.org/issues/13588#change-65151 * Author: haines (Andrew Haines) * Status: Feedback * Priority: Normal * Assignee: * Target version: ---------------------------------------- When implementing an IO-like object, I'd like to handle encoding correctly. To do so, I need to know the minimum and maximum character sizes for the encoding of the stream I'm reading. However, I can't find a way to access this information from Ruby (I ended up writing a gem with a native extension [1] to do so). I'd like to propose adding instance methods `min_char_size`, `max_char_size`, and `minmax_char_size` to the `Encoding` class to expose the information stored in the `OnigEncodingType` struct's `min_enc_len` and `max_enc_len` fields. ~~~ ruby Encoding::UTF_8.min_char_size # => 1 Encoding::UTF_8.max_char_size # => 6 Encoding::UTF_8.minmax_char_size # => [1, 6] ~~~ [1] https://github.com/haines/char_size -- https://bugs.ruby-lang.org/ Unsubscribe: