[ruby-core:93803] [Ruby master Feature#16006] String count and alignment that consider multibyte characters
From:
sawadatsuyoshi@...
Date:
2019-07-16 03:35:12 UTC
List:
ruby-core #93803
Issue #16006 has been reported by sawa (Tsuyoshi Sawada).
----------------------------------------
Feature #16006: String count and alignment that consider multibyte characters
https://bugs.ruby-lang.org/issues/16006
* Author: sawa (Tsuyoshi Sawada)
* Status: Open
* Priority: Normal
* Assignee:
* Target version:
----------------------------------------
In non-proportional font, multibyte characters have twice the width of ASCII characters. Since `String#length`, `String#ljust`, `String#rjust`, and `String#center` do not take this into consideration, applying these methods do not give the desired output.
```ruby
array = ["aaあああ", "bいいいいいいいい", "cc"]
col_width = array.map(&:length).max
array.each{|w| puts w.ljust(col_width, "*")}
# >> aaあああ****
# >> bいいいいいいいい
# >> cc*******
```
In order to do justification of strings that have multi-byte characters, we have to do something much more complicated such as the following:
```ruby
col_widths =
array.to_h{|w| [
w,
w
.chars
.partition(&:ascii_only?)
.then{|ascii, non| ascii.length + (non.length * 2)}
]}
col_width = col_widths.values.max
array.each{|w| puts w + "*" * (col_width - col_widths[w])}
# Note that the following gives the desired alignment in non-proportional font, but may not appear so in this issue tracker.
# >> aaあああ*********
# >> bいいいいいいいい
# >> cc***************
```
This issue seems to be common, as several webpages can be found that attempt to do something similar.
I propose to give the relevant methods an option to take multi-byte characters into consideration. Perhaps something like the `non_ascii` keyword in the following may work:
```ruby
"aaあああ".length(non_ascii: 2) # => 8
"aaあああ".ljust(17, "*", non_ascii: 2) # => "aaあああ*********"
```
Then, the desired output would be given by this code:
```ruby
col_width = array.map{|w| w.length(non_ascii: 2)}.max
array.each{|w| puts w.ljust(col_width, "*", non_ascii: 2)}
# >> aaあああ*********
# >> bいいいいいいいい
# >> cc***************
```
--
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>