[ruby-core:77292] [Ruby trunk Feature#12744] Add str.reverse_each_char and str.reverse_chars
From:
duerst@...
Date:
2016-09-16 10:41:28 UTC
List:
ruby-core #77292
Issue #12744 has been updated by Martin D端rst.
Bouke van der Bijl wrote:
> I don't really have a use case for reverse_chars, but I added it for symmetry with the other methods.
Other languages may do that, but Ruby doesn't add something just for symmetry.
> I meant str.reverse_each_char, I typo'd it in the issue but it's correct in the patch. The equivalent with doing allocation would be str.chars.reverse.each. I could use `reverse_each_char` in Sprockets, where we need to iterate over the string backwards to check that it ends with certain characters (and know what it ends with).
Wouldn't this usually be done with a Regexp? If using a Regexp directly isn't efficient, what about just applying the reverse of the Regexp to the reverse of the string (so that it gets applied from the start)?
> Not sure why you think we can't make it faster than `reverse.each_char`, I've already implemented it and attached the patch. It uses `rb_enc_left_char_head`, which is implemented by all the encodings to scan a string backwards.
Some of these implementations are not exactly trivial. Please look at enc/shift_jis.c or enc/gb18030.c. Please try your code on something like
```ruby
"\x95\x95".force_encoding('Shift_JIS') * x
```
where you increase x and see whether the time increases linearly or not.
> For the most common encoding (UTF8) it is always possible to scan a string backwards from any point, and looking at the other encodings implemented in Ruby it seems only gb18030 has a stateful way to back up to previous characters, so iterating backwards over that one could end up being O(N^2).
Yes indeed.
----------------------------------------
Feature #12744: Add str.reverse_each_char and str.reverse_chars
https://bugs.ruby-lang.org/issues/12744#change-60525
* Author: Bouke van der Bijl
* Status: Feedback
* Priority: Normal
* Assignee:
----------------------------------------
This patch adds `str.reverse_each` and `str.reverse_chars`. It's currently not really possible to iterate a Ruby string in reverse while guaranteeing that you're not accidentally introducing an O(N^2) bug, without encoding to a fixed-length encoding like UTF-32. This is because variable-length encodings like UTF-8 requiring iterating over the whole string if you want to address characters by index.
The patch uses `rb_enc_left_char_head` to iterate over the string in reverse, so you can do so without allocating more memory.
---Files--------------------------------
add-reverse-string-iteration.patch (5.91 KB)
--
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>