From: nagachika00@... Date: 2016-08-15T19:02:14+00:00 Subject: [ruby-core:76885] [Ruby trunk Bug#12431] Strange behavior of String#encode('UTF-8', 'UTF-8', ...) when the encoding of the source string is not UTF-8 Issue #12431 has been updated by Tomoyuki Chikanaga. Backport changed from 2.1: WONTFIX, 2.2: REQUIRED, 2.3: REQUIRED to 2.1: WONTFIX, 2.2: REQUIRED, 2.3: DONE ruby_2_3 r55905 merged revision(s) 55181. ---------------------------------------- Bug #12431: Strange behavior of String#encode('UTF-8', 'UTF-8', ...) when the encoding of the source string is not UTF-8 https://bugs.ruby-lang.org/issues/12431#change-60107 * Author: Paul Grayson * Status: Closed * Priority: Normal * Assignee: * ruby -v: ruby 2.3.1p112 (2016-04-26 revision 54768) [x86_64-linux] * Backport: 2.1: WONTFIX, 2.2: REQUIRED, 2.3: DONE ---------------------------------------- When the dst_encoding and src_encoding options of String#encode are the same, it appears to ignore the encoding given and instead operate on the actual encoding of the string. Examples: ~~~ "abcd��".force_encoding('ASCII').encode('UTF-8', 'UTF-8', invalid: :replace, undef: :replace) => "abcd??" "abcd��".force_encoding('ASCII').encode('UTF-8', 'UTF-8', invalid: :replace, undef: :replace, replace: '���') Encoding::CompatibilityError: incompatible character encodings: US-ASCII and UTF-8 "abcd��\xff".encode('ASCII', 'ASCII', invalid: :replace, undef: :replace).force_encoding('UTF-8') => "abcd�����" ~~~ Also, without the "replace" options, exceptions are not raised as they should be: ~~~ "\xff".force_encoding('ASCII').encode('UTF-8', 'UTF-8') => "\xFF" ~~~ I looked a little at the code, and I think the problem might be in [this block](https://github.com/ruby/ruby/blob/v2_3_1/transcode.c#L2697-L2709) where the given string is passed to `rb_str_scrub` without any other encoding information. What I would expect is for `s.dup.force_encoding('X').encode('Y', opts)` to behave identically to `s.encode('Y', 'X', opts)`, but that is clearly not the case. Verified on Ruby 2.1.5, 2.3.0, and 2.3.1. -- https://bugs.ruby-lang.org/ Unsubscribe: