[#25936] [Bug:1.9] [rubygems] $LOAD_PATH includes bin directory — Nobuyoshi Nakada <nobu@...>

Hi,

10 messages 2009/10/05

[#25943] Disabling tainting — Tony Arcieri <tony@...>

Would it make sense to have a flag passed to the interpreter on startup that

16 messages 2009/10/05

[#26028] [Bug #2189] Math.atanh(1) & Math.atanh(-1) should not raise an error — Marc-Andre Lafortune <redmine@...>

Bug #2189: Math.atanh(1) & Math.atanh(-1) should not raise an error

14 messages 2009/10/10

[#26222] [Bug #2250] IO::for_fd() objects' finalization dangerously closes underlying fds — Mike Pomraning <redmine@...>

Bug #2250: IO::for_fd() objects' finalization dangerously closes underlying fds

11 messages 2009/10/22

[#26244] [Bug #2258] Kernel#require inside rb_require() inside rb_protect() inside SysV context fails — Suraj Kurapati <redmine@...>

Bug #2258: Kernel#require inside rb_require() inside rb_protect() inside SysV context fails

24 messages 2009/10/22

[#26361] [Feature #2294] [PATCH] ruby_bind_stack() to embed Ruby in coroutine — Suraj Kurapati <redmine@...>

Feature #2294: [PATCH] ruby_bind_stack() to embed Ruby in coroutine

42 messages 2009/10/27

[#26371] [Bug #2295] segmentation faults — tomer doron <redmine@...>

Bug #2295: segmentation faults

16 messages 2009/10/27

[ruby-core:26450] [Bug #2313] Incomplete encoding conversion?

From: Yui NARUSE <redmine@...>
Date: 2009-10-31 16:59:31 UTC
List: ruby-core #26450
Issue #2313 has been updated by Yui NARUSE.


> >> "元気".encode('UTF-8').force_encoding('ASCII-8BIT').encode('UTF-8')
> Encoding::UndefinedConversionError: "\xE5" from ASCII-8BIT to UTF-8
> 	from (irb):24:in `encode'
> 	from (irb):24
> 	from /opt/local/bin/irb:12:in `<main>'
> 
> Is that a bug in the UTF-8 encoding parser? Or is it related to this problem?

OK, I'll explain step by step:

str = "元気"
# You make a String which contains "元気" encode by some encoding
# str's byte data is some byte string which means "元気"
# str's encoding is a source encoding
str = str.encode('UTF-8')
# str is encoded to UTF-8, so
# str's byte data is "\xE5\x85\x83\xE6\xB0\x97"
# str's encoding is UTF-8
str.force_encoding('ASCII-8BIT')
# change str's encoding to ASCII-8BIT, so
# str's byte data is "\xE5\x85\x83\xE6\xB0\x97"
# str's encoding is now ASCII-8BIT

Then you try str.encode('UTF-8') and this String#encode converts byte data:
String#encode try to convert "\xE5" from ASCII-8BIT to UTF-8, but there is no mapping.
What you want to do is not a conversion, it should be setting encoding.

str.force_encoding('UTF-8')
# change str's encoding to UTF-8, so
# str's byte data is "\xE5\x85\x83\xE6\xB0\x97"
# str's encoding is now UTF-8
----------------------------------------
http://redmine.ruby-lang.org/issues/show/2313

----------------------------------------
http://redmine.ruby-lang.org

In This Thread