From: "mame (Yusuke Endoh)" Date: 2012-11-06T20:43:26+09:00 Subject: [ruby-core:48972] [ruby-trunk - Bug #7156][Feedback] Invalid byte sequence in US-ASCII when using URI from std lib Issue #7156 has been updated by mame (Yusuke Endoh). File bulgarian.rb added Status changed from Open to Feedback Target version set to 2.0.0 I'm not sure what you want. I cannot reproduce this issue by the following code. $ cat bulgarian.rb # coding: UTF-8 require "uri" p URI.escape("��������������") $ ruby bulgarian.rb "%D0%98%D1%81%D1%82%D0%BE%D1%80%D0%B8%D1%8F" Could you please tell us a example code, expected result and actual one? -- Yusuke Endoh ---------------------------------------- Bug #7156: Invalid byte sequence in US-ASCII when using URI from std lib https://bugs.ruby-lang.org/issues/7156#change-32489 Author: t0d0r (Todor Dragnev) Status: Feedback Priority: Normal Assignee: Category: lib Target version: 2.0.0 ruby -v: 1.9.3 Invalid byte sequence in US-ASCII on ruby 1.9.3 I receive that error when trying to open url with bulgarian text (utf-8: "��������������"). It seems that the problem is in uri/common.rb from ruby standard library... adding str.force_encoding(Encoding::BINARY) to following method fix the problem class URI::Parser def escape(str, unsafe = @regexp[:UNSAFE]) unless unsafe.kind_of?(Regexp) # perhaps unsafe is String object unsafe = Regexp.new("[#{Regexp.quote(unsafe)}]", false) end str.force_encoding(Encoding::BINARY) # FIX str.gsub(unsafe) do us = $& tmp = '' us.each_byte do |uc| tmp << sprintf('%%%02X', uc) end tmp end.force_encoding(Encoding::US_ASCII) end end One more suggestion - maybe US_ASCII must be replaced to Encoding::BINARY too? -- http://bugs.ruby-lang.org/