[#105544] [Ruby master Feature#18239] Variable Width Allocation: Strings — "peterzhu2118 (Peter Zhu)" <noreply@...>

Issue #18239 has been reported by peterzhu2118 (Peter Zhu).

18 messages 2021/10/04

[#105566] [Ruby master Bug#18242] Parser makes multiple assignment sad in confusing way — "danh337 (Dan Higgins)" <noreply@...>

Issue #18242 has been reported by danh337 (Dan Higgins).

9 messages 2021/10/06

[#105573] [Ruby master Bug#18243] Ractor.make_shareable does not freeze the receiver of a Proc but allows accessing ivars of it — "Eregon (Benoit Daloze)" <noreply@...>

Issue #18243 has been reported by Eregon (Benoit Daloze).

11 messages 2021/10/06

[#105618] [Ruby master Bug#18249] The ABI version of dev builds of CRuby does not correspond to the ABI — "Eregon (Benoit Daloze)" <noreply@...>

Issue #18249 has been reported by Eregon (Benoit Daloze).

23 messages 2021/10/11

[#105626] [Ruby master Bug#18250] Anonymous variables seem to break `Ractor.make_shareable` — "tenderlovemaking (Aaron Patterson)" <noreply@...>

Issue #18250 has been reported by tenderlovemaking (Aaron Patterson).

14 messages 2021/10/12

[#105660] [Ruby master Feature#18254] Add an `offset` parameter to String#unpack and String#unpack1 — "byroot (Jean Boussier)" <noreply@...>

Issue #18254 has been reported by byroot (Jean Boussier).

13 messages 2021/10/18

[#105672] [Ruby master Feature#18256] Change the canonical name of Thread::Mutex, Thread::Queue, Thread::SizedQueue and Thread::ConditionVariable to just Mutex, Queue, SizedQueue and ConditionVariable — "Eregon (Benoit Daloze)" <noreply@...>

Issue #18256 has been reported by Eregon (Benoit Daloze).

6 messages 2021/10/19

[#105692] [Ruby master Bug#18257] SystemTap/DTrace coredump on ppc64le/s390x — "vo.x (Vit Ondruch)" <noreply@...>

Issue #18257 has been reported by vo.x (Vit Ondruch).

22 messages 2021/10/20

[#105781] [Ruby master Misc#18266] DevelopersMeeting20211118Japan — "mame (Yusuke Endoh)" <noreply@...>

Issue #18266 has been reported by mame (Yusuke Endoh).

13 messages 2021/10/25

[#105805] [Ruby master Bug#18270] Refinement#{extend_object, append_features, prepend_features} should be removed — "shugo (Shugo Maeda)" <noreply@...>

Issue #18270 has been reported by shugo (Shugo Maeda).

8 messages 2021/10/26

[#105826] [Ruby master Feature#18273] Class.subclasses — "byroot (Jean Boussier)" <noreply@...>

Issue #18273 has been reported by byroot (Jean Boussier).

35 messages 2021/10/27

[#105833] [Ruby master Feature#18275] Add an option to define_method to not capture the surrounding environment — "vinistock (Vinicius Stock)" <noreply@...>

Issue #18275 has been reported by vinistock (Vinicius Stock).

11 messages 2021/10/27

[#105853] [Ruby master Feature#18276] `Proc#bind_call(obj)` same as `obj.instance_exec(..., &proc_obj)` — "ko1 (Koichi Sasada)" <noreply@...>

Issue #18276 has been reported by ko1 (Koichi Sasada).

15 messages 2021/10/28

[ruby-core:105534] [Ruby master Feature#14975] String#append without changing receiver's encoding

From: "nobu (Nobuyoshi Nakada)" <noreply@...>
Date: 2021-10-04 00:38:07 UTC
List: ruby-core #105534
Issue #14975 has been updated by nobu (Nobuyoshi Nakada).

Status changed from Open to Rejected

Closing since this seems not acceptable.
Feel free to reopen if any progress.

----------------------------------------
Feature #14975: String#append without changing receiver's encoding
https://bugs.ruby-lang.org/issues/14975#change-93991

* Author: ioquatix (Samuel Williams)
* Status: Rejected
* Priority: Normal
* Assignee: ioquatix (Samuel Williams)
----------------------------------------
I'm not sure where this fits in, but in order to avoid garbage and superfluous function calls, is it possible that `String#<<`, `String#concat` or the (proposed) `String#append` can avoid changing the encoding of the receiver?

Right now it's very tricky to do this in a way that doesn't require extra allocations. Here is what I do:

```ruby
class Buffer < String
	BINARY = Encoding::BINARY
	
	def initialize
		super
		
		force_encoding(BINARY)
	end
	
	def << string
		if string.encoding == BINARY
			super(string)
		else
			super(string.b) # Requires extra allocation.
		end
		
		return self
	end
	
	alias concat <<
end
```

When the receiver is binary, but contains byte sequences, appending UTF_8 can fail:

```
"Foobar".b << "F淡淡bar"
=> "FoobarF淡淡bar"

> "F淡淡bar".b << "F淡淡bar"
Encoding::CompatibilityError: incompatible character encodings: ASCII-8BIT and UTF-8
```

So, it's not possible to append data, generally, and then call `force_encoding(Encoding::BINARY)`. One must ensure the string is binary before appending it.

It would be nice if there was a solution which didn't require additional allocations/copies/linear scans for what should basically be a `memcpy`.

See also: https://bugs.ruby-lang.org/issues/14033 and https://bugs.ruby-lang.org/issues/13626#note-3

There are two options to fix this:

1/ Don't change receiver encoding in any case.
2/ Apply 1, but only when receiver is using `Encoding::BINARY`




-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread

Prev Next