[#109403] [Ruby master Feature#18951] Object#with to set and restore attributes around a block — "byroot (Jean Boussier)" <noreply@...>

Issue #18951 has been reported by byroot (Jean Boussier).

23 messages 2022/08/01

[#109423] [Ruby master Misc#18954] DevMeeting-2022-08-18 — "mame (Yusuke Endoh)" <noreply@...>

Issue #18954 has been reported by mame (Yusuke Endoh).

10 messages 2022/08/04

[#109449] [Ruby master Feature#18959] Handle gracefully nil kwargs eg. **nil — "LevLukomskyi (Lev Lukomskyi)" <noreply@...>

Issue #18959 has been reported by LevLukomskyi (Lev Lukomskyi).

27 messages 2022/08/08

[#109456] [Ruby master Bug#18960] Module#using raises RuntimeError when called at toplevel from wrapped script — "shioyama (Chris Salzberg)" <noreply@...>

Issue #18960 has been reported by shioyama (Chris Salzberg).

15 messages 2022/08/09

[#109550] [Ruby master Feature#18965] Further Thread::Queue improvements — "byroot (Jean Boussier)" <noreply@...>

Issue #18965 has been reported by byroot (Jean Boussier).

14 messages 2022/08/18

[#109575] [Ruby master Bug#18967] Segmentation fault in stackprof with Ruby 2.7.6 — "RubyBugs (A Nonymous)" <noreply@...>

Issue #18967 has been reported by RubyBugs (A Nonymous).

10 messages 2022/08/19

[#109598] [Ruby master Bug#18970] CRuby adds an invalid header to bin/bundle (and others) which makes it unusable in Bash on Windows — "Eregon (Benoit Daloze)" <noreply@...>

Issue #18970 has been reported by Eregon (Benoit Daloze).

17 messages 2022/08/20

[#109645] [Ruby master Bug#18973] Kernel#sprintf: %c allows codepoints above 127 for 7-bits ASCII encoding — "andrykonchin (Andrew Konchin)" <noreply@...>

Issue #18973 has been reported by andrykonchin (Andrew Konchin).

8 messages 2022/08/23

[#109689] [Ruby master Misc#18977] DevMeeting-2022-09-22 — "mame (Yusuke Endoh)" <noreply@...>

Issue #18977 has been reported by mame (Yusuke Endoh).

16 messages 2022/08/25

[#109707] [Ruby master Feature#18980] Re-reconsider numbered parameters: `it` as a default block parameter — "k0kubun (Takashi Kokubun)" <noreply@...>

Issue #18980 has been reported by k0kubun (Takashi Kokubun).

40 messages 2022/08/26

[#109756] [Ruby master Feature#18982] Add an `exception: false` argument for Queue#push, Queue#pop, SizedQueue#push and SizedQueue#pop — "byroot (Jean Boussier)" <noreply@...>

Issue #18982 has been reported by byroot (Jean Boussier).

11 messages 2022/08/29

[#109773] [Ruby master Misc#18984] Doc for Range#size for Float/Rational does not make sense — "masasakano (Masa Sakano)" <noreply@...>

Issue #18984 has been reported by masasakano (Masa Sakano).

7 messages 2022/08/29

[ruby-core:109642] [Ruby master Bug#18972] String#byteslice should return BINARY (aka ASCII-8BIT) Strings

From: "Eregon (Benoit Daloze)" <noreply@...>
Date: 2022-08-23 13:44:19 UTC
List: ruby-core #109642
Issue #18972 has been updated by Eregon (Benoit Daloze).


I think the current behavior is better, `String#byteslice` is not only used for BINARY strings.
In fact for binary strings (and other fixed-width encodings), there is no point to use byteslice over slice/[].

For instance, one might work with UTF-8 and get a byte index (instead of a character index), from e.g. `String#byteindex` or from `MatchData#byteoffset`, and then one would use `byteslice` to avoid 2 extra byte offset<->character offset conversions, which e.g. are expensive for (non-7-bit) UTF-8.
What I just described is close to the motivation for #13110 which added `String#byteindex`.

So I think we cannot change this for compatibility, and it is intended AFAIK.

----------------------------------------
Bug #18972: String#byteslice should return BINARY (aka ASCII-8BIT) Strings
https://bugs.ruby-lang.org/issues/18972#change-98866

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN
----------------------------------------
While working on implementing https://bugs.ruby-lang.org/issues/13626, I noticed `byteslice` assign the receiver encoding to the returned String.

I believe this is incorrect, as since you are doing a byte based operation, you do expect a binary string in return, otherwise if you'd call it on an UTF-8 string, you'd likely get a string with invalid encoding.

I read the original feature request and there's no mention of what the returned encoding should be: https://bugs.ruby-lang.org/issues/4447


### Current behavior

```ruby
>> "f辿e".byteslice(1).valid_encoding?
=> false
>> "f辿e".byteslice(1).encoding
=> #<Encoding:UTF-8>
```

### Expected behavior

```ruby
>> "f辿e".byteslice(1).valid_encoding?
=> true
>> "f辿e".byteslice(1).encoding
=> #<Encoding:ASCII-8BIT>
```

### Backward compatibility concerns

I'm honestly not quite sure what the backward incompatibility impact may be.

From my point of view if you are calling `byteslice` it's to use it with other binary string, but it's indeed
possible that there is existing code mixing UTF-8 and BINARY that somewhat work and would be broken by this change.

Especially since binary strings can silently be promoted from BINARY to UTF-8:

```ruby
buffer = "".b 
buffer << "f辿e" # buffer was promoted to Encoding::UTF-8 silently
buffer << "f辿e".byteslice(1)
```

The above currently "works", but would raise `Encoding::CompatibilityError` with this change.





-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread