From: dylan.smith@... Date: 2021-04-14T18:28:28+00:00 Subject: [ruby-core:103451] [Ruby master Feature#17790] Have a way to clear a String without resetting its capacity Issue #17790 has been updated by dylants (Dylan Thacker-Smith). > Maybe String#capacity and String#capacity= would make sense? Using `capacity=` for the method name would set the assumption that the capacity is exactly that after the call. However, with embedded strings, the capacity would be fixed until it grows larger than what can be embedded in the object struct. That's why I suggested `shrink` as the name to shrink the capacity. > But then there's the question of the behavior if you set the capacity to lower than the size. Should it truncate? (this could corrupt UTF-8 for instance) or should it raise? I think that should raise, since it seems too implicit to have a call to set the capacity also truncate the contents. I do think it would be useful to be able to efficiently truncate a string, but that could be done with a separate method. For example, `String#size=` could be provided and could efficiently truncate a binary string and would avoid corrupting UTF-8 strings. There are limited String methods for working with byte offsets for variable width encoded strings like UTF-8, so I'm actually surprised that there is already a String#byteslice method. Nothing prevents that from creating an invalid UTF-8 string, however, I don't see the use case for using that with non-binary strings. I think a way to truncate using byte offset would be more useful as part of the C API for now. > My feeling is handling the capacity in Ruby code feels wrong and like C++ code. Performance sensitive code will naturally be written based on what is more efficient for the machine (the primary concern of C++), such as preferring mutations to avoid object allocations. Providing primitive low-level methods for performance sensitive ruby code will allow more pleasant optimization than forcing the code to be rewritten in a native extension to do the same optimization. > String#resize `size` refers to the size of the contents, so `resize` seems like it would affect that `size` (e.g. truncating or padding) instead of just the capacity. > What about buffer.clear(capacity: 1024) > Or maybe even buffer.clear(capacity: 1024..8192) > I think that's more straightforward than separate clear and resize operations. Coupling capacity control with clearing the buffer makes the capacity control less general. For instance, it doesn't support shrinking the buffer to fit the contents or growing the buffer once before multiple appends. ---------------------------------------- Feature #17790: Have a way to clear a String without resetting its capacity https://bugs.ruby-lang.org/issues/17790#change-91544 * Author: byroot (Jean Boussier) * Status: Open * Priority: Normal ---------------------------------------- In some tight loop it can be useful to re-use a buffer string. For instance: ```ruby buffer = String.new(encoding: Encoding::BINARY, capacity: 1024) 10.times do build_next_packet(buffer) udp_socket.send(buffer) buffer.clear end ``` Currently `Array#clear` preserve the Array capacity, but `String#clear` doesn't: ```ruby >> puts ObjectSpace.dump(Array.new(20).clear) {"address":"0x7fd3260a1558", "type":"ARRAY", "class":"0x7fd3230972e0", "length":0, "memsize":200, "flags":{"wb_protected":true}} >> puts ObjectSpace.dump(String.new(encoding: Encoding::BINARY, capacity: 1024).clear) {"address":"0x7fd322a8a320", "type":"STRING", "class":"0x7fd3230b75b8", "embedded":true, "bytesize":0, "value":"", "memsize":40, "flags":{"wb_protected":true}} ``` It would be useful if `String#clear` wouldn't free allocated memory, but if it's a backward compatibility concern to change it, then maybe another method could make sense? -- https://bugs.ruby-lang.org/ Unsubscribe: