[#105882] [Ruby master Bug#18280] Segmentation Fault in rb_utf8_str_new_cstr(NULL) — "ukolovda (Dmitry Ukolov)" <noreply@...>

Issue #18280 has been reported by ukolovda (Dmitry Ukolov).

13 messages 2021/11/01

[#105897] [Ruby master Bug#18282] Rails CI raises Segmentation fault with ruby 3.1.0dev supporting `Class#descendants` — "yahonda (Yasuo Honda)" <noreply@...>

Issue #18282 has been reported by yahonda (Yasuo Honda).

12 messages 2021/11/02

[#105909] [Ruby master Misc#18285] NoMethodError#message uses a lot of CPU/is really expensive to call — "ivoanjo (Ivo Anjo)" <noreply@...>

Issue #18285 has been reported by ivoanjo (Ivo Anjo).

37 messages 2021/11/02

[#105920] [Ruby master Bug#18286] Universal arm64/x86_84 binary built on an x86_64 machine segfaults/is killed on arm64 — "ccaviness (Clay Caviness)" <noreply@...>

Issue #18286 has been reported by ccaviness (Clay Caviness).

16 messages 2021/11/03

[#105928] [Ruby master Feature#18287] Support nil value for sort in Dir.glob — "Strech (Sergey Fedorov)" <noreply@...>

Issue #18287 has been reported by Strech (Sergey Fedorov).

16 messages 2021/11/04

[#105944] [Ruby master Bug#18289] Enumerable#to_a should delegate keyword arguments to #each — "Ethan (Ethan -)" <noreply@...>

Issue #18289 has been reported by Ethan (Ethan -).

8 messages 2021/11/05

[#105967] [Ruby master Bug#18293] Time.at in master branch was 25% slower then Ruby 3.0 — "watson1978 (Shizuo Fujita)" <noreply@...>

Issue #18293 has been reported by watson1978 (Shizuo Fujita).

17 messages 2021/11/08

[#106008] [Ruby master Bug#18296] Custom exception formatting should override `Exception#full_message`. — "ioquatix (Samuel Williams)" <noreply@...>

Issue #18296 has been reported by ioquatix (Samuel Williams).

14 messages 2021/11/10

[#106033] [Ruby master Bug#18330] Make failure on 32-bit Linux (Android) with Clang due to implicit 64-to-32-bit integer truncation — "xtkoba (Tee KOBAYASHI)" <noreply@...>

Issue #18330 has been reported by xtkoba (Tee KOBAYASHI).

10 messages 2021/11/11

[#106053] [Ruby master Misc#18335] openindiana ruby 3.1 preview needs --disable-dtrace — "stes (David Stes)" <noreply@...>

Issue #18335 has been reported by stes (David Stes).

14 messages 2021/11/14

[#106069] [Ruby master Feature#18339] GVL instrumentation API — "byroot (Jean Boussier)" <noreply@...>

Issue #18339 has been reported by byroot (Jean Boussier).

13 messages 2021/11/15

[#106145] [Ruby master Misc#18346] DevelopersMeeting20211209Japan — "mame (Yusuke Endoh)" <noreply@...>

Issue #18346 has been reported by mame (Yusuke Endoh).

11 messages 2021/11/18

[#106173] [Ruby master Feature#18349] Let --jit enable YJIT — "k0kubun (Takashi Kokubun)" <noreply@...>

Issue #18349 has been reported by k0kubun (Takashi Kokubun).

8 messages 2021/11/19

[#106175] [Ruby master Feature#18351] Support anonymous rest and keyword rest argument forwarding — "jeremyevans0 (Jeremy Evans)" <noreply@...>

Issue #18351 has been reported by jeremyevans0 (Jeremy Evans).

10 messages 2021/11/19

[#106279] [Ruby master Feature#18364] Add GC.stat_size_pool for Variable Width Allocation — "peterzhu2118 (Peter Zhu)" <noreply@...>

Issue #18364 has been reported by peterzhu2118 (Peter Zhu).

14 messages 2021/11/25

[#106308] [Ruby master Feature#18367] Stop the interpreter from escaping error messages — "mame (Yusuke Endoh)" <noreply@...>

Issue #18367 has been reported by mame (Yusuke Endoh).

13 messages 2021/11/29

[#106314] [Ruby master Feature#18368] Range#step semantics for non-Numeric ranges — "zverok (Victor Shepelev)" <noreply@...>

Issue #18368 has been reported by zverok (Victor Shepelev).

39 messages 2021/11/29

[#106341] [Ruby master Bug#18369] users.detect(:name, "Dorian") as shorthand for users.detect { |user| user.name == "Dorian" } — dorianmariefr <noreply@...>

Issue #18369 has been reported by dorianmariefr (Dorian Mari辿).

14 messages 2021/11/30

[#106347] [Ruby master Feature#18370] Call Exception#full_message to print exceptions reaching the top-level — "Eregon (Benoit Daloze)" <noreply@...>

Issue #18370 has been reported by Eregon (Benoit Daloze).

10 messages 2021/11/30

[ruby-core:106189] [Ruby master Feature#18262] Enumerator::Lazy#partition

From: "zverok (Victor Shepelev)" <noreply@...>
Date: 2021-11-20 10:17:24 UTC
List: ruby-core #106189
Issue #18262 has been updated by zverok (Victor Shepelev).


@Eregon your version is not the same for effectfull enumerators (where `.lazy` is extremely useful):

```
require 'stringio'

str = StringIO.new(<<~ROWS)
1: OK
2: Err
3: OK
4: Err
5: OK
6: Err
ROWS


err, ok = str.each_line(chomp: true).lazy.partition { _1.include?('Err') }

p [err.first(2), ok.first(2)]
# mine: [["2: Err", "4: Err"], ["1: OK", "3: OK"]]
# yours: [["2: Err", "4: Err"], ["5: OK"]]
```
...because yours is consuming both kinds of rows while producing `err`s.

I agree that if not a "whole point", the "it consumes much less memory" is an implicit expectation of a lazy enumerator. But I believe having (well-documented) quirk in `partition` is better than not having lazy `partition` at all.

----------------------------------------
Feature #18262: Enumerator::Lazy#partition
https://bugs.ruby-lang.org/issues/18262#change-94796

* Author: zverok (Victor Shepelev)
* Status: Open
* Priority: Normal
----------------------------------------
(Part of my set of proposals about making `.lazy` more useful/popular.)

Currently:
```ruby
file = File.open('very-large-file.txt')
lines_with_errors, lines_without_errors = file.lazy.partition { _1.start_with?('E:') }
lines_with_errors.class
# => Array, all file is read by this moment
```
This might be not very practical performance-wise and memory-wise.

I am thinking that maybe returning a pair of lazy enumerators might be a good addition to `Enumerator::Lazy`

Naive prototype:

```ruby
class Enumerator::Lazy
  def partition(&block)
    buffer1 = []
    buffer2 = []
    source = self

    [
      Enumerator.new { |y|
        loop do
          if buffer1.empty?
            begin
              item = source.next
              if block.call(item)
                y.yield(item)
              else
                buffer2.push(item)
              end
            rescue StopIteration
              break
            end
          else
            y.yield buffer1.shift
          end
        end
      }.lazy,
      Enumerator.new { |y|
        loop do
          if buffer2.empty?
            begin
              item = source.next
              if !block.call(item)
                y.yield(item)
              else
                buffer1.push(item)
              end
            rescue StopIteration
              break
            end
          else
            y.yield buffer2.shift
          end
        end
      }.lazy
    ]
  end
end
```
Testing it:
```ruby
Enumerator.produce(1) { |i| puts "processing #{i}"; i + 1 }.lazy
  .take(30)
  .partition(&:odd?)
  .then { |odd, even|
    p odd.first(3), even.first(3)
  }
# Prints:
# processing 1
# processing 2
# processing 3
# processing 4
# processing 5
# [1, 3, 5]
# [2, 4, 6]
```
As you might notice by the "processing" log, it only fetched the amount of entries that was required by produced enumerators.

The **drawback** would be—as my prototype implementation shows—the need of internal "buffering" (I don't think it is possible to implement lazy partition without it), but it still might be worth a shot?



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread

Prev Next