[#84867] [Ruby trunk Bug#14357] thread_safe tests suite segfaults — v.ondruch@...
Issue #14357 has been reported by vo.x (Vit Ondruch).
11 messages
2018/01/15
[#85364] Re: [Ruby trunk Bug#14357] thread_safe tests suite segfaults
— Eric Wong <normalperson@...>
2018/02/03
v.ondruch@tiscali.cz wrote:
[#85368] Re: [Ruby trunk Bug#14357] thread_safe tests suite segfaults
— Eric Wong <normalperson@...>
2018/02/03
Eric Wong wrote:
[#85442] Re: [Ruby trunk Bug#14357] thread_safe tests suite segfaults
— Eric Wong <normalperson@...>
2018/02/06
Eric Wong <normalperson@yhbt.net> wrote:
[#85451] Re: [Ruby trunk Bug#14357] thread_safe tests suite segfaults
— Vladimir Makarov <vmakarov@...>
2018/02/06
On 02/06/2018 05:00 AM, Eric Wong wrote:
[#85455] Re: [Ruby trunk Bug#14357] thread_safe tests suite segfaults
— Eric Wong <normalperson@...>
2018/02/06
Vladimir Makarov <vmakarov@redhat.com> wrote:
[#84874] [Ruby trunk Bug#14360] Regression CSV#open method for writing from Ruby 2.4.3 to 2.5.0 — shevegen@...
Issue #14360 has been updated by shevegen (Robert A. Heiler).
3 messages
2018/01/15
[#84980] [Ruby trunk Feature#13618][Assigned] [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid — hsbt@...
Issue #13618 has been updated by hsbt (Hiroshi SHIBATA).
10 messages
2018/01/23
[#85012] Re: [Ruby trunk Feature#13618][Assigned] [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid
— Eric Wong <normalperson@...>
2018/01/23
hsbt@ruby-lang.org wrote:
[#85081] Re: [Ruby trunk Feature#13618][Assigned] [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid
— Eric Wong <normalperson@...>
2018/01/24
Eric Wong <normalperson@yhbt.net> wrote:
[#85082] Re: [Ruby trunk Feature#13618][Assigned] [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid
— Eric Wong <normalperson@...>
2018/01/24
> Thinking about this even more; I don't think it's possible to
[#85088] [Ruby trunk Feature#13618] [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid — danieldasilvaferreira@...
Issue #13618 has been updated by dsferreira (Daniel Ferreira).
3 messages
2018/01/25
[#85107] [Ruby trunk Misc#14222] Mutex.lock is not safe inside signal handler: what is? — eregontp@...
Issue #14222 has been updated by Eregon (Benoit Daloze).
3 messages
2018/01/25
[#85136] Re: [Ruby trunk Feature#13618] [PATCH] auto fiber schedule for rb_wait_for_single_fd and rb_waitpid — Eric Wong <normalperson@...>
samuel@oriontransfer.org wrote:
3 messages
2018/01/26
[ruby-core:85019] [Ruby trunk Feature#14383] Making prime_division in prime.rb Ruby 3 ready.
From:
wolf@...
Date:
2018-01-23 23:45:40 UTC
List:
ruby-core #85019
Issue #14383 has been updated by graywolf (Gray Wolf).
Unless I copy-pasted wrong your `prime_division2` is /significantly/ slower for small numbers:
```
$ ruby bm.rb
prime_division - current in prime.rb
prime_division_2 - proposed version in #14383 by jzakiya
`factor` - spawning coreutils' factor command
`factor` is left out from the first benchmark, spawning 1_000_000
shells proves nothing.
1_000_000 times of 10
user system total real
prime_division 1.496456 0.000059 1.496515 ( 1.496532)
prime_division_2 104.094586 6.469827 110.564413 (110.565201)
1_000 times of 2**256
user system total real
prime_division 0.069389 0.000000 0.069389 ( 0.069391)
prime_division_2 0.325075 0.000000 0.325075 ( 0.325088)
`factor` 0.163589 0.073234 0.992784 ( 1.021447)
10 times of 500_000_000_000_000_000_008_244_213
user system total real
prime_division 328.625069 0.033017 328.658086 (328.801649)
prime_division_2 118.690491 0.000119 118.690610 (118.691372)
`factor` 0.002487 0.000030 0.024145 ( 0.025040)
```
script:
```ruby
require 'prime'
require 'openssl'
require 'benchmark'
class Integer
def prime_division_2(generator = Prime::Generator23.new)
return [] if (self.abs | 1) == 1
pv = self < 0 ? [-1] : []
value = self.abs
prime = generator.next
until value.to_bn.prime? or value == 1
while prime
(pv << prime; value /= prime; break) if value % prime == 0
prime = generator.next
end
end
pv << value if value > 1
pv.group_by {|prm| prm }.map{|prm, exp| [prm, exp.size] }
end
end
def factor(num)
return [] if num | 1 == 1
factors = num < 0 ? [-1] : []
factors += `factor #{num.abs}`.split(' ')[1..-1].map(&:to_i)
factors.group_by {|prm| prm }.map {|prm, exp| [prm, exp.size] }
end
puts <<~EOF
prime_division - current in prime.rb
prime_division_2 - proposed version in #14383 by jzakiya
`factor` - spawning coreutils' factor command
`factor` is left out from the first benchmark, spawning 1_000_000
shells proves nothing.
EOF
puts
puts "1_000_000 times of 10"
puts
Benchmark.bm(16) do |x|
x.report('prime_division') { 1_000_000.times { 10.prime_division } }
x.report('prime_division_2') { 1_000_000.times { 10.prime_division_2 } }
end
puts
puts "1_000 times of 2**256"
puts
Benchmark.bm(16) do |x|
num = 2**256
x.report('prime_division') { 1_000.times { num.prime_division } }
x.report('prime_division_2') { 1_000.times { num.prime_division_2 } }
x.report('`factor`') { 1_000.times { factor(num) } }
end
puts
puts "10 times of 500_000_000_000_000_000_008_244_213"
puts
Benchmark.bm(16) do |x|
num = 500_000_000_000_000_000_008_244_213
x.report('prime_division') { 10.times { num.prime_division } }
x.report('prime_division_2') { 10.times { num.prime_division_2 } }
x.report('`factor`') { 10.times { factor(num) } }
end
```
----------------------------------------
Feature #14383: Making prime_division in prime.rb Ruby 3 ready.
https://bugs.ruby-lang.org/issues/14383#change-69726
* Author: jzakiya (Jabari Zakiya)
* Status: Open
* Priority: Normal
* Assignee:
* Target version:
----------------------------------------
I have been running old code in Ruby 2.5.0 (released 2017.12.25) to check for
speed and compatibility. I still see the codebase in `prime.rb` hardly has
changed at all (except for replacing `Math.sqrt` with `Integer.sqrt`).
To achieve the Ruby 3 goal to make it at least three times faster than Ruby 2
there are three general areas where Ruby improvements can occur.
* increase the speed of its implementation at the machine level
* rewrite its existing codebase in a more efficient|faster manner
* use faster algorithms to implement routines and functions
I want to suggest how to address the later two ways to improve performance of
specifically the `prime_division` method in the `prime.rb` library.
I've raised and made suggestions to some of these issues here
[ruby-issues forum](https://bugs.ruby-lang.org/issues/12676) and now hope to invigorate additional discussion.
Hopefully with the release of 2.5.0, and Ruby 3 conceptually closer to reality,
more consideration will be given to coding and algorithmic improvements to
increase its performance too.
**Mathematical correctness**
First I'd like to raise what I consider *math bugs* in `prime_division`, in how
it handles `0` and `-1` inputs.
```
> -1.prime_division
=> [[-1,1]]
> 0.prime_division
Traceback (most recent call last):
4: from /home/jzakiya/.rvm/rubies/ruby-2.5.0/bin/irb:11:in `<main>'
3: from (irb):85
2: from /home/jzakiya/.rvm/rubies/ruby-2.5.0/lib/ruby/2.5.0/prime.rb:30:in `prime_division'
1: from /home/jzakiya/.rvm/rubies/ruby-2.5.0/lib/ruby/2.5.0/prime.rb:203:in `prime_division'
ZeroDivisionError (ZeroDivisionError)
```
First, `0` is a perfectly respectable integer, and is non-prime, so its output should be `[]`,
an empty array to denote it has no prime factors. The existing behavior is solely a matter of
`prime_division`'s' implementation, and does not take this mathematical reality into account.
The output for `-1` is also mathematically wrong because `1` is also non-prime (and correctly
returns `[]`), well then mathematically so should `-1`. Thus, `prime_division` treats `-1` as
a new prime number, and factorization, that has no mathematical basis. Thus, for mathematical
correctness and consistency `-1` and `0` should both return `[]`, as none have prime factors.
```
> -1.prime_division
=> []
> 0.prime_division
=> []
> 1.prime_division
=> []
```
There's a very simple one-line fix to `prime_division` to do this:
```
# prime.rb
class Prime
def prime_division(value, generator = Prime::Generator23.new)
-- raise ZeroDivisionError if value == 0
++ return [] if (value.abs | 1) == 1
```
**Simple Code and Algorithmic Improvements**
As stated above, besides the machine implementation improvements, the other
areas of performance improvements will come from coding rewrites and better
algorithms. Below is the coding of `prime_division`. This coding has existed at
least since Ruby 2.0 (the farthest I've gone back).
```
# prime.rb
class Integer
# Returns the factorization of +self+.
#
# See Prime#prime_division for more details.
def prime_division(generator = Prime::Generator23.new)
Prime.prime_division(self, generator)
end
end
class Prime
def prime_division(value, generator = Prime::Generator23.new)
raise ZeroDivisionError if value == 0
if value < 0
value = -value
pv = [[-1, 1]]
else
pv = []
end
generator.each do |prime|
count = 0
while (value1, mod = value.divmod(prime)
mod) == 0
value = value1
count += 1
end
if count != 0
pv.push [prime, count]
end
break if value1 <= prime
end
if value > 1
pv.push [value, 1]
end
pv
end
end
```
This can be rewritten in more modern and idiomatic Ruby, to become much shorter
and easier to understand.
```
require 'prime.rb'
class Integer
def prime_division1(generator = Prime::Generator23.new)
Prime.prime_division1(self, generator)
end
end
class Prime
def prime_division1(value, generator = Prime::Generator23.new)
# raise ZeroDivisionError if value == 0
return [] if (value.abs | 1) == 1
pv = value < 0 ? [[-1, 1]] : []
value = value.abs
generator.each do |prime|
count = 0
while (value1, mod = value.divmod(prime); mod) == 0
value = value1
count += 1
end
pv.push [prime, count] unless count == 0
break if prime > value1
end
pv.push [value, 1] if value > 1
pv
end
end
```
By merely rewriting it we get smaller|concise code, that's easier to understand,
which is slightly faster. A *triple win!* Just paste the above code into a 2.5.0
terminal session, and run the benchmarks below.
```
def tm; s=Time.now; yield; Time.now-s end
n = 500_000_000_000_000_000_008_244_213; tm{ pp n.prime_division }
[[3623, 1], [61283, 1], [352117631, 1], [6395490847, 1]]
=> 27.02951016
n = 500_000_000_000_000_000_008_244_213; tm{ pp n.prime_division1 }
[[3623, 1], [61283, 1], [352117631, 1], [6395490847, 1]]
=> 25.959149721
```
Again, we get a *triple win* to this old codebase by merely rewriting it. It can
be made 3x faster by leveraging the `prime?` method from the `OpenSSL` library to
perform a more efficient|faster factoring algorithm, and implementation.
```
require 'prime.rb'
require 'openssl'
class Integer
def prime_division2(generator = Prime::Generator23.new)
return [] if (self.abs | 1) == 1
pv = self < 0 ? [-1] : []
value = self.abs
prime = generator.next
until value.to_bn.prime? or value == 1
while prime
(pv << prime; value /= prime; break) if value % prime == 0
prime = generator.next
end
end
pv << value if value > 1
pv.group_by {|prm| prm }.map{|prm, exp| [prm, exp.size] }
end
end
```
Here we're making much better use of Ruby idioms and libraries (`enumerable` and
`openssl`), leading to a much greater performance increase. A bigger *triple win*.
Pasting this code into a 2.5.0 terminal session gives the following results.
```
# Hardware: System76 laptop; I7 cpu @ 3.5GHz, 64-bit Linux
def tm; s=Time.now; yield; Time.now-s end
n = 500_000_000_000_000_000_008_244_213; tm{ pp n.prime_division }
[[3623, 1], [61283, 1], [352117631, 1], [6395490847, 1]]
=> 27.02951016
n = 500_000_000_000_000_000_008_244_213; tm{ pp n.prime_division1 }
[[3623, 1], [61283, 1], [352117631, 1], [6395490847, 1]]
=> 25.959149721
n = 500_000_000_000_000_000_008_244_213; tm{ pp n.prime_division2 }
[[3623, 1], [61283, 1], [352117631, 1], [6395490847, 1]]
=> 9.39650374
```
`prime_division2` is much more usable for significantly larger numbers and use
cases than `prime_division`. I can even do multiple times better than this, if
you review the above cited forum thread.
My emphasis here is to show there are a lot of possible *low hanging fruit*
performance gains ripe for the picking to achieve Ruby 3 performance goals, if we
look (at minimum) for simpler|better code rewrites, and then algorithmic upgrades.
So the question is, are the devs willing to upgrade the codebase to provide the
demonstrated performance increases shown here for `prime_division`?
--
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:ruby-core-request@ruby-lang.org?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>