[#392128] negative grep — Matt Lawrence <matt@...>

A bit of syntax that I have never picked up. How do I use grep to exclude

14 messages 2012/01/12
[#392129] Re: negative grep — K Clair <kclair@...> 2012/01/12

biglist !~ /bar/

[#392135] Problem with "Exception" - suddenly stopped working — Panagiotis Atmatzidis <ml@...>

Hello,

16 messages 2012/01/12
[#392144] Re: Problem with "Exception" - suddenly stopped working — "Abinoam Jr." <abinoam@...> 2012/01/12

IMHO ~PERHAPS~ the begin rescue is not working because the exception

[#392146] Re: Problem with "Exception" - suddenly stopped working — Peter Vandenabeele <peter@...> 2012/01/12

On Thu, Jan 12, 2012 at 10:04 PM, Abinoam Jr. <abinoam@gmail.com> wrote:

[#392147] Re: Problem with "Exception" - suddenly stopped working — Peter Vandenabeele <peter@...> 2012/01/12

On Thu, Jan 12, 2012 at 10:33 PM, Peter Vandenabeele <peter@vandenabeele.com

[#392154] Re: Problem with "Exception" - suddenly stopped working — Panagiotis Atmatzidis <ml@...> 2012/01/12

Hello,

[#392161] Re: Problem with "Exception" - suddenly stopped working — "Abinoam Jr." <abinoam@...> 2012/01/13

On Thu, Jan 12, 2012 at 8:45 PM, Panagiotis Atmatzidis

[#392162] Re: Problem with "Exception" - suddenly stopped working — Peter Vandenabeele <peter@...> 2012/01/13

On Fri, Jan 13, 2012 at 12:23 PM, Abinoam Jr. <abinoam@gmail.com> wrote:

[#392242] The Better Code — Intransition <transfire@...>

Which would you judge to be the better code?

15 messages 2012/01/16

[#392252] Which library to write a parser — thomas carlier <carlier.thomas@...>

Hi,

16 messages 2012/01/16

[#392262] uniq with count; better way? — Ralph Shnelvar <ralphs@...32.com>

a = [4,5,6,4,5,6,6,7]

42 messages 2012/01/16
[#392266] Re: uniq with count; better way? — Sigurd <cu9ypd@...> 2012/01/16

The first that came to my mind.

[#392268] Re: uniq with count; better way? — Adam Prescott <adam@...> 2012/01/16

On Mon, Jan 16, 2012 at 16:00, Sigurd <cu9ypd@gmail.com> wrote:

[#392277] Re: uniq with count; better way? — Magnus Holm <judofyr@...> 2012/01/16

On Mon, Jan 16, 2012 at 17:04, Adam Prescott <adam@aprescott.com> wrote:

[#392287] Re: uniq with count; better way? — "Abinoam Jr." <abinoam@...> 2012/01/17

On Mon, Jan 16, 2012 at 1:48 PM, Magnus Holm <judofyr@gmail.com> wrote:

[#392289] Re: uniq with count; better way? — "Abinoam Jr." <abinoam@...> 2012/01/17

On Mon, Jan 16, 2012 at 9:22 PM, Abinoam Jr. <abinoam@gmail.com> wrote:

[#392291] Re: uniq with count; better way? — "Abinoam Jr." <abinoam@...> 2012/01/17

On Mon, Jan 16, 2012 at 10:05 PM, Abinoam Jr. <abinoam@gmail.com> wrote:

[#392303] Re: uniq with count; better way? — Peter Vandenabeele <peter@...> 2012/01/17

On Tue, Jan 17, 2012 at 2:44 AM, Abinoam Jr. <abinoam@gmail.com> wrote:

[#392351] Re: uniq with count; better way? — Robert Klemme <shortcutter@...> 2012/01/18

On Tue, Jan 17, 2012 at 12:08 PM, Peter Vandenabeele

[#392286] Parsing log with date time entry — Christopher Graves <gravescl@...>

The log file looks like this

24 messages 2012/01/16

[#392406] Name directory with a variable — Alex Sweps <alexszepes@...>

Hello again everyone.

14 messages 2012/01/20

[#392429] Getting an Object to Push or Register "Itself" With a Hash During Initialization — Frank Guerino <frank.guerino@...4it.com>

Hi,

11 messages 2012/01/20

[#392460] Microrant on Ruy's Math Skills — Intransition <transfire@...>

So simple...

116 messages 2012/01/21
[#392464] Re: Microrant on Ruy's Math Skills — Gary Wright <gwtmp01@...> 2012/01/21

[#392469] Re: Microrant on Ruy's Math Skills — Yossef Mendelssohn <ymendel@...> 2012/01/21

On Jan 21, 2012 9:34 AM, "Gary Wright" <gwtmp01@mac.com> wrote:

[#392471] Re: Microrant on Ruy's Math Skills — Su Zhang <su.comp.lang.ruby@...> 2012/01/21

On 1/21/2012 12:08 PM, Yossef Mendelssohn wrote:

[#392499] Re: Microrant on Ruy's Math Skills — Intransition <transfire@...> 2012/01/22

So they can drop a billion transistors on a chip, have implemented 3D

[#392547] Re: Microrant on Ruy's Math Skills — Robert Klemme <shortcutter@...> 2012/01/23

On Sun, Jan 22, 2012 at 3:03 AM, Intransition <transfire@gmail.com> wrote:

[#392550] Re: Microrant on Ruy's Math Skills — Peter Vandenabeele <peter@...> 2012/01/23

On Mon, Jan 23, 2012 at 9:29 AM, Robert Klemme

[#392579] Re: Microrant on Ruy's Math Skills — Chad Perrin <code@...> 2012/01/23

On Mon, Jan 23, 2012 at 07:33:20PM +0900, Peter Vandenabeele wrote:

[#392581] Re: Microrant on Ruy's Math Skills — Steve Klabnik <steve@...> 2012/01/23

> Even that and the '1.1'.to_dec option mentioned elsewhere seem pretty

[#392585] Re: Microrant on Ruy's Math Skills — Chad Perrin <code@...> 2012/01/23

On Tue, Jan 24, 2012 at 03:14:27AM +0900, Steve Klabnik wrote:

[#392587] Re: Microrant on Ruy's Math Skills — Steve Klabnik <steve@...> 2012/01/23

No, it's not a terminology difference. That's why it won't work. You

[#392590] Re: Microrant on Ruy's Math Skills — Chad Perrin <code@...> 2012/01/23

On Tue, Jan 24, 2012 at 05:45:18AM +0900, Steve Klabnik wrote:

[#392591] Re: Microrant on Ruy's Math Skills — Ryan Davis <ryand-ruby@...> 2012/01/23

[#392618] Re: Microrant on Ruy's Math Skills — Alex Chaffee <alexch@...> 2012/01/24

"Standard is better than better." -Anon.

[#392643] Re: Microrant on Ruy's Math Skills — Gavin Sinclair <gsinclair@...> 2012/01/25

On Wed, Jan 25, 2012 at 6:05 AM, Alex Chaffee <alexch@gmail.com> wrote:

[#392673] Re: Microrant on Ruy's Math Skills — Intransition <transfire@...> 2012/01/25

I have tried this, but recently discovered the same issues arise.

[#392743] Re: Microrant on Ruy's Math Skills — Garthy D <garthy_lmkltybr@...> 2012/01/27

[#392745] Re: Microrant on Ruy's Math Skills — Josh Cheek <josh.cheek@...> 2012/01/27

On Thu, Jan 26, 2012 at 6:05 PM, Garthy D <

[#392766] Re: Microrant on Ruy's Math Skills — Adam Prescott <adam@...> 2012/01/27

On Fri, Jan 27, 2012 at 03:05, Josh Cheek <josh.cheek@gmail.com> wrote:

[#392776] Re: Microrant on Ruy's Math Skills — Chad Perrin <code@...> 2012/01/27

On Fri, Jan 27, 2012 at 11:02:52PM +0900, Adam Prescott wrote:

[#392781] Re: Microrant on Ruy's Math Skills — Gary Wright <gwtmp01@...> 2012/01/27

[#392805] Re: Microrant on Ruy's Math Skills — "Jon Lambert" <jlambert@...> 2012/01/29

On Jan 27, 2012, at 3:26 PM, Gary Wright wrote:

[#392831] Re: Microrant on Ruy's Math Skills — Gary Wright <gwtmp01@...> 2012/01/30

[#392835] Re: Microrant on Ruy's Math Skills — Chad Perrin <code@...> 2012/01/30

On Mon, Jan 30, 2012 at 10:03:04AM +0900, Gary Wright wrote:

[#392837] Re: Microrant on Ruy's Math Skills — Robert Klemme <shortcutter@...> 2012/01/30

On Mon, Jan 30, 2012 at 6:56 AM, Chad Perrin <code@apotheon.net> wrote:

[#392847] Re: Microrant on Ruy's Math Skills — Chad Perrin <code@...> 2012/01/30

On Mon, Jan 30, 2012 at 05:22:47PM +0900, Robert Klemme wrote:

[#392511] Building desktop application using Ruby and any GUI Framework — Rubyist Rohit <passionate_programmer@...>

I want to write a small desktop application on Ruby. I want the

12 messages 2012/01/22

[#392598] Web Application from Scratch - like PHP — "Gaurav C." <chande.gaurav@...>

Hi,

17 messages 2012/01/24

[#392635] A little assistance please :) — Paet Worlds II <paetilium@...>

So I'm still quite new to Ruby and so far I love it's simplicity, but I

21 messages 2012/01/25
[#392636] Re: A little assistance please :) — Hilco Wijbenga <hilco.wijbenga@...> 2012/01/25

On 24 January 2012 17:14, Paet Worlds II <paetilium@live.com> wrote:

[#392637] Re: A little assistance please :) — Paet Worlds II <paetilium@...> 2012/01/25

Hilco Wijbenga wrote in post #1042399:

[#392641] Re: A little assistance please :) — Chad Perrin <code@...> 2012/01/25

On Wed, Jan 25, 2012 at 10:41:10AM +0900, Paet Worlds II wrote:

[#392672] Re: A little assistance please :) — Dave Aronson <rubytalk2dave@...> 2012/01/25

On Tue, Jan 24, 2012 at 23:32, Chad Perrin <code@apotheon.net> wrote:

[#392711] Re: A little assistance please :) — Chad Perrin <code@...> 2012/01/26

On Thu, Jan 26, 2012 at 03:43:59AM +0900, Dave Aronson wrote:

[#392818] Help please Undefined Method error — "andres d." <andres.1996.1@...>

Hi and thank you for reading this

12 messages 2012/01/29

[#392867] Multiple assignment in conditional — Gavin Sinclair <gsinclair@...>

I find this a strange Ruby error.

28 messages 2012/01/31
[#392868] Re: Multiple assignment in conditional — Josh Cheek <josh.cheek@...> 2012/01/31

On Tue, Jan 31, 2012 at 12:46 AM, Gavin Sinclair <gsinclair@gmail.com>wrote:

[#392944] Re: Multiple assignment in conditional — Gavin Sinclair <gsinclair@...> 2012/02/01

On Tue, Jan 31, 2012 at 6:22 PM, Josh Cheek <josh.cheek@gmail.com> wrote:

[#392914] Re: Multiple assignment in conditional — Robert Klemme <shortcutter@...> 2012/01/31

On Tue, Jan 31, 2012 at 7:46 AM, Gavin Sinclair <gsinclair@gmail.com> wrote=

[#392917] PHP vs Ruby is it worth it? — Samuel Mensah <sasogeek@...>

Hi, I've been searching around for what the best language there is out

14 messages 2012/01/31

Re: uniq with count; better way?

From: Peter Vandenabeele <peter@...>
Date: 2012-01-17 11:08:28 UTC
List: ruby-talk #392303
On Tue, Jan 17, 2012 at 2:44 AM, Abinoam Jr. <abinoam@gmail.com> wrote:

> On Mon, Jan 16, 2012 at 10:05 PM, Abinoam Jr. <abinoam@gmail.com> wrote:
> >                     user     system      total        real
> > Ralph Shneiver:   0.290000   0.000000   0.290000 (  0.259640)
> > Sigurd:          0.320000   0.000000   0.320000 (  0.289873)
> > Keinich #1       0.560000   0.000000   0.560000 (  0.497736)
> > Keinich #2       0.280000   0.000000   0.280000 (  0.250843)
> > Magnus Holm:     0.310000   0.000000   0.310000 (  0.283344)
> > Abinoam #1:      1.140000   0.000000   1.140000 (  1.042744)
> >
> > Abinoam Jr.
> >
>
> Sorry for the mess... (some error in the bench code + horrible text
> wrapping)
>
> Apologizing with the gist...
> https://gist.github.com/1624016
>


Thanks. Very interesting.

I assumed (incorrectly turns out for Ruby 1.9.3) that for large data sets
there
would be a significant performance difference. Because, upon determining the
"bins" of uniq values, that is essentially a form of "sorting", which can
be O(n^2)
if not careful.

Turns out I was wrong for ruby 1.9.3 (but right for some other rubies).

I rewrote the code to create large datasets (array up to 10_000_000), but
with
the data inside a set of 0...100.

require 'benchmark'

n = 1
#ar = [4,5,6,4,5,6,6,7]
ar = [].tap{|a| 10_000_000.times {a << rand(100)}} #1_000_000, 2_000_000,
...
puts "SAMPLE of ar"
puts ar[0...20]
puts "SIZE"
puts ar.size
Benchmark.bm(15) do |b|
 b.report("Ralph Shneiver:"){ n.times { result = Hash.new(0); ar.each { |x|
result[x] += 1 }; result} }
 b.report("Sigurd:") { n.times { ar.inject(Hash.new(0)) {|res, x| res[x] +=
1; res } } }
 b.report("Keinich #1") { n.times { Hash[ar.group_by{|n|n}.map{|k,v|[k,
v.size]}] } }
 b.report("Keinich #2") { n.times { Hash.new(0).tap{|h|ar.each{|n|h[n] +=
1}} } }
 b.report("Magnus Holm:") { n.times { ar.each_with_object(Hash.new(0)) {
|x, res| res[x] += 1 } } }
 b.report("Abinoam #1:") { n.times { Hash[ar.sort.chunk {|n| n}.map {|ix,
els| [ix, els.size] } ] } }
end



RUBY 1.9.3:
==========

For ruby 1.9.3-p0 all solutions perform approx. the same (at least same
order)
and seem to stay quasi linear.

SIZE
1000000  #[1_000_000 that is ; n = 1]
                      user     system      total        real
Ralph Shneiver:   0.180000   0.000000   0.180000 (  0.174283)
Sigurd:           0.200000   0.000000   0.200000 (  0.203652)
Keinich #1        0.140000   0.000000   0.140000 (  0.142833)
Keinich #2        0.180000   0.000000   0.180000 (  0.177456)
Magnus Holm:      0.200000   0.000000   0.200000 (  0.205895)
Abinoam #1:       0.260000   0.000000   0.260000 (  0.254554)

SIZE
2000000  #[2_000_000 that is ; n = 1]
                      user     system      total        real
Ralph Shneiver:   0.340000   0.010000   0.350000 (  0.350032)
Sigurd:           0.410000   0.000000   0.410000 (  0.406483)
Keinich #1        0.280000   0.000000   0.280000 (  0.285213)
Keinich #2        0.350000   0.010000   0.360000 (  0.354640)
Magnus Holm:      0.410000   0.000000   0.410000 (  0.411010)
Abinoam #1:       0.470000   0.030000   0.500000 (  0.498782)


SIZE
10000000  #[10_000_000 that is; n = 1 here]

                      user     system      total        real
Ralph Shneiver:   1.710000   0.040000   1.750000 (  1.748137)
Sigurd:           2.000000   0.010000   2.010000 (  2.012496)
Keinich #1        1.380000   0.030000   1.410000 (  1.409462)
Keinich #2        1.750000   0.010000   1.760000 (  1.760997)
Magnus Holm:      1.990000   0.020000   2.010000 (  2.014282)
Abinoam #1:       2.500000   0.060000   2.560000 (  2.562646)

All solutions in Ruby 1.9.3 are relatively "linear".

Keinich #1 is the fastest.

I believe it is because I limited the dataset to 100 different values, the
number of
bins is fixed and order O(n) can be achieved (not the O(n.log(n)) that I had
expected, incorrectly).

RUBY 1.8.7:
==========

For ruby 1.8.7-p I could not test the last 2  solutions (methods not
implemented).
For the first 3, _significant_ differences arise.

SIZE
1000000 #[1_000_000 that is ; n = 1]

                     user     system      total        real
Ralph Shneiver:  0.370000   0.010000   0.380000 (  0.369785)
Sigurd:          2.320000   0.030000   2.350000 (  2.360403)
Keinich #1       1.520000   0.040000   1.560000 (  1.562627)
Keinich #2      16.520000   0.100000  16.620000 ( 16.623032)

SIZE
2000000 #[2_000_000 that is ; n = 1]
                     user     system      total        real
Ralph Shneiver:  0.720000   0.010000   0.730000 (  0.737673)
Sigurd:          8.040000   0.110000   8.150000 (  8.142827)
Keinich #1       5.670000   0.040000   5.710000 (  5.716364)
Keinich #2      52.600000   0.480000  53.080000 ( 53.100823)

SIZE
10000000  #[10_000_000 that is ; n = 1]

                     user     system      total        real
Ralph Shneiver:  3.680000   0.000000   3.680000 (  3.680230)

Striking: Only the solution of Ralph Shneiver remains linear in MRI 1.8.7


JRUBY (1.6.5.1)
=============

$ ruby -v
jruby 1.6.5.1 (ruby-1.8.7-p330) (2011-12-27 1bf37c2) (OpenJDK Server VM
1.6.0_23) [linux-i386-java]

I also tested JRuby out of curiosity.

SIZE
1000000  #[1_000_000 that is ; n = 1]
                     user     system      total        real
Ralph Shneiver:  0.244000   0.000000   0.244000 (  0.220000)
Sigurd:          0.264000   0.000000   0.264000 (  0.264000)
Keinich #1       0.112000   0.000000   0.112000 (  0.112000)
Keinich #2      11.256000   0.000000  11.256000 ( 11.256000)

SIZE
2000000  #[2_000_000 that is ; n = 1]
                     user     system      total        real
Ralph Shneiver:  0.392000   0.000000   0.392000 (  0.368000)
Sigurd:          0.439000   0.000000   0.439000 (  0.438000)
Keinich #1       0.196000   0.000000   0.196000 (  0.196000)
Keinich #2      14.462000   0.000000  14.462000 ( 14.462000)

SIZE
10000000 #[10_000_000 that is ; n = 1]]

                     user     system      total        real
Ralph Shneiver:  1.604000   0.000000   1.604000 (  1.581000)
Sigurd:          1.769000   0.000000   1.769000 (  1.769000)
Keinich #1       0.967000   0.000000   0.967000 (  0.967000)
Keinich #2       8.978000   0.000000   8.978000 (  8.978000)


Interesting again.

Those 2 solutions where not yet available on JRuby 1.6.5.1:

Magnus Holm:   NoMethodError: undefined method `each_with_object' for
#<Array:0x18eb7b8>
Abinoam #1:    NoMethodError: undefined method `chunk' for
#<Array:0x1719f30>

Ralph Shneiver remains linear again.

But then again ... Keinich #1 is significantly faster on 10_000_000 than
the others.
Keinich #2 seems to be not very predictable?

JRUBY HEAD (in rvm):
==================

$ ruby -v
jruby 1.7.0.dev (ruby-1.8.7-p357) (2012-01-17 7de254f) (Java HotSpot(TM)
Server VM 1.6.0_26) [linux-i386-java]

1000000  #[1_000_000 that is ; n = 1]

                     user     system      total        real
Ralph Shneiver:  0.238000   0.000000   0.238000 (  0.223000)
Sigurd:          0.286000   0.000000   0.286000 (  0.286000)
Keinich #1       0.120000   0.000000   0.120000 (  0.120000)
Keinich #2       7.841000   0.000000   7.841000 (  7.841000)

SIZE
2000000  #[2_000_000 that is ; n = 1]

                     user     system      total        real
Ralph Shneiver:  0.426000   0.000000   0.426000 (  0.410000)
Sigurd:          0.478000   0.000000   0.478000 (  0.478000)
Keinich #1       0.213000   0.000000   0.213000 (  0.213000)
Keinich #2      21.928000   0.000000  21.928000 ( 21.928000)

SIZE
10000000 #[10_000_000 that is ; n = 1]

                     user     system      total        real
Ralph Shneiver:  1.550000   0.000000   1.550000 (  1.535000)
Sigurd:          1.795000   0.000000   1.795000 (  1.795000)
Keinich #1       1.060000   0.000000   1.060000 (  1.060000)
Keinich #2     116.826000   0.000000 116.826000 (116.826000)

Similar results to JRuby 1.6.5.1

Caveat:
=======

I did not check the correctness of the result, only the timing.

Questions:
========

Why would ruby 1.9.3. be so much better at this than ruby 1.8.7 ?
Could it be because the Hash is now "ordered" so it can do an efficient
algorithm when adding an entry to a bin? Or is it object creation?

Why are certain methods leading to non-linear behavior?

My conclusions (?):
===============

* be careful about performance with large data sets

* Fastest overall: Keinich #1 on JRuby 1.6.5.1

* CRuby 1.9.3 seems to be the only on the remains linear for all
  solutions on large data sets.

* "Ralph Shneiver" is the only solution that remains linear for
  all tested rubies (MRI 1.9.3, MRI 1.8.7, JRuby 1.6.5.1 JRuby head)

HTH,

Peter

In This Thread