[#402707] Require a ruby project to automatically include the modules in classes defined in the same .rb file — Marc Heiler <lists@...>

Hi.

11 messages 2013/01/03
[#402738] Re: Require a ruby project to automatically include the modules in classes defined in the same .rb file — Josh Cheek <josh.cheek@...> 2013/01/04

On Wed, Jan 2, 2013 at 9:58 PM, Marc Heiler <lists@ruby-forum.com> wrote:

[#402764] Best practice for &&, ||, and, or — sto.mar@...

Hi group,

33 messages 2013/01/05
[#402786] Re: Best practice for &&, ||, and, or — "Jan E." <lists@...> 2013/01/05

Hi,

[#402812] newbie question what am I doing wrong? — "Lee V." <lists@...>

I wrote this simple program but it won't work. What am I doing wrong?

13 messages 2013/01/07

[#402856] Ruby on Android - usb/serialport — Scott Macri <lists@...>

Hello,

12 messages 2013/01/07

[#402880] One liner for filenames — Peter Bailey <lists@...>

Hello,

18 messages 2013/01/08

[#402890] Pure Ruby Jobs — Brandon Weaver <keystonelemur@...>

One thing has been bugging me lately. I've been looking around for jobs in

15 messages 2013/01/09

[#402958] how to open pop up window table? — Arup Rakshit <lists@...>

There is `text label` on a webpage, and I am trying to click on that to

13 messages 2013/01/10

[#403015] How Ruby environment varibles work in realtime program? — Arup Rakshit <lists@...>

Hi,

11 messages 2013/01/11

[#403051] Array methods creating confusions as per their functionalities — Arup Rakshit <lists@...>

Can any one just elaborate how the below works in Ruby, by definition

10 messages 2013/01/12

[#403062] How to take information from a text file and add them to an array — Adam Kennedy <lists@...>

Hi Im trying to take a list of usernames from a text file then add them

13 messages 2013/01/12

[#403083] Can anyone tell me the computational logic of Unpack() method of string? — Arup Rakshit <lists@...>

Hi,

17 messages 2013/01/12

[#403116] Garbage Collection and Fibers — Na Na <lists@...>

Hi,

20 messages 2013/01/13

[#403127] Conversion of Ruby-code to c/c++ code :: URGENT Plz help — "Nilesh S." <lists@...>

Hi.. I urgent require to convert the following ruby-code to c/c++ code.

11 messages 2013/01/14

[#403139] Installation query — Ron Herrema <lists@...>

I'm new to Ruby and am enjoying it, but when I installed, I attempted to

19 messages 2013/01/14

[#403205] Escaped backslashes in input strings - newbie question — John Sampson <jrs.idx@...>

I am trying to find a way of removing escaped characters in input

13 messages 2013/01/16
[#403208] Re: Escaped backslashes in input strings - newbie question — Alexander McMillan <alexandermcmillan@...> 2013/01/16

[#403244] Adding file directory automatically — Adam Kennedy <lists@...>

I have a bit of code that will add an amount to an array and then print

23 messages 2013/01/17

[#403326] question about string concatenation — David Richards <lists@...>

I'm puzzled about why the following happens (I'm using v1.9.3):

11 messages 2013/01/20

[#403377] Getting error "getaddrinfo: No such host is known. (Socke tError)" with mechanize gem — Arup Rakshit <lists@...>

I tried the below code:

9 messages 2013/01/22
[#403379] Re: Getting error "getaddrinfo: No such host is known. (Socke tError)" with mechanize gem — Robert Klemme <shortcutter@...> 2013/01/22

On Tue, Jan 22, 2013 at 3:52 PM, Arup Rakshit <lists@ruby-forum.com> wrote:

[#403423] Reading and looping through Excel — cristian cristian <lists@...>

Hi all!

16 messages 2013/01/24

[#403456] Can we attach documents to excel columns using Ruby? — Arup Rakshit <lists@...>

Suppose I do have some folders in a directory. Now say directory name

12 messages 2013/01/24

[#403540] Please explain in English — jooma lavata <lists@...>

I'm learning Ruby and I'm reading some expression that I saw on the

20 messages 2013/01/28

[#403553] Learning Ruby and proving your knowledge — Nathaniel Sokoll-Ward <lists@...>

Hey all,

19 messages 2013/01/28

[#403581] newbie question.. — Zebulon Bowles <lists@...>

So I'm taking a class on Ruby and it seems as though the teacher has

12 messages 2013/01/29

[#403607] (Errno::EINVAL) occurs during the File::rename() execution — Arup Rakshit <lists@...>

Hi I wrote the below code to rename the file names. The logic is during

12 messages 2013/01/30

[#403642] How to copy the directory files only to another directory? — Arup Rakshit <lists@...>

Hi,

18 messages 2013/01/30

[#403656] Does Ruby has any default database with it? — Arup Rakshit <lists@...>

I will do webpage scraping using Ruby and required Gems. But looking for

28 messages 2013/01/30
[#403657] Re: Does Ruby has any default database with it? — Brandon Weaver <keystonelemur@...> 2013/01/30

Normally sqlite is the go to being that it's the default of rails. Check

[#403667] Re: Does Ruby has any default database with it? — Justin Collins <justincollins@...> 2013/01/30

On 01/30/2013 10:21 AM, Arup Rakshit wrote:

[#403671] Re: Does Ruby has any default database with it? — Tony Arcieri <tony.arcieri@...> 2013/01/30

On Wed, Jan 30, 2013 at 12:07 PM, Justin Collins <justincollins@ucla.edu>wrote:

[#403674] Re: Does Ruby has any default database with it? — Arup Rakshit <lists@...> 2013/01/30

Tony Arcieri wrote in post #1094436:

[#403678] Re: Does Ruby has any default database with it? — Justin Collins <justincollins@...> 2013/01/30

On 01/30/2013 12:27 PM, Arup Rakshit wrote:

[#403735] Re: Does Ruby has any default database with it? — tamouse mailing lists <tamouse.lists@...> 2013/02/01

I think the best course for a new project is to start simple, go with

[#403698] Select "columns" from multidimensional array? — Joel Pearson <lists@...>

There's probably a simpler answer to this than the ways I've come up

51 messages 2013/01/31

[#403718] Ruby Project Ideas to get someone hired... — Colby Callahan <colby.callahan@...>

I have started learning Ruby this past week and have down the basics of

15 messages 2013/01/31

Re: What DB to use for lots of categorized urls and domains? noSQL? SQL?

From: Eliezer Croitoru <eliezer@...>
Date: 2013-01-15 15:14:09 UTC
List: ruby-talk #403169
Thanks Bob,

To start from zero again.
The DB is a url\domains filtering black\white lists.
which means the DB is being scanned frequently mostly for a full match 
of a domain or a url\path.

The DB now is pretty static and I am adding a feature which will make it 
more dynamic as it is now.
Now I use a basic TC DB and redis for caching of results.
(the reason i'm not using only redis is size)

In the current state the DB is static for at least 12 hours which means 
I can trigger an update based on a simple "flag" for that.

I have one storage which holds all the basic DB and then the nodes copy 
the complete DB from there as files.

The DB will be separated into different parts:
- queue DB based on url requests live from the proxy.
- master DB which will get updates based on the queue DB.
- nodes DB which will be static and will get updates from the master DB 
based on changes.

The queue DB will be very heavy duty(more then 900 hits per sec and more).

The master DB will be updated by couple sources in a rate of about 10k 
updates in rush hour.

The nodes must be updated from the master with a maximum latency of 2 
secs since filtering is a big issue..

The nodes DB is being requested in the rate of the incoming requests 
which is more then 1k requests per sec.

The idea is to make updates possible in a more dynamic form then static 
files.

On the master I must apply some consistency which will apply also to the 
nodes updates since Content filtering is a pretty important issue.

Scan the whole DB?
it's a search DB... you look for a match url or a match domain etc..
from MYSQL point of view the table should have index and two columns one 
for the url\domain and the second for the rating.

just as a note:
I have used mySQL for the static DB but it was slower(1000%+ slower) 
then TC and also the size of the mySQL DB was +100%.
I have tried tweaking mySQL in couple aspects which wasn't enough.

I may have did something wrong about it and I am open minded about any 
possible solution which can work nice in this setup.

One of the problems is that the DB size will grow pretty fast when 
starting the system.

I can start simulating to get a more specific numbers.
Just imaging more then 1k requests per sec on a http proxy for a lot of 
users.. double that in about 10 and you will get a basic view of what to 
expect.

Take in account that I will release the code of most of the systems I 
was working on such as ICAP server and some other related code.

Eliezer

On 1/15/2013 4:21 PM, Bob Hutchison wrote:
> Hi,
>
> Here are a bunch of questions to ask yourself that cover off things that I've found helpful to know.
>
> You might think about why you need to move from your combination of TokyoCabinet and Redis. In particular, why not Redis. It's a little more clear why you'd move from TokyoCabinet, I've used it happily for years myself, so I can imagine a bunch of reasons.
>
>
>> >The scale is in couple directions:
>> >- multiple physical nodes the main DB stored on.
> Is this for reliability or performance? This sounds like a solution not a requirement.
>
>> >- Database size
> How big do you think it'll be?
>
>> >- requests per second
> What kind of request rate are you thinking?
>
> Do you care about latency? (you should) Throughput and latency are pretty much independent variables when it comes to databases.
>
>> >- master and secondary updates\replication
> Again, this sounds like a solution not a requirement. What's the issue that makes you say this?
>
> What is your read/write ratio? What is your write rate? Are you updating or writing new data, and what's the ratio of update to write? Do you need secondary indexes? How many, what kind?
>
> If you write to the master then replicate there'll be a time period where the various nodes will provide different results. Can your application tolerate this? or do you need some kind of stronger consistency constraint?
>
> Are your updates/writes exposing you to consistency issues? (i.e. do you need transactions?) If you update (or even write) multiple records, it's possible that the updates arrive in an essentially random order to the replicas, and possibly in a different order to the different replicas.
>
>> >
>> >For now one machine will host the DB while it gets updates from couple sources such as human and other auto-testing tools.
>> >This will be a dedicated DB machine while there are others servers which gets updates from the master DB when needed.
>> >The problem is that the updates are live and should be replicated with the smallest delay possible.
> What does "when needed" mean given that the updates should be "as soon as possible"? I'm thinking that this master/slave setup you're thinking of is lifted from how you'd do it with TokyoCabinet or Redis. Things like Cassandra or Riak or HBase don't do it that way.
>
> Are you ever going to have to scan your whole database? How often?
>
> Cheers,
> Bob
>

In This Thread

Prev Next