Page 1 of 1

adding 90mb csv file to the vicidial_dnc

PostPosted: Thu Oct 22, 2009 9:33 am
by Op3r
Any ever tried putting in a 90mb dnc file to that table?

Or anyone using any leads scrubber? what do you guys recommend?

thanks

PostPosted: Thu Oct 22, 2009 10:47 am
by brett05
waw great a 90mb of csv file ??
i think you will collect all number from this world
please don't put my number in it :D

PostPosted: Thu Oct 22, 2009 4:21 pm
by okli
This would be maybe millions of records. Matt has mentioned several times DNC table is not intended for so many records and leads should be scrubbed in advance, rather than using DNC table.

As for scrubbing- quite easy and fast from command line, in Windows for example:

Code: Select all
findstr /v /g:dnc.csv input.csv >> output.csv


In Linux:
Code: Select all
grep -v -f dnc.csv input.csv >> output.csv

PostPosted: Thu Oct 22, 2009 7:26 pm
by tweidner6471
Yeah, wow, I wouldn't try loading that into the DNC table unless you never plan on using it. I can only imagine that this would really hurt performance when leads are loaded into the hopper.

Like everyone else said, scrub those leads first, either by some awk/sed or in a database.

PostPosted: Fri Oct 23, 2009 9:29 am
by ykhan
I have a client using the vicidial dnc with 17 seats at a dial ratio of 3:1 successfully with more than 2 million records without any issues so far. I just use the LOAD INFILE command for MySQL from the command line to upload the numbers though.

PostPosted: Fri Oct 23, 2009 3:30 pm
by Op3r
yeah,

I use load infile to mysql too.

The reason why I want to do this is because I want to.

I have loaded the 7million dnc and still have to encounter any problems. :)

PostPosted: Fri Oct 23, 2009 3:49 pm
by tweidner6471
That's kind of surprising. If you don't mind my asking, what is your hopper level set to on the campaigns that check against the DNC table?

I'd expect the hopper.pl script to choke up a bit on the first pass through. Looping through a few thousand new leads and doing a count against the DNC for each one sounds a little slow in my mind. I just ran a check of 1.5 million leads on a DNC table 500 thousand large in the same way the hopper.pl file does, and it took around 1 second to execute. Maybe I'm just a performance monger and have issues with queries running longer than a few seconds, heh.

At any rate, it just proves how awesome ViciDial really is. I have a TouchStar server that, while nice, doesn't have the flexibility and power of this system, and it literally runs over $1,000 an agent for licensing alone.

PostPosted: Fri Oct 23, 2009 5:44 pm
by okli
This would greatly depend on mysql server muscles and how loaded it is.

Op3r might afford to put 7mill records in DNC table on powerful not stressed machine, but I'd not, on our weaky-tiny-modest DB server :)

PostPosted: Fri Oct 23, 2009 6:11 pm
by mflorell
We have a client that loaded over 3 million records into the DNC table on a single server system and they didn't really have any issues either.

The problems start happening at higher levels really. If you tried to put the 170,000,000 record USA FTC DNC list in your ViciDial server it would not work very well no matter what kind of hardware you were using.

PostPosted: Sat Oct 24, 2009 11:46 am
by Op3r
Also only 10-15 agents are just using the server so I dont see any harm on it.

I am currently thinking about having a dnc scrubber created and have it automatically upload the leads to vicidial. This might be a good idea after all.

Now to get the ftc dnc list. Thats another question.

PostPosted: Mon Oct 26, 2009 10:01 pm
by konextu
Op3r, what format(file format that is) are your leads in before loading into vicidial?

PostPosted: Mon Oct 26, 2009 10:02 pm
by williamconley
mflorell wrote:We have a client that loaded over 3 million records into the DNC table on a single server system and they didn't really have any issues either.

The problems start happening at higher levels really. If you tried to put the 170,000,000 record USA FTC DNC list in your ViciDial server it would not work very well no matter what kind of hardware you were using.
on the other hand, if your SQL server were a MONSTER (and a seperate one at that, of course) you could probably get away with it. but the question would be ... why? it's so much easier to pre-wash the leads before bringing them in (and then re-washing them every night after shift ...)

i even have a few boxes with a "disabler" that refuses to use new leads until they've been washed (keep the managers honest) which occurs on a timed basis (for new leads only) so the hopper script doesn't have to do it, it's handled by MySQL directly which is MUCH freakin faster, and can be set to run only when cpu is under 50% ttl usage. so no problems across the board, except a delay for a new list to be used until server load drops enough to enable the new list. (if the list is washed FIRST, the list washer "pre-enables" the list so it can be used immediately, and that runs on another server entirely, so there's no hit at all).