vicidial.org

by **sbenson** » Wed Aug 15, 2007 6:09 pm

I imported 3 different lists and I am wondering what constitutes "Bad" and "Duplicate" entries.

First we imported in the 3 lists seperately.

Code: Select all: List 1 GOOD: 3547 BAD: 149 TOTAL: 3696 DUPLICATE: 39 POSTAL MATCH: 0 List 2 GOOD: 444 BAD: 56 TOTAL: 500 DUPLICATE: 0 POSTAL MATCH: 0 List 3 GOOD: 1827 BAD: 1173 TOTAL: 3000 DUPLICATE: 1 POSTAL MATCH: 0

Then we made one big file and imported them all at once.

Code: Select all: Total GOOD: 5818 BAD: 1378 TOTAL: 7196 DUPLICATE: 41 POSTAL MATCH: 0

We figured that "Bad = Dupes in the existing list" and "Duplicates = Dupes in the current file". I know Bad = Something + Duplicate...but what is that something?

Scott

by **mflorell** » Thu Aug 16, 2007 12:36 am

Bad should be that the phone number is less than 6 digits. This also could mean a formatting error for the record.

by **tbenson** » Fri Aug 17, 2007 1:57 am

Matt,

I ran the following select statement against the table the leads are coming from to create the lists.

Code: Select all: select * from $table where char_length(phone_number) != 10;

Unless I am wrong this should show any of the records that are bad due to phone length, however this comes up with Empty sets from the 3 tables we were using for the leads.

Thanks

by **mflorell** » Sat Aug 18, 2007 9:40 am

did you have any commas in any of your fields for any records that you were importing?

by **tbenson** » Mon Aug 20, 2007 2:18 pm

Possibly under address however we changed the delimeter to be | instead of comma for the entire file to import. Should it not then ignore all comma's if we are inserting with pipes?

Or should we surround all fields with quotes just in case still???

Thanks,
Trevor

by **mflorell** » Wed Aug 22, 2007 9:22 am

The basic lead loader does not have as much logic in it for file parsing, could you try the Super lead loader with a custom format and see if you get the same results on this file?

by **tbenson** » Wed Aug 22, 2007 1:58 pm

I just tried and when I load an xls file with header information, selecting customer layout. I get the field chooser screen, but every prompt has 2 values to select from (none) and "". It does not seem to open the xls file and read the header fields.

I tested this with XLS and TXT files. XLS is a standard XLS provided to me by a client, which opens fine in OpenOffice. The TXT i pipe-delimited. Both come up with no fields available when in custom. I also edited the TXT file and added quotes around the header fields just in case. This TXT file also shows up with no fields in the chooser.

Thanks

by **mflorell** » Wed Aug 22, 2007 2:59 pm

What is the file extension on these files?

What build version of the super lead loader?

by **tbenson** » Wed Aug 22, 2007 3:32 pm

Each file has a file extension matching criteria layed out in the Manager Manual. The pipe-delimited file has a txt extension, the excel file has xls. These are lower case (manual shows all caps TXT and XLS, I assume this is not a requirement).

Thanks

by **mflorell** » Fri Aug 24, 2007 8:14 am

I have not had a problem like yours before, could you send me a sample of one of your lead files off-list?

by **tbenson** » Fri Aug 24, 2007 12:23 pm

Just sent over to mflorrel@, not sure if that was it but there is a sample of the excel file i am having issues with.

Thanks

by **mflorell** » Fri Aug 24, 2007 2:35 pm

Thanks, I did receive it and will try to take a look at it in the next couple days.

by **tbenson** » Wed Nov 28, 2007 12:35 pm

mflorell wrote:Bad should be that the phone number is less than 6 digits. This also could mean a formatting error for the record.

Matt,

Found this to be incorrect. The logic in the lead loader does not increment correctly for duplicates, and duplicates get marked as bad. The duplicate count appears to be off as well. I took a sample file and ran some bash commands against it to see what was going on while Kevan looked at the lead loading code. Here is what we found:

Kevan noticed the duplicate incrementing in the listloader was part of a loop. So if you increase the printout of bad errors from 10 to all, you will never see duplicate go over 1 in the last collumn, even when you find a duplicate 20 times in a file. So it gets 20 individual lines with the correct row, phone number, and duplicate as 1 in each error.

Also when finishing this file we would get a list of 117 Duplicates, and 1847 bad (when inserted into a new list). When we inserted into an existing list we would get 117 Duplicates and 2903 bad.

I then ran

Code: Select all: cat TEST.TXT |cut -d'|' -f5 |sort -n | uniq | wc -l

This came out with 8153, matching the 1847 "bad" records when inserted into a new list. So the lead loader appears to increment the total BAD when they find a duplicate record. When examining the code above without the word count, i find that this file has about 275-325 duplicate leads, repeated for a total of 1847 times. I assumed 117 duplicates would be total leads that had duplicate records 1 or more times, however its obvious I am incorrect.

Finally when inserting into an existing list with prior leads we got 2903 Bad's, further look showed an additional 1056 were being marked bad as we had them from other vendors already existing in the file, however it still showed 117 duplicates even though we found another 1056 unique records as duplicates.

by **mflorell** » Wed Nov 28, 2007 8:11 pm

Thanks for the in-depth analysis.

What version of astguiclient did you do this analysis on?

The lead loading code was mostly written by someone else, and there is a lot of duplication in the code in several places so it is not the easiest thing to debug.

by **tbenson** » Wed Nov 28, 2007 8:24 pm

Vicidial 2.03. Still having issues with super lead loader mapping fields, so sticking with original until 2.04.

Trevor

by **mflorell** » Wed Nov 28, 2007 9:35 pm

There have been several changes in the lead loader in 2.0.4 You may want to try it on a test box to see if any of your issues have been fixed.

vicidial.org

Dupe Vs Bad?

Dupe Vs Bad?

Who is online