Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
Home
Discussion Groups
General
GeneralPortable MacsHardwareNetworking
Applications
Mac ApplicationsEudoraFirefox / MozillaInternet ExplorerOutlook ExpressMS OfficeEntourageExcelPowerPointWordVirtual PCMedia PlayerOther MS Products
Programming
Mac ProgrammingCodeWarriorPerl
Country Specific
Australian Mac GroupUK Mac Group

Mac Forum / Applications / Excel / January 2006



Tip: Looking for answers? Try searching our database.

Delete mulitple entries

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
grantman - 15 Jan 2006 01:12 GMT
I have an address list that has a ton of duplicates.  How do I, with
essentially one command, get rid of all the duplicates?

Thanks

Grantman
Helpful Harry - 15 Jan 2006 03:13 GMT
> I have an address list that has a ton of duplicates.  How do I, with
> essentially one command, get rid of all the duplicates?

You can't.

The best you can do is sort the list by whichever column(s) you're
using to determine "duplicates" and then manually double-check each one
and delete the entries that REALLY are duplicates.

Helpful Harry                  
Hopefully helping harassed humans happily handle handiwork hardships  ;o)
JE McGimpsey - 15 Jan 2006 06:23 GMT
> > I have an address list that has a ton of duplicates.  How do I, with
> > essentially one command, get rid of all the duplicates?
>
> You can't.

You're making blanket statement based on a large assumption about the
OP's data and intent. You may in fact be right, but if duplicate records
can be identified by a single field, it's quite easy to get rid of
duplicates by performing an advanced filter, extracting unique records
only...

That requires standard forms for that field, at least, but that's hardly
rare.

> The best you can do is sort the list by whichever column(s) you're
> using to determine "duplicates" and then manually double-check each one
> and delete the entries that REALLY are duplicates.

Again, that is a rather blanket statement based on an assumption that
may or may not be valid - the OP didn't give enough info to say.

And there are certainly techniques in addition to sorting and doing a
manual search that would make eliminating duplicates easier.
Helpful Harry - 15 Jan 2006 23:57 GMT
> > > I have an address list that has a ton of duplicates.  How do I, with
> > > essentially one command, get rid of all the duplicates?
[quoted text clipped - 19 lines]
> And there are certainly techniques in addition to sorting and doing a
> manual search that would make eliminating duplicates easier.

There may or may not be other techniques, but the problem is that many
so-called "duplicates" can only be assessed manually to make sure they
are in fact duplicates - this is especially true in an address list
where there can easily be two people with the same name, they may even
live at the same address (father and son, mother and daughter), but
aren't in fact true duplicates. Trying to automate such a process is
asking for some records that are wanted being incorrectly deleted.

Helpful Harry                  
Hopefully helping harassed humans happily handle handiwork hardships  ;o)
JE McGimpsey - 16 Jan 2006 03:14 GMT
> There may or may not be other techniques, but the problem is that many
> so-called "duplicates" can only be assessed manually to make sure they
[quoted text clipped - 3 lines]
> aren't in fact true duplicates. Trying to automate such a process is
> asking for some records that are wanted being incorrectly deleted.

Again, you're making assumptions and blanket statements that may not be
warranted, and your solution may not be any better than an automated one.

The OP's data may in fact be of the nature you cite, and it's worth
alerting the OP of the potential problem. But the data may *not* have
that problem, too, in which case there's no reason for the OP not to
avail himself of an automated solution.

But your example belies your argument - father/son or mother/daughter
with the same name and address are no more tractable using a manual
assessment than using an automatic technique. Unless the assessor has
personal knowledge of the individual or individuals, it's just a swag.
Helpful Harry - 16 Jan 2006 06:07 GMT
> > There may or may not be other techniques, but the problem is that many
> > so-called "duplicates" can only be assessed manually to make sure they
[quoted text clipped - 16 lines]
> assessment than using an automatic technique. Unless the assessor has
> personal knowledge of the individual or individuals, it's just a swag.

See, now you're making assumptions that there is no other data (eg.
birthdate) that makes it obvious that it is or isn't a duplicate ... in
fact if there is no other data you're best to leave it there in case it
isn't a true duplicate, rather than deleting someone that may be
important data.

We always have to make some assumptions because nobody ever explains
100% of the details. It's best to err on the side of caution than
simply saying "here's the automated process", which they then use and
possibly screw up their hard work of entering the data. The other way
is to ask lots of questions until you finally do know 100% of the
details.

Whether you like my answer or not, it was an appropriate answer.
Obviously any one answer may not be the only answer, but that's what
the newsgroup is for - to get multiple replies from people all over the
planet, some of whom may have better or different ideas. That's what
makes the newsgroups a great source of information ... at least the few
newgroups that aren't full of morons, spam and scams.

Helpful Harry                  
Hopefully helping harassed humans happily handle handiwork hardships  ;o)
CyberTaz - 16 Jan 2006 12:04 GMT
Hi Harry -

I'm not taking sides as both you & JE make quite valid points. But just for
clarification, as far as the Adv Filter;

1- If _any_ field of data varies (such as birth date) the record will be
considered Unique & copied to the specified location, even if all other
field content is identical, &

2- The word "extract" is misleading as the matching records are _copied_ to
the specified range and the original list is left completely intact.

Regards |:>)

On 1/16/06 1:07 AM, in article
160120061907134651%helpful_harry@nom.de.plume.com, "Helpful Harry"
<helpful_harry@nom.de.plume.com> wrote:

>>> There may or may not be other techniques, but the problem is that many
>>> so-called "duplicates" can only be assessed manually to make sure they
[quoted text clipped - 39 lines]
> Helpful Harry    
> Hopefully helping harassed humans happily handle handiwork hardships  ;o)
JE McGimpsey - 16 Jan 2006 13:41 GMT
> See, now you're making assumptions that there is no other data (eg.
> birthdate) that makes it obvious that it is or isn't a duplicate ...

Given the context of the discussion, in which I was pointing out that
you were assuming information not given, I did assume that you were
being explicit in your further example.

However, your added info again undermines your argument. If in fact
there's data which differentiates between records, such as a birth date,
then there's no need to manually assess the records to avoid deleting
non-duplicates, since Advanced Filter will consider them unique.

Obviously, that doesn't guard against keeping records that aren't
duplicates but refer to the same address. But since the OP said there
*were* duplicates, those weren't really under discussion.

> in fact if there is no other data you're best to leave it there in
> case it isn't a true duplicate, rather than deleting someone that may
> be important data.

May be true, may not be. There's no way to determine that without a more
explicit problem statement. However, it's hard to see how the duplicate
data could be truly important if there's no distinction between the
duplicate records.

> Whether you like my answer or not, it was an appropriate answer.
> Obviously any one answer may not be the only answer,

What I initially objected to was your statement:

> > You can't.
> >
> > The best you can do...

which, as I read it, makes a blanket statement about the "only answer".
That, as you correctly point out, is not usually the case.

Your points about the difficulty of scrubbing address lists are well
taken. But to revert solely to manually checking large lists with known
duplicates is almost never cost-effective, especially in a commercial
environment.
Helpful Harry - 16 Jan 2006 19:59 GMT
> > See, now you're making assumptions that there is no other data (eg.
> > birthdate) that makes it obvious that it is or isn't a duplicate ...
[quoted text clipped - 37 lines]
> duplicates is almost never cost-effective, especially in a commercial
> environment.

Whatever you say - I give up. I haven't got the time to waste
continuing on with this silliness.   :o\

Helpful Harry                  
Hopefully helping harassed humans happily handle handiwork hardships  ;o)
Paul Berkowitz - 30 Jan 2006 06:07 GMT
On 1/16/06 11:59 AM, in article
170120060859509557%helpful_harry@nom.de.plume.com, "Helpful Harry"
<helpful_harry@nom.de.plume.com> wrote:

> Whatever you say - I give up. I haven't got the time to waste
> continuing on with this silliness.   :o\

It would have been better to say nothing. This is not helpful - JE was. Not
silliness at all.

Signature

Paul Berkowitz
MVP MacOffice
Entourage FAQ Page: <http://www.entourage.mvps.org/faq/index.html>
AppleScripts for Entourage: <http://macscripter.net/scriptbuilders/>

Please "Reply To Newsgroup" to reply to this message. Emails will be
ignored.

PLEASE always state which version of Microsoft Office you are using -
**2004**, X  or 2001. It's often impossible to answer your questions
otherwise.

Christopher MacLeod - 15 Jan 2006 05:24 GMT
No way that I know of with a single command, but you can simplify the chore
of identifying the duplicates.

If you can use two or three fields two judge whether or not entries are the
same, then do something like the following:

Sort by the fields -- e.g., last name, first name, address (columns A,  B, &
C let's say) so that duplicates will appear in contiguous rows

In a new column, enter the formula

=AND(A2=A1,B2=B1,C2=C1)

Copy/paste the formula down through your entire list.

If the entries in the three fields of a given row do not match the entries
above them,  the formula column will read FALSE; if they are duplicates, the
column will read TRUE

Now copy the entire column containing the formulas, then paste special
(values only) into the same location. This is an essential step.

Next, resort your address list by the column containing the TRUE or FALSE
entries.

All the TRUE entries will be clustered. You can delete all the rows labeled
TRUE. They are duplicates.

Finally, delete the column that now contains just FALSE entries.

On 1/14/06 8:12 PM, in article
1137287526.997749.104810@o13g2000cwo.googlegroups.com, "grantman"
<allangrant@bellsouth.net> wrote:

> I have an address list that has a ton of duplicates.  How do I, with
> essentially one command, get rid of all the duplicates?

JE McGimpsey - 15 Jan 2006 06:14 GMT
> I have an address list that has a ton of duplicates.  How do I, with
> essentially one command, get rid of all the duplicates?

*IF* by duplicates you mean that the duplicates can be identified by a
single field (i.e., you're not trying to differentiate "John Q. Public"
from "John Public" and "John Queue Public"), you can extract the unique
entries to another worksheet using Advanced Filter (Unique entries
only). See the technique at Debra Dalgleish's Contextures site:

   http://www.contextures.com/xladvfilter01.html#ExtractWs
CyberTaz - 15 Jan 2006 15:59 GMT
Another tool that often goes unnoticed is the Date Form... Especially useful
if the issue is not as 'cut & dried' as the other techniques require. Still
time consuming, but particularly handy.

With your active cell in the Data Range, go to Data>Form, then click the
Criteria button. Type the criteria into the appropriate fields (note that
wildcards & comparison operators _can_ be used), then click Find Next. Use
the Delete button to delete the found record or click Find Next again to
skip over it. Of course this would have to be done for each set of criteria
& only deletes one record at a time, not the found set.

Two *cautions* - First, the Form will display the next record in the list
(regardless of whether it meets the criteria) after deleting the last record
that _does_ match the criteria, so don't just keep clicking Delete... Check
each record. Secondly, deleting records using the Form is _not_ an undoable
action, so be careful re point #1, otherwise the only way to recover a
deleted record is to close without saving, which will recover all deleted
records.

If you're looking for a Delete Duplicates button, however, there isn't one
in XL or most other 'responsible' software.

HTH |:>)

On 1/14/06 8:12 PM, in article
1137287526.997749.104810@o13g2000cwo.googlegroups.com, "grantman"
<allangrant@bellsouth.net> wrote:

> I have an address list that has a ton of duplicates.  How do I, with
> essentially one command, get rid of all the duplicates?
>
> Thanks
>
> Grantman
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.