How To Find Duplicates In Numbers

Learn how to find duplicate rows in Numbers using various functions and formulas. You can also deal with the duplicates by deleting them or combining the data in them.
You can also watch this video at YouTube.
Watch more videos about related subjects: Numbers (200 videos).

Video Transcript

Hi, this is Gary with MacMost.com. Today let me show you how to deal with duplicates in Numbers.
MacMost is brought to you thanks to a great group of more than 800 supporters. Go to MacMost.com/patreon. There you can read more about the Patreon Campaign. Join us and get exclusive content and course discounts.
So let's say you have a list of names in Numbers but some of the names are duplicates and you want to get rid of the duplicates or maybe do something else with them. So let's start off with this list here. You could see it's a list of random names but right away you can spot that there are some duplicates in there. So let's start by identifying them. One way to do that is to use the Match function. So we'll start off here in cell B2, hit the equals key to start a formula, and then we'll use match and then we can look up Match over here to see the parameters. You could see the first is Search For and then Search Where, and then Matching Method. Matching Method is actually going to be import here. 
So let's start with what we're searching for. We're searching for the name in that row. Then where are we searching. Well we're going to search column A for it. Then let's just use the default Matching Method for now. So I'll close that up and you could see that this name is found in Row 2 which is correct. Let's Copy that and then I'll double click on column B here just to select the cells, not the Header cell, and do Command V to Paste. So you could see here the second one, which is a duplicate, gives us an 8 instead of 3, because it found a duplicate here in row 8. Now what would be better is if it actually put a 3 here saying that this one was fine but here in row 8 it said, oh this is a duplicate there's one in row 3. So we can use the Matching Order for that. So instead of the default Matching Method being find largest value, we'll go the opposite and find the smallest value. Now I'll Copy and Paste this throughout and now you could see in row 3 we have a 3. But in row 8 we also have a 3. So it recognizes that the first copy of this name is actually in another row not in the current row.
So let's make this more useful by putting this in an IF statement. So we'll say IF and then we'll take that matching row number there and let's compare it to the current row which is simply just the function row without anything in it. Now if these match, in other words if we search for this name and we find it's in row 2 and this is also row 2, then this is what happens when it's true. So in this case let's just put two double quotes to actually have nothing in the cell. But if it's false then we could say Duplicate. Now if I copy this and paste it throughout we could see that row 8 says Duplicate. We can look further down and there are others here. This is a duplicate as well. This is a duplicate as well. We could see this name is actually here three times so there's two duplicate copies of it.
Now what would be even better is if we could see where the actual original was. We can do that by altering the results here. So we can say Duplicate of Row. Then I'm going to use the ampersand to append more text to it. Then use the same match function here, paste that in, and now if I Copy that and Paste that throughout you can see this now reads Duplicate of row 3. We could see this is a duplicate of row 14. So we can really easily find out there's the original. We could see these two are both duplicates of row 23. So now it's very useful in trying to figure out where exactly the duplicate is. 
Now let's say there's more than one cell that you need to compare for a duplicate. So let's insert a new column and we'll call this one Amount, just some sort of other number something else that goes along with the name here. Let me put a bunch of random numbers in here. Now let's say we only want to find duplicates where the name and this number match another row. So the way to do that is to add another column. We'll just call this Combo. In here we'll put a formula where we just take the value of this column, then use an ampersand here and maybe a comma as a separator like that, and then the value of this column. So it kind of combines those. I'll Copy and Paste throughout. Now in here instead of comparing column A we'll compare column C. So I'll go here and I'll change this value to that and I'll select this value and change it to that and I'll do the same thing here changing these. Just so they're in Column C now rather than Column A. Now if I Copy and Paste you could see it no longer sees this as a duplicate of this because they use different numbers. But if I scroll down I could see it does find this one here. You could see this matches what's in row 23. Now all you need to do is simply Hide this column, so you don't even notice it's there. Now you can get your duplicates listed here. 
Let's look at another way to do this that could be more useful in some situation. Instead of Match we're going to use COUNTIF. This will give us the number of rows that contain this name. So here we'll use COUNTIF and we could look that up over here. We could see it takes a Test Array and Condition. So the Test Array would be all of column A. The Condition is this value here. Then we'll see it only found one of this name. If I Copy and Paste throughout we could see in this case it found two of these and also shows a 2 here. Then you could see here, in this case, if found 3 of these and it shows that here as well. Now this doesn't play favorites as to which one is first and which one is second. It just shows you a count. What you could do with this then is to Filter. So with this table selected I could go to Organize, Filter, add a Filter on this second column here and say IF the number is greater than 1. Now when I implement that filter you could see it only shows me the rows where the number is greater than 1. So I could see all my duplicates. The great thing is that it hasn't resorted everything. Everything is still in the same order. As a matter of fact you could still see the row numbers here to the left. So now I could manually deal with each one of these and say which one do I want to keep. Assuming that there are more columns with other data as well you could make decisions one-by-one to try to figure out what to do with these duplicates. 
Notice if I get rid of a duplicate, like I say I want to get rid of this second copy here, I could go in and I can Delete this row and notice that the first one goes away because the count of that is now a 1 and that doesn't fit the filter anymore.
So one final example. Let's say that you want to sum up the amount that each person has but you want to take into account duplicates. So you could do that with SUMIF. So I could do equals SUMIF and let's look it up over here. We could see its Test Value is Condition and then Sum of Values. So similar to COUNTIF but there's an extra parameter. So now we'll do, for Test Values, we'll test column A. The Condition is does it match the actual cell from Column A for this row. The Sum values are going to be the amounts here. So when I use this you could see this first one the sum is just the same  as this one row. But when I paste it throughout you could see here, for instance, this one gives me a 97 which is the sum of 71 and 26. So both of these show a 97. So what will be useful then is to also see the duplicates here so I can eliminate them. So I can either use my Match formula here or COUNTIF. So let's use COUNTIF here so I can easily see the groups of ones that are duplicates. I'll Copy that throughout and now if I want I could Filter these like I did before and now see which ones are duplicates. You can notice that the sums are always the same for these duplicates. If my goal is to actually to get rid of the duplicates here what I could do is select this column here and do Edit, Copy, and then Edit, Paste Formula Results. So now instead of the actual formula being in here you have the real number. The formula is now gone. So if I were to actually get rid of the duplicates, like say let's get rid of row 8 here, you could see instead of having 71 I've got 97 here. So that number of the sum here. So I could eventually switch this to the Amount column and get rid of these. 
So there are various different techniques for dealing with duplicates inside of Numbers. 

Comments: One Comment

    Eric
    5 years ago

    Great stuff! Those techniques should allow me to tidy up a database that I have and know to contain duplicates.

Comments are closed for this post.