Eliminating Or Merging Duplicate Rows In Numbers

If you have a large table with some duplicate rows in a spreadsheet, you can find them by sorting and using a formula to identify duplicate rows. If you need to merge the data in duplicate rows, you can use a formula for that too. You can then sort and delete those rows easily, saving hours of manual work.
Video Transcript / Captions
Closed captioning for this video is available on YouTube: Eliminating Or Merging Duplicate Rows In Numbers.

Something you may need to do in Numbers from time to time is to find duplicates. So say you have a list. I've got a list here with fruits in one column and then the number of them in the other. But there are some duplicates. If you look through the list there is apple several times, cherries several times, and a few other duplicates. So how can you find them?

Well, first let's sort by the column that has fruit in it. Now you may have other columns you want to sort, say the date or something like that, but for now let's sort everything by the column that has duplicates. Sure enough I can see that apples are there three times, cherries are there three times, and if I look enough I'll find some other duplicates here as well. There's orange twice.

Now if the list is huge you're not going to want to look through it manually. So how can you get it to be done automatically or semi-automatically so you can check and see where the duplicates are and get rid of them easily. So we're going to use a bit of logic and a simple function to figure this out.

If this here, A3, is the same as A2 then we know that A3 is a duplicate. So we'll start here in the second cell, because it doesn't make sense to do it in the first cell because there is nothing above it to compare it to. We'll start here in the second row, row three in this case, and we'll say, start a formula and say if this equals this. We're not using the if statement. We're just actually using the equals sign between the two. We hit return and we see it says true. If we copy and paste that here we'll see it says true for the next one because the third on is also a duplicate. We'll go here where it starts with apricot and that's false. Then true because it's the second apricot. False because avocado is by itself. Banana is by itself. Blackberry is by itself. Etc.

I can select all of the cells here, I just did that by double clicking on C column header, paste it in, and then we'll get trues and falses depending upon what's there. Of course it's not going to apply to the first one but we can leave it there because it says false at least. So everywhere there's a true we know we have a duplicate column.

Now we could make it a little nicer. Instead of just using this formula here and having it be a true or false we could use the if statement there, make the comparison the first part of the if statement and say if it is true put the word duplicate maybe with an exclamation point. If it's not just a blank. So two quotes there, so a blank piece of text. Now when we copy and paste that throughout we could see it's easier to tell where the duplicates are. We can see all of the duplicate rows very easily and we can go ahead and check them out and eliminate them.

Say you want to take it a step further. Say you don't just want to eliminate the extra rows, you want to merge them. So you want to have the total count of all the apples, 202+19+121. Now you can do that manually every time you see the word duplicate you can add them together. But it would be nice to have a formula to do it automatically. So let's increase by one more column here so we can work with this column. Let's put a formula here that's basically going to be the total. We're going put equals and then we're going to say the count, so the same number, then we're going to use plus then use an if statement to say if the value in the first column is equal to the value of the second column; in other words the next one is a duplicate. Then take the value of that. Now we don't want to take it from here. We want to actually take it from the same column that will add up successive rows because if you took it from here it would only add nineteen and it would never add the 121. We want it to add the nineteen but we want to have this one also have the 121. We'll see how it works in a second.

So we say D3, the row below. Then the other option is zero. The zero is when the next one is not the same. When we go from apple to apricot or cranberry to date then it's going to add a zero to it so it will not add the next one. We close the parentheses there. So now we go and you can see it says 202 still. You think it would add the nineteen. But it's trying to add this one. So we want to copy this and paste it everywhere. So now we see 342 which is the correct count. What is it doing? Well this first one here, the last row of apples, says 121 and the 192 does not match because apricot is different than apple. But here it's saying okay take the 19 and then apple and apple are the same so add that 121 and 140. Now here take the 202, then since apple and apple are the same add 140 to that you get 342 for the total.

So now what I want to do is I want to, instead of these beings the results of formulas, I want them to be actual real numbers that are in there. The same thing here. I don't want these to be results of formulas either. I going to move things around and sort them. So I'm going to select both of these columns, I'm going to do Edit, Copy, and then Edit, Paste Formula Results. So now this is really the word duplicate. This is really the number 342. They're not results of formulas.

Now the cool thing I can do is I can sort by the row C here in ascending order and at the bottom I'll get all the duplicate rows. I can select all of those, Delete them, and I'm left with all of the rows that have the correct count here in this column D here. I could then simply Copy and Paste it over here so I have updated count numbers or I could delete this column and let this on replace it. Delete these right here and now I've got everything there once with the complete total merged counts.

Comments: 5 Responses to “Eliminating Or Merging Duplicate Rows In Numbers”

    2 years ago

    This is brilliant!

    Chuck Cumiskey
    2 years ago

    Hmmm… I could follow you but I definitely couldn’t repeat you. Thanks for showing me how much more I need to learn about Numbers basics.

    2 years ago

    Gary, you are terrific. If I had to figure this out from the manual…it would never happen. Thank you soooooo much. PS: Can you add PayPal acct to your donation options please.

    2 years ago

    I’m wondering why you wouldn’t just use ‘sumif’ as in this example: https://discussions.apple.com/thread/7814195?start=0&tstart=0

    It seems simpler and would automatically update with revised underlying data (the only exception would be if a new type of item is added to the count list then it needs to be added to the summary list.

    2 years ago

    Pat: Not sure how this applies. People need to remove duplicate rows in Numbers and sometimes combine the information in them. SUMIF is great for calculating sums if you want to keep multiple items in their own rows, but only in that case.

Comments Closed.