AudienceBloom

CALL US:  1-877-545-GROW

The Blog

Ready to get serious about growing your website? Get a free proposal

How to Match Domain Names in Excel to Help with Creating a Disavow List

Published by | 1 Comment

Excel can be an extremely powerful tool for SEO professionals. Recently, I got a phone call from a friend with a perplexing problem. He had a huge list of backlinks in Excel and he didn’t know which ones to disavow. This is a common problem these days. Many of us have seen our rankings tank on Google and want to disavow the bad backlinks to get our rankings back.

With a 300 domain-name disavow list, my friend wanted to highlight the cells that should be disavowed. With over 10,000 backlink URLs to review, this was far too big a job to do by hand and he thought Excel would be able to handle it. He also wanted a method that could be updated in the months to come as more disavow lists are discovered.

I’ve had experience with Excel and it seemed like the request was definitely doable. This post covers how we solved the issue.

This process can be used any time you have two lists that you need to match. It could be a list of domain names as in this situation or it could be matching a keyword within a list of domains.

Keeping it Simple

One of the first things I did when I got the spreadsheet was review both of the lists. The disavow list was pretty simple and contained a list of URLs (see image).

domain

I then compared this to the list of 10,000 backlinks.

source url

One of the first things I noticed was that I was looking at apples and oranges. To make it easier to compare the two lists, I needed to find the root domain for each cell in both lists. It is much easier to compare “askdives.com” to “askdives.com” than to “http://askdives.com/what-is-this.html”.

Part 1: Cleaning up the Disavow List

First, I started working on the disavow list. Here’s a walkthrough on how to acquire your disavow list. I used the following formula (assuming the first item in the list was in field A2) to remove everything but the domain.

=IF(NOT(ISERROR(FIND(“www.”,A2))),RIGHT(A2,(LEN(A2)-11)),RIGHT(A2,(LEN(A2)-7)))

I placed this formula in field B2. It stripped out all the information I didn’t need and left me with a clean domain name.

clean domain

Figure 1 – The formula stripped the URL in column A to the domain in column B.

Once I verified this formula worked, I then copied it to the rest of the cells in the column.

Part 2: Cleaning up the Backlink List

To save time, I found the following formula on the internet to remove all of the irrelevant information and leave me with the domain name.

=IF(ISNUMBER(FIND(“www.”,D2)),MID(D2,FIND(“www.”,D2)+4,FIND(“/”,D2,9)-FIND(“www.”,D2)-4),MID(D2,FIND(“//”,D2)+2,FIND(“/”,D2,9)-FIND(“//”,D2)-2))

clean domain 2

Figure 2 – The formula stripped the URL in column D to the domain in column E.

This formula left me with a clean domain list that I can easily compare to the backlink list (see column E). Note that this formula assumes that the list of URLs is in column D. If your list of URLs is in column A, change every instance of D2 (within the formula) to A2.

Part 3: Comparing the lists

Now that I have used the two formulas and created two columns of “clean” domain names, I have something I can easily use for comparison purposes. Since my friend needs matching domains to be highlighted, I used the conditional formatting function, which is on the home tab for Microsoft Excel 2010.

To start the comparison process, I selected the cells in the backlinks column that I wanted to have highlighted. In my example, this happens to be column D.

comparing-lists

Select the conditional formatting button and click “Manage Rules…” from the list.

conditional-formatting

Once the next dialog box appears, make sure “Show formatting rules for” is set to “Current Selection” (first red box in diagram below).

Select the “new rule…” button (second red box in diagram below).

new-rule

At this point, you are ready to set up your new rule.

Select the rule type in the list, “Use a formula to determine which cells to format”.

Enter the following formula into the box under “Format values where this formula is true:”

=ISERROR(VLOOKUP(E2,$B2:$B5,1,FALSE)) = FALSE

formatting-rule

The formula is one of the most important parts of the process. Let’s break this down by command.

=ISERROR(VLOOKUP(E2,$B2:$B5,1,FALSE)) = FALSE

VLOOKUP, also known as vertical lookup, is a Microsoft Excel command that “looks for a value in the leftmost column of the table, and then returns a value in the same row from a column you specify. By default the table must be sorted in ascending order.”

The format for vlookup is:

=VLOOKUP (lookup_value, table_array, col_index_num,[range_lookup])

Lookup Value

=VLOOKUP (lookup_value, table_array, col_index_num,[range_lookup])

In the image below, the lookup value is in cell E2 (see red box) which is the first clean domain name from the backlink list.

lookup-value

Table Array

=VLOOKUP (lookup_value, table_array, col_index_num,[range_lookup])

The table array is the list of domains that you wish to disavow (see disavow column). In this case, the range goes from B2 to B5 so in Excel it is listed as $B2:$B5.

Column Index

=VLOOKUP (lookup_value, table_array, col_index_num,[range_lookup])

The column index is “1” because the first (and only) column contains the answer.

Range Lookup

=VLOOKUP (lookup_value, table_array, col_index_num,[range_lookup])

The last option, “range lookup” is important. If you select “true” for the range lookup, the formula will do an approximate match. This can give you unintended results. As a general rule, I like to use false for range lookup which gives you an exact match in the formula.

ISERROR

=ISERROR(VLOOKUP(E2,$B2:$B5,1,FALSE)) = FALSE

The ISERROR command checks to see if an error occurs within the formula and returns TRUE or FALSE. In this case, we set the variable to FALSE because we want the result to end up being FALSE = FALSE which makes a TRUE result.

We do this because conditional formatting in Excel requires a TRUE formula in order to highlight the given cell.

Picking Your Favorite Color

At the bottom of the form, select the “Format…” button and pick the color that you wish to use for your highlights.

fav-color

Select OK when done.

Select OK to close the Conditional Formatting box.

Once you click OK, Excel should highlight every cell in column D that contains a match in column B.

highlighted

Note: In this given example, the match happens to be on the exact same line. In reality, this will rarely be the case.

Conclusion

This is a simple way to identify backlinks to disavow based on the domain. This report can be great for clients or to use on your own sites. These formulas, when used together, will allow you to match two columns of domains to create a nice presentation with highlighted duplicates.

avatar

Jayson DeMers

Jayson DeMers is the Founder & CEO of AudienceBloom. You can contact him on LinkedIn, Google+, or Twitter.

Looking to grow your traffic?

Our managed SEO and social campaigns and high domain authority link building will increase your presence and organic search engine traffic.

Request a rate card

Want more great resources?

Check out our new Resource Library, with over 100 expert articles spanning all aspects of online marketing, divided into 16 chapters.

See our Resource Library
  • Willy G,

    Hi Jayson.

    Great work! I still have a question!

    If i have a list A (with domains : domain.com) and a list on B with links (links who contains domains from A, but not all links from B contains domains from A), how can i select all links from B which contains domains from A?

    Thank you so much!

Success! We've just sent an email containing a download link for your selected resource. Please check your spam folder if you don't receive it within 5 minutes. Enjoy!

Love,

-The AudienceBloom Team