Contents
It’s been a while I published anything on SEO but in the last couple of months, I made some discoveries on this blog and implemented several changes that actually increased my traffic. As a matter of fact, everything we’ll be discussing in the next paragraphs are practical tips I personally applied.
Sometimes around March or thereabouts, I experienced a serious drop in traffic and ever since, I’ve been struggling to get things back in form to no avail. Sometimes, I would blame the drop in traffic on my changed domain name and at other times, I felt it was as a result of the unending Google algorithm updates that have been nothing but bad news lately. I never knew the problem was as a result poor on-page search engine optimization.
On-Page SEO Issues
Well, I’ve always tried to implement the best on-page SEO practise but there are times you have to go beyond what’s written in the book. I was using Genesis default SEO feature but as time went on, I realized it just wasn’t enough. There were problems I discovered it couldn’t fix and I went ahead to create a workaround.
Pagination on the homepage led to duplicate meta tags and attachment pages were being indexed on search. As explained on this post, the issues were fixed. However, there were still loads of unidentified problems then.
What is Internal Duplicate Content?
When we talk of duplicate content, the first thing that comes to mind is probably a fellow blogger stealing your articles. That’s external duplicate content and unless you’ve got a good SEO score, the plagiarist may outrank you on SERP. In such cases, contacting the blogger or reporting to Google is always the best approach.
Internal duplicate content happens when you have your own content being duplicated on other pages on your site unknowingly.
Causes of Internal Duplicate Content
There are several causes of internal duplicate content issue on WordPress and they include:
1. Plugins generating URL parameters: One of such plugins is MobilePress which I’ve been using for a very long time. It includes parameters such as ?mobile, ?nomobile and ?comment=true to URLs and unknown to quite a lot of users, this can be a major SEO problem.
Another such plugin is DW Question & Answer plugin.This adds a parameter like ?dwqa-embed=true to every question page. There are others I may not be able to identify right now.
All these contribute to on-page SEO problems that may eventually get you penalized for duplicate content.
2. Comment Pagination: If you’re using comment pagination, you probably should disable this from your WordPress settings under Discussions. For sites that generate lots of comments, paginating the comments may be a good idea but this will only duplicate the original post on the comment pages.
3. www and non-www website address: If your website is available with www and without it, you should choose which one you prefer from WP settings and redirect the unused one to the other. Making your website available on both versions without redirection is a big problem if not taken care of.
WordPress automatically add rel=”canonical” to fix this but if you’re using a mobile plugin, the tag may be absent, leading to duplicate content.
4. Author archives, date archives, tag & category pages: All these should be de-indexed already using your SEO plugin. Having them available on Google may lead to duplicate content issues. If you’re having lots of tags and categories, you’ll find out you’ve got a whole lot of useless pages all over Google and these may be prioritized over your actual post pages.
There may be more not mentioned but in my case, these were the causes I was able to identify.
How to Find Internal Duplicate Content
I used two easy methods to detect internal duplicate content on my blog:
1. A simple site search on google: An example is simply searching for “site:geek.ng” without quotes on Google search.
This should reveal pages on your website and from there, you can determine if there are duplicate contents and low quality that should be removed.
2. Checking HTML Improvements on Google Webmaster Tools: Checking this section in Google webmaster tools tells you about duplicate meta tags and from there, you should be able to see pages with duplicate contents.
As seen below, I got quite a number of duplicate pages as a result of URL parameters:
Effects of Internal Duplicate Content
As mentioned earlier, the whole problem started earlier this year and I kept applying every fix I could think of without any major success and it got even worse around August. The image below shows how badly internal duplicate content can affect your blog. A few days after fixing the underlying issues, this is what I got from Google Analytics:
Hot to Fix Internal Duplicate Content
1. Install a good SEO plugin & configure correctly: I recommend WordPress SEO by Yoast since it covers almost every aspect of on-page SEO that should be taken care of by a plugin.
Be sure tag & category pages plus all archives are de-indexed. This includes author archives and date archives.
2. Use URL removal tool in Google Webmaster Tools (carefully): If your category and tag pages are already on search, perform a directory removal for these:
http://www.yoursite.com/category/
http://www.yoursite.com/tag/
http://www.yoursite.com/author/
Also, remove all posts duplicated through URL parameters found under HTML improvements.
To do this, go to Google Index > Remove URLs. Using URL removal tool can be risky and it’s advised not to use it unless it’s necessary. Doing it the wrong way may end up wiping major sections of your website from search. For categories, tags and author archives, performing a directory removal as shown below removes the whole archive from search:
For other pages duplicated through URL parameters:
You should note that this is a bit of a hard work. I ended up removing over 1,000 URLs one after the other. I collated every post URL on the blog, added the URL parameters and input them into the URL removal tool one after the other.
3. Updating your robots.txt file: WordPress has a set of rules in robots.txt by default but this is way too basic. Improving it and blocking some pages and directories from search engines should be of help. Here is what I presently use:
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/plugins/
Disallow: /cgi-bin/
Disallow: /wp-includes/
Disallow: /wp-content/themes/
Disallowed: /*?*
Sitemap: https://www.doncaprio.com/post-sitemap.xml
Sitemap: https://www.doncaprio.com/page-sitemap.xml
Sitemap: https://www.doncaprio.com/dwqa-question-sitemap.xml
You need to change the sitemap links to yours and you may delete the lines you don’t need. However the line that says Disallow: /*?* is important as it stops search engines from accessing pages containing parameters. Please note that you should only use this if you’re not using the default WordPress Permalink structure that works based on URL parameter.
If you’ve installed WordPress SEO by Yoast, you should be able to edit your robots.txt file from SEO > Edit Files.
4. Monitor URL Parameters from Google Webmaster Tools: Google automatically detects URL parameters and decide wether to index them or not but there are cases when decisions made by Googlebot are not always the right one.
From Google Webmaster Tools, you should go to Crawl > URL Parameters and tell Googlebot which URL parameter not to crawl.
Other Low Quality Pages
Another problem you probably haven’t noticed is the issue of low quality pages appearing on search. A lot of WordPress users still have the default Sample Page on search. There’s absolutely no reason why your privacy policy should be on search. Also, pages such as Terms and Conditions, Disclaimer, Login page, Registration page and Forgot password page do not have to be on search.
If you’re using a membership plugin too, Member Profiles may contribute to low quality pages on search.
How to fix:
1. Be sure to set the pages to noindex.
2. Remove the pages from Google webmaster tools.
You may need to find other pages as well that need to be removed. These include old posts that offer no value anymore and those you know are actually poorly written or contain dead links.
Doing all these improved my search engine traffic considerably within a very short time. I hope it helps someone too.