Skip navigation
Tyssen Design — Brisbane Freelance Web Developer
(07) 3300 3303

Catching content copiers with Wordpress

By John Faulds /

I recently had one of the articles from this site copied and posted on someone else's site (a practice sometimes also known as site scraping). But thanks to a couple of Wordpress' in-built features and also a handy plugin, I soon found about it.

I'd read in a couple of different places recently that it was a good idea to use absolute, rather than relative, links for your internal linking structure. This is because if people copy the content from your website to post on their own, the chances are they won't edit the copy and so you'll end up with links pointing back to your site.

This is what happened in my case. Wordpress automatically inserts the full absolute path to any links you post on your site if you use the get_permalink function. The post in question was copied verbatim even including the 'related entries' at the bottom which linked back to five other posts on my site (the permalink in the article heading was also left intact).

Another piece of advice from SEOMoz was to "ping the major blogging/tracking services (like Google, Technorati, Yahoo!, etc.)" whenever you publish new content. This is something else Wordpress also does automatically with a selection of update services added by default (located in wp-admin | Options | Writing | Update Services).

The third thing that helped me find out about the copying so quickly was a plugin called Kramer which "will show every post linking to your posts, in the form of comments or pingbacks. The blog post linking to yours does not need to ping your post for the comment to be shown in your weblog."

The copied post was published some time on a Friday night. By the time I'd checked my emails the next morning, I'd received a notification email of the new backlink because the links to my site had been left intact in the copy. Most of the notifications I get from Kramer are from links to articles on my site posted on forums and I normally just delete them without checking. But I usually check out links I see from other blogs just to see in which context the post has been mentioned.

A quick whois of the site showed that it was registered with GoDaddy. So I sent them an email asking what the procedure was for cases like this. It seems that GoDaddy take a pretty dim view of this sort of thing because although I never heard back from them, the next day the offending site was no longer accessible (401 Authorization Required) and has been that way ever since. It seems at least one other person has experienced a similarly quick turn-around time from GoDaddy in response to DMCA infringment notices.