Duplicate Web Content and How Search Engines View It
Thanks to the new filters that search engines have implemented duplicate content is a growing concern for those concerned about their copyrights and their search engine rankings. We hope to help you undesrstand how you might be caught in a content filter and ways to avoid it. We'll also show you how you can determine if your pages have duplicate content, and what to do to fix it.
Search engine spam is any deceitful attempt to deliberately trick the search engine into returning poor quality search results. Many times this behavior is seen in pages that are exact replicas of other pages from copyright violations or sites which sell ready made sites. Creating multiple or similar copies of the same page will not increase chances of getting great listings in search engines.
Search engines remove duplicate content so that their visitors will receive more relevent search results. Unfortunately, many webmasters have een caught by the filters that remove duplicate content. There are some things that the webmasters can do to avoid being filtered out.
How Does a Content Filter Work:
The term "duplicate content penalty" is actually a misnomer. The tem penalties in search engine rankings refers to the points that are deducted from a page in order to come to an overall relevancy score. Duplicate content pages are not penalized, they filtered, the way you would use a filter to remove particles from a liquid. Unfortunately, good content sometimes also gets filtered as well.
There are basically four types of duplicate content that are filtered out:
- Duplicate Pages - These pages are considered duplicate. Purchased adsense websites are especially vulnerable to being filtered becasue they are identical and will be considered SPAM by the search engines. Also, affiliate sites with the same look, feel, and identical content are especially vulnerable to a duplicate content filter. Another example would be a website with doorway pages. Many times, these doorways are skewed versions of landing pages. However, these landing pages are identical to other landing pages. Generally, doorway pages are intended to be used to spam the search engines in order to manipulate search engine results.
- Product Descriptions - Many eCommerce sites use the manufacturer's descriptions for their products. While harder to spot, is still considered spam.
- Repackaged Content - Taking content from a web site and repackaging it to make it look different is the same as a duplicate page.
- Article Distribution - If you have articles that you share for re-publishing is it good for you? Not necessarily, even though Yahoo and MSN determine the source of the original article and deems it most relevant in search results, other search engines like Google may not.
How Does the Duplicate Filter Work?
When a search engine robot crawls a website, it reads the pages, and stores the information in its database. It then compares its findings to other information it has in its database. Depending upon a few factors, such as the overall relevancy score of a website, it then determines which are duplicate content and then may determine they are spam. Even if your pages are not spam they may still be regarded as spam if the content is duplicate.
Avoiding the Duplicate Filter
Check your pages for duplicate content. Using WebConf's Similar Page Checker tools. By entering the URLs of two pages, this tool will compare those pages, and point out how they are similar so that you can make them unique.
To determine if your pages have been copied you will need some help. Here is a tool that searches for copies of your page on the Internet: www.copyscape.com. Using this tool you can put in your web page URL and it will search for copies of your page. This can help you address the issue of someone "borrowing" your content without permission.
Some search engines, like Google, use link popularity to determine the most relevant results. Continue to build your link popularity, while using tools like www.copyscape.com to find how many other sites have the same article, and if allowed by the author, you may be able to alter the article as to make the content unique.
When you use distributed articles for your content pick articles that are relevant to your web site. Sometimes, simply adding your own commentary to the articles can avoid the duplicate content filter; the WebConf's Similar Page Checker tools might help you make your content unique. The more relevant articles you can add to compliment the original article, the better. Search engines will look at the entire page and how it relates to the entire site, so as long as you aren't exactly copying someone's pages, you should be fine.
If you have an eCommerce site, you should write original descriptions for your products. This can be a lot of work, but it is worth the effort to avoid duplicate content.
Do not rely on an affiliate site which is identical to other sites or create identical doorway pages. These types of behaviors are not only filtered out immediately as spam, but there is generally no comparison of the page to the site as a whole if another site or page is found as duplicate, and get your entire site in trouble.
The duplicate content filter is sometimes hard on sites that don't intend to spam the search engines. By being vigilant and selecting your content carefully you will improve your chances of not being filtered.