Looking for some creative inspiration for your promotions? Find and suggest ideas for marketing and building your affiliate campaigns.

Would PDF files show as duplicate content?

Posts: 21
Joined: 22 Nov 09

Would PDF files show as duplicate content?

Hi all,

If I was to put a pdf file in a zipped folder (for later bonus email) in the folder with all my website files with hostgator, would robots crawl it etc and would the content in it show as duplicate to other content on my site?

A hyperlink to the file can be added to emails. I have tested it in my usual "learn by doing" mentality and a right-click, save to target type thing works well.

Very interested to know as I was hoping the file would just sit there given that it is totally different format to the other files but images are different format too and I am unsure about them as well.

  • 1
Last edited by michellerana on 16 Apr 10 6:53 am, edited 1 time in total.
Reason: improve title to describe the post better

Posts: 869
Joined: 14 Mar 08
Hi Kerry,
Google does crawl PDF docs.
That being said, I would not worry too much about the duplicate content side of things as Google will just not index the PDF.
If you have an html page with content on it.. get that indexed first and then if you put a PDF on the site for download, you can and it will not be indexed because it MAY be viewed as duplicate but you will not get penalized for it.
If you are really concerned about it, then just do a no follow on the PDF.

Google can not read text in images or see if images are duplicate. The only wat they can tell is if you name the images the same name as each other in the ALT text.

  • 1
Enjoy the little things, for one day you may look back and realize
they were the big things.

-- Robert Brault
Posts: 216
Joined: 26 Jun 09
Thats good to hear. I wanted to use the exact same text at the very bottom of all my pages but was worried about dup. content so I made it into a jpeg pic.
Never give up
  • 1
Paul J. Burkhardt
Posts: 28
Joined: 07 Mar 11
if you link to your pdf site google will find it.

i dont think it will count as duplicate content but you can use the conical tag just to be on the safe side
  • 1
Posts: 20
Joined: 22 Apr 11
There's a way to tell your robots.txt or robots meta to not craw that particular page. If you have those files that is. I usually use wordpress, so I can decide what the spiders will crawl.
  • 1
Posts: 6369
Joined: 25 Feb 11
Hi Kerry,

If you have the same content on an HTML page and PDF file and both are up on your site, Google will count it as duplicate content but will choose the HTML over the PDF. You need to tweak the robots.txt so that the PDF files do not get indexed. You can find additional information at http://www.seroundtable.com/archives/021584.html

Hope that helps. Have a good day!
  • 1

Building affiliate marketing websites is a breeze: https://www.affilorama.com/affilojetpack
Like us on Facebook: https://www.facebook.com/affilorama
Posts: 24
Joined: 04 May 11
agree with cecille
  • 1

This topic was started on Apr 10, 2010 and has been closed due to inactivity. If you want to discuss this topic further, please create a new forum topic.