Looking for some creative inspiration for your promotions? Find and suggest ideas for marketing and building your affiliate campaigns.

Would PDF files show as duplicate content?

PremiumMember
Richie32
 
Posts: 37
Joined: 22 Nov 09
Location: New Zealand

Would PDF files show as duplicate content?

Hi all,

If I was to put a pdf file in a zipped folder (for later bonus email) in the folder with all my website files with hostgator, would robots crawl it etc and would the content in it show as duplicate to other content on my site?

A hyperlink to the file can be added to emails. I have tested it in my usual "learn by doing" mentality and a right-click, save to target type thing works well.

Very interested to know as I was hoping the file would just sit there given that it is totally different format to the other files but images are different format too and I am unsure about them as well.

Kerry
 

Last edited by michellerana on 16 Apr 10 6:53 am, edited 1 time in total.
Reason: improve title to describe the post better
 

Moderator
wollowra
 
Posts: 1268
Joined: 14 Mar 08
Location: Australia

Hi Kerry,
Google does crawl PDF docs.
That being said, I would not worry too much about the duplicate content side of things as Google will just not index the PDF.
If you have an html page with content on it.. get that indexed first and then if you put a PDF on the site for download, you can and it will not be indexed because it MAY be viewed as duplicate but you will not get penalized for it.
If you are really concerned about it, then just do a no follow on the PDF.

Google can not read text in images or see if images are duplicate. The only wat they can tell is if you name the images the same name as each other in the ALT text.

Regards
Troy
 


Enjoy the little things, for one day you may look back and realize
they were the big things.

-- Robert Brault
 
PremiumMember
burkhardt5
 
Posts: 100
Joined: 26 Jun 09
Location: United States

Thats good to hear. I wanted to use the exact same text at the very bottom of all my pages but was worried about dup. content so I made it into a jpeg pic.
Never give up
 

Paul J. Burkhardt
http://superiorwowguide.com
 
luckylook3
 
Posts: 28
Joined: 07 Mar 11
Location: United States

if you link to your pdf site google will find it.

i dont think it will count as duplicate content but you can use the conical tag just to be on the safe side
 

fastflipwebservices
 
Posts: 22
Joined: 22 Apr 11
Location: Germany

There's a way to tell your robots.txt or robots meta to not craw that particular page. If you have those files that is. I usually use wordpress, so I can decide what the spiders will crawl.
 

Site Admin
Cecille L
 
Posts: 1473
Joined: 25 Feb 11
Location: Philippines

Hi Kerry,

If you have the same content on an HTML page and PDF file and both are up on your site, Google will count it as duplicate content but will choose the HTML over the PDF. You need to tweak the robots.txt so that the PDF files do not get indexed. You can find additional information at http://www.seroundtable.com/archives/021584.html

Hope that helps. Have a good day!
 

Cecille


http://www.affilorama.com/affiloblueprint
Build a Successful Website in 12 Weeks

Add us on Google Plus: http://www.affilorama.com/googleplus
 
rankwarrior
 
Posts: 32
Joined: 04 May 11
Location: Great Britain

agree with cecille