Home » Affilorama forum » Affilorama News » Article compare tool concern
You will find the latest news and what is happening at Affilorama here. Any new announcements will be posted here before anywhere else.

Article compare tool concern

vsolomon
Posts: 1
Joined: 01 Dec 09
Trust:

Article compare tool concern

Hi,

I was using the article compare tool to compare a rewrite with the original PLR article that I have. When I checked, it said that I was 86.35% different. I was very curious so I did a google search to see if I could find another article compare tool and I found one at affiliate article writers and it said my article was 90.21% same (Same not different). From that information I am worried that my rewrite may be considered duplicate content. Should I be concern? Should I just stick with what article compare tool tells me from Affilorama? I don't want to get into trouble with my blog.

thanks,
Virginia
  • 1
Site Admin
markling
Posts: 2503
Joined: 13 Jun 06
Trust:
86.35% different according to our article compare tool sounds like a really good rewording to me. The other thing our tool does is it highlights in yellow the areas that have phrases that are still the same. If you don't have too many of those in too close a proximity to each other, then it will be even better.
  • 1
Limited time special - Try Affilorama Premium for just $1 for 7 days: http://www.affilorama.com/premium
 
marklongy
Posts: 11
Joined: 29 Apr 09
Trust:
Hi Mark,

I have a similar question as well.

I have re worded an article in the Affilorama Article compare tool and it say i have 89.79% different, which i was well chuffed with.

I wanted to double check this and found somebody on the forum mentioned Dupefree Pro which is free i downloaded it and copy and pasted the articles into it and it states i have 79.42 duplicate content found.

Any ideas??????

Mark
  • 1
plutonium
Posts: 46
Joined: 24 Nov 09
Trust:
Hi,

I too have concerns about the article compare tool. It says my modified article is 90% different to the original. But most of my article is highlighted yellow as unchanged. When I look at the marked phrases it seems that 90% of the article is same as the original. When I compare the articles with dupe free pro the result is 80% duplicate content. And with another compare tool I get 80% to 90 % duplicate content too.

When I copy the same article in both boxes in compare tool, the result is definitely correct: 0% different. When I change one sentence: 5% difference. That seems me to be ok.

But when I put the full modified article again: 90% difference but mostly of the article highlighted as unchanged. The articles have 10 sentences and only 3 sentences are different to the original. But the tool says 90% different. How can this be? Can I still rely on the article compare tool or is there a bug?

Thanks for help.

Waldemar
  • 1
Last edited by plutonium on 22 Mar 10 10:08 am, edited 1 time in total.
 

slayer2kuk
Posts: 26
Joined: 19 Nov 09
Trust:
Is there any answer to this? Mark?
  • 1
Site Admin
aletta
Posts: 1780
Joined: 09 Jul 06
Trust:
Hey guys,

I've forwarded this onto the programmers concerned to see what they say. I think someone was looking into this issue a while ago, but I'm not sure how far this "looking into" progressed.

Will keep you posted :)
  • 1
PremiumMember
nick
Posts: 307
Joined: 17 May 06
Trust:
There are many ways to try and compare text. Can do it word by word, character by character, sentence by sentence etc.

There are also some lingual algorithms out there like levenshtein ("http://en.wikipedia.org/wiki/Levenshtein_distance") which is what we used to use however we have changed and use something else (to much to go into detail).

However, its worth noting that the % and the yellow colored text block calculated separately. The yellow is just a couple of lines of code that breaks the text into groups of 3 words and then compares them based on that - so really it's only useful of seeing how well you have changed sentences and paragraphs.

Hope that makes some sort of sense :)
  • 1
PremiumMember
nick
Posts: 307
Joined: 17 May 06
Trust:
Aletta has pointed out something (she points out a lot of things), I will be back with more info.
  • 1
Site Admin
aletta
Posts: 1780
Joined: 09 Jul 06
Trust:
You can always count on me, Nick.
  • 1
PremiumMember
nick
Posts: 307
Joined: 17 May 06
Trust:
Indeed there is something freaky going on.
Will update with more info tomorrow.
  • 1
PremiumMember
nick
Posts: 307
Joined: 17 May 06
Trust:
I made some tweaks and getting Aletta to test :)
  • 1
centered
Posts: 97
Joined: 30 Nov 09
Trust:
What is the result now?
Can the staffs please update in regards to this issue? It's been a while since the last post.
Thank you.
  • 1
I'm happy to be an Affilorama member.
 
Site Admin
michellerana
Posts: 2373
Joined: 05 May 09
Trust:
The difference in the results all comes down to the algorithm used to compare the text. Basically ours uses a custom solution quite different to most other compare tools. At the time of development this was taken to be more accurate, but it is under review and it is likely at some stage we will migrate it to use the more common method based on Levenstein distance, which as far as I understand is the most common method of comparison.

http://en.wikipedia.org/wiki/Levenshtein_distance

So in short, we compare the text differently to most, wether this is more accurate or not is really still up for debate, so I can't definitely say which is right or wrong (or really, more accurate).
  • 1
Michelle
Customer Support


=========================

Want a step-by-step training program in affiliate marketing? Affiloblueprint is a "hand-holding" course that will will show you how to build a site, drive traffic to it, and monetize it.

Go to this link to start building your profitable affiliate sites now!
www.affilorama.com/affiloblueprint
 
centered
Posts: 97
Joined: 30 Nov 09
Trust:
Thanks Michelle,

Based on your post above, we can conclude that Google and Yahoo search engines now must based their algorithm on Levenstein distance, mustn't they? And hence, what about the articles that have been posted for some time that based their uniqueness on Affilorama's compare tools whose custom solution quite different to most other compare tools? Aren't they now endangered by duplicate content penalty by the search engines? Please shed a light and thank you very much.
  • 1
I'm happy to be an Affilorama member.
 
Site Admin
mikeantiga
Posts: 715
Joined: 28 Mar 10
Trust:
Centered wrote:Based on your post above, we can conclude that Google and Yahoo search engines now must based their algorithm on Levenstein distance, mustn't they?


You won't find any official statement from Google mentioning the exact algorithm they use simply because revealing this would undermine their process of detecting duplicate and plagiarized contents.

Google probably uses a custom or proprietary algorithm for comparing articles if we are to base this from Google's patent application on near duplicate content (Granted December 2009). Source: Google Patent Granted on Duplicate Content Detection in a Web Crawler System.

Centered wrote:...what about the articles that have been posted for some time that based their uniqueness on Affilorama's compare tools whose custom solution quite different to most other compare tools? Aren't they now endangered by duplicate content penalty by the search engines?


First, duplicate content penalty is still subject for debate. Majority of webmasters (and even Google itself) says that it is a myth. Google, for sure, does not want to display same or exact content fill up the first 10 or 15 of its search results. Hence, a website with a very similar content will likely be placed in the lower part of the results pages but this does not necessarily mean the site is being 'penalized' for having a similar or duplicate content.

Second, we have not yet encountered any website that suffered from low search ranking just by using our Article Compare tool, although we get inquiries from time to time about the varying results between our tool and other commercially available comparison tools. Again, our algorithm is still subject for review at this time.
  • 1
Want lessons on Affiliate Marketing, SEO and Content Creation? Get them here for FREE:
http://www.affilorama.com/lessons
 
danchan
Posts: 25
Joined: 17 Oct 09
Trust:
Google is on the move again with its algorithm. It seems to harp even more on original content.
  • 1