Saturday, December 01, 2007

Google's Search Results - Should Factors Other Than User Value be Used

Comment on: Selling links that pass PageRank:

I agree I wouldn't give that post much weight. What if it were a pre-eminent brain tumor researcher that was posting while being paid by the Mayo Clinic (because they decided that was a good way to encourage publicizing content that would reach the public) and he suggested further reading at the Mayo Clinic. Then I would have no problem reading it and giving that content a high value and following his advice and reading more at the Mayo Clinic. The issue is the value of the content. Knowing the ways the author might be biased (who is paying them) is one valid thing to consider.

The idea for Pagerank came from the citation of academic papers. Google's current position would be that citations from those papers that are funded by other than the author should be ignored. That is not how citation value is calculated in the academic world. An I do not believe it would be an improvement to do so, though I can believe there is a minority that believes exactly that.

Google should be allowed to decide to calculate pagerank however it likes. It could be that ignoring all sponsored, paid and otherwise influenced links (wherever exactly the current guidelines draw that line today), on balance, is the best method. I doubt it. But it might be that until organizations (Google, Yahoo...) better figure out how to improve determining value in such cases this is the best strategy. Google certainly knows better than me the limitations of their ability to judge such potentially biased content.

Your example, in this post, though doesn't make a good case though - I don't think. The value of posts by a knowledgeable, well meaning doctor would be good regardless if they were paid or not. Disclosing that they have financial interests is valuable. Then people can weigh that, as one factor. If I were making decisions for Google I would want to mimic that human ability to weigh that factor as one measure in determining how much weight to give to the doctors links on the page. I can't believe that this isn't exactly what Google would like to be able to do.

Perhaps that ability to manage the "grey area" is beyond the ability of Google's models today. I can't believe it is, but the current explanations seem to indicate a desire by Google to eliminate this "grey area" (there are plenty of other "grey areas" being left "grey"). So it appears that either Google decided it was unable to effectively produce results managing this "grey area." Or decided that it could, and doing so was a bad idea. If Google can do so and choose not to I would say that is a bad idea. But one Google can make and frankly I don't know why Google would care what I think - they have plenty of smart people that I am sure have thought of whatever I can on this topic.

It would seem to me rather than the explanation in your post, the real issue is whether Google can try to balance what a smart human would try to do when evaluating content or not. Saying that bad content that is paid for should not be given a high value by readers is true, but does nothing to explain why paid content is automatically untrustworthy. I would think most people would say it is not automatically untrustworthy, it might be, it might not - I will use the fact it was paid as one factor in valuing that content.

I think it is great that Google is at least somewhat open about discussing these issues even as Google is criticized.

To me the bottom line is that Google needs to provide users the best results. My guess is if Google were to say that because the Mayo Clinic (just as an example) engaged in some practice that Google did not like that Google heavily penalized there search results users of Google would be dissatisfied. Google has done a good job of making the decisions of what shows up in search results in the past. I can understand a desire to take action against those Google thinks are not doing thing the way it likes. However, I don't think Google can put too much weight on site owners following exactly the practices we want versus the shall we say un-penalized pagerank of a page. As long as it works and high value sites searches want to see were to modify their sites to comply with what Google wants there is no problem (because then Google can provide users what they want and get sites to do what Google wants).

But if sites did not and Google chose to not display those sites to users obviously that just makes Google's results less valuable. Here, for the sake of argument (and clarity), I am seperating out the issues of value of the content to searcher and complying with Google's desires. Those 2 factors obviously can be separate (they could also be interrelated but for the sake of clarity lets say in this example, that they are not. Then Google's option is to 1) return the best results it can to users or 2) punish a site for some factor users don't care about but Google does. Google has a lead is providing good results. So Google can afford to degrade the results provided to users in order to penalize sites not adopting practices Google wishes them to. That is real bottom line.

The anger of people that their site rises and falls based on certain things matters to them - but the real issue for Google is how good are the results to users. If any factor other than providing the best results to users is used in producing the search results then the value to users is degraded. I am not ignoring that factors which may not seem to be relevant may in fact be. So, if in fact it were true that a site that was sponsored and didn't use no follow when it linked to that sponsor web site were worse results than sites that did not adopt that practice then the factor is being used in order to provide the best results to users. But any factor that is merely to make it easier for Google and results in degraded search results. I just can't see Google adopting over the long term. It provides competitors a weakness to exploit.

And the same with ignoring the links recommended by lets say high authority sites that don't exactly follow Google's desires. Google could just choose to ignore the "votes" of those sites but if those "votes" are not of 0 value (lets say 100% corrupt) then Google would be throwing away valuable insight (by ignoring the votes of that site). Obviously that is Google's choice but it seems to be pretty obvious that doing so is far from an ideal engineering solution. There is value (votes of an authority) being ignored. Too much ignoring of worthwhile information (even if that information is tainted by payment) and it provides an opening for competitors to make better use of that information to provide results. I just can't see that as in Google's interest.

Anyway those are my rather long thoughts on this topic.

Related: Google's Displayed PageRank

One more point that Google likes to avoid. They say they wish to limit the penalties to sites that buy or sell links to manipulate page rank. They have, by not disclosing that they oppose, paid for links when the payments are the style that large companies make. Say when Google partners with CNN. Exposure and links are part of the bargin they strike with one another. Links from those partnerships are paid as any human would see the issue. But Google's statement make it clear they do not see such links as untrustworthy. I would agree those links are trustworthy. Large corporations that agree to cooperate invest a great deal of money in that venture and the "vote" that this organization we agreed to partner with is worth valuing. It is however non-the-less paid.

So, just remember it is the links bought by small organizations Google is basically targeting. That is obviously their right. But it doesn't seem to follow that certain paid links are 100% trustworthy and certain paid links are 100% untrustworthy. This is not to say that Google doesn't have the right to decide to act as though this is the case. They do.

This is mainly just me thinking out loud about a further the understanding of the scope of the issue. Which I think is an interesting engineering challenge: how to pick the best results to display. And how to do so when actors are consciously trying to manipulate the results and actions of other actors. And how Google is acting in the process not just as a evaluator but to persuade content owners to behave in ways Google would prefer.

It makes perfect sense for Google to do this. They can make there results better if they can get content owners to follow practices that help them better evaluate pages. In this I just think the implications of Google's words are not the best practice.

Since I think Google has proven to be very smart, my guess is that you can't assume the implications will be followed through (another alternative is that my understanding is faulty in some way which is certainly very possible). It is in Google's interest to get as many sites to comply as it can - it should be easier to evaluate if everyone follows your guidelines).

But it is not in Google's interest (I don't believe) to punish otherwise good sites that do not comply (by lowering their rankings in search results). This is not in Google's interest because then worse results are shown to users. In addition it is not in Google's interest to ignore valuable information ("votes by authoritative site") even if those sites don't play exactly by all Google's rules. However, in order to convince people that they have to follow Google's guidelines I can certainly see people making a judgment that, while some people might get mad at us, it is worth it if we can get more compliance to make our job of picking the best results easier. As long as though degraded results were still the best results (and useful) it actually wouldn't have a negative impact. But I don't believe their lead is so great they can degrade the results much without losing market share.

Anyway this is an interesting topic to think about.

No comments: