Google Hiding Most Relevant Results in Site-Specific Index
I recently wrote a post entitled “Best PayPal Alternatives” in which I asked for advice on choosing a competitor to PayPal. Not only was I bummed to not get any suggestions from my readers, but I also just discovered that Google is “gaming” search results in my own site {see also: this happening on Cryptogon}.
From my search page on my domain, the querystring (without quotation marks), “best paypal alternative” (or with the plural form of the last word) yields no result pointing directly to that page. The page itself was written over six days ago, which in search-indexing for Google is like a few centuries worth of data-collection. And I’ve seen Google add results with direct links to individual pages from my originating URL within less than 24 hours many many times over, so I know they can do it.
So what potential conclusions could we apply to these simple observations? The most obvious hypothesis would be that Google is actually receiving payouts from companies to intentionally bury relevant results. Without access to large non-public sets of Google’s data, you, the information-consumer would never be able to know. The information simply wouldn’t be available - ever. And you’d just be expected to trust a company whose kitschy slogan is “Don’t be evil.” (Not surprisingly, my post entitled “Don’t be evil” was also well-buried in the site-specific index). Oh wait, and that’s exactly the position we find ourselves in now. At least we don’t have these computers hooked right into our central nervous system… yet.

- Site-Flavored Search :: Web Realms
- [Tech Nerds] Good Site Search?
- Google Is Ads Not Search
- Searchin’
- Quick Guide To Content Scraping
- Prev: Country of Origin
- Next: TLA Adds New Payout Option

![[tmbchr]™](/journal/popocculture-blog-logo.jpg)
March 29th, 2008 at 8:10 pm
[…] Getting your content scraped online, I guess, gives you some kind of weird convoluted street-cred. All it really means though, is your website is on some list in some database as a producer of original or semi-original content. That may not even be how “they” define you as an appropriate data source, but it seems like a fairly logical reason. So you have a list of sites producing content and blasting it out in all directions, over RSS and on the Google-controlled adwebs, and then you have some spam algorithms which pluck out sections of your content around highlighted keyword clusters (which somebody somewhere is paying money through some middle man to push and pull the public meaning of words) and then push those out through a vast network of spam-ridden URL’s and strange foreign-esque blogs ridden with trashy Google advertisements. It’s the seedy underbelly of the internet. […]
April 2nd, 2008 at 8:42 pm
I wasn’t able to reproduce your results. When I searched for “best paypal alternatives” the first result was the page you claimed was missing from the index.
I also wasn’t able to reproduce the results on cryptogon.com.
Care to revise your hypothesis?
April 2nd, 2008 at 8:48 pm
I may have bumped the indexing up a notch in accuracy by specifically and directly linking certain terms to those pages again.
I know also Google searches are weighted for individual users based on their history, as the algorithm develops a relationship with your search patterns.
Those are the two most ready guesses I’ve got at the moment. I’m open to other explanations on the matter and recognize my thinking could be quite flawed…
April 2nd, 2008 at 9:55 pm
It’s probably fruitless to discuss the finer details of Google’s search ranking algorithm, simply because that information isn’t available to us. It does seem clear, however, that they draw upon many sources of data when determining how to rank search results. I’d imagine the logic behind it is complex enough as to be effectively inscrutable to an outsider. As you pointed out, Google could be performing any amount of trickery behind-the-scenes without our awareness. How would we know?
Still, claiming that something could be the case, and actually is the case are two different matters. Given the evidence you’ve collected, do you still think “the most obvious hypothesis would be that Google is actually receiving payouts from companies to intentionally bury relevant results”? Or, would a more parsimonious explanation be that Google hadn’t indexed the “missing” page yet when you made this post?
April 2nd, 2008 at 11:19 pm
No, there’s no way it wasn’t indexed before this. That’s just factually inaccurate because the phrase “best paypal alternatives” did appear within the first results page (under the URL http://www.timboucher.com/ - which it had been many days since that was where it lived), but a link directly to that page could not be found within easy access.
I guess that cuts to the heart of what I’m really trying to say here. And hopefully I can be more precise about this: if you have a reference to that phrase, but not a direct link for the user to click on the page titled that phrase, then you have what I think is a bad search result, no matter how you slice it. The user shouldn’t have to take multiple click actions with the same meaning-value equivalent (terminology?) to get to the desired result.
Put in shorter terms: two clicks to get to my content means nobody will read it. One click means somebody might read it. As a content provider using an integrated search solution, that difference is not only important, but it’s crucial. It’s the consumer connecting to my information when its relevant or not.
If Google can’t provide that, no matter what the reason, then well then I don’t know what. I think I made my point?