When people talk about why Google is so successful and so powerful, they often talk about Google’s employees making products people love. They say that Google hires all the smartest people and gives them free reign to focus on the world’s hardest problems with near infinite resources at their disposal. The theory is that Google employees aren’t just the best, they are almost a different breed of people, who just can’t help but come up with ideas that make the company more and more money. And with competition being just a click away after all, consumers constantly have a choice about whether to use Google and, isn’t that nice? They always seem to be coming back to Google.
During the Big Tech antitrust congressional hearing, we saw what some Google employees actually do to get ahead and keep consumers coming back to Google. The Representatives tore into the anticompetitive practices that Google and the rest of the tech companies used to increase their market share and push out their competitors. We all heard stories about how Google stole content from smaller websites, how employees at Google worked to stop traffic from flowing out to competing websites, and the practices in Google’s advertising business that looked very much like insider trading. The Representatives did a fantastic job of showing how the culture within Google drives employees to perform anti-competitive actions within seemingly all of the markets that Google operates in, and how Google has worked to prevent consumers having any choice presented to them besides Google.
The investigations and conversations around Google’s business practices we are having right now are incredibly important. We find ourselves thinking that there is something missing in these discussions about whether and how to regulate Google. The focus has been on the brilliant innovations made by and the anti-competitive actions taken by Google employees as well as consumers’ persistent knack to just keep using Google products no matter what happens. We keep asking ourselves why there aren’t there more search engines seriously competing with Google though. Why can’t the other search engines catch up? Why don’t other search engines start nipping away at Google’s extravagant profit margins? Why aren’t there more Googles?
To better introduce ourselves, we’re a group of people, some of us working in tech and others in politics, who started asking themselves these questions and more about two years ago. We have been on the hunt ever since for more satisfying reasons for why Google is such a dominant force on the internet. And we think we found something really interesting.
What we found and what this website details is that Google has some pretty significant advantages when they crawl the web. Crawling the web is the process where a search engine goes out to collect the documents it will later show to people when they query the search engine. Search engines do this ahead of time because it is a long and time consuming process to go out and collect all that information and they cannot do it on the fly when people send them queries.
What we knew from some of our experiences as website operators was that it’s expensive for a website to get crawled too much. There’s no dollar figure we can put on the cost of being crawled but every website operator has had to block an overeager crawler that was putting too much load on the server. So there is a natural incentive for website operators to want to limit the amount that they are getting crawled.
How this plays out in practice is that, in aggregate, many website operators end up only allowing the web crawlers from major search engines to crawl their websites. This saves on the server and bandwidth costs and the other search engines don’t send the websites enough traffic to make the websites that much money anyway. So, there is a winnowing effect of how many search engines can even begin to enter the market to try and compete with the major search engines.
We found a quote from a webmaster that was left on a blog post over a decade ago that sums up this whole dynamic perfectly:
As a webmaster I get a bit tired of constantly having to deal with the startup crawler du jour.
From law firms looking for DMCA violations to verticals search engines, to image aggregators, to company intelligence resellers… It feels to me that everybody and their brother has gotten into spidering sites.
With 10,000s of pages that have content that is only relevant to a targeted audience who is perfectly able to find us on the majors, I do not hesitate to block (and possibly ban) when I see an aggressive crawler that does not provide me or my customers with direct benefits.– A comment by “Cuili banning webmaster” on “Cuill is banned on 10,000 sites” from Skrentablog
That the crawlers from the major search engines, and Google in particular, have access that other smaller and newer search engines do not is common knowledge in the software industry. It’s not a secret, people have been talking about it for a long time. What people have not done is talk about these advantages in terms that politicians, regulators and economists understand. Regulators don’t realize the political and economic impacts these advantages have on the search engine marketplace and the internet as a whole.
For the past two years we have been researching this winnowing effect, trying to find a way to put our professional experiences with website operations into political and economic language. And now, after some studying, we can state simply that crawling the web is a natural monopoly. Natural monopolies have long been the objects of study for economists and the objects of regulation for politicians and so we think web crawling should be treated the same way.
What we want to add to the discussion about how and whether to regulate Google is an argument that focuses on the impact of the choices made by website operators. In addition to focusing on the choices made by the thousands of Google employees and billions of consumers to explain Google’s success, we want to also examine the choices made by the millions of website operators that Google interacts with through web crawling. We think that these millions of tiny decisions that the website operators make about who gets access to what on their websites have aggregated into an avalanche of advantages for Google that is insurmountable for others who are trying to build competing search engines.
We’ve published on this website what we have figured out so far and we’re going to keep publishing as we figure out more. We’ve found some pretty good evidence that crawling the web is a natural monopoly that redounds to Google’s benefit and we’ve got some big plans about where to look next to prove that out further. We’re launching this club so that anyone who wants to can contribute to the fight against Google’s power over the internet.
We can challenge Google’s power and we can win. Nobody is coming to do this for us and we have to make the fight ourselves. We’re just a bunch of random, stubborn people on the internet trying to make something happen together. We want to see the fruits of the information revolution truly benefit everyone, not just the people who got there first. So, to all of you reading this, we invite you to come on and catch the faith, join the club and join the fight.