Exactly ten years ago, Google expanded its transparency report with a new section dedicated to DMCA takedown requests.
For the first time, outsiders were able to see what URLs copyrights holders targeted and in what quantity.
The decision to make this information public was in part triggered by a rapid increase in removal requests. This was having an impact on the “free flow of information”, according to the search engine.
“We believe that openness is crucial for the future of the Internet. When something gets in the way of the free flow of information, we believe there should be transparency around what that block might be.”
According to Fred von Lohmann, Google’s Senior Copyright Counsel at the time, DMCA notices were skyrocketing.
“These days it’s not unusual for us to receive more than 250,000 requests each week, which is more than what copyright owners asked us to remove in all of 2009,” Van Lohman wrote at the time.
From 250,000 to 1,000,000,000 Takedowns Per Year
In hindsight, this was just the start of a takedown explosion. A few years later Google processed more than 20 million DMCA notices per week, which translates to more than a billion per year.
This growth curve eventually flattened and in recent years the takedown volume has started to decline. This is in part due to the various anti-piracy algorithms that push pirated content down in the search results.
By downranking pirate site results, infringing content has become harder to find in the search engine. As a result, Google now processes ‘just’ a few hundred million DMCA requests per year.
After ten years of takedown transparency, we take a look at the totals thus far, which are quite impressive. Over the past decade rightsholders asked Google to remove 5.75 billion URLs that allegedly link to copyright-infringing content.
These takedown requests come from just over 300,000 different copyright holders. UK music group BPI is the most prolific sender. With 570 million reported links, it’s good for nearly 10% of all takedown requests.
Looking at the targeted domains we see that 4shared.com is in the lead with 68 million reported URLs. Most of these were flagged several years ago. In recent years, the site is flagged ‘only’ a few thousand times per week, with less than a million reported links per year.
The top five most targeted domain names is completed with the defunct site mp3toys.xyz, hosting platforms rapidgator.net, chomikuj.pl, and uploaded.net, as well as the unblocking proxy portal unblocksites.co.
Not All Reported URLs are Removed
The figures refer to the number of URLs that are reported but not all of these are actually removed from the search engine. The stats also count duplicate reports, bogus claims, and URLs that are not indexed by Google.
For example, if we look at the reports from MindGeek’s “MG Premium” we see that the company reported over 494 million URLs over the years. Little over half of these were actually removed by Google.
Of the remaining URLs, 128 million were not in Google’s index. These have been placed on a preemptive blocklist, to prevent them from appearing in search results later on. Another 70 million links were classified as duplicates, while nearly 7 million were rejected for other reasons.
Mistakes and Abuse
While these numbers are interesting by themselves, the biggest contribution of the transparency report is the ability for outsiders to spot faulty and abusive notices. This is possible because Google shares all reported links with the Lumen Database, which is managed by Harvard’s Berkman Klein Center.
Over the years this database has allowed us to spot thousands of problematic takedowns, ranging from honest mistakes, through automated takedown errors, to plain abuse.
There are numerous examples of mistakes we can mention. Microsoft once targeted the BBC, Wikipedia, and the US Government; Movie studios asked Google to remove their own films; A French movie and TV show database targeted Netflix and Rotten Tomatoes, and so on.
With billions of reported URLs is no surprise that these errors happen but by pointing them out in public, those responsible can be held to account. That must have resulted in a higher takedown accuracy rate over time.
It will be interesting to see how takedown trends develop over the coming years. As long as Google continues its transparency report, we will surely keep an eye on it.