I got an interesting press release today. It wasn’t explicitly related to web hosting, but it got me thinking. And, time being money, I try to spend my thinking time blogging.
The press release was from the University of Toronto, the venerable Canadian institution located just a few blocks from my house. It describes the development, by computer engineering graduate student Alex Karpenko, of a system for searching and cataloguing video data.
The system is called Tiny Videos, says the announcement. And it could be used to ensure that content on a site like YouTube is properly labeled and therefore easier to find. And here’s the particularly interesting part:
“Since the Tiny Videos system can quickly search for specific content with large video collections, it can quickly identify videos that violate copyright infringement and alert copyright holders.”
It’s interesting when something comes along that forces an imperfect system to function more efficiently – in this case YouTube. Now, we can probably all agree that YouTube doesn’t live or die by copyrighted content. But ask any copyright-concerned organization (say, the MPAA), and they’ll more than likely tell you that YouTube has, at the very least, a “problem” with copyright infringement.
This, I suppose, is mostly because, like any social media site, YouTube has a kind of “post it now; sort it out later” sort of policy when it comes to user-generated content. The result of which is lots of users posting lots of copyrighted material to YouTube, all the time.
The way I understand it, YouTube’s policy is basically to pull down any clip about which it receives a relatively believable copyright infringement claim. The reason being, with so many users posting so much material all the time, it’s the only logistically reasonable way to police content. It’s not perfect (from the copyright holder point of view), but it seems to be enough to have kept YouTube safe DMCA-wise.
What I find so interesting is the possible elimination of the “this is the best we can do” argument.
The cynical side of me recognizes the benefit (in terms of pageviews and advertising revenues) of having the copyrighted material online and generating traffic while the takedown notice is still being drafted.
(Admittedly, I can’t think of a single copyright-kosher clip on YouTube I have any interest in watching. My tolerance for crying teenagers and guys getting hit in the genitals is pretty limited.)
That not-quite-a-conspiracy-theorist side perks up whenever I hear about a circumstance in which commerce comes into conflict with what you might call “principles.” Certain systems and products have “flaws” that are actually sources of revenue for somebody, if not for the company that produced the system or product.
I thought of this a few weeks ago, just before buying a $35 case for my iPhone, as I stood in the Apple store staring at the huge wall of products designed just to protect the various Apple devices from scratching. “If Apple ever did manage to make iPods scratch-proof,” I thought, “they’d be putting a lot of people out of business.”
A couple of more relevant examples:
- Back in 2005, web host AIT sued Google, claiming that by not doing enough to prevent click fraud on its advertising network, the search engine was complicit in the business of generating fake traffic. In this case, the benefit to Google would be its cut of every click.
- Last month, a few activists managed to get web host McColo’s backbone providers to cut off the company’s connection, effectively shutting down a company that was notorious for hosting equipment involved in some of the most prolific spam-generating botnets. The result of the shutdown was a temporary dip in spam volume that was reported by some filterers to be as much as 75 percent. One of the interesting implications of this story being that while cutting it off at the source appears to be a very effective way to curb spam, filtering it is where the money (and the resultant very large industry) is.
So is there a web hosting connection? I’d like to think so. I’d imagine that while McColo is an extreme example, it wasn’t the only web hosting company out there making money from customers who were using its servers to do possibly illegal things.
The fact of the matter is there’s only so much you can do to police content on your servers, particularly when we’re talking about thousands of customers. The sort of nebulous ethical question is: how many customers (and how much money) would it cost you if you were somehow able to eliminate every bit of copyrighted or illegal material off your network? And does the potential that it’s “a whole lot” have any effect on the quality of your policing?
Well, you’re not running YouTube (unless you are running YouTube, in which case, treat this paragraph as hypothetical), so that’s not a line of questioning you realistically have to answer to any time soon.
But the advancing technology evidenced by Mr. Karpenko’s press release suggests that eventually, technology is going to eliminate the ethical question from process.
That is, one day somebody’s going to hand you a piece of software that will find everything on your network that shouldn’t be there. Or they’ll hand it to the copyright holders. Either way, you won’t have to fret about starting up that “customer elimination” division after all.











