The views expressed by contributors are their own and not the view of The Hill

Copyright bots powered by a 1998 law threaten the public’s right to know

by Jonathan Hofer, opinion contributor 03/30/21 01:01 PM ET

A clever Beverly Hills police officer allegedly tried to thwart a police-reform activist’s attempt to post an unflattering video of the officer’s conduct by blasting Sublime’s song “Santeria” from his phone. Why would he do that? The officer apparently realized that since “Santeria” is copyrighted, Instagram’s automatic content filter would take down the video.

If true, this is troubling. Cases like this show that copyright systems in the digital age pose a unique challenge to civil liberties and the public’s right to know what their government is up to. This is not just an issue with Instagram and police: bots on many platforms can take down information that should be freely available to the public. Bad public policy is to blame.

Copyright bots are automated programs that search digital content to identify copyright infringements. Google’s Content ID for YouTube is a prominent example. According to a Google publication, 98 percent of YouTube’s copyright issues were handled through the automated Content ID system in 2018.

When a user uploads a video to YouTube, Content ID scans the contents against a database of files submitted by digital content owners. If the newly uploaded video matches a copyrighted file, the copyright holders have the option to make money from the offending video, be granted access to the video’s viewing statistics, or have the video taken down.

However, this process is open to exploitation. A post from NYU’s Journal of Intellectual Property and Entertainment Law has called copyright actions like YouTube’s “a tool for censorship, bullying, extortion.” As the Beverly Hills police officer showed, an alleged bad actor who wishes to cover up his misconduct faces a low bar. Other examples include the Azerbaijani government allegedly censoring journalists and a former candidate for Colorado Assembly filing multiple claims against a critic’s YouTube channel, resulting in the termination of the critic’s account twice.

Besides ill-intent, automated copyright checks fall liable to false positives. Some of Content ID’s lowlights include; a 10-hour video of TV static, which received five copyright notices; a music teacher’s educational video of Beethoven’s and Wagner’s public-domain works; a microphone check and the sound of a bird in the background of man’s outdoor video.

How these bots treat public records is of particular concern. While official federal documents are properly exempt from copyright, public documents can still be censored by these programs. In a high profile case, BookID, Scribd’s copyright filter, flagged the Mueller report on presidential election interference as infringing a copyright. In an article published by Quartz, Scribd acknowledged that BookID will sometimes incorrectly identify legitimate content as an infringement and will disable access even if no actual infringement occurred. The company website states:

“BookID contains fingerprints of educational textbooks and other works that contain long excerpts of classic literature, religious texts, legal documents, and government publications in the public domain. This occasionally results in the temporary removal of non-copyrighted, authorized, or public domain material from Scribd.com and the mobile app.”

Even live streams on YouTube are scanned for copyright content and according to Google’s own admission, “Your live broadcast can be interrupted even if you licensed the third-party content in question, or even if you restricted your broadcast to a territory in which you own all the necessary rights.”

Why do internet platforms create these bots if they don’t work well? The answer lies with the Digital Millennium Copyright Act. Passed in 1998, the DMCA criminalizes disseminations of copyrighted work. This encourages preemptive takedowns of content by digital platforms. Human reviewers are not viable, making trigger-happy bots the next-best option. Without oversight, bots cannot discern if a copyright claim is made in good faith.

Wendy Seltzer, an attorney and a fellow at Harvard’s Berkman Center for Internet & Society, emphasized this point:

“Under the “safe harbors” of the [DMCA], internet service providers are encouraged to respond to copyright complaints with content takedowns, assuring their immunity from liability while diminishing the rights of their subscribers and users … the law’s shield for service providers becomes a sword against the public who depend upon these providers as platforms for speech. [And the DMCA copyright] process for an accused infringer is limited.”

As the Beverly Hills cop incident and others cases show, the potential for using bots to reduce government transparency is real. Reform is warranted. Archaic intellectual-property laws like the DMCA must be amended to ensure that online copyright enforcement is never used to withhold vital government information from the public.

Jonathan Hofer is a policy research associate in the Center for Entrepreneurial Innovation at the Independent Institute in Oakland, Calif.

Technology