How much content is organic?
YouTube content creators sometimes break their own fourth-wall by admitting (often for comic relief) that the algorithm made them do it. YouTube’s algorithm is poorly understood, perhaps even by the Google engineers who develop it; the average YouTube viewer has no clue how the algorithm works, or why they see recommendations for videos that seemingly have nothing to do with their watch histories.
To the outside observer, the algorithm is mostly a black box. Content creators, however, have at least some insight into what the algorithm is doing: by analyzing the performance metrics of their past video uploads, they can start to infer which kinds of content resonate with the algorithm. One might say, “the algorithm isn’t to blame–viewers decide what they like and don’t like”, but this misses so much of what the algorithm itself is doing, behind the scenes.
In a 2018 interview, software engineer Guillaume Chaslot spoke on his time at YouTube, telling The Guardian, “YouTube is something that looks like reality, but it is distorted to make you spend more time online. The recommendation algorithm is not optimizing for what is truthful, or balanced, or healthy for democracy.” Chaslot explained that YouTube’s algorithm is always in flux, with different inputs given different weights over time. “Watch time was the priority. Everything else was considered a distraction”, he told The Guardian.
YouTube’s algorithm behaves in ways that appear outwardly bizarre: recommending videos from channels with 4 subscribers, videos with zero views, or videos from spam and affiliate link clickfarm channels. To an uninitiated third-party, YouTube’s algorithm seems a bit obtuse, unpredictable, mercurial; perhaps the algorithm is working as intended. The algorithm, not the audience, increasingly directs the kinds of content that uploaders make. This is assuming, of course, that the uploader wants to monetize their content and collect ad revenue from high-performing videos.
A channel you subscribe to that normally uploads home repair videos may decide to upload a new kind of video, where the content creator travels across the country to interview a master carpenter about his trade. This video, while very well-received among the channel’s audience, has a fraction of the total watch hours and much less “engagement” than usual. This results in lower ad revenue for the creator, but also results in the video being shown to fewer people outside the channel’s existing audience.
The content was good, but the reception was poor. Was the reception poor because people just decided the video was a bit too long? Not relevant enough to home repair? Was Mercury perhaps in retrograde? No one really knows, even the channel owner. The algorithm has decided, based on its unknowable metrics, that this video is a bad video, and it won’t be going out of its way to promote it on the front page of YouTube.
As a result, that content creator, despite his love of the video he made and the subject matter, will mothball his plans to create more videos in his “Interviews with a Master Carpenter” series because the money just isn’t there. Conceivably, the first video performed poorly just by virtue of its newness, not due to any intrinsic flaw in the content. Maybe subsequent videos in the series would have done better. It doesn’t matter, though, because the algorithm has all but whispered to the channel owner, “don’t make this kind of video again”. Consequently, the content creator returns to his normal fare and watch hours, views, likes, and “engagement” go back up.
On a broad scale, this behavior would seem to have a chilling effect on speech itself. Algorithms and machine learning models are playing an outsize role in what kinds of content people make and what kinds of things we see. Is important work being taken out back and shot because the algorithm has concluded, based on historical data, that it won’t perform well? Or is the algorithm, even by chance, ensuring that videos it thinks are overly critical of a brand or company, or just generally problematic, won’t be seen again? Artificial intelligence doesn’t have the ability to be conscientious–it lacks self-awareness. All the same, human provocateurs can and do put their desires and agendas into these algorithms, giving AI the power to selectively dismiss and bury content that it’s been instructed not to like.
It’s important not to conflate this weaponization of AI with simple clickbait. While clickbait is problematic in its own right, it serves as more of an incentive for publishing certain kinds of content, rather than disincentivizing the creator from making certain kinds of content.
And make no mistake: algorithmic activism is weaponization of artificial intelligence. As we fall further down the trash-compactor-chute of tech dystopia, we need to remember that the same companies who wield AI as a cudgel against certain content were the same companies who heralded AI as a net positive to humanity. Google’s former motto was “don’t be evil”--they’ve since removed that utterance from their corporate code of conduct, as if Google needed to make it any more obvious that they are evil. Regardless, there is no “if” AI becomes a weapon because it already is one. The question is how bad will we, and our elected representatives, allow things to get before we place hard limits on AI’s scope in the public domain.
Print media hasn’t escaped the problem, either. Print, if anything, is a vestigial organ. Print media follows whatever is in fashion in digital media, as it may not surprise you to learn. So, when you read your local newspaper and ask, “why is every story about a shooting or a new tech company promising to hire a whole bunch of people?”, remember the algorithm. The same algorithm that tells YouTube content creators to be very careful about publishing unpopular material. The algorithm itself defines popularity, so you’ll never really know if the videos it deep-sixes are resonating with people or not; there are no guardrails, no systems of checks-and-balances, and you can’t interrogate the Black Box.
When we factor in the use of AI to write articles, product reviews, and other content, we find we really do exist in an algorithmic hall of mirrors. How can we discern which videos or articles to trust if the entire game is rigged? The quest tech companies have embarked on with AI is fairly straightforward: do whatever is necessary to beat the opponent’s AI model into submission. Then, the AI that emerges victorious gets to decide reality for the rest of us. Consider deepfakes–what if, in the future, an anti-establishment YouTube creator is framed for murder using an AI-generated confession that the man himself never uttered? This isn’t a cheesy Blade Runner-style screenplay pitch, it’s already happening. Deepfakes have already necessitated using AI to sniff out phony and doctored content.
The Massachusetts Institute of Technology is currently developing a deepfake detection experiment, which has already been peer-reviewed by the Proceedings of the National Academy of Sciences of the United States of America. The new space race between large-language models in AI will only intensify, and our reality will be dragged along with it. Truepic has also developed a system to authenticate videos and images, aiming to guarantee the authenticity of digital master copies, thereby preventing the fraud and abuse that deepfakes engender.
Deepfakes also promise to make corporate espionage easier, and cyberattacks harder to prevent, thanks to the sophisticated phishing attacks the technology facilitates. Because of AI-generated deepfakes, requirements for companies to obtain cybersecurity insurance will no doubt become more strigent. Cybersecurity insurers already require out-of-band-authentication (OOBA) as a defense against deepfake or impersonation-based attacks against clients; these authentication strategies are only one piece of the puzzle in mitigating these emerging deepfake threats, however. Additional software tools, authentication factors, user training, and the use of advanced AI technologies will become a necessary component in enterprises protecting their employees and clients from malevolent AI attacks.
The aim of this post isn’t so much to make you doubt your experiences, but rather to encourage you to ask some pointed questions about the things you read and watch online. The algorithms that run social media will only grow more sophisticated, and so, too, must our responses. If we don’t treat artificial intelligence with due caution and subject it to much-needed scientific and legislative rigor, we will find ourselves in a frightening new reality that we won’t be able to escape.
Comments