
The internet has changed dramatically in the last decade. While that is largely down to how people use it, the content created and the rise of artificial intelligence, it is also due to the number of web pages that are disappearing.
A new study by the Pew Research Center found that 38% of web pages that existed in 2013 “are no longer accessible a decade later”. This demonstrates “just how fleeting online content” has become in an era of “digital decay”.
Most people would think of the internet as a “place where content lasts forever”, but the reality is that “vast amounts of news and important reference content are disappearing”, said The Independent.
‘Algorithms are deciding’
These lost pages are “deleted or removed on an otherwise functional website”, said the study, and there are numerous reasons that sites will do this.
Many will remove old content, “sacrificing it at the altar of Google” to maximise its algorithm, said Simon Brew on Why Now. That’s because the Google algorithm favours fast-loading websites and so removing “thin content” will allow for a quicker loading time. Some of these pages will be “duplicated material”, but some will be older, genuine news articles that “algorithms are deciding” are not worth keeping.
Other reasons websites shed pages include “sizeable restructures and redesigns” where archive material is deemed “not compatible” with new technologies and is subsequently removed.
It’s not just web pages either. According to the Pew study, one in five posts on X (formerly Twitter) are deleted, with 60% of these lost because the account that posted them is suspended or deleted.
‘A digital space that feels abandoned and crowded at once’
It would be easy to think it doesn’t matter that many of these pages that are “of little immediate value to anyone” are taken down, said Brew. But it means vast swathes of news, government and Wikipedia pages include broken links to “important reference content”.
That has significance from a research and historical reference point of view. It is also symptomatic of an internet where disinformation is becoming more prolific and it has become “harder to surface and verify information” from sources that may have previously existed, said Wired.
That is also true across social media sites like X, where there is a “real” sense of “platform decay” and fleeting bot-generated content is creating a “digital space that feels abandoned and crowded at once”.
‘A reminder to be sceptical’
There is “some fightback” to combat disappearing content from the internet, said Brew, but it is coming from non-profit archive sites that will struggle to compete with big corporations whose decisions to remove content are “determined primarily by the pursuit of the pound or dollar”.
There is a continuing sense that as the “world wide web grows, it’s narrowing”, he added, and its “own active history is being removed, with not an eyebrow being batted”.
The removal of that content also makes the internet a less recognisable space, one “no longer for humans, by humans”, said Jake Renzella and Vlada Rozova on The Conversation. Online interactions are becoming ever more “synthetic” as AI and algorithm-generated content takes a greater hold.
The freedom for people to create and share on the internet is “what made it so powerful” in the first place, but the rapid removal of human-generated content is a “reminder to be sceptical” and to navigate the internet with a “critical mind”.
Research finds an increasing amount of older content is being removed from websites