Is This the Real Life? Or Just a Semantically-Filtered Search?

06 Oct, 2024

We’ve come so far with technology that we now trust search algorithms to tell us about the world more than we trust our own senses. Instead of searching for information ourselves, we just sit back and let the algorithms show us their version of the "truth." Gone are the days of typing out clear questions and sorting through endless pages of useless info. What a relief.

The Grand Illusion of Personalization

Let’s say you search for “best startup ideas” (I promise I don't). What you’ll see isn't necessarily the objectively best ideas, but rather the ones that align with your search history, interests, and even your geographical location (thanks to that handy geotag). So, if you're an electrical engineering student at NUS, you’ll likely get suggestions that lean toward solar, materials, maybe even a social entrepreneurship idea or two—because, clearly, that’s all you’re interested in. Or at least that’s what the algorithm has "learned."

Keywords to Context: How Semantic Search Works

NLP helps the algorithm understand the nuances of language, like synonyms, word relationships, and sentence structure. Machine learning enables it to refine results based on user behavior over time—learning from millions of search queries to continuously improve accuracy. Knowledge graphs, which store interconnected facts about people, places, and things, help link related concepts, so when you search for "Python," it can tell if you mean the programming language or the snake.

By analyzing not just the text, but the meaning and relationships behind it, semantic search aims to return results that are contextually relevant.

Confirmation Bias as a Service (CaaS™)

We’re all familiar with confirmation bias—the very human tendency to seek out information that validates what we already believe. But we’ve outsourced that task now, haven’t we? Semantic search ensures that, whether you realize it or not, you’re consistently being fed a version of reality that aligns with what you’ve previously searched. The more you look up things like “deployable robust communications network for UGVs,” the more convinced you become that the world of autonomous amphibious unmanned ground vehicles is the next big thing (not that it isn’t, of course).

The Algorithm’s "Best Guess" at Reality

The truly fascinating (and perhaps troubling) thing is that this isn’t just happening at the surface level. As semantic algorithms get better at understanding language, meaning, and context, they start making more complex associations, identifying patterns that the average human wouldn’t even consider. They attempt to link the dots between concepts, creating a web of meaning that is, in essence, the algorithm’s "best guess" at how reality is structured.

The semantic search algorithm is essentially calculating the probability that a certain search result is what you're looking for. And more often than not, it’s right. But being "right" isn’t the same as being correct.

The algorithm isn’t giving you the definitive truth; it’s giving you the most satisfactory truth. And in doing so, it quietly shapes your understanding of the world. It’s like Schrödinger’s cat, but instead of a feline caught in a quantum paradox, it’s your perception of reality that exists in a superposition of meanings, filtered by algorithms until observed by you in the search results.

Living in the Matrix (But It’s Really Well-Indexed)

If we only ever see the search results tailored to our preferences, how much of the world are we missing out on? More importantly, how much of what we perceive is just the result of an algorithm’s attempt to make our lives easier?

Should our lives be easier? Reminds me of the hard times strong men analogy.

It's an engineering problem at its core, really. Given a vast and complex dataset (reality), how do you efficiently filter out the noise and deliver only the most relevant signals? That’s what semantic search algorithms do—they process the noise, isolate the signal, and present it to you as if it were objective truth. But noise isn’t just irrelevant data—it’s an integral part of the system. Strip away enough noise, and you risk losing sight of the system’s full complexity.

And yet, that’s exactly what we’ve done. In our quest for efficiency, we’ve optimized the world right down to bite-sized snippets of relevance, delivered at lightning speed. But at what cost? How much of the vastness and complexity of reality have we sacrificed for the sake of convenience? In trying to make the world more accessible, we’ve possibly created a machine that’s quietly filtering it out.

But at least it’s fast.