July 13, 2018  |  Friday words #129  |  479 hit(s)

It's Friday the 13th! But it's Friday! I'm conflicted.

Friend Nancy alerted me this week to an interesting term: search void, also known as data void. This describes a peculiar weakness, you might call it, of how web search results are ranked.

It might help to know that search rankings (or page rank, as Google calls it[1]), works by counting how many pages link to a specific page. The more pages link to a specific page, and the more "authoritative" those pages are, the higher a page appears in the search results. "Authoritative" here is defined as a page that itself ranks high. If a well-known, high-traffic blogger links to one of your blog posts, your post will get a big rankings boost.[2] A similar example occurs on Twitter: if someone with tons of followers retweets one of your tweets, many people will see and possibly retweet your original.

The idea is a kind of digital crowdsourcing—the internet at large decides which pages are the best, and those rise to the top of the search results. A flaw can result, however, if a lot of content is produced and cross-linked about a topic, but that information is one-sided or niche. An article in Wired that describes this uses the example of vitamin K shots for newborns. A passionate anti-shot community has produced a lot of content warning of the dangers of these shots. There is not (or was not) a corresponding community of passionate pro-shotters, so there was a period during which if you searched for info about vitamin K for newborns, there was a data void: the top-ranked search results represented a kind of skewed data sampling. This information showed up at the top of the search listings, and people presumably assumed it was the "best" information, even though it doesn't represent a majority view about the subject.

As our information sources become more siloed, we're all going to become more subject to search/data voids. I suppose the first defense is to know that there's a word for the phenomenon.

For origins, a fun one that I learned from Jonathon Owen. In English, we got the word lettuce from Old French, and there are cognates like lechuga in Spanish. (Hold that thought.) It gets more interesting when we go further back. In Latin, the name was lactuca. The lac- part means "milk", because wilder members of the lettuce family have milky juice. That lac particle is what you see in lactate and lactose, and whose relatives are caffè latte and café au lait. (In Spanish, milk is leche, which hey look, is right there in lechuga.) The lac particle also shows up in the word galaxy/galactic, which comes from a Greek word for the Milky Way. Got milk? Yes you do.

[1] Page rank is a satisfactory lexical intersection of the term web page and the name Larry Page, one of Google's founders.

[2] This statement is only mostly true.

