If you look up when the last five movies from Jim Carrey were released, and were able to sneak a peek at Google’s query logs, you’d see that searches for Jim Carrey spiked on those dates.
Same with Ben Stiller, Edward Norton, Leonardo Dicaprio, and Tom Hanks.
We know this from a footnote in a recently published paper from researchers at Google.
The authors of Gazpacho and summer rash: lexical relationships from temporal patterns of web search queries checked to see if there was some kind of time-based relationship between searches for those movies’ names (and release dates) and the names of those actors.
It sounds obvious that there would be, but it’s interesting to see actual data from Google that explores relationships like that.
Relationships between Queries Based upon Time
The researchers from Google’s Zurich office also looked for other kinds of relationships based upon time for other queries, and came up with a number of different types of relationships.
For example, “gazpacho” and “summertime” both tend to show up in Google’s query log files and increase and decrease in searches around the same time – both of which tend to be warm weather phenomena.
Might Google be able to use this kind of information to help searchers in the form of query suggestions? That’s one of the questions that the researchers pose in the paper.
Part of their research also involved tracking patterns and trends during real time search.
While reading this document, I asked myself if an understanding of these kinds of relationships might help people who create content for web sites?
Semantical Relationships between Temporally Similar Searches
For this study, the researchers limited the phrases that they were looking at to terms found in Princeton’s Wordnet 3.0, so their results aren’t quite a reflection of what they might have found if they took a number of query terms commonly searched for upon the Web. But the study did yield some interesting results and ideas, and is worth spending some time with.
What I found very interesting was their descriptions of a number of the relationships that they described.
Here’s a quick rundown:
True synonyms – words that mean exactly the same thing, such as november and nov, or car and automobile.
Variations of people names – If a person is know by their first or last name or by a title, such as john lennon and lennon, Barack Obama and President Obama
Geographically-related terms Locations that are close to each other such as Manhattan, Brooklyn, Bronx…
Synonyms of location names like New Jersey and Jersey.
Derived words like New York and New Yorker.
Generic word optionalizations Where a shortened version of a word or phrase most commonly means the same thing as the longer version, such as Spanish inquisition and inquisition.
Word reordering where related phrases use the same words in such as oil palm and palm oil.
Morphological variants – Where a phrase may vary slightly but be very related, such as station of the cross and stations of the cross.
Acronyms – Such as National Aeronautics and Space Administration Agency and NASA.
Hyperonym-hyponym Pairs of words that related in the way that scarlet or crimsom or rust are related to the word red.
Sibling terms in a taxonomy Terms in a classification that are on the same level. For example, sibling terms in a classification of citrus fruits might include oranges, grapefruit, lemons, limes.
Co-occurring events in time – Examples can include movies that may have been released at the same time, or words that appear in the same movie title, such as quantum and solace, which show up in the James Bond film Quantum of Solace.
Topically-related terms – Take a specific topic, and find terms or phrases that might be closely related, such as teammates on the New York Yankees – Alex Rodrguez, Derek Jeter, Mark Teixeira, etc. Or Boston Tea Party might be seen as topically related to American Revolution, Samuel Adams, and British East India Company.
The paper goes on to explore how useful these types of relationships might be to creating query suggestions for searchers, as well as coming up with classifications for queries that a search engine might be able to use in other ways. Briefly, the use of these time-related queries that tend to appear around the same time in search engine logs may be useful in the creation of query suggestions. They possibly may not be as useful in categorizing queries.
The results are worth looking at, but the ideas behind such relationships are also worth considering during keyword research or content creation for websites, as well as concepts to keep in mind when searching for information on the Web.