Voice Queries and Visual Queries and Automated Assistants
I’m on the second day of a trip to New York City, giving presentations at SMX East on both the potential impact of mobile devices to the future of search, and on how reputation and authority signals might impact the rankings and visibility of authors and publishers and commentators on the Web.
My first presentation was in the “local and mobile” mobile track of the conference as part of a session titled “Meet Siri: Apple’s Google Killer?” where I joined Bryson Meunier, Will Scott, Andrew Shotland, and moderator Greg Sterling in discussing the potential impact of Apple’s Siri and voice search on SEO and search.
When I read the title for this proposed session a couple of months back, I couldn’t help but start to draft a pitch to join in on the conversation. I’ve been carefully watching patents and papers from Google and Apple and others about inventions and interfaces that might transform the way we search in the future to one focusing more upon voice queries and visual queries and also transforming the way that people might share information and market businesses online.
Is Siri Smarter Than We Perceive it to Be?
The Siri we see now on the iPhone 5 and the iPhone 4S only hints at some of the potential capabilities of voice queries via this intelligent automated assistant. Apple’s name is on 2 patents at this point that describe what other features we might see in the future. Both patents are immense, at around 150 printed pages or so each.
I wrote about Apple’s Siri patent this January. A continuation patent sharing the same name and substantially the same description was published by the US Patent and Trademark Office (USPTO) last week – which sent many tech blogs into a frenzy of speculation about what it might mean. None of them actually looked at the claims section of the newer version, which was where the two differed, with the more recent providing more detail on how Siri might better take advantage of the context of searches.
At the heart of Siri is an Intelligent Ontology that attempts to understand the intent behind searches by understanding the domain or or topic of a search, tasks that might be associated with that topic, phrases and terms that might help Siri acts upon those tasks, information about entities that may also be related, vocabulary that can expand upon the topic further, and information and service providers who may be of assistance.
For example, someone may ask Siri about places to get Italian food nearby for dinner. Siri can search and provide some suggestions and may be able to directly book reservations if so instructed. Siri can find the names of possible restaurants, knows the difference between Italian food and Korean BBQ, and could work with services like OpenTable to make reservations. It takes a wide range of vocabulary, task management, and ability to interact to make those types of decisions.
My presentation wasn’t intended to say bad things about Siri, but rather to look at the evolution of new approaches from Google that seem to be a response. I’m excited to see how Siri continues to develop in the future, and I’m hoping that its growth inspires new innovation from places like Google.
Google Glass, Google Now, and Google Field Trip
Google’s Project Glass hasn’t brought us products that have been released to the public yet, but it is something that is potentially very transformative in that it can receive not only supports voice queries but also supports visual queries of the type seen at Google Goggles.
I pulled an example from Google’s patent Providing digital content based on expected user behavior to describe on the difference that I see between Google Now and Siri, based on a reading of the patents from both.
Google Now and Siri may both recognize that you like to go to the local baseball stadium every so often. Siri may determine that you like baseball, and may start showing you the scores for the local team. Google Now may also determine that you like baseball, but notice that you only go to games against a certain competitor of the local team, and start showing you scores for that competitor.
Google Now and Google Field Trip both show how advertisements and offers might make their way on the mobile screen. Google now by doing things like learning your commute to work, so that it can warn you about traffic congestion while you still have a chance to plot an alternative route, and learning when you detour to a local coffee house on certain days, so that it might show you coupons for that coffee house, or others near it. Google Field Trip allows you to decide about what kind of content you might see, including historical marker data, locations where movies were filmed, and even advertisements for local businesses as you get near to them.
Google Acquisitions and Alternative Interfaces
I take a detour in the patent to show off some of the patent acquisitions that Google has in the past couple of years that would support the use of Google’s Project Glass. One set includes Terahop patents that would make it easier for Google to know the location of people indoors, including large indoor spaces like shopping malls and airports.
Google also recently acquired a number of patents from deCarta, which deCarta (then using the name Tecontar) acquired in the early 2000s when they acquired a company named Gravitate.
The company appears to have been ahead of its time, focusing upon the use of social, local, and mobile technologies to connect people near each other, connect people who might have common interests near one another, to set off alerts based upon a location for advertisements and for location-based events. These are all capabilities that Would appear to be ideal for Android and Google Glasses.
A number of patents from Google directly about the kind of heads up displays at the heart of Project Glass describe interfaces such as a simple touch screen on the side of the glasses, a user interface that changes based upon your activities. It would be simpler and sparser while you were running or jogging or driving, and more detailed while you were walking, and even more so while you are standing still or seated.
Another shows markers that you can wear on your fingers, such as rings, or gloves that have reflective tape on them, that you could use instead of a mouse or keyboard to enter information into a computer system or onto the Web. This goes beyond voice queries, to ones supported by gestures.
Another patent from Google describes a predictive application selector that might track your iris movements and remember your usage of applications in the past to determine which ones to launch from a heads up display.
Google has also acquired some patents that might not be used with Google Glass, but point out the company’s ability to take risks and explore other options. For example, Google acquired a number of patents from Louis Rosenberg that include a Harry-Potter styled wand game controller that works with voice commands that the pointing of a wand, or media devices that you can turn on and off with just a glance at them.
Google may also potentially view inventor Ralph Osterhout as a possible competitor. I’ve noticed a large number of patent applications being published at the USPTO recently with his name on them that involve augmented reality glasses that can run applications like Gmail, Google Earth, see virtual signs and advertisements on the sides of buildings, see-through fog and smoke, and much more.
Siri isn’t a Google Killer, as the title of the SMX session suggests, but it’s definitely something that is Voice Queries and Visual Queries The Future of Search? with Google in interesting ways.
If you spend some time working upon how voice queries and visual queries might impact search and SEO, and how the different parts of an active ontology from Apple or predictive models from Google might transform marketing on the Web, you might be a step or two ahead of others.
I present this afternoon on how signals of authority might be used by search engines to rank content in search results. It’s another technology from the search engines that may be future-focused but has the potential to have a substantial impact on what we see from the search engines.