Locally Prominent Semantic Features

Sharing is caring!

Finding Locally Prominent Semantic Features Using a Computer

This patent relates to determining locally prominent semantic features using a computer.

Operations associated with the state of a geographic area can get implemented on a variety of computers. These include processing data associated with the geographic area for later access and use by a user or computer. Further, they can include exchanging data with remote computers.

But, how the operations get performed can vary over time. So can the underlying hardware that implements the operations. There are different ways to leverage computing resources associated with the state of a geographic area.

Advantages of the patent get outlined in the following description or can become learned from the description or get learned through the embodiments.

Providing Navigational Instructions That Reference Landmarks

One example aspect of the patent becomes directed to a computer-implemented method of providing navigational instructions that reference locally prominent landmarks. The computer-implemented method can include accessing many semantic tags associated with several images by a computer, including processors. Each of the numbers of semantic tags can become associated with features depicted by one of the plurality of images.

Further, each of the features can get associated with a geographic location. The computerized can include identifying, by the computer, based at least in part on the plurality of semantic tags, locally prominent landmarks that include the features that meet entropic criteria that measure a localized prominence of each of the features.

Associating Locally Prominent Features with Content

The computer-implemented method can also include selecting, by the computer, based at least in part on context data associated with locally prominent features on a path, including a plurality of locations, at least one landmark for use in navigation at the location.

Furthermore, the computerized method can generate at least one navigational instruction that references one landmark by the computer.

Navigational Data Associated With this Locally Prominent Semantic Features Patent

Another example aspect of the patent gets directed to tangible non-transitory computer-readable media storing computer-readable instructions that, when executed by processors, cause the processors to perform operations. The operations can include accessing image data, including a plurality of images associated with semantic tags.

Each of the semantic tags can become associated with features of the plurality of images.

Further, each of the features can get associated with a geographic location.

The operations can also include determining locally prominent landmarks, including the features that meet entropic criteria associated with a localized prominence of each of the features.

The operations can include determining, based at least in part on context data associated with a location of a vantage point on a path, including a plurality of locations, the landmarks associated with the vantage point.

Furthermore, the operations can include generating navigational data, including indications associated with the landmarks.

Semantic Tags Associated with Images and With Locations

Another example of the disclosure gets directed to a computer, including processors and non-transitory computer-readable media storing instructions that cause the processors to perform operations when executed.

The operations can include accessing image data, including a plurality of images associated with semantic tags:

  1. Each of the semantic tags can become associated with features of the plurality of images
  2. Further, each of the features can get associated with a geographic location
  3. The operations can include determining locally prominent landmarks, including the features that meet entropic criteria associated with a localized prominence of each of the features
  4. The operations can also include determining, based at least in part on context data associated with a location of a vantage point on a path, including a plurality of locations, the landmarks associated with the vantage point
  5. Furthermore, the operations can include generating navigational data, including indications associated with the landmarks

Other aspects of the patent get directed to various systems, apparatuses, non-transitory computer-readable media, user interfaces, and electronic devices. These and other features, aspects, and advantages of various embodiments of the patent become better understood concerning the following description and appended claims.

This Finding Locally Prominent Semantic Features Patent gets Found at:

Finding Locally Prominent Semantic Features for Navigation and Geocoding
Inventors Yan Mayster, Brian Brewington, and Matthew Strosnick
Patent Number US20210240762
Filed Date October 22, 2018
Publication Number 20210240762
Publication Date August 5, 2021
Applicants Google LLC

Abstract

Methods, systems, devices, apparatuses, and tangible non-transitory computer-readable media for navigation and geocoding get provided.

The disclosed technology can perform operations, including accessing semantic tags associated with images. Each of the semantic tags can become associated with features depicted by one of the images.

Further, each of the features can get associated with a geographic location.

Based on the semantic tags, locally prominent landmarks that include the features that meet entropic criteria can get identified.

The entropic criteria can measure the localized prominence of each of the features.

A navigation landmark can be selected based on context data associated with a location on a path that includes a plurality of locations.

Furthermore, at least one navigational instruction that references the landmark can get generated.

Determining Locally Prominent Landmarks for Navigation and Geocoding

This disclosure gets directed to determining locally prominent landmarks for navigation and geocoding. The disclosure describes navigational instructions based on the landmarks. The technology can include a computer that receives data, including semantic tags associated with features of images. Example images could include photographs of an area captured from various vantage points.

These semantic tags can become based on images processed by a machine-learned model or other image content analysis system that detected and identified the features of the images.

Associating Features Of An Image With A Geographic Location

This finding locally prominent semantic Features process is interesting. The features of the images can get associated with:

  • Geographic location (such as altitude, latitude, and longitude)
  • Other information (such as a time at which each image became captured)

The computer can determine landmarks in the images that include features that meet entropic criteria associated with the localized prominence of the features (these can include the visibility, rarity, or familiarity of the feature within the area).

The computer can also determine, based on context data associated with a vantage point. This would be a current location of the computer or a user of the computer. The landmarks would be associated with the vantage point. This means landmarks that are visible from the vantage point.

We get told that:

The computer can generate indications (e.g., navigational instructions that reference the landmark(s)) that can become used to facilitate navigation. The disclosed technology can more effectively determine landmarks in an environment and select a portion that will facilitate more effective navigation based on a given context and vantage point. The technology described herein may thus enable users to navigate more effectively and efficiently to a specified location.

Landmarks May Become Referenced by A Series of Turn-By-Turn Navigational Instructions

The landmarks determined may become referenced in a series of turn-by-turn navigational instructions provided to a vehicle driver. The described technology may thus assist the driver in performing the technical task of driving the vehicle to a specified location employing a continued and guided human-vehicle interaction process.

Accessing Map Data Associated With a Geographic Area

By way of further example, a computer (e.g., a computer in a vehicle) can access map data (e.g., locally stored or remotely accessible map data) associated with a geographic area. The map data can include images and semantic tags that state features of the images that have been previously identified using image recognition (e.g., a machine-learned model trained to detect features of images).

Determining A Set of Locally Prominent Landmarks in a Geographic Area

The computer can then determine a set of landmarks in the area that include features that meet entropic criteria associated with the localized prominence of the landmarks. For example, a thirty-meter tall red-granite obelisk can meet entropic criteria (e.g., large size, unusual shape, and unusual color) that a parking meter (e.g., relatively small size and very common) or bush would not.

Additionally, the computer can determine a context that includes the time of day, location, and direction of travel (e.g., the vehicle’s direction of travel) to select the most effective landmark for the given vantage point and conditions in the area. The computer can then use the locally prominent landmarks within the area to provide a set of instructions (e.g., audible instructions generated through a vehicle loudspeaker and textual or graphical instructions provided on display) to a user to assist in navigation. As such, the disclosed technology provides improvements in navigation and geocoding.

The disclosed technology can include a computer (e.g., a navigation computer) that can include computers (e.g., devices with computer processors and a memory that can store instructions) that can send, receive, process, generate, and change data (e.g., data including semantic tags associated with images) including information patterns or structures that can get stored on memory devices (e.g., random access memory devices) and storage devices (e.g., hard disk drives and solid-state memory drives); and signals (e.g., electronic signals).

Sending Semantic Tags Associated with Images

The data and signals can get exchanged by the computer with various other systems and devices, including a plurality of service systems (e.g., remote computers and software applications operating on computers) that can send and receive data, including semantic tags associated with images (e.g., digital images associated with data including geographic location, time of image capture, and descriptions of other features of the images).

The computer (e.g., the navigation computer) can include features of the device and the computer that has gotten depicted. Further, the network computer can become associated with machine-learned models that include features of the machine-learned models.

Furthermore, the computer can include specialized hardware (e.g., an application-specific integrated circuit) and software that enables the computer to perform operations specific to the disclosed technology, including accessing semantic tags (e.g., accessing locally stored semantic tags or accessing semantic tags by receiving the semantic tags from a remote computer) associated with a plurality of images, determining landmarks that include features that satisfy entropic criteria, using context data to determine locally prominent landmarks associated with a vantage point, and generating navigational data.

Each Feature of the Semantic Features Can Become Associated With a Geographic Location.

More particularly, a navigation computer can receive a plurality of semantic tags associated with a plurality of images.

Each semantic tag of the plurality of semantic tags can get associated with features depicted by one of the images. For example, each semantic tag can provide a semantic description of an object included within a scene depicted by one of the plurality of images. Further, each feature of the features can become associated with a geographic location. The plurality of images can include digital images (e.g., a two-dimensional image) of a part of an environment (e.g., an image of a location in an environment).

The plurality of images can get encoded in any type of image format, including a combination of raster images (e.g., bitmaps comprising a grid of pixels) and vector images (e.g., polygonal representations of images based on positions of coordinates including x and y axes of a two-dimensional plane).

Images Associated With Locally Prominent Semantic Features

The images can include still images, image frames from a movie, and other types of imagery, including LIDAR imagery, RADAR imagery, and other types of imagery.

Examples of digital image formats used by the plurality of images can include JPEG (Joint Photographic Experts Group), BMP (Bitmap), TIFF (Tagged Image File Format), PNG (Portable Network Graphics), and GIF (Graphics Interchange Format).

The images can get collected from various sources such as user-submitted imagery, imagery in the public domain (e.g., obtained via web crawl and properly aggregated and anonymized), street-level panoramic imagery, and other sources of images.

Images Associated with Physical Dimensions

The plurality of semantic tags associated with the images can get associated with features including physical dimensions (e.g., physical dimensions of objects in an image); and object identities (e.g., the identity of objects depicted in the images). More information can get associated with the images and semantic tags such as a location (e.g., a street address and an altitude, latitude, and longitude associated with an image); a time of day (e.g., a time of day when an image gets captured); a date (e.g., a date when an image got captured);

By way of example, the navigation computer can receive data including information associated with the plurality of semantic tags and the plurality of images via a communication network (e.g., a wireless and wired network including a LAN, WAN, or the Internet) through which signals (e.g., electronic signals) and data can get sent and received.

The Navigating Computer Identifying Locally Prominent Landmarks Using Semantic Tags

The navigation computer can identify, based at least in part on the plurality of semantic tags, landmarks that include the features that meet entropic criteria that measure a localized prominence of each of the features. For example, the navigation computer can access data associated with the plurality of semantic tags that state an area’s landmarks: the landmarks can get associated with features (e.g., physical dimensions or shape) that can compare entropic criteria that can get used to identifying the locally prominent landmarks.

The entropic criteria can get associated with the frequency with which each of the features occurs in the area or the distinctiveness of each feature from other common features in the area.

How A Histogram and Clustering Can Find Geolocations Based on Semantic Tags

Satisfaction of the entropic criteria can get based, for example, on a feature being infrequent (e.g., the only tree in an area or the only high-rise building in an area).

Thus, in one example, clustering or other algorithmic techniques can become used to determine a rarity or frequency associated with each feature, which can then guide the selection of features for use as locally prominent landmarks.

As one example, an area around the location can identify which features associated with the location are most rare for each location. Such as a histogram of semantic tags in an area around a location might reveal that an obelisk only occurs once. In contrast, a mailbox occurs sixty times, indicating that the obelisk would work better as a landmark.

By way of further example, the satisfaction of the entropic criteria can include the distinctiveness of various characteristics of a feature about other similar features in the area. For instance, although they may all be “buildings,” a small house on one corner of a four-sided intersection will contrast with high-rise buildings on the other three corners.

Landmarks thus can get determined from the semantic tag statistics aggregated geographically and over time for each location by focusing on “tags” of high entropy. These are tags that appear to persist in time and exhibit highly localized prominence.

This means that the system can identify high confidence tags comparatively unusual or rare in the surrounding area.

Highly Visible Nighttime Landmarks and Audible Landmarks

Another approach to Finding Locally Prominent Semantic Features means using senses to help locally prominent landmarks stand out.

The navigation computer can select, based at least in part on context data associated with a location on a path, including a plurality of locations, at least one landmark for use in navigation at the location. For example, the navigation computer can determine a context, including the time of day, season, and amount of traffic proximate to a vantage point. Based on, for example, context indicating that night has fallen, the navigation computer can select a landmark that becomes illuminated and not cloaked in darkness.

The navigation computer can generate at least one navigational instruction that references at least one landmark. At least one navigational instruction can include visual instructions. These can show as text displayed on a display device. We could also hear audible instructions. Such as instructions emitted from an audio output device. For example, the navigation computer can generate audible instructions describing the appearance of a landmark (e.g., “a statue of a horse with a rider”) and the location of the landmark (e.g., “on your right at the next intersection”).

Determining A Rate at Which Each of the Features Occurs Within

Determining, based at least in part on the plurality of semantic tags, a rate at which each of the features occurs within a predetermined area can get used to identifying, based at least in part on the plurality of semantic tags, landmarks that include the features that meet entropic criteria that measure a localized prominence of each of the features. For example, the navigation computer can determine the rate at which each feature (e.g., the height of high-rise buildings) occurs within a two-hundred square meter area.

Determining that the landmarks include the features that occur the least frequently or that occur at a rate below a threshold rate can get used to identifying, based at least in part on the plurality of semantic tags, locally prominent landmarks that include the features that meet entropic criteria that measure a localized prominence of each of the features.

For example, in determining a building to use as a landmark among a group of buildings, the navigation computer can determine that the building with a height that occurs less frequently than an average rate of heights (e.g., a very tall high-rise building with a height that is in a height range that occurs once in every thousand high-rise buildings).

Determining a Confidence Score For Each of the Semantic Features

Determining a confidence score for each of the features based at least in part on many times that each respective feature of the features has gotten associated with a semantic tag of the plurality of semantic tags can get used in identifying, based at least in part on the plurality of semantic tags, Locally prominent landmarks that include the features that meet entropic criteria that measure a localized prominence of each of the features. For example, the navigation computer can access data associated with the number of times each feature gets tagged with the same tag. A feature that gets tagged with the same feature a greater number of times can become associated with a higher confidence score.

Identifying as a landmark, the features with a confidence score that meet confidence score criteria can get used in identifying, based at least in part on the plurality of semantic tags, landmarks that include the features that meet entropic criteria that measure a localized prominence of each of the features. For example, the navigation computer can identify as a landmark the features with a confidence score that exceeds a threshold confidence score.

What the Confidence Score Gets Based On

The confidence score can get based at least in part on many different perspectives from which each of the features associated with a semantic tag has become viewed (e.g., semantic tags associated with images of the same object viewed from different angles), and recency with which the features have gotten associated with a semantic tag.

Determining clusters of the features that meet the entropic criteria can get used in identifying, based at least in part on the plurality of semantic tags, landmarks that include the features that meet entropic criteria that measure a localized prominence of each of the features. Further, each of the clusters can include features that have a common semantic type. For example, the navigation computer can determine clusters of trees that can meet entropic criteria associated with a tree density for the cluster of trees (e.g., the number of trees within a predetermined area).

Determining Visibility of Each of The Landmarks From the Vantage Point

Determining visibility of each of the landmarks from the vantage point associated with the location can get used in selecting, by the computer, based at least in part on context data associated with a location on a path including a plurality of locations, at least one landmark for use in navigation at the location.

Further, the context data can include a vantage point (e.g., a point within the location where parts of the surrounding environment can become viewed). For example, the navigation computer can determine the landmarks table from the vantage point based on partly how far away the landmark is and whether the landmark gets obstructed by other objects.

The visibility can get based at least in part on a distance from which each of the landmarks is visible from the vantage point, an amount of light that is cast on each of the landmarks, any obstructions between the vantage point and the landmarks, and physical dimensions of each of the landmarks.

Determining a direction of travel along the path at the vantage point can determine the visibility of each of the locally prominent landmarks from the vantage point associated with the location. For example, the visibility of each of the landmarks can become associated with a field of view from the vantage point that gets associated with the direction of travel. Such as a field of view equal to a predetermined angle relative to a line corresponding to the direction of travel.

Determining The Landmarks That Face The Direction of Travel

Determining the landmarks that face the direction of travel can determine each from the vantage point associated with the location. For example, the navigation computer can determine that the locally prominent landmarks that face the direction of travel (e.g., the landmarks that are ahead of the vantage point) are more visible than the landmarks that do not face the direction of travel (e.g., the landmarks that are behind the vantage point).

Determining the visibility based at least in part on a mode of transportation associated with the vantage point can determine the visibility of each of the landmarks from the vantage point associated with the location. For example, the navigation computer can determine that under certain circumstances (e.g., on the street with heavy traffic), the visibility from an automobile is less than the visibility from a bicycle and determine the visibility accordingly.

Determining a Reaction Time Based at Least in Part On A Velocity At The Location And A Distance To The Closest Landmark Of The Landmarks

Determining a reaction time based at least in part on a velocity at the location and a distance to the closest landmark of the locally prominent landmarks can get used in selecting, based at least in part on context data associated with the location on the path, including the plurality of locations, at least one landmark for use in navigation at the location. For example, the navigation computer can determine the reaction time based on the velocity at the location (e.g., a velocity in meters per second) and the distance to the closest landmark (e.g., a distance in meters).

Determining the landmarks that meet reaction time criteria associated with the least reaction time can get used in selecting, based at least in part on context data associated with the location on the path, including the plurality of locations, at least one landmark for use in navigation at the location. For example, the navigation computer can determine that the shortest reaction time is two seconds. The landmarks got selected from the landmarks that will become visible after two seconds.

More on Reaction Time

The reaction time can become based on a mode of transportation associated with the vantage point. Furthermore, the mode of transportation can include a motor vehicle, a bicycle, and foot travel. For example, the reaction time for a slower mode of transportation (e.g., foot travel) can become a shorter duration than the reaction time for a faster mode of transportation (e.g., a motor vehicle).

Selecting a landmark based at least in part on a level of familiarity with the landmarks can get used in selecting, based at least in part on context data associated with the location on the path, including the plurality of locations, at least one landmark for use in navigation at the location. Further, the level of familiarity can become associated with the number of times and frequency that a user (e.g., a user associated with the navigation computer) has previously been at the location (e.g., within a threshold distance of the location).

For example, the context data can include recording the frequency of a user traveling to reach landmarks. The navigation computer can determine when the frequency of a user traveling past a landmark satisfies familiarity criteria (e.g., a threshold smallest frequency) and select a landmark that satisfies the familiarity criteria. By way of further example, the level of familiarity can get associated with a part of the features (e.g., many visual characteristics). Each of the locally prominent landmarks has in common with another landmark the user has previously viewed.

Adjusting at least one navigational instruction based at least in part on the level of familiarity can get used in generating at least one navigational instruction that references at least one landmark. For example, the navigation computer can access information associated with at least one landmark (e.g., the name of a landmark used by the area’s residents) and use the information to change the navigational instruction that gets used. By way of further example, the navigation computer can determine that when the level of familiarity exceeds a threshold level of familiarity, the navigational instruction will become selected from data, including locally used terms for a landmark.

The context data can include information associated with a time of day, a season, a language (e.g., French, Russian, and Chinese), the features visible from the location, and a mode of transportation (e.g., personal automobile, bus, bicycle, foot travel).

The Navigation Computer Can Receive Image Data

The navigation computer can receive image data, including a plurality of images associated with semantic tags. Each of the semantic tags gets associated with features of the plurality of images. Further, each of the features can get associated with a geographic location. For example, the navigation computer can receive image data (e.g., tagged digital photographs), including information associated with the plurality of semantic tags and the plurality of images via a communication network through which signals (e.g., electronic signals) ando/r data can get sent and received.

The navigation computer can determine landmarks, including the features that meet entropic criteria associated with a localized prominence of each feature. For example, the navigation computer can access the plurality of semantic tags to determine the landmarks in an area that include the features (e.g., size, distance from a vantage point) that meet entropic criteria (e.g., size greater than a threshold size or distance within a threshold distance).

The navigation computer can determine, at least in part, the context data associated with a location of a vantage point on a path, including a plurality of locations, the landmarks associated with the vantage point. For example, the navigation computer can access context data, including the time of day, season, and the location of objects between the vantage point and the landmarks. Based on the context, the navigation computer can determine the visible landmarks (based on the time of day and season) that are not obstructed by other objects between the vantage point and each of the landmarks.

The Navigation Computer Can Help Find Locally Prominent Semantic Features to Generate Navigational Data

The navigation computer can generate navigational data, including indications associated with the locally prominent landmarks. For example, the navigation computer can generate visual instructions on a display device, including a description of the appearance of a landmark (e.g., “a tall glass high-rise”) and the location of the landmark (e.g., “one hundred meters ahead and on the left”).

Each of the features can get associated with a time of day (e.g., an hour and minute of the day), a season (e.g., winter, summer, spring, or autumn), a visual constancy (e.g., the extent to which the features appear the same over time), and locations from which each of the features is visible (e.g., geographic locations including latitude, longitude, and altitude from which each feature is visible).

The entropic criteria can include a frequency of occurrence of each of the features within a predetermined area not exceeding a predetermined threshold frequency (e.g., how often a feature occurs within the predetermined area), a temporal persistence (e.g., how long a feature has been present) of each of the features at a location exceeding a predetermined threshold duration, and a size (e.g., physical dimensions) of each of the features exceeding a threshold size.

Tag point and audible indications (e.g., audio produced by a loudspeaker) get associated with the relative location of the landmarks on the path about the vantage point.

Determining a mode of travel associated with the vantage point can become used in determining landmarks, including the features that meet entropic criteria associated with a localized prominence of each of the features. For example, the navigation computer can access data associated with the vantage point (e.g., the middle of a highway, sidewalk, or lake) that can get used to determine an associated mode of transportation (e.g., an automobile, foot travel, or a boat).

The System Can Determine Looking At Direction of Travel and Velocity Along The Path

The navigation computer can determine, based at least in part on a direction of travel and velocity along the path, the landmarks that will become visible from the vantage point within a predetermined time associated with the mode of travel. For example, the navigation computer can determine a velocity associated with the mode of transportation and determine the landmarks visible within the next ten seconds.

The navigation computer can generate a navigational instruction query associated with at least one navigational instruction utility. The navigational instruction query can include visual indications and audible indications. For example, when the navigation computer has determined that a predetermined amount of time has elapsed since generating at least one navigational instruction, the navigation computer can generate the navigational instruction query via an output device (e.g., a display device an audio output device).

The Navigation Computer May Receive Responses to the Navigational Instruction Query.

The navigation computer can receive responses to the navigational instruction query. The responses to the navigational instruction query can include signals or data received from input devices that can receive the responses from a user of the navigation computer (e.g., a keyboard, touch screen display, and microphone).

The navigation computer can adjust the entropic criteria based at least in part on the responses to the navigational instruction query. When the responses state that at least one navigational instruction was useful, the entropic criteria can get adjusted by modifying the entropic criteria. For example, when the entropic criteria include a smallest rate of occurrence of a feature (e.g., a certain type of restaurant occurs once per two and a half square kilometers), and the responses state that the landmark became confused with another landmark with the same feature, the smallest rate of occurrence of the feature can get decreased (e.g., the certain type of restaurant occurs once every three square kilometers).

Improving Landmark Determination For Navigation And Geocode

The systems, methods, devices, apparatuses, and tangible non-transitory computer-readable media in the disclosed technology can provide various technical effects and benefits, including an improving landmark determination for navigation and geocode. In particular, the disclosed technology may assist the user (e.g., a vehicle driver) in performing a technical task (e.g., driving a vehicle to a specified location) utilizing a continued and guided human-machine interaction process. It may also provide benefits, including improvements in the performance of communications networks, better resource usage efficiency, and improved safety.

The disclosed technology can improve the performance of communications network operation by more effectively determining a set of landmarks to ease navigation. The set of landmarks can then get provided to more efficiently route traffic through a transportation network and avoid situations in which communications network bandwidth becomes wasted due to ineffective navigational directions (e.g., sending more sets of instructions across the network when the first set of instructions is not properly followed). More effective provision of navigational instructions can reduce the number of navigational instructions sent through a communications network with a corresponding reduction in bandwidth use.

Furthermore, the disclosed technology can improve the efficiency of resource consumption (e.g., fuel and electrical energy) by providing more effective navigational instructions that leverage the use of locally prominent landmarks. For example, navigational instructions that include landmarks can result in fewer missed turns and backtracking, thereby reducing the associated excess usage of fuel or electrical energy that results.

Additionally, the use of landmarks for navigational purposes can improve driver safety when traveling in a vehicle. For example, navigational instructions that include more readily observed, locally prominent landmarks can reduce driver distraction resulting from navigational instructions that do not use prominent landmarks (e.g., street names which may get located on awkwardly located or poorly illuminated street signs).

Accordingly, the disclosed technology may assist the driver of a vehicle to perform more efficiently/effectively the technical task of driving the vehicle to a specified location employing a continued and guided human-machine interaction process. Besides, the disclosed technology may provide a computer that facilitates more effective landmark identification for use in navigation and geocoding.

The disclosed technology provides the specific benefits of reduced network bandwidth use, more efficient fuel and energy usage, and greater safety, any of which can become used to improve the effectiveness of a wide variety of services, including navigation services and geocoding services.

The Machine Learning System Behind Finding Locally Prominent Semantic Features

The system includes a computer, a server computer, a training computer, and remote computers that are communicatively connected and coupled over a network.

The computer can include any type of computer, including, for example, a personal computer (e.g., laptop computer or desktop computer), a mobile computer (e.g., smartphone or tablet), a gaming console, a controller, a wearable computer (e.g., a smartwatch), an embedded computer, and any other type of computer.

The computer includes processors and memory. The processors can use any suitable processing device (e.g., a processor core, a microprocessor, an ASIC, an FPGA, a controller, a microcontroller, etc.). They can use one processor or a plurality of processors that are operatively connected. The memory can include non-transitory computer-readable storage mediums, including RAM, ROM, EEPROM, EPROM, flash memory devices, magnetic disks, etc., and combinations thereof. The memory can store data and instructions executed by the processor to cause the computer to perform operations.

The computer can perform operations, including accessing image data, including a plurality of images associated with semantic tags. Each of the semantic tags accessed by the computer can become associated with features of the plurality of images. Further, each of the features can get associated with a geographic location. The operations performed by the computer can include determining landmarks, including the features that meet entropic criteria associated with a localized prominence of each of the features.

The operations performed by the computer can also include determining, based at least in part on context data associated with a location of a vantage point on a path, including a plurality of locations, the landmarks associated with the vantage point. Furthermore, the operations performed by the computer can include generating navigational data, including indications associated with the landmarks.

The computer can store or include machine-learned models. For example, the machine-learned models can include various machine-learned models, including neural networks (e.g., deep neural networks) or other machine-learned models, including non-linear and linear models. Neural networks can include feed-forward neural networks, recurrent neural networks (e.g., long short-term memory), convolutional neural networks, or other forms of neural networks. Examples of the machine-learned models get discussed.

The machine-learned models can get received from the server computer over the network, stored in the computer memory, and then used or otherwise implemented by the processors. In some implementations, the computer can put in place many parallel instances of a single machine-learned model of the machine learning models (e.g., to perform parallel landmark determination across many instances of the machine-learned model 120). More particularly, the machine-learned models can determine and identify the landmarks and landmarks based on various inputs, including semantic tags (e.g., semantic tags associated with features of a landmark). Further, the machine-learned models could determine navigational instructions to provide in association with landmarks that become identified.

The machine-learned models can get included in or otherwise stored and implemented by the server computer that communicates with the computer according to a client-server relationship. For example, the server computer can put in place the machine-learned models as a part of a web service (e.g., a landmark determination service). Thus, machine-learned models are get stored and implemented at the computer, and machine-learned models are get stored and implemented at the server computer.

The computer can also include a user input component that receives user input. For example, the user input component can use a touch-sensitive component (e.g., a touch-sensitive display screen or a touchpad) sensitive to t user input object (e.g., a finger or a stylus). The touch-sensitive component can serve to put in place a virtual keyboard. Other user input components include a microphone, a traditional keyboard, or other means by which a user can provide user input.

The server computer can perform operations, including accessing image data, including a plurality of images associated with semantic tags.

Each of the semantic tags can get associated with features of the plurality of images. Further, each of the features can become associated with a geographic location.

The operations performed by the server computer can include determining landmarks, including the features that meet entropic criteria associated with a localized prominence of each of the features.

The server computer can store or otherwise include machine-learned models. For example, the machine-learned models can include various machine-learned models. Example machine-learned models include neural networks or other multi-layer non-linear models. Example neural networks include feed-forward neural networks, deep neural networks, recurrent neural networks, and convolutional neural networks. Examples of the machine-learned models get discussed.

The computer and the server can train the machine-learned models via interaction with the training computer that is communicatively connected and coupled over the network. The training computer can become separate from the server computer or a part of the server computer.

More On Machine Learning Behind Finding Locally Prominent Semantic Features

The training computer includes processors and memory. The processors can use any suitable processing device (e.g., a processor core, a microprocessor, an ASIC, an FPGA, a controller, a microcontroller, etc.). They can use one processor or a plurality of processors that are operatively connected. The memory can include non-transitory computer-readable storage mediums, including RAM, ROM, EEPROM, EPROM, flash memory devices, magnetic disks, etc., and combinations thereof. The memory can store data and instructions executed by the processor to cause the training computer to perform operations. In some implementations, the training computer includes or is otherwise implemented by server computers.

The training computer can include a model trainer that trains the machine-learned machine machine-learned models, respectively, stored at the computer and the server computer using various training or learning techniques, including, for example, backward propagation of errors. In some implementations, performing backward propagation of errors can include performing truncated backpropagation through time. The model trainer can perform several generalization techniques (e.g., weight decays, dropouts, etc.) to improve the generalization capability of the trained models.

In particular, the model trainer can train the machine-learned models and the machine-learned models based on a set of training data 162. The training data can include, for example, semantic data (e.g., semantic tags) describing the location and features of landmarks in an area. For example, the training data can include physical dimensions associated with a landmark, the proximity of a landmark to points of reference (e.g., a vantage point), the location of a landmark (e.g., latitude, longitude, and altitude of a landmark), and various metadata associated with the landmark (e.g., a nickname for a landmark and a former name of a landmark).

If the searcher has provided consent, the training examples can become provided by the computer. The machine-learned models provided to the computer can train the training computer on user-specific data received from the computer. In some instances, this process can become referred to as personalizing the model.

The model trainer can include computer logic utilized to provide desired functionality. The model trainer can get implemented in hardware, firmware, and software controlling a general-purpose processor. For example, in some implementations, the model trainer includes program files stored on a storage device, loaded into memory, and executed by processors. In other implementations, the model trainer includes sets of computer-executable instructions that get stored in a tangible computer-readable storage medium, including RAM hard disk or optical or magnetic media.

The training computer can perform operations, including accessing image data, including a plurality of images associated with semantic tags. Each of the semantic tags can get associated with features of the plurality of images. Further, each of the features can get associated with a geographic location. The operations performed by the training computer can include determining landmarks, including the features that meet entropic criteria associated with a localized prominence of each of the features.

The operations performed by the training computer can also include determining, based at least in part on context data associated with a location of a vantage point on a path, including a plurality of locations, the landmarks associated with the vantage point. Furthermore, the operations performed by the training computer can include generating navigational data, including indications associated with the landmarks.

A diagram of an example device according to example embodiments

A computer can include features of the computer, the server computer, and the training computer. Furthermore, the computer can perform actions and operations, including the actions and operations performed by the computer, the server computer, and the training computer.

The computer can include memory devices, landmark data, interconnects, processors, a network interface, mass storage devices, output devices, a sensor array, and input devices.

The memory devices can store information and data (e.g., the landmark data), including information associated with the processing of instructions that get used to performing actions and operations, including accessing semantic tags, identifying landmarks, selecting a landmark for use in navigation, and generating a navigational instruction that references a landmark.

The landmark data can include parts of the data, the data, and the data. Furthermore, the landmark data can include maps, semantic tags, sensor outputs, and machine-learned models.

The interconnects can include interconnects or buses that can become used to send and receive signals (e.g., electronic signals) and data (e.g., the landmark data) between components of the computer, including the memory devices, the processors, the network interface, the mass storage devices, the output devices, the sensor array, and the input devices.

The interconnects can get arranged or configured in different ways, including parallel or serial connections. Further, the interconnects can include internal buses connecting the computer’s internal components; and external buses connecting the computer’s internal components to external devices. By way of example, the interconnects can include different interfaces including Industry Standard Architecture (ISA), Extended ISA, Peripheral Components Interconnect (PCI), PCI Express, Serial AT Attachment (SATA), HyperTransport (HT), USB (Universal Serial Bus), Thunderbolt, and IEEE 1394 interface (FireWire).

The processors can include computer processors that get configured to execute the instructions stored in the memory devices. For example, the processors can include general-purpose central processing units (CPUs), application-specific integrated circuits (ASICs), and graphics processing units (GPUs). Further, the processors can perform actions and operations, including actions and operations associated with the landmark data. For example, the processors can include single or many core devices, including a microprocessor, microcontroller, integrated circuit, and logic device.

The network interface can support network communications. For example, the network interface can support communication via a local area network and a wide area network (e.g., the Internet). The mass storage devices (e.g., a hard disk drive and a solid-state drive) can get used to store data, including landmark data. The output devices can include display devices (e.g., LCD, OLED display, and CRT display), light sources (e.g., LEDs), loudspeakers, and haptic output devices.

The input devices can include keyboards, touch-sensitive devices (e.g., a touch screen display), buttons (e.g., ON/OFF buttons, YES/NO buttons), microphones, and cameras (e.g., cameras that can get used for the detection of gestures and facial expressions).

The memory devices and the mass storage devices get illustrated separately. But, the memory devices and the mass storage devices can use regions within the same memory module. The computer can include more processors, memory devices, network interfaces, which may get provided separately or on the same chip or board. The memory devices and the mass storage devices can include computer-readable media, including, but not limited to, non-transitory computer-readable media, RAM, ROM, hard drives, flash drives, and other memory devices.

The memory devices can store instructions for applications, including an operating system associated with various software applications or data. The memory devices can get used to operating various applications, including a mobile operating system developed specifically for mobile devices. As such, the memory devices can store instructions that allow the software applications to access data, including wireless network parameters (e.g., the identity of the wireless network, quality of service), and invoke various services including telephony, location determination (e.g., via global positioning service (GPS) or WLAN), and wireless network data call origination services. In other embodiments, the memory devices can become used to operate or execute a general-purpose operating system that operates on both mobile and stationary devices, such as smartphones and desktop computers, for example.

The software applications that the computer can operate or execute can include applications associated with the system. Further, the software applications that can get operated or executed by the computer can include native applications or web-based applications.

The computer can get associated with or include a positioning system (not shown). The positioning system can include devices or circuitry for determining the position of the computer. For example, the positioning device can determine actual or relative position by using a satellite navigation positioning system (e.g., a GPS, a Galileo positioning system, the GLObal Navigation satellite system (GLONASS), the BeiDou Satellite Navigation and Positioning system), an inertial navigation system, a dead reckoning system, based on IP address, by using triangulation and proximity to cellular towers or Wi-Fi hotspots, beacons, and the like and other suitable techniques for determining position.

Landmark detection according to example

The output can get generated and determined by a computer or computer that includes features of the computer, the server computer, and the training computer. The image includes a pole object, a fountain object, a bus stop object, a pedestrian object, and a vehicle object.

The image depicts a scene with features associated with various objects that have become identified by a content analysis system (e.g., a content analysis system including machine-learned models trained to detect features of input content that can include or more images). For example, the content analysis system can include features of the computer, the server computer, and the training computer.

Further, the features of the image, including the pole object, the fountain object, the bus stop object, the pedestrian object, and the vehicle object, can get associated with various semantic tags (e.g., semantic tags that include features of the plurality of semantic tags that can include descriptions of various aspects of the features. For example, the image features can get associated with a plurality of semantic tags based on image content analysis performed by machine-learned models that include features of the machine-learned models.

A semantic tag associated with the pole object can get used to state that the pole object becomes associated with a type of object that occurs frequently (e.g., there are many poles throughout the city in which the image got captured, including poles for street signs, telephone cables, electrical cables, and various utility poles) and that the pole object has low distinctiveness (e.g., the various poles found throughout the city in which the image got captured can have a similar size, shape, and appearance). Further, about prominence, the pole object is a significant size (e.g., significant relative to a predetermined significance threshold which can become associated with physical dimensions, including the height of an object). It can get seen from various angles and distances.

The pole object is visually constant, with minimal changes in appearance over time. Changes in the appearance of the pole object can include minor changes due to climatic conditions (e.g., snow cover) and human interventions (e.g., graffiti and posters) that do not significantly render the pole object indistinct. The pole object becomes attached to the ground. It has a constant location over time, making the pole object more apt to get selected as a landmark than objects likely to move away from a location (e.g., a pedestrian).

Furthermore, about the context associated with the pole object (e.g., the context associated with the information included in the context data. The pole object is in the foreground and closer to the vantage point from which the image became captured, allowing an unobstructed view of the pole object. The image of the pole object gets captured when daylight allows the pole object to be clearly visible without more light (e.g., a street lamp).

The pole object is also close to a street lamp (not shown) that can illuminate the pole object when darkness falls. As such, when landmarks get selected (e.g., selected by the computer) at the vantage point at which the image got captured, the pole object is less apt to become included than the other potential landmarks in the image that occur less frequently, are more distinctive, more prominent, and are more visible from the vantage point.

A semantic tag associated with the fountain object can get used to state that the fountain object becomes associated with a type of object that occurs infrequently (e.g., there are a small number of fountains throughout the city in which the image gets captured) and that the fountain object is highly distinctive (e.g., the fountain object has a unique sculpture that sets it apart from other fountains). Further, about prominence, the fountain object is large, and the water spraying into the air from the fountain object increases the distance from which the fountain object is visible. The fountain object can also become seen from different angles (e.g., the fountain object can get seen from all sides other than from beneath the fountain object) and distances.

But, the fountain object has a lower visually constancy since the appearance of the fountain changes substantially depending on the season (e.g., the water in the fountain gets frozen or drained in cold seasons) and whether the fountain is operational (e.g., the fountain object looks different when it is spouting water than when the fountain is not spouting water). Furthermore, the fountain object is firmly fixed in the ground. It has a constant location over time, making the fountain object more apt to get selected as a landmark than mobile objects (e.g., a bus or a train).

Furthermore, about the context associated with the fountain object (e.g., the context associated with information that can get included in the context data. The fountain object is in the background and further away from the vantage point from which the image got captured, which can sometimes obstruct the view of the fountain object depending on the vantage point from which the fountain object became viewed.

The image of the fountain object got captured when daylight allows the fountain object to become clearly visible without more light (e.g., a street lamp). The fountain object also includes its own set of lamps that can illuminate the fountain object at night. As such, when landmarks got selected (e.g., selected by the computer) at the vantage point at which the image became captured, the fountain object is more apt to get included than the other potential landmarks in the image that occur more frequently, are less prominent, and are less distinctive.

A semantic tag associated with the bus stop object can get used to state that the bus stop object gets associated with a type of object that frequently occurs (e.g., there are many bus stops throughout the city in which the image got captured) and that the bus stop object has low distinctiveness (e.g., the bus stop object lacks distinctive features that set it apart from other bus stops). Further, about prominence, the bus stop object is significant and be get seen from various angles and distances.

Additionally, the bus stop object is visually constant, with minimal changes in appearance over time. Changes in the appearance of the bus stop object can include minor changes due to climatic conditions (e.g., snow cover) and human interventions (e.g., graffiti and posters) that do not significantly render the bus stop object indistinct. Further, the bus stop object is a structure that gets attached to the ground and has a constant location over time, making the bus stop object more apt to get selected as a landmark.

Furthermore, about the context associated with the bus stop object (e.g., the context associated with information that can get included in the context data described in the method. The bus stop object is in the background and further away from the vantage point from which the image gets captured, which obstructs the view of the bus stop object. For example, from various vantage points, the pole object can obstruct the bus object. The image of the bus stop object gets captured when daylight allows the bus stop object to become clearly visible without more light (e.g., a street lamp). Further, the bus stop object is also located close to a street lamp (not shown) and has its own light source that illuminates when night falls (e.g., a lamp inside the bus stop object).

As such, when landmarks get selected (e.g., selected by the computer) at the vantage point at which the image got captured, the bus stop object is less apt to become included than the other potential landmarks in the image that occur less frequently, are more distinctive, and are more visible from the vantage point.

A semantic tag associated with the pedestrian object can get used to state that the pedestrian object gets associated with a type of object that occurs frequently (e.g., there are many pedestrians present at various locations throughout the city in which the image got captured) and that the pedestrian object has low distinctiveness (e.g., at a distance pedestrians tend to look alike and nearby a large part of pedestrians are not very distinctive in appearance). Further, about prominence, the pedestrian object is not especially large. It is often obstructed by other objects, including, for example, the pole object and the bus stop object, depending on the vantage point from which the pedestrian object got viewed.

Additionally, the pedestrian object is not visually constant since changes in clothing and other aspects of a pedestrian’s physical appearance are relatively frequent. For example, changes in the appearance of the pedestrian object can include changes in the clothing, eyewear, and hats worn by the pedestrian object. Further, the pedestrian object is mobile and has a location that is highly variable over time. For example, the pedestrian object can move from a home location to the bus stop object to a bus that transports the pedestrian object away from the location in the image.

Furthermore, about context associated with the pedestrian object (e.g., the context associated with information that can get included in the context data, The pedestrian object is in the background and at an intermediate-range from the vantage point from which the image got captured. The image of the pedestrian object gets captured when daylight allows the pedestrian object to become clearly visible without more light (e.g., a street lamp).

The pedestrian object is also close to a street lamp (not shown) that can illuminate the pedestrian object when darkness falls. As such, when landmarks get selected (e.g., selected by the computer) at the vantage point at which the image got captured, the pedestrian object is less apt to become included than the other potential landmarks in the image that occur less frequently, are more distinctive, are more prominent, and are more visible from the vantage point. Further, the high mobility associated with the pedestrian object also makes the pedestrian object less apt to get selected as a landmark.

A semantic tag associated with the vehicle object can get used to state that the vehicle object becomes associated with a type of object that frequently occurs (e.g., there are many vehicles present at various locations throughout the city in which the image got captured) and that the vehicle object has low distinctiveness (e.g., there are many vehicles that appear similar including vehicles of the same make, model, and color).

Further, about prominence, the vehicle object is significant and be get seen from various angles and distances. But, due to being mobile, other objects often obstruct the vehicle object. Additionally, the vehicle object is visually constant since the shape and color of the vehicle object remain constant over time. Further, the vehicle object is highly mobile and has a location that can change rapidly in a short period of time. For example, except when parked, the vehicle object tends to change location frequently as it gets used to transport people and goods throughout the city in which the image got captured.

Furthermore, about the context associated with the vehicle object (e.g., a context associated with information that can get included in the context data. The vehicle object is far in the background about the vantage point from which the image got captured. The image of the vehicle object captured during daylight allows the vehicle object to become clearly visible without more light (e.g., a street lamp).

The vehicle object also has its own light source (e.g., headlights) that can illuminate the area around the vehicle object, but that does not provide a large amount of illumination of the vehicle object itself (e.g., the headlights and tail lights of the vehicle object are obvious but provide the different impression of the appearance of the vehicle object when darkness has fallen).

As such, when landmarks get selected (e.g., selected by the computer) at the vantage point at which the image got captured, the vehicle object is less apt to become included than the other potential landmarks in the image that occur less frequently, are more distinctive, are more prominent, and are more visible from the vantage point. Further, the high mobility associated with the vehicle object also makes the vehicle object less apt to become selected as a landmark.

Landmark Detection Depicts A Scene With Features Associated With Various Objects

The output can get generated and determined by a computer that includes features of the computer, the server computer, and the training computer. The image includes a building object, a sculpture object, a sculpture object, a signage object, and an address object.

The image depicts a scene with features associated with various objects identified by a content analysis system (e.g., a content analysis system including machine-learned models trained to detect input content features that can include more images). For example, the content analysis system can include features of the computer, the server computer, and the training computer.

Further, the features of the image, including the building object, the sculpture object, the sculpture object, the signage object, and the address object, can become associated with various semantic tags (e.g., semantic tags that include features of the plurality of semantic tags that can include descriptions of various aspects of the features. For example, the image features can become associated with a plurality of semantic tags based on image content analysis performed by machine-learned models that include features of the machine-learned models.

A semantic tag associated with the sculpture object and the sculpture object can get used to state that the sculpture object and the sculpture object become associated with a type of object that occurs infrequently (e.g., there are a small number of matching pairs of relief sculptures on buildings in the city in which the image got captured) and that the sculpture object and the sculpture object are highly distinctive (e.g., the sculpture object and the sculpture object have unique sculptural features that set them apart from other sculptures).

Further, about prominence, the sculpture object and the sculpture object are a significant size (e.g., significant relative to a predetermined significance threshold which can become associated with physical dimensions including the height of an object) and protrude from the surface of the building object, which allows the sculpture object and the sculpture object to become seen from various angles and distances.

The location of the sculpture object next to the sculpture object further increases the distinctiveness of the pair as the distinctiveness of either object can get based not only on the individual distinctiveness of each object but also on the distinctiveness of both objects as part of a pair of objects. Accordingly, combinations of features can get associated with a semantic tag (e.g., a semantic tag describing “a pair of relief sculptures”).

Additionally, the sculpture object and the sculpture object are visually constant since the sculpture got carved in granite (a hard-wearing stone) changed over time. Further, the sculpture object and the sculpture object are part of the building object and firmly attached to the building object and remain at the same location over time, making the sculpture object and the sculpture object more apt to get selected as a landmark.

For the context associated with the sculpture object and the sculpture object (e.g., a context associated with information that can get included in the context data described in the method), the sculpture object and the sculpture object get located at a readily visible height from the ground level. The image of the sculpture object and the sculpture object gets captured when daylight allows the sculpture object and the sculpture object to be clearly visible without more light. Neither the sculpture nor the sculpture object includes a light source and is not positioned where an external light source can illuminate the object or object at night. As such, when landmarks get selected (e.g., selected by the computer) at the vantage point at which the image gots captured, the sculpture object and the sculpture object are more apt to get included than the other potential landmarks in the image that occur more frequently, are less prominent, and are less distinctive.

A semantic tag associated with the signage object can get used to state that the signage object becomes associated with a type of object that frequently occurs (e.g., there are many signs of various sizes and shapes on buildings throughout the city in which the image got captured) and that the signage object has low distinctiveness (e.g., from a distance the signage object looks like many other different signs and there are other signs that appear the same as the signage object).

Further, about prominence, the signage object is not very large, does not occupy a prominent position, and may become difficult to discern from a long distance or from an angle that is not close to being directly in front of the signage object. The signage object is itself visually constant, with minimal changes in appearance over time. But, exposure to the elements (e.g., snow cover) can significantly alter the visibility of the signage object. Further, the signage object is at a constant location time, making an object more apt to get selected as a landmark.

To the context associated with the signage object (e.g., the context associated with the information included in the context data. The image of the signage object got captured when daylight allows the signage object to become clearly visible without additional light (e.g., a street lamp). The signage object includes a light source that can illuminate the signage object when darkness falls. As such, when landmarks get selected (e.g., selected by the computer) at the vantage point at which the image became captured, the signage object is less apt to get included than the other potential landmarks in the image that occur less frequently, are more prominent, and are more distinctive.

A semantic tag associated with the address object can become used to state that the address object gets associated with a type of object that frequently occurs (e.g., there are many street address signs of various sizes and shapes on buildings, poles, and other structures throughout the city in which the image get captured) and that the address object has low distinctiveness (e.g., from a distance the address object looks like many other address signs and there are other signs inscribed with the number “107” that appear the same as the address object).

Further, about prominence, the address object is small in size and can be difficult to discern from a distance. The address object is itself visually constant, with minimal changes in appearance over time. Further, a heating element in the address object prevents the address object from becoming obscured due to environmental conditions, including snow or frost. Further, the address object has a constant location over time which makes the address object more apt to become selected as a landmark.

Concerning the context associated with the address object (e.g., the context associated with information that can get included in the context data described, the image of the address object becomes captured during a time at which daylight allows the address object to be clearly visible without more light (e.g., a street lamp). But, the address object does not include its own light source, nor is it illuminated by an external light source (e.g., a street lamp) at night.

As such, when landmarks get selected (e.g., selected by the computer) at the vantage point at which the image got captured, the address object is less apt to become included than the other potential landmarks in the image that occur less frequently, are more distinctive, and are more prominent. By way of example, the computer can select at least one of the objects as a landmark for use as a navigational instruction (e.g., the navigational instruction described in the method depicted in FIG. 7).

Landmark identification according to example embodiments

The image is a map of an area (Paris and the environs surrounding Paris), including representations of locations associated with semantic tags (e.g., semantic tags that include features of the plurality of semantic tags described in the method) that denote locally prominent landmarks.

In this example, the objects are the location of fountains (e.g., fountains with features of the fountain object) that are locally prominent.

The region, which includes the object, includes fewer fountains than the region, which has plenty of fountains, including the object and the object. In determining a locally prominent landmark to select for use in navigation, the frequency at which an object associated with a semantic tag occurs can influence the selection of the object for use as a landmark, with less frequently occurring objects more apt to become selected as a landmark than more commonly occurring objects.

Accordingly, all other things being equal, the object is more likely to become selected as a landmark in the region than in the region. For example, the computer can select one of the objects as a landmark for use as a navigational instruction (e.g., the navigational instruction described in the method).

Landmark identification in an environment

Operations in the environment can get performed by a computer or computer that includes features of the computer, the server computer, and the training computer. The environment includes a computer, a road, a field of view threshold, a field of view threshold, a vantage point, a direction of travel, a tree object, a monument object, a lamppost object, a street sign, and a park bench object.

The computer includes features of the computer. It gets located inside the vantage point, a vehicle traveling on the road along with a travel path corresponding to the direction of travel. The vantage point includes sensors, including cameras that provide a view of the environment that captures the portion of the environment in front of the field of view threshold and the field of view threshold. Further, the field of view delineates a boundary that divides the environment into a first region that includes the street sign object, in which objects can get selected for use as a landmark; and a second region that includes the park bench object, in which objects cannot become selected for use as a landmark.

From the vantage point, the objects associated with semantic tags include the park bench object, which is a park bench that is outside the field of view of the sensors of the computer and will not get selected as a potential landmark since the park bench is outside of the field of view. The vantage point is moving away from the park bench object.
An object behind a viewer moving away from the object may become of lesser value for use as a navigational instruction. Within the field of view from the vantage point are the tree object which is a maple tree, the monument object, which is a monument on a high pedestal including a large bronze statue of a rider garbed in Napoleonic era attire and mounted on a horse, the lamp post object which is a small lamp post, and the street sign object which is a stop sign.

In this example, the monument object gets selected for use as a landmark due to its rarity (e.g., the monument is one of a kind and the occurrence of similar monuments is low), prominence (e.g., the monument is large and mounted on a high pedestal), distinctiveness (e.g., the monument has a uniquely distinctive appearance), and its location which is within the field of view from the vantage point.

Accessing a Plurality of Locally Prominent Semantic Tags Associated With a Plurality of Images

Parts of the method can get executed or implemented on computers or computers, including, for example, the computer, the server computer, and the training computer. Further, parts of the method can get executed or implemented as an algorithm on the hardware devices or systems disclosed herein. Depicts steps performed in a particular order for purposes of illustration and discussion. Using the disclosures provided herein, those of ordinary skill in the art will understand that various steps of any of the methods disclosed herein can become adapted, modified, rearranged, omitted, and expanded without deviating from the scope of the patent.

The method can include accessing a plurality of semantic tags associated with a plurality of images. Each semantic tag of the plurality of semantic tags can become associated with features depicted by one of the plurality of images. For example, each semantic tag can include a semantic description of an object in a scene depicted by one of the plurality of images. Further, each feature of the features can get associated with a geographic location. The plurality of images can include digital images (e.g., a two-dimensional image) of a part of an environment (e.g., an image of a set of objects at a particular location in an environment).

The plurality of images can get encoded in any type of image format, including a combination of raster images (e.g., bitmaps comprising a grid of pixels) and vector images (e.g., polygonal representations of images based on positions of coordinates including x and y axes of a two-dimensional plane). The images can include still images, image frames from a movie, and other types of imagery, including LIDAR imagery, RADAR imagery, and other types of imagery.

Examples of digital image formats used by the plurality of images can include JPEG (Joint Photographic Experts Group), BMP (Bitmap), TIFF (Tagged Image File Format), PNG (Portable Network Graphics), and GIF (Graphics Interchange Format). The images can get collected from various sources such as user-submitted imagery, imagery in the public domain (e.g., obtained via web crawl and properly aggregated and anonymized), street-level panoramic imagery collected from various sources, including user devices (e.g., smartphones with cameras), and other sources of images.

The plurality of semantic tags associated with the images can become associated with features including physical dimensions (e.g., physical dimensions of objects in an image including the height, length, and width of an object); descriptions (e.g., descriptions including manually created descriptions and descriptions generated by a content analysis system which can include features of the computer, the server computer, and the training computer) and object identifiers (e.g., the identity of objects depicted in the images).

For example, an identifier for the Eiffel tower in Paris, France, can include an identifier associated with the name of the Eiffel tower in English, “Eiffel tower,” as well as the name of the Eiffel in French “La tour Eiffel.” Furthermore, semantic tags associated with the Eiffel tower can include information associated with the Eiffel tower including physical dimensions (e.g., the height of the Eiffel tower), color (e.g., brown), and semantic tags associated with the type of structure (e.g., building, tower, tourist attraction, and monument). More information associated with any of the plurality of images and the plurality of semantic tags can include: a location (e.g., a street address and an altitude, latitude, and longitude associated with an image); a time of day (e.g., a time of day when an image became captured); a date (e.g., a date when an image got captured);

The computer can access locally stored data (e.g., stored in a computer’s storage device), including information associated with the plurality of semantic tags and images. Further, the computer can access (e.g., access via the network) data (e.g., the plurality of images and the plurality of semantic tags) stored on a remote storage device (e.g., a storage device of the server computer and the training computer).

The computer can receive data including information associated with the plurality of semantic tags and the plurality of images via a communication network (e.g., a wireless and wired network including a LAN, WAN, or the Internet) through which signals (e.g., electronic signals) and data can get sent and received. The computer can, for instance, receive the plurality of images and the plurality of semantic tags from the server computer and the training computer via the network.

Each of the features can get associated with a time of day (e.g., an hour, minute, and second of the day), a season (e.g., winter, summer, spring, or autumn), a visual constancy (e.g., the extent to which the features appear the same over time), and locations from which each of the features is visible (e.g., geographic locations including latitude, longitude, and altitude from which each feature is visible).

For example, the visual constancy of an existing building can get associated with the size of the building remaining the same over time (e.g., from month to month or year to year). As such, a building under construction with only a building foundation at start, a bare steel frame structure after two months, and a completed facade after three months would have a lower visual constancy than a fully constructed building with minimal structural changes over time.

The method can include identifying, based at least in part on the plurality of semantic tags, landmarks that include the features that meet entropic criteria that measure characteristics of the features, including a localized prominence of each of the features. For example, the computer can access data associated with the plurality of semantic tags that state an area’s landmarks. Further, the landmarks can get associated with features (e.g., physical dimensions or shape) compared to entropic criteria used to identify the landmarks.

The entropic criteria can become associated with the frequency with which each of the features occurs in the area (e.g., the rate of occurrence of the features), the distinctiveness of each feature (e.g., the extent to which each feature is different from other features in the area), the prominence of each feature (e.g., how prominent a feature is physical dimensions and visibility), the visual constancy of a feature (e.g., how much the feature changes visually over time), and a locational persistence of the feature (e.g., the extent to which the feature will remain at the same location over time). Satisfaction of the entropic criteria can get based, for example, on a feature being infrequent (e.g., the only flag post in an area or one of two tall buildings in an area).

For example, the entropic criteria can get applied to semantic tags associated with images. For example, the entropic criteria can get applied to the semantic tags associated with the image to determine that the fountain object satisfies the entropic criteria based partly on its lower frequency, greater distinctiveness, and greater prominence than the other objects in the image.

By way of further example, the entropic criteria can get applied to the semantic tags associated with the image to determine that the sculpture object and the sculpture object meet the entropic criteria based in part on their lower frequency, greater distinctiveness, and greater prominence in comparison to the other objects in the image.

Clustering or other algorithmic techniques can become used to determine a rarity or infrequency associated with each feature, which can then guide the selection of features for use as landmarks. As one example, for each location, an area around the location can get analyzed to identify which features associated with the location are most rare (e.g., a histogram of semantic tags in an area around a vantage point location might determine that a monument with a rider mounted on a horse occurs once while a set of traffic lights occurs twenty times, thereby indicating that the monument has higher entropy value and is a better choice for use as a landmark).

By way of further example, the satisfaction of the entropic criteria can include the distinctiveness of various characteristics of a feature about other similar features in the area (e.g., a small house located on one corner of an intersection with four corners will contrast with, and be more distinctive than, high-rise buildings located on the other three corners).

Landmarks can get determined from the semantic tag statistics aggregated geographically and overtime for each location by focusing on “tags” of high entropy, e.g., those tags associated with the features that appear to persist in time and may exhibit highly localized prominence. Thus, the system can identify high confidence tags but comparatively unusual or rare in the surrounding area.

The entropic criteria can include a frequency of occurrence of each of the features within a predetermined area not exceeding a predetermined threshold frequency (e.g., the number of times the features occur per unit of area), a temporal persistence (e.g., how long a feature has been present in a particular area) of each of the features at a location exceeding a predetermined threshold duration, and size (e.g., physical dimensions) of each of the features exceeding a threshold size.

For example, a feature (e.g., a statue) occurring more than once in a twenty-meter by twenty square meter area could exceed a predetermined threshold frequency of a statue occurring no more than once in a fifty-by-fifty-square-meter area.

The method can include selecting, based at least in part on context data associated with a location on a path, including a plurality of locations, at least one landmark for use in navigation at the location. For example, the computer can determine a context, including the time of day, season, and amount of traffic proximate to a vantage point. Based on, for example, context indicating that a heavy fog has enveloped the area around the location, the computer can select a landmark that is brightly illuminated.

The computer can determine that the context data indicates that the season is winter and that the area becomes covered in snow. Based on some part of the landmarks getting covered in snow, the computer can select at least one locally prominent landmark when covered in snow or that gets situated such that at least one landmark is not covered in snow. For example, a very tall building (the tallest building in the area) may not get covered in snow and thus could remain locally prominent. In contrast, a distinctive mural on a wall may not become visible due to snow cover.

The amount of traffic proximate to a vantage point can become determined by the computer based on outputs from the sensor array, which can include cameras that can detect the amount of traffic (e.g., foot traffic and vehicular traffic) and other activity (e.g., construction) proximate to the vantage point. The computer can select at least one landmark based on the amount of traffic detected by the sensor array.

The method can include generating at least one navigational instruction that references at least one landmark. At least one navigational instruction can include visual instructions (e.g., textual instructions displayed on a display device) and audible instructions (e.g., instructions emitted from an audio output device).

The computer can generate visual instructions, including textual instructions displayed on a display device (e.g., a flat panel display). The textual instructions can, for example, describe the appearance of a landmark (e.g., “a large white building with a row of Ionic columns”) and the location of the landmark (e.g., “straight-ahead”). The visual instructions can include images associated with the landmark. For example, the textual instructions of the “large white building with a row of Ionic columns” can get accompanied by an image of the building.

Furthermore, the image of the large white building with the row of Iconic columns included in the visual instructions may get generated based at least in part on the location of the vantage point for the landmark. For example, when a landmark is a four-sided building, the image associated with the visual instructions can become an image of the side of the building that is visible from the vantage point.

By way of further example, the computer can generate audible instructions describing the appearance of a landmark (e.g., “an ornate gate emblazoned with a lion and a dragon”) and the location of the landmark (e.g., “on your left at the next intersection”). Further, the audible instructions can become the same as the visual instructions (e.g., when the visual instructions visually state “a red building on the right,” the audio instructions can state “a red building on the right”).

The audible instructions can become different from the visual instructions (e.g., when the visual instructions visually state “a red building on the right,” the audio instructions can indicate “a large red building next to a small white building”).

At least one audible instruction can get generated at a volume level associated with the proximity of at least one landmark about the vantage point. For example, the audible instructions can become generated at a volume inversely proportional to the distance between the vantage point and at least one landmark (e.g., the volume of the audible instructions increases as the distance to at least one landmark decreases).

Determining the Rate at Which Each Feature Occurs Within A Predetermined Area

Parts of the method can become executed or implemented on computers or computers, including, for example, the computer, the server computer, and the training computer. Further, parts of the method can get executed or implemented as an algorithm on the hardware devices or systems disclosed herein. Parts of the method can get performed. Depicts steps performed in a particular order for purposes of illustration and discussion. Using the disclosures provided herein, those of ordinary skill in the art will understand that various steps of any of the methods disclosed herein can get adapted, modified, rearranged, omitted, and expanded without deviating from the scope of the patent.

The method can include determining the rate at which each feature occurs within a predetermined area. F based at least in part on the plurality of semantic tags. Further, the features can include individual features, including the size of an object or the brightness of illuminated objects, and combinations of the features, including combinations of colors and shapes.

For example, the computer can determine a rate at which a combination of features (e.g., a bright yellow colored stylized letter of the alphabet) occurs within a one square kilometer area. Further, the features can get associated with semantic tags for an alphabetical letter (e.g., the letter “N”), a particular color (e.g., bright yellow), and an establishment associated with the features (e.g., a bank, restaurant, or other business). The computer can determine that the combination of features occurs once within the one square kilometer area.

Determining, based at least in part on the plurality of semantic tags, a rate at which each of the features occurs within a predetermined area can get used in identifying, based at least in part on the plurality of semantic tags, landmarks that include the features that meet entropic criteria that measure a localized prominence of each of the features.

The method can include determining that the landmarks include the least frequent features below a threshold rate. For example, in determining a restaurant to use as a landmark among a group of restaurants, the computer can determine that the restaurant that occurs the least among other restaurants in the area will get included in the landmarks. By way of further example, in determining an object to use as a landmark, the computer can determine that a prominent sculpture that occurs below a threshold rate (e.g., a threshold rate of one occurrence per square kilometer) will become included in the landmarks.

Determining that the landmarks include the features that occur the least or that occur at a rate below a threshold rate can get used in identifying, based at least in part on the plurality of semantic tags, locally prominent landmarks that include the features that satisfy entropic criteria that measure a localized prominence of each of the features.

The method can include determining a confidence score for each feature based at least in part on the number of times that each feature of the features has become associated with a semantic tag of the plurality of semantic tags. For example, the computer can access data associated with the number of times that each feature has gotten tagged with the same tag or a related set of tags (e.g., a set of tags including a window tag can also include a porthole tag). A feature tagged with the same feature a greater number of times can get associated with a higher confidence score. A feature tagged with a feature a lower number of times can get associated with a lower confidence score.

Determining a confidence score for each of the features based at least in part on some times that each respective feature of the features has gotten associated with a semantic tag of the plurality of semantic tags can become used in identifying, based at least in part on the plurality of semantic tags, landmarks that include the features that meet entropic criteria that measure a localized prominence of each of the features.

The method can include identifying the features with a confidence score that satisfies confidence score criteria as a landmark. For example, the computer can identify the features with a confidence score equal to or exceeds a threshold confidence score as a landmark.

The confidence score can get based at least in part on several different perspectives from which each of the features associated with a semantic tag has become viewed (e.g., semantic tags associated with images of the same object viewed from various angles, perspectives, and distances), and recency (e.g., the amount of time that has elapsed since the features got associated with the semantic tag) with which the features became associated with a semantic tag.

Identifying as a landmark, the features with a confidence score that meet confidence score criteria can get used in identifying, based at least in part on the plurality of semantic tags, landmarks that include the features that meet entropic criteria that measure a localized prominence of each of the features.

The method can include determining clusters of the features that meet the entropic criteria. Further, each of the clusters can include features that have a common semantic type. For example, the computer can determine clusters of buildings that can meet entropic criteria associated with a building density for the cluster of buildings (e.g., the number of buildings within a predetermined area, a predetermined radius from the vantage point, or within a predetermined number of blocks).

Determining clusters of the features that meet the entropic criteria can get used in identifying, based at least in part on the plurality of semantic tags, Locally prominent landmarks that include the features that satisfy entropic criteria that measure a localized prominence of each of the features.

Determining Each of The Landmarks from the Vantage Point Associated With the Location

Parts of the method can get executed or implemented on computers or computers, including, for example, the computer, the server computer, and the training computer. Further, parts of the method can become executed or implemented as an algorithm on the hardware devices or systems disclosed herein. Parts of the method can get performed.

This depicts steps performed in a particular order for purposes of illustration and discussion. Using the disclosures provided herein, those of ordinary skill in the art will understand that various steps of any of the methods disclosed herein can get adapted, modified, rearranged, omitted, and expanded without deviating from the scope of the patent.

The method can include determining each of the landmarks from the vantage point associated with the location. Further, the context data can include a vantage point (e.g., a point within the location where parts of the surrounding environment can get viewed). For example, the computer can determine the landmarks that are visible from the vantage point based in part on a distance to each of the landmarks (e.g., visibility can decrease with greater distance to a landmark), whether each of the landmarks gets obstructed by other objects (e.g., trees obstructing a landmark), a height of each of the landmarks (e.g., a very tall building that is four hundred meters tall can become more visible than a low-rise building that is twenty meters tall).

The visibility can get based at least in part on a distance from which each of the landmarks is visible from the vantage point, an amount of light that is cast on each of the landmarks (e.g., an amount of light emitted by street lamps proximate to the landmarks, and an amount emitted by a light source included in each of the landmarks), any obstructions between the vantage point and the landmarks, and physical dimensions of each of the landmarks (e.g., the height of a building).

Determining visibility of each of the landmarks from the vantage point associated with the location can get used in selecting, by the computer, based at least in part on context data associated with a location on a path including a plurality of locations, at least one landmark for use in navigation at the location as described in 706 of the method that gets depicted.

The method can include determining a reaction time based at least in part on a velocity at the location and a distance to the closest landmark of the landmarks. For example, the computer can determine the reaction time by reaching a landmark at a current velocity. When the vantage point is stationary (e.g., standing in one place or sitting in a stationary automobile), the reaction time will be greater than when the vantage point is in an automobile traveling at one hundred kilometers per hour.

The reaction time can get based at least in part on a mode of transportation associated with the vantage point. Furthermore, the mode of transportation can include a motor vehicle, a bicycle, and foot travel. For example, the reaction time for a slower mode of transportation (e.g., cycling) can be longer than the reaction time for a faster mode of transportation (e.g., riding on a bus).

The reaction time can vary based on whether the vantage point gets associated with a vehicle driver or a passenger of a vehicle. When the vantage point gets associated with the vehicle’s driver, the reaction time can become less than when the vantage point gets associated with the passenger of a vehicle.

Determining a reaction time based at least in part on a velocity at the location and a distance to the closest landmark of the landmarks can get used in selecting, based at least in part on context data associated with the location on the path, including the plurality of locations, at least one landmark for use in navigation at the location as described in 706 of the method.

The method can include determining the locally prominent landmarks that meet reaction time criteria associated with a smallest reaction time. For example, the computer can determine that the smallest reaction time is five seconds and that the landmarks get selected from the landmarks that will become visible for more than five seconds after the current time.

Determining the landmarks that meet reaction time criteria associated with a smallest reaction time can get used in selecting, based at least in part on context data associated with the location on the path, including the plurality of locations, at least one landmark for use in navigation at the location.

Determining a Direction of Travel Along the Path At The Vantage Point

Parts of the method can get executed or implemented on computers or computers, including, for example, the computer, the server computer, and the training computer. Further, parts of the method can get executed or implemented as an algorithm on the hardware devices or systems disclosed herein. Parts of the method can get performed as part of the method and as part of the method that Depicts steps performed in a particular order for illustration and discussion purposes. Using the disclosures provided herein, those of ordinary skill in the art will understand that various steps of any of the methods disclosed herein can get adapted, modified, rearranged, omitted, and expanded without deviating from the scope of the patent.

The method can include determining a direction of travel along the path at the vantage point. Further, the visibility of each of the landmarks can get associated with a field of view (e.g., a field of view associated with the area in which the landmarks are visible) from the vantage point that gets associated with the direction of travel (e.g., a field of view equal to a predetermined angle relative to a line corresponding to the direction of travel). The field of view threshold and the field of view threshold can get associated with the visibility from the vantage point when traveling along the direction of travel 612.

Determining a direction of travel along the path at the vantage point can determine the visibility of each of the landmarks from the vantage point associated with the location.

The method can include determining the landmarks that face the direction of travel. For example, the computer can determine that the landmarks that face the direction of travel (e.g., the landmarks that are visible from in front of the vantage point) are more visible than the landmarks that do not face the direction of travel (e.g., the landmarks that are not visible from in front of the vantage point). The tree object is more visible than the park bench object that does not face the direction of travel of the vantage point.

Determining the locally prominent landmarks that face the direction of travel can determine each from the vantage point associated with the location.

The method can include determining the visibility based at least in part on a mode of transportation associated with the vantage point. For example, the computer can determine that the visibility from an open-top vehicle (e.g., a convertible automobile or the top of an open-air tour bus) is greater than the inside of an enclosed vehicle (e.g., an automobile with a permanent roof).

Determining the visibility based at least in part on a mode of transportation associated with the vantage point can determine the visibility of each of the landmarks from the vantage point associated with the location.

Selecting of A Landmark-Based On A Level of Familiarity With The Landmark

Parts of this locally prominent method can get executed or implemented on computers or computers, including, for example, the computer, the server computer, and the training computer. Further, parts of the method can get executed or implemented as an algorithm on the hardware devices or systems. Parts of the method can get performed. Depicts steps performed in a particular order for purposes of illustration and discussion. Using the disclosures provided herein, that ordinary skill in the art, all understand that various steps of any of the methods disclosed herein can get adapted, modified, rearranged, omitted, and/ expanded without deviating from the scope of the patent.

The method can include selecting at least one landmark based on a level of familiarity with the landmarks. Further, the level of familiarity can get based at least in part on the user’s previous association with the location of the features of the landmarks.

For example, the level of familiarity can be greater when a user has previously visited the location (e.g., the user has been to the same neighborhood in the past) and previously viewed a different landmark (e.g., a department store with a distinctively colored logo and physical exterior) that has a similar set of the features s a landmark of the landmarks (e.g., a department store with the same distinctively colored logo and physical exterior).

The level of familiarity can get associated with many times and frequency that a user (e.g., a user associated with the navigation computer) has previously been at the location (e.g., within a threshold distance of the location). For example, the context data can include a record of the number of times a user has traveled past each of the landmarks. The computer can then determine when the number of times a user has traveled past a landmark satisfies familiarity criteria (e.g., a threshold smallest number of times passing the landmark) and select a landmark that satisfies the familiarity criteria.

The level of familiarity can get associated with a part of the features (e.g., many visual characteristics) each of the landmarks has in common with another landmark the user has previously viewed (e.g., viewed at a time preceding the time the landmarks became viewed from the vantage point). Further, the user’s level of familiarity with a landmark the user has viewed from the vantage point can become greater when the landmark shares a predetermined part (e.g., ninety percent of the features) of the features with a different landmark the user has previously viewed.

The level of familiarity for a currently viewed landmark that is a restaurant associated with a chain of restaurants that share features (e.g., similar design features including the same or similar logo, signage, exterior color scheme, and exterior masonry) will be greater when the user has previously observed another landmark (e.g., another restaurant) that belongs to the same chain of restaurants and shares the same features.

Furthermore, the determination of the landmarks that the user has previously viewed can get based on:

  • Information (e.g., information the user has agreed to provide that becomes maintained in a privacy-enhancing and secure manner)
  • User travel history (e.g., locations the user has before visited)
  • Direct user input about landmarks (e.g., the user provides information on the landmarks that the user is familiar with)
  • User affiliations based at least in part on past user interactions and associations with a landmark (e.g., the user has provided information indicating the user’s membership in an organization associated with the landmark)

The context data can include:

  • Information associated with a time of day
  • A season
  • A language, such as English, Spanish, and Japanese
  • Features visible from the location
  • A mode of transportation, such as Personal automobile, bus, bicycle, and foot travel

The context data can include information about whether an occupant of a vehicle is a driver or a passenger. For example, when the context data indicates that an occupant of a vehicle is a driver, the landmarks can include landmarks more visible from the driver’s side. By way of further example, when the context data indicates that an occupant of a vehicle is a passenger, the landmarks can include landmarks more visible from various passenger seating locations.

Selecting at least one locally prominent landmark-based at least in part on a level of familiarity with the landmarks can get used in selecting, based at least in part on context data associated with the location on the path, including the plurality of locations, at least one landmark for use in navigation at the location.

The method can include adjusting at least one navigational instruction based at least in part on the level of familiarity. For example, the computer can access information associated with at least one landmark (e.g., a nickname for a landmark or a former name) and use the information to change the navigational instruction.

By way of further example, the computer can determine that when the level of familiarity exceeds a threshold level of familiarity associated with a user having passed a landmark greater than a threshold number of times, the navigational instruction gets selected from data (e.g., a semantic tag) including a nickname for a landmark.

Adjusting at least one navigational instruction based at least in part on the level of familiarity can get used in generating at least one navigational instruction that references at least one landmark.

Determining a Mode of Travel Associated With the Vantage Point

Parts of the Finding Locally Prominent Semantic Features patent can get executed or implemented on computers or computers, including, for example, the computer, the server computer, and the training computer. Further, parts of the method can get executed or implemented as an algorithm on the hardware devices or systems disclosed herein.

Parts of the method can get performed as part of the method. Depicts steps performed in a particular order for purposes of illustration and discussion. Using the disclosures provided herein, those of ordinary skill in the art will understand that various steps of any of the methods disclosed herein can get adapted, modified, rearranged, omitted, and expanded without deviating from the scope of the patent.

The method can include determining a mode of travel associated with the vantage point can get used in determining landmarks, including the features that meet entropic criteria associated with a localized prominence of each of the features.

For example, the computer can access data associated with the mode of transportation associated with the vantage point (e.g., an automobile traveling on a highway) can get used to select landmarks that have greater localized prominence from the highway (e.g., large billboards and advertisements).

Determining a mode of travel associated with the vantage point can get used in determining landmarks, including the features that meet entropic criteria associated with a localized prominence of each of the features as described in 706 of the method.

The method can include determining, based at least in part on a direction of travel and velocity along the path, the landmarks visible from the vantage point within a predetermined time associated with the mode of travel. For example, the computer can determine a velocity associated with the mode of transportation (e.g., a velocity in kilometers per hour when cycling) and determine the landmarks visible from the cyclist’s vantage point within the next thirty seconds.

Determining, based at least in part on a direction of travel and velocity along the path, the landmarks that will become visible from the vantage point within a predetermined time associated with the mode of travel can get used in determining landmarks, including the features that meet entropic criteria associated with a localized prominence of each of the features as described in 706 of the method.

The System Behind Finding Locally Prominent Semantic Features

Parts of the method can get executed or implemented on computers or computers, including, for example, the computer, the server computer, and the training computer. Further, parts of the method can get executed or implemented as an algorithm on the hardware devices or systems disclosed herein. Parts of the method can get performed as part of the method. Depicts steps performed in a particular order for purposes of illustration and discussion.

Using the disclosures provided herein, those of ordinary skill in the art will understand that various steps of any of the methods disclosed herein can get adapted, modified, rearranged, omitted, and expanded without deviating from the scope of the patent.

The method can include accessing image data, including a plurality of images associated with locally prominent semantic tags. Each of the semantic tags got associated with features of the plurality of images. Further, each of the features can get associated with a geographic location. For example, the computer can receive image data (e.g., tagged photographs from an image repository), including information associated with the plurality of semantic tags and the plurality of images via a communication network (e.g., the Internet) through which signals (e.g., electronic signals) and data can get sent and received.

The method can include determining landmarks, including the features that meet entropic criteria associated with a localized prominence of each of the features. For example, the computer can access the plurality of semantic tags to determine the landmarks (e.g., objects with features that are distinctive, less common, and locally prominent) in an area that include the features (e.g., physical dimensions, color, brightness, and shape) that meet entropic criteria (e.g., physical dimensions exceeding a physical dimension threshold).

The method can include determining, based at least in part on context data associated with a location of a vantage point on a path, including a plurality of locations, the landmarks associated with the vantage point. For example, the computer can access context data, including the time of day, environmental conditions. These conditions can include rain, snow, fog, hail, and level of sunshine, different seasons such as summer, spring, autumn, winter.

It also gets concerned with the location of objects, such as latitude, longitude, and altitude, between the vantage point and the landmarks. Based on the context data, the computer can determine the visible landmarks. These can get determined based on the time of day, pollution level, and amount of precipitation) that are not partly or wholly obstructed by other objects between the vantage point and each of the landmarks.

The method can include generating navigational data, including indications associated with the landmarks. For example, the computer can generate visual instructions on a heads up display device in a vehicle, including a description of the appearance of a landmark (e.g., “a restaurant shaped like a castle”) and the location of the landmark (e.g., “fifty meters ahead on the right”).

The indications can include information associated with the context data, including whether a landmark got obstructed (e.g., “the landmark is behind the small tree on the corner”), whether the landmark became illuminated, and the position of the landmark about other objects (e.g., “the landmark by the river”).

The indications can include visual indications (e.g., a photograph of the landmarks displayed on a display device) associated with a relative location of the landmarks on the path about the vantage point and audible indications (e.g., an audible sign from a loudspeaker based on the semantic tags associated with the landmarks) associated with the relative location of the landmarks on the path about the vantage point.

Providing Navigational Instructions For Finding Locally Prominent Semantic Features

The method can include generating a navigational instruction query associated with at least one navigational instruction utility. The navigational instruction query can include visual indications For example: visual indications generated on a display device associated with the computer. They may also respond to audible indications, such as audible indications generated via an audio output device associated with the computer).

For example, later (e.g., five seconds afterward) to generate at least one navigational instruction. For example, “Turn right at the large white obelisk.” The computer can generate the navigational instruction query via a loudspeaker device associated with the computer. The navigational instruction query can state, “Did you see the landmark?” or “Did you see the large white obelisk?”

Further, for example, the navigational instruction query about whether at least one navigational instruction was helpful in navigation. They could answer “Was the navigational instruction helpful?”

The navigational instruction query can get generated based at least in part on a distance to at least one locally prominent landmark referenced in at least one navigational instruction. For example, the computer can get associated with sensors that can determine the distance to at least one landmark and generate the navigational instruction query when the vantage point is within a predetermined distance of at least one landmark.

By way of further example, the computer can use the distance to at least one landmark and the current velocity at the vantage point to determine an amount of time before the landmark is within the distance. The computer can then generate the navigation instruction query within a time of reaching the location of at least one landmark.

The method can include receiving responses to the navigational instruction query. The responses to the navigational instruction query can include signals or data received from devices, including a tactile input device. These could include a keyboard, a touch screen, a pressure-sensitive surface, and a button), a microphone, and a camera.

The computer can use a microphone that receives verbal responses from a passenger. Such as a passenger in an automobile. This would become someone who had become provided with the navigational instruction query “Did you see the landmark?” The response from the passenger can include, for example, a verbal response of “Yes” or “No.” to denote whether the passenger saw the landmark.

The responses can get stored in a privacy-enhancing way that gets encrypted and anonymized. For example, aggregations of the responses can improve the efficacy of selecting landmarks and providing navigational instructions.

Adjusting the entropic criteria based at least in part on the responses to the navigational instruction query. For example, when the responses do not affirm the utility of at least one navigational instruction, the entropic criteria can get adjusted by changing the entropic criteria. For example, when the entropic criteria can include the smallest height of ten meters. The smallest height can increase to fifteen meters when the responses state that a navigational instruction referencing a landmark eleven meters tall was not useful. It could become seen as not useful if the landmark did not assist a user in navigating to their intended destination.

The extent to which the responses state the effectiveness of a navigational instruction from a particular location. This would become the location associated with the vantage point and get adjusted about the entropic criteria so that certain entropic criteria. This would become the locally prominent nature of the landmark or the brightness of a landmark. Those get based on user responses. Those would state whether there was an increase in the smallest brightness at night when the landmark was not seen by a user that received a navigational instruction referencing the landmark.

The technology discussed in finding locally prominent semantic features refers to servers, databases, software applications, and other computer-based systems and actions taken and information sent to and from such systems. The inherent flexibility of computer-based systems allows for various possible configurations, combinations, and divisions of tasks and functionality between and among components.

For instance, processes discussed herein can use a single device or component or multiple devices or components working in combination. Databases and applications can get implemented on a single system or distributed across multiple systems. Distributed components can operate sequentially or in parallel.

Sharing is caring!

13 thoughts on “Locally Prominent Semantic Features”

  1. Hello Mr. Slawski,

    I came across your article from 12 years ago, ‘Did MetaRAM Play a Role in Google’s Infrastructure Update to Caffeine’ while researching Netlist vs Google patent infringement case.

    This case is finally in its 8th inning and starting to gain more attention day by day thanks for social media.

    In case if you are still interested, there is a community for NLST investors on reddit.com/r/nlst sharing lots of legal, technical information.

    Have a great day.

  2. Thank you for that long content. I really enjoyed reading it. That is a very technical explanation why local SEO should use navigational directions

  3. Hi Niels,

    Thanks. I was fascinated at how local and Semantic was being combined in this patent. I have had clients ask about how a search engine might better understand locations, and this is definitely one.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.