Is Google Measuring Our Reading Speed of Web Documents?

Imagine that a search engine might insert place markers into a web page, perhaps with the use of something like the new Google Tag Manager? These markers could enable a search engine to calculate how long it might take someone to read that page. A newly granted patent from Google describes why they might insert such markers (without really telling how how it might insert those), to determine the reading speed of a page.

Patent drawing showing the placement of electronic markers on a page, with an eye viewing those, to calculate reading speed.

The process described by the patent might try to understand how different features associated with a page might cause it to take less time or more time for a visitor to read a page. It would then use that understanding to predict how such features might influence the reading of other pages that don’t have markers inserted into them. These types of features could include language, layout, topic, and the length of text of those documents. These are all things that could affect traffic across the web or at specific websites.

As an example, the patent tells us that pages in different languages could require different amounts of space to say the same thing, and therefore would require different amounts to time for someone to read those pages.

The patent is:

Detection and utilization of document reading speed
Invented by Victor Bennett
Assigned to Google
US Patent 8,271,865
Granted September 18, 2012
Filed: September 19, 2005

Abstract

A system stores an electronic document that has markers inserted within the electronic document. The system visually renders the electronic document to a user and uses the inserted markers to determine a speed at which a reader reads the electronic document.

A determination of the speeds at which readers travel through the text in a document enables a search engine to predict user interactions with a specific website, and use that information to predict web traffic generally.

The patent defines documents pretty broadly to include e-mails, websites, business listings, files including files with embedded links to other files, news group postings, blogs, web advertisements, digital maps, and others.

The amount of time it might take someone to read a page could be calculated by the amount of time it takes to scroll down a page and go from one marker to another, or when a mouse cursor hovers over a portion of a page.

Aggregated reading speeds from multiple viewers might be used to create reading statistics for pages, and predictions of reading speeds for other pages. I can’t help but be reminded of Google’s layout algorithm, and how it seems to know so much about the layouts of pages. I don’t know if reading speed predictions are part of that algorithm, but it’s interesting that it focuses upon one way of indicating how fast someone might travel from one marker on a page to others.

The patent tells us that for the language detection feature, the text in a document might be compared to a dictionary of words in different languages to identify the language of a document. Google has stated in the past that they ignore meta language elements in the HTML code on a page:

We don’t use any code-level language information such as lang attributes.

Why does Google want to know how quickly people are reading different pages?

Here are a few reasons cited in the patent. Determined reading speeds for pages might be used to:

1. Determine an expected time after loading of a document by a browser that a user will reach a specific portion of the document.

2. Differentiate users who speak the language they are reading from those that don’t.

3. Detect automated surfing systems from actual users that are reading a document, which can be helpful to detect and avoid “bots” and automated surfing on “pay-per-click” pages.

Take Aways

There have been a number of patents and papers from the search engines that indicate that Google might be using user behavior signals as part of a ranking signal, but none of those really have told us what kinds of baselines or information they might use in verifying or checking up on how valid those signals might be, or if there are things about them that might skew those signals in odd ways.

Being able to measure and predict things such as how long it might take someone to read a page, or whether or not people are even scrolling down most of a page can provide a better sense of how realistic the thresholds that might be set to measure those user behavior signals might be. And being able to tell if an automated system is visiting pages is good to know when fighting things like click fraud.

Is this something that Google is doing now, or might do in the future?

Good question.

Share

51 thoughts on “Is Google Measuring Our Reading Speed of Web Documents?”

  1. Thanks Bill,

    Definitely nice as always, and yes there are some user behavior tracking activities are going on. Like the “bonus links” comes out from Authorship markup of the document.

    When you click on a search result with authorship markup and you return to the search results after reading the article, Google will display three additional links.

    Here’s an example. I conducted a search for the term “inbound marketing” and clicked on the article that was written by Lauren Drell. I read the article, then a couple of minutes later I pressed the back button to return to the search results. This time there were three bonus links from other articles she has written.

    What you think about this one?

  2. Bill,

    How is this different from the tracking that sites like Crazyegg do. They can already tell the amount of the page scrolled. I guess I am not understanding what makes this patent-worthy. Potentially, instead of having 1000 data points about scrollto location they might have 3. (The full scrolling data would be hard to store across every site) Top, Middle, Bottom and calculate the speed between rendering of these? Sounds like it could be most useful for clickfraud and authorship. I guess that based on the Authorship signals mentioned by Rajesh, they could be wanting to add additional metrics that the page was actually read and not just that the user was interrupted while visiting a certain page.

    Great article as always.

    JR

  3. Hi JR,

    It does show that Google is actively measuring baselines for user behavior for a site when it comes to things like the language used on a page, the layout of the page, and other features. This isn’t analytics in that Google isn’t collecting this information so that changes can be made to those pages. It can be used for a collection of a seed set of features and statistics related to them to enable Google to predict user actions on other sites as well.

    If Google is going to use data about actual visits from real people to web pages, it needs to be able to set baselines about that kind of behavior. If someone sends 10,000 people to a page from mechanical turk to click on a link to another page, and the link is at the bottom of the page, and those visitors all scroll down to that link without reading anything, that’s a good sign that something odd is going on. Being able to better understand user data means better control over how it might be used.

  4. Well, tag manager is one of the most useful tools provided by Google. We’re running several campaigns for our company website and this tool will be definitely helpful to simplify the tags. Thanks a lot for the update.

  5. How will this work with people who read a page a little slower? Some of us are always searching Google but it takes a while to read a page to see if we’re really finding what we are looking for. I definitely understand the reasoning behind this patent, I’m just not sure how accurate it will be.

  6. Some people will read those pages a little faster, too. That’s why they will probably collect information from multiple people whenever possible to collect information about people reading those pages. Again, this isn’t a ranking signal, but rather a way for Google to collect a baseline for pages, and how different features on those page might influence reading speeds.

  7. It’s something we can use as guidance hello, but from my point of view if a user is really interested in the article you are reading, the time that will be higher, particularly in my case I read and reread several times if I really care about what I’m reading, so if it takes me five minutes to read information, but this really interests me, could invest more in another 5 minutes to read it, anyway how long it takes to read it is very useful to know if we are catching the attention of our visitors, or otherwise go fast.

    Thanks for sharing this fascinating article,

    Best wishes,

    Omar

  8. I do not think that fast page reading will necessarily hurt your page, I have some pages with very little info, that do very well.

    These average times are 00:15 up to 1:00 you know?

    The pages I am talking about are also at the top of Google for the keywords I want.

    I am not saying you are wrong, but the effect if there is one, is nominal.

  9. So, on the assumption that a ‘document’ could be a web page or a book… I wonder how Google’s method differs from Amazon’s:
    “New Time to Read feature uses your reading speed to let you know when you’ll finish your chapter”
    enough to get a patent?

  10. It’s not surprising Google is able to measure read time on their own network of site. Scraping the Google search result may be stopped using this method. I think Google already has good statistics on read time with Google Reader users and maybe other products.

  11. Isn’t the knowledge of reading time interesting to know how much of a portion of text certain people can handle? So text can be made a bit smaller, and people read more text, and maximum amount of the article itself?

  12. Ranking is not the reason why Google may be tracking reading times for pages, but rather to understand the user experience on those different pages.

    I didn’t say that your rankings would be influenced by the speed at which someone reads your pages.

    Maybe you should go over and read the patent for yourself. :)

  13. Interesting post. Google is suspected to currently use bounce rate etc so it reasons that they want to be able to create a baseline of “how long should the average person take to read this page” and if people are falling well under it then it could be a signal of low quality.

  14. This is supremely interesting, and I don’t doubt for a minute capabilities like this are either in place or coming. So scary that we are all being tagged and followed as species by ourselves.

  15. I do somehow believe that Google is measuring reading speed and using it for SERP. Also I believe they might use Analytics just for this, to closer measure the site reading speed time.

    If an article is interesting, sure everyone will read it and spend some quality time on it. For an unattractive article, users tend to roll out in less then ten seconds and personally I’d punish these kind of articles/sites too.

  16. Hi Ribice

    This particular patent doesn’t really tell us or even infer that measuring the reading speed of visitors will be a part of a ranking signal. Maybe it will, and maybe it won’t, but I think we need to look elsewhere to get a more definitive statement from Google on the subject.

  17. Hi Jon,

    I’ve seen a lot of denials about Google using a bounce rate as a ranking signal, and I agree that it tends to be a pretty noisy signal that might not be as helpful for rankings as it might seem to be. It is interesting though that Google is working to get a better sense of how people might actually spend time on a page.

  18. User experience and engagement are the key here rather than the impact on rankings. I agree with the first comment from Rajesh, this does make me think automatically of the “bonus links”, authorship markup + time spent on page. This patent comes as one more way to measure and enhance user experience.

  19. Hi Eliseo,

    I agree that it’s not so much about rankings, but it does give Google an idea about user experiences on different pages. As the patent says, one of the potential uses is to be able to understand when a visitor might not be human or might not have come to the page to read it, but rather in both cases if they had come to the page only to click on an advertisement. Getting a sense of how long someone spends reading different parts of a page also gives Google a good idea of how different features might impact visits to pages that have similar features. It’s possible that this might be used in other ways by Google too, but I’m happy to see that Google is experimenting and collecting data about pages like this.

  20. It seems rather obvious that different languages will read in different times, that they might require more or less space on the page depending on the language. This comes down to DTP and web localization.
    It would be a great tool for multi-language firms.
    Thank you for the article!

  21. Reading speed is important to the reader. Thus, it should be important to Google. I get annoyed when a page loads slowly and when my document does so. I think we will find that as the internet gets older and wiser, there will be a greater emphasis on speed.

  22. I’ve always been under the impression that all existing content online are analyzed by Google, but I haven’t really realized how important reading habits are until I saw information on Google Analytics hinting that Google does check reading time (bounce rates, page views, time per page, etc.) I’d be even more amazed if I find out they can check me out on Google Earth!

  23. I think your assumption of detecting click fraud is spot on. I’m sure everyone with google analytical does there own manual click fraud detecting when looking at time spent on page.

  24. I like the “detect automated surfing bots on pay per click” sites. That one could potentially level the playing field for the rest of us “white hat” peoples. However, it’s not going to be an easy task. One site differs considerably from the next. One of our sites has a telephone number and is a service site. People ring the number to “book our services” and the conversion rates about 30%. Another site is an information portal and gets far more pageviews with time per page being much shorter- but it has much more information on each page.

    If, however Google limited the demographic to sites that use advertising, they may see some significant results with the idea- but then what about sites that use pcm and host videos (a user could be on a page for 2 hours!)

    If Google applies this patent eventually, they’re either more intelligent than Oppenheimer or they’ve missed some key factors……

  25. Hi Bill,

    I put this new ‘innovation’ right up there with the patent or rumour about how google was penalising pages that just had text on them. That you now had to magically have images, and tables and a youtube clip before you were considered google ranking worthy.

    Some people still think that just having good unique content is enough. I am a little more concerned with the buzz around the disavow story. Do you have a take on that? And how it will affect seo? Cheers Bruce

  26. Hi Bruce,

    Not sure what you’re talking about in terms of a “patent or rumor” about Google penalizing pages that just had text on them. Where did you hear that?

    I have no take on the new disavow feature from Google. I looks like nothing more than Google bowing to the pressure of people demanding the feature.

  27. Hi Lisa,

    Maybe that’s why Google is actually trying to measure the amount of time people spend on pages, because it varies from page to page.

  28. You always hear about the 200+ things that Google looks at in their algorithm. It seems like this could certainly already be one but it would seem to have a minimal effect in my experience. Of course, with the rise of infographics, which are often viewed far longer than normal text and have very little text based content, rank well. Hard to say if it is because they are so share-able/link-able or if this patent has something to do with that.

  29. Thanks for the post! Another factor in Google’s 200+ algorithm measurement tools to gauge whether or not the site gives a good experience for the user. If this tool decides your site doesn’t make the grade then I guess a penalty is on the cards.

  30. Hi Cody,

    Don’t think this is a ranking factor on its own, but rather a way for Google to learn more about the pages it indexes, and the user behavior associated with them. That might lead to better use of human bhavior as a ranking signal in the future, as well as a way of possibly ignoring signals that are automated.

  31. Google always wanted to provide better user-interactivity to their search engines. Sometimes, I admit that I hate it that Google just adjust a lot of things every now and then but let’s accept it, we have to follow their rules. They always wanted to measure how a certain site helps visitors. Thus, they see to that people comes to the site and read the content. As web owners, we should then provide interesting and quality information for our visitors to read. I hope this patent you mentioned here will help the web owners and visitors alike.

  32. Hi Bill, ok as you say it’s not particularly to do with rankings, and maybe not penalization, but don’t you think Google should be concentrating on other things like giving the small or newbie sites a bit of slack?
    I come across many sites that have zero content/reading matter at all, just links to advertisers, with PR4/5 i honestly don’t know what’s going on here (maybe something i’m missing being a newbie.
    Perhaps Google should spend more time looking into paid backlinks and other black hat tricks and merit the honest guy on his/her interesting content alone, and not how long it takes to read?
    Thanks,,, Jimi.

  33. Hi Jimi,

    The patent filing doesn’t indicate in any way that this approach is directly related to ranking, or even to penalizing the site owners.

    Google publishes a lot of patents filings and has a lot of different things going on. Not all of their patents are going to focus upon rankings, and they don’t have to. Google is most likely working on the issues that you describe.

  34. I think they are already trying this – when I go to a website and click back button too quickly, they show “Block results from this website”.

    This is a way for them to learn from user behavior and keep search results very relevant.

  35. Hi Bill. Thanks for the post. Read the comment by Rajat Garg and this is something I have noticed as well. I knew that bounce rate and read time had become a conjoined factor. This seems to be a further encouragement for great content writing and another well earned slap for the spammers out there. Sorry to sound harsh but when researching, so many search results do not deliver as they promise. However, Google is getting increasingly fired up about content and it’s delivery. The “block results from this website” is very dangerous though as we are moving into subjective search results. What happens if the first landing page was under par but the rest of the website was highly useful? Just my thoughts, but what can we do? Thanks again. Sara x.

  36. Hi Bill,

    This is an interesting one, and potentially a step towards the age old question of determining a ‘true bounce’ (i.e the difference between someone who has found the information they wanted and left the site, against someone who has hopped off after being put off by poor design, speed or content). If this patent made it’s way into an analytics feature (and I’m guessing that’s a big ‘if’ at the moment considering they would need to figure out how to tag pages first), it would be a really useful tool to determine the true quality of your content.

    Here’s to hoping this could potentially be a Analytics tool rather than a secret ranking signal.

    David

  37. I definitely think Google is recording, analyzing, and trying to incorporate as many different things as the can. I think it’s pretty well assumed that they already measure the time it takes for the user clicking on a link to return to search results. e.g. If somebody searches for something and never searches for it again, there’s a pretty good chance it was a successful search.

    As far as on-page analytics, I think that’s a little murkier. With AdSense, Google Analytics, etc, they definitely have the capabilities to track page movements, too. However, I’ve heard Matt Cuts (works on Google Search quality team) say they won’t incorporate GA data and this makes sense. Feels like it could be a privacy intrusion.

  38. Hi Rajat and Sarah,

    The display of a “do you want to block this site” isn’t necessarily an iondication that Google is measure the reading speed of a page. Google also isn’t necessarily using bounce rate as a ranking signal, either.

  39. Hi David,

    This might be an analytics signal, but not necessarily one for your Website, but rather one to tell the search engine about the sites it is sending people (and non-people) to. If someone (or a robot) does a search at Google, to find pages on specific topis, and they quickly scroll down to an ad much quicker than a person might, and click upon the ad, that could easily be construed as click fraud. Seems like a reasonable thing for a search engine to want to track.

  40. Hi Chris,

    Good points. Google really doesn’t need to look at Google Analytics data to collect information about searches, and what people do when they leave a search engine after a search.

  41. Hi Bill,
    I truly agree with you that Google doesn’t really need to look at Google Analytics data to collect information about searches.This is a very strong point which you have made and this is so very true.I realized this when i actually started learning about Google searches.Google is doing a great job by measuring speed of readers which will help them to optimize web pages.

  42. Hi Bill,

    You’re correct that they don’t need need to look at Google Analytics. However, if they did use Google Analytics, their information would be ultra-accurate. Measuring return to search results is pretty accurate but Google Analytics can definitively say whether a site is engaging. However, it would be a huge breach of privacy from Google.

  43. As with all Google updates, I have no doubt that this will enhance user experience IN GENERAL. But at the same time this will inevitably hurt websites that produce content that does not lend itself to a typical pattern of reading time. I am afraid that this addition will take some time to get an accurate read (excuse the pun) on how long people are actually reading the page. They can track what the website is displaying, but they can’t track eye motion…yet.

  44. I have not doubt that Google is doing this now or at least trying to develop the ability to. Eye scan technology already exists as well as heatmaps that give us the ability to see where the majority of people are looking on a web page.

    I have heard that Google uses bounce rates as part of their search algorithm so hopefully this would not cause our bounce rates to increase thus making it harder to rank. While some sites have great useful content they may be able to say in 100 words what might take another site to say in 700 thus not taking as long to read.

    More and more people are inserting WAY too much content on their site because supposedly Google likes this and the thinking is the more content I write the longer the user stays on my website…..the lower my bounce rate.

    Hopefully if this ever is created and implement it won’t cause website owners to write too much unnecessary content

  45. Hi Nathan,

    Google doesn’t use straight-up bounce rates to rank pages, and the expressed intent behind this patent isn’t to calculate bounce rates, but rather to do things like get a sense of how long it takes people to read a page, so that when automated spam programs come to a page to engage in click fraud, it can ignore those clicks upon advertisements. You may want to listen to someone else on topics like bounce rates and rankings.

  46. Very interesting points, I definitely think Google uses methods like this to help them bring back the most relevant results. Writing content that is engaging rather than for SEO purposes will have a more positive effect on rankings I am sure.

Comments are closed.