Sura gave up on her debugging for the moment. ‘The word for all this is ‘mature programming environment.’ Basically, when hardware performance has been pushed to its final limit, and programmers have had several centuries to code, you reach a point where there is far more signicant code than can be rationalized. The best you can do is understand the overall layering, and know how to search for the oddball tool that may come in handy ‘take the situation I have here’ She waved at the dependency chart she had been working on ‘We are low on working fluid for the coffins. Like a million other things, there was none for sale on dear old Canberra. Well, the obvious thing is to move the coffins near the aft hull, and cool by direct radiation. We don’t have the proper equipment to support this so lately, I’ve been doing my share of archeology. It seems that five hundred years ago, a similar thing happened after an in-system war at Torma. They hacked together a temperature maintenance package that is precisely what we need.
Identifying Primary Versions of Duplicate Pages
We know that Google doesn’t penalize duplicate pages on the Web, but it may try to identify which version it prefers to other versions of the same page.
I came across this statement from Dejan SEO on the Web about duplicate pages earlier this week, and wondered about it, and decided to investigate more:
If there are multiple instances of the same document on the web, the highest authority URL becomes the canonical version. The rest are considered duplicates.
The above quote is from the post at Link inversion, the least known major ranking factor. (it is not something I am saying with my post. I wanted to see if there might be something similar in a patent. I found something closer, but it deoes say the same thing that Dejan predicts
I read that article from Dejan SEO about duplicate pages, and thought it was worth exploring more. As I was looking around at Google patents that included the word “Authority” in them, I found this patent which doesn’t quite say the same thing that Dejan does, but is interesting in that it finds ways to distinguish between duplicate pages on different domains based upon priority rules, which is interesting in determining which duplicate pages might be the highest authority URL for a document.
1. Domain Age and Rate of Linking
2. Use of Keywords
3. Related Phrases
4. Keywords in Main Headings, Lists, and Titles
5. Page Speed
6. Watch Times for a Page
7. Context Terms on a Page
8. Language Models Using Ngrams
9. Gibberish Content
10. Authoritative Results
11. How Well Databases Answers Match Queries
12. Suspicious Activity to Increase Rankings
13. Popularity Scores for Events
14. The Amount of Weight from a Link is Based upon the Probability that someone might click upon it
15. Biometric Parameters while Viewing Results
17. Site Quality Scores
18. Disambiguating People
19. Effectiveness and Affinity
21. Category Duration Visits
22. Repeat Clicks and Visit Durations
23. Environmental Information
24. Traffic Producing Links
26. Media Consumption History
27. Geographic Coordinates
28. Low Quality
29. Television Viewing
30. Quality Rankings
Google Introduces Combined Content Results
This new patent is about “Combined content. What does that mean exactly? When Google patents talk about paid search, they refer to those paid results as “content” rather than as advertisements. This patent is about how Google might combine paid search results with organic results in certain instances.
The recent patent from Google (Combining Content with Search Results) tells us about how Google might identify when organic search results might be about specific entities, such as brands. It may also recognize when paid results are about the same brands, whether they might be products from those brands.
In the event that a set of search results contains high ranking organic results from a specific brand, and a paid search result from that same brand, the process described in the patent might allow for the creation of a combined content result of the organic result with the paid result.
PageRank Update by Google
The original PageRank patent, assigned to Stanford University, has expired. Google had an exclusive license to use PageRank. Google filed a PageRank update, with a different algorithm behind it. The PageRank patent filed by Google has been updated. It does cover PageRank, as it describes in the description to the patent which tells us this about PageRank:
A popular search engine developed by Google Inc. of Mountain View, Calif. uses PageRank.RTM. as a page-quality metric for efficiently guiding the processes of web crawling, index selection, and web page ranking. Generally, the PageRank technique computes and assigns a PageRank score to each web page it encounters on the web, wherein the PageRank score serves as a measure of the relative quality of a given web page with respect to other web pages. PageRank generally ensures that important and high-quality web pages receive high PageRank scores, which enables a search engine to efficiently rank the search results based on their associated PageRank scores.
A continuation patent showing a PageRank update was granted today. The original version of this PageRank patent was filed in 2006 and reminded me a lot of Yahoo’s TrustRank (which is cited by the patent’s applicants as one of a large number of documents that this new version of the patent is based upon.)
At one point in time, search engines such as Google learned about topics on the Web from sources such as Yahoo! and the Open Directory Project, which provided categories of sites, within directories that people could skim through to find something that they might be interested in.
Those listings of categories included hierarchical topics and subtopics; but they were managed by human beings and both directories have closed down.
In addition to learning about categories and topics from such places, search engines used to use such sources to do focused crawls of the web, to make sure that they were indexing as wide a range of topics as they could.