Outside SEO Resources

Books, Articles, and Papers

Some papers, books, and articles that I found online and wanted to share.

As We May Think
by Vannevar Bush

In July of 1945, Vannevar Bush speculated what scientists who had worked on the war effort should turn their hands to next, to make the world a better place. His article urged scientists to focus upon making knowledge more accessible to everyone, and came up with an idea that in many ways foreshadowed the emergence of the internet.

Citation Indexes for Science: A New Dimension in Documentation through Association of Ideas
by Eugene Garfield

Published in Science on July 15, 1955, Eugene Garfield proposes a citation index to scientific articles, in many ways like the legal Shepard’s Citation, which helps lawyers and legal scholars in US State and Federal Courts find publications and court cases that refer to other cases. Eugene Garfield’s work on citation analysis had an influence on how links are considered as citations in algorithms such as PageRank.

Improved Text Searching in Hypertext Systems (pdf)
by Lawrence Page

The first PageRank patent, filed by Lawrence Page with the USPTO on January 10, 1997. A plain language description of PageRank and the Backrub search engine in a provisional patent filing that was never actually assigned or published in the patents database, and which provides a comparison of Backrub with other search engines of the time.

Hypersearching the Web
by Soumen Chakrabarti, Byron Dom, S. Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins, Jon M. Kleinberg, and David Gibson

IBM’s CLEVER Project explored how analyzing links between pages could be useful in indexing the Web, around the same time that Google was developing its PageRank approach. While the team never publicly released a search engine, many of the concepts they developed were used by Teoma/Ask Jeeves. This paper describes the concepts of “Authorities” and “Hubs” within a collection of pages for a query on a specific topic, which are used to refer to how some pages are linked to by many other pages, and other pages link out to many other pages.

The Semantic Web
by Tim Berners-Lee, James Hendler, and Ora Lassila

An effort to help computers better understand content and data on the Web, and enable it to be shared widely and quickly. This is one of the first and one of the most well known papers about the Semantic Web.

Semantic Search
by R. Guha (IBM Research), Rob McCool (Knowledge Systems Lab), and Eric Miller (W3C/MIT)

A look at some of the early challenges on the Semantic Web, differing from crawling the Web of pages, to collect information from the Web of Data. This includes a look at Documents vs Real World Objects, Human vs Machine Readable Information, and the Relation between the HTML & Semantic Web.

Scientific Advertising

Claude Hopkins published this classic book on advertising in 1923, and it’s still very relevant for today’s world of online marketing and advertising.

Introduction to Information Retrieval
by Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze

A thoughtful look at how search works from a computer science perspective. Highly recommended for those who like to delve into the science behind search.

Search User Interfaces
by Marti Hearst

A very readable and very informative book that approaches how search engines work not from the algorithms behind the scenes, but rather the interfaces that you see when you search. If you want to learn a lot about how search engines work quickly, this is a great place to start.

The Anatomy of a Large-Scale Hypertextual Web Search Engine (pdf)
by Sergey Brin and Lawrence Page

One of the very first white papers that provided a glimpse into how a commercial search engine works. The search engine in question in Google, and even though this paper was written more than a decade ago and provides some great historical perspective on Google and search, there are hints in it of things to come from the search engine.

The PageRank Citation Ranking: Bringing Order to the Web
by Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd

If you’ve heard of Google, chances are you’ve also heard about PageRank, which is a method that the search engine used to rank how important pages are on the Web, and which has been combined with other ranking signals to determine the order of pages you see when you search. It’s very likely that the PageRank of 1998, as described in this paper, has evolved over the last decade, but it’s worth reading about how it was intended to work in the early days.

Shaping the Web: Why the politics of search engines matters (pdf)
by Lucas D. Introna and Helen Nissenbaum

Published in 2000, this paper looks at the potential biases in which search engines may engage, arising not so much from technical issues, but rather political ones. Why might some types of sites be excluded from search results while others might be favored? A thoughtful criticism of popularity-based search algorithms and the purchasing of prominence in search results.

The Design of Browsing and Berrypicking Techniques for the Online Search Interface
by Marcia J. Bates

Published in 1989, this paper discusses a different kind of search interface than what often gets discussed in Information Retrieval circles, where a single search is often part of a multiple page and multiple query inquiry for information. A thoughtful paper that might have you thinking about designing web pages a little differently.

Research-Based Web Design and Usability Guidelines

This set of usability guidelines from the Department of Health and Human Services are helpful, creative, and smart. If you design web sites, and you haven’t seen them, you should take a look. You might get some ideas on how to make your sites more usable for visitors.

Introduction to Information Retrieval

Published in 2009, this online version of the book provides a great first look at the computer science behind how search engines work.

Design Resources

Helpful Government Sites

Search Engine Resources

Search Data Related Blogs

Loading


Books

Web Dragons: Inside the Myths of Search Engine Technology, by Ian H. Witten, Marco Gori, and Teresa Numerico
Web Dragons: Inside the Myths of Search Engine Technology

by Ian H. Witten, Marco Gori, and Teresa Numerico

Search Engines: Information Retrieval in Practice,by Bruce Croft, Donald Metzler, and Trevor Strohman
Search Engines: Information Retrieval in Practice

by Bruce Croft, Donald Metzler, and Trevor Strohman

Mining the Web, Discovering Knowledge from Hypertext Data, by Soumen Chakrabarti
Mining the Web, Discovering Knowledge from Hypertext Data

by Soumen Chakrabarti

Ambient Findability: What We Find Changes Who We Become, by Peter Morville
Ambient Findability: What We Find Changes Who We Become

by Peter Morville

Algorithms of the Intelligent Web, by Haralambos Marmanis and Dmitry Babenko
Algorithms of the Intelligent Web

by Haralambos Marmanis and Dmitry Babenko

Numerical Algorithms for Personalized Search in Self-organizing Information Networks, by Sep Kamvar
Numerical Algorithms for Personalized Search in Self-organizing Information Networks

by Sep Kamvar

Search Patterns: Design for Discovery, by Peter Morville and Jeffery Callender
Search Patterns: Design for Discovery

by Peter Morville and Jeffery Callender

Information Retrieval: Implementing and Evaluating Search Engines, by Stefan Buttcher, Charles L.A. Clarke, and Gordon V. Cormack
Information Retrieval: Implementing and Evaluating Search Engines

by Stefan Buttcher, Charles L.A. Clarke, and Gordon V. Cormack

Letting Go of the Words, Writing Web Content that Works, by Janice (Ginny) Redish
Letting Go of the Words, Writing Web Content that Works

by Janice (Ginny) Redish

Design Accessible Web Sites: 36 Keys to Creating Content for All Audiences and Platforms, by Jeremy Sydik
Design Accessible Web Sites: 36 Keys to Creating Content for All Audiences and Platforms

by Jeremy Sydik

Web Analytics 2.0: The Art of Online Accountability and Science of Customer Centricity, by Avinash Kaushik
Web Analytics 2.0: The Art of Online Accountability and Science of Customer Centricity

by Avinash Kaushik

Don't Make Me Think, by Steve Krug
Don’t Make Me Think

by Steve Krug

Getting Information about Search, SEO, and the Semantic Web Directly from the Search Engines