Special Section: Information Architecture
Optimizing websites in the post panda world
Webmasters' traditional approach to search engine optimization was long based on matching query terms with word tokens. The result was a machine-oriented reflection of presumed relevance among retrieved items. Yet humans interpret content differently, focusing more on meaning than match, prompting programmers to focus on semantics, applied ontologies and predictive machine learning. Google Panda upended tradition with a focus on internal content quality and user engagement as well as the external influences of social media interaction, comments and reviews. Information architects can support a better search user experience through page layout, a flatter architecture, simpler site navigation and strong semantic connections between content and external authority resources.
Journalist Andrew Rice recently commented in the New York Times on the obsolescence of search engine optimization (SEO) as we know it:
Increasingly the online audience for <sites> is coming in the side doors via links on blogs and social-networking websites like Facebook. Probably the most important tool for reaching large audiences, however, is Google. If you can climb onto the top of the site's search results, you're certain to be rewarded with a huge number of clicks. Most publications these days try to harness the Google algorithm through an arcane process known as search engine optimization. Some are more skilled at this than others. 
During the last few years since its PageRank algorithm was first victimized by tactics that Google deemed “outside of its Webmaster Guidelines,” Google has steadfastly marched away from a link-based model of relevance towards one of content and context. In February 2011, Google launched the first of the Panda updates that solidified content quality and user experience as the passports to visible placement in search results. Sadly, no one told the content strategists and user experience professionals how their work influences search results or how the conception of content and design of user experience influences the performance of search engines.
My intent in this article is to fill in the blanks.
How Search Works
Most search engines use foundational methodologies that go back 60 years to the earliest days of electronic document storage. Document text is broken down into word tokens. These tokens are stored in an index that contains the document properties (date created, author, title, associated metadata) and its location. Retrieval is achieved by pulling all matches between query terms and word tokens and determining display based on term density and location.
The problem with machine relevance based on perceived user experience is that humans and machines think differently about what is relevant. Machines are constrained by the application of a prescribed set of rules and conditions that determine outcome. “Good” means that the criteria have been met. People, however, are contextual. They are influenced by environment and emotion when constructing their queries. “Good” means “pleasing” or it “feels” right.
How Search is Changing
To meet the challenges of mercurial input, search programmers look to semantic and predictive machine learning. Processing power has enabled search technology to determine contextual sameness between linked pages that is then used to validate the link quality.
Increasingly, search engines have incorporated an applied ontology to understand how things in the world are divided into categories and how these categories are related. By better understanding real world context, search engines are now able to designate authority resources for certain subject areas and then use links from authority pages to determine relevance.
In 2009, Google extended this contextual relationship modeling by purchasing the Orion algorithm from an Australian university. The algorithm uses predictive modeling to include more distantly related concepts into search results.
This significant update was originally referred to as the “Farmer Update” because it was erroneously assumed that Panda was going after content farms. It soon became clear that scope went beyond the location to the quality of the content itself. Quality is determined by the professional care – spelling and grammar – and by the depth of user engagement with the content as determined by the following:
Bounce Rate: Does the user engage with the content or do they bounce back to the search results or move on to another site? Some pages should have a large bounce rate, e.g. a Contact Us page with location and contact information. Navigation pages, long text pages or, most importantly, the homepage should draw the user in and through to other pages on the site.
Click through: Does the user select the result from those presented? If a result on page 3 is selected more often than a result on page 1, the algorithm judgment will be overridden and the results will switch places.
Conversion: Does the selected result satisfy the user's information need or do they select another result from the original set or revise their query.
It is not as though search engines see us thought-processing bipeds as mere recipients of their beneficence. There are human-generated social factors that Google uses to determine who gets in what position in search results. The following are among these:
Social media interaction: On the page opportunity to “like,” tweet or recommend the content through interacting with a social media button, logging in to the channel and completing the action.
Article Comments & Reviews: Information consumption is seen as such a passive activity that the simple action of responding is seen as an indicator of significance. Discourse through commenting is a strong signal of engagement.
Why Do We Care?
Computer scientists, developers and mathematicians have finally come around to define searching as a user experience. They use algorithms and programming to determine whether information is relevant based on its design. We care because we are the ones that create the information structures, page layouts and designs that inform the perception of relevance. The work we do as information architects and user-experience professionals influences how information is retrieved and presented to our users. Our choices when developing information structure help search engines determine relevance and, ultimately, influences whether our users even get to experience our work.
Search User Experience (SUX)
It is not too late for the IA and UX communities to seize the semantic high ground in determining what is relevant to an information need. Here's how:
- 1Develop flat information architectures. Search engines see distance from the homepage as an indicator of how important a page is in the greater scheme of things. Get rid of as many folders within folders within folders. Flat is back.
- 2Design page layouts that have semantically distinct regions with the most important information in the center section of the page. Google eye-tracking studies supported George Furnas' concept of the Fisheye View or Mark Hurst's concept of the Page Paradigm.
- 3See content as part of a strategy before it becomes an artifact. Consider these strategic actions:
- aUse online resources like Google Insights for Search and Yahoo Clues to find out how users look for your product or service. Bake these references into your content. Develop semantic relationships between content items on your site as well as authority resources off the site.
- bAdopt the newspaper model of an opening paragraph summary that goes on to deep subject-specific content. Ditch the newspaper maxim of “the fold.” Even Jakob Nielsen believes that the fold is dead in the virtual world .
- cDescribe your content in a machine-readable fashion with a <title> of 72 characters (including spaces) and a metaDescription of 150 characters (including spaces).
- dHelp users to remain oriented on the site. Give them recognizable navigation and an indication of where they are in the site structure. Include inline text links and make them clearly visible if scanning the page. Encourage users to explore with related link components.
- eThink beyond the site. Build in program links to contextually relevant information outside of the domain. Be confident that users will return to the site because they will, they always do, using the back button if necessary. Visibility is about relevance and relevance is based on relationships.
Search engines are not smart. They are programs that following instructions in an exacting way. They know the user through the query only. We are smart. When we take time to consider how the technology “understands” content, we become more adept at using technology to get information to those who need it. Remember, searching is an experience, not just an activity.