Feature Article (Pub: June 03) Defining a Search strategy by Paul Rudman and Andy Caddy
When people set out to design a web site, be it knowledge based intranet or a commerce based internet site, search strategy isn't always on the top of the priority list. This is understandable; the bigger goal is in sight, the detail of the user experience hasn't yet been explored, but experience dictates that search strategy needs to be established right at the forefront of the project. The Neilson Norman Group, stalwarts of usability, include search as two of their top 10 home page design guidelines and note that still in late 2002 only 50% of web sites pass their usability guidelines for search.
What will search do for you
Before you even start to think about evaluating technology and designing search results pages, the first step is to really understand what function your search tool is performing. It sounds obvious doesn't it? Well, just like the web site itself there is a myriad of requirements that search can be satisfying. It could be a product locator, a browse facility, an alternative navigation method or tool to serve up common queries. It is important that you decide now before you commit your hard earned budget as changing tack later is going to be far more difficult.
So lets break down the requirements into the fundamentals of what a search tool is there to do :-
Input of the query
How is the user going to key in their search term? Will it be keyword driven or via "Ask Jeeves" style natural language. Perhaps drop down menus will store fixed queries or previously saved searches. What languages will users be inputting and will there be a choice?
As the input is processed by the engine, will additional synonyms be added? Will an extensive thesaurus be used for like terms and is fuzzy searching preferable to exact matching?
How important is the search response time? Is there a trade off of accuracy against the time taken to execute? Will the search be required to span multiple indexes or perform federated searches from other internal or external data sources?
How much content are you planning to index? Will it just be plain spidering or is something more 'low level' required such as database connectors to pull out specific metadata? How important is the time taken to index the content and how often will the search index be refreshed and updated?
Will vanilla results suffice or is more user control required? Are there specific sorts orders needed or even personalised views of the results? Do the results need to be filtered for security or relevancy using business rules?
How important is the presentation of the results? How configurable do you want this to be? Do you want different views for different users? Are features such as viewing the results in context or similar terms important.
From a detailed analysis of the purpose of your site and the subsequent function expected of your search tool it should be possible to get to an accurate list of business requirements. This could be a simple tick list or a more detailed specification of requirements, whichever, it will be your yard stick against which contenders will be measured.
On top of this you will need to factor in three other major business considerations:
- The limitations of your technical infrastructure; are you constrained in any way by operating system, hardware, interoperability?
- Extensibility; will you be satisfied at just search, or will you want classification, metrics, personalisation at some later point in time?
- Budget; search solutions range from free to multi-million and many will have to be dismissed purely from a budgetary standpoint.
Getting good results
Deciding on the right search tool for your site or sites is only the first step. Once in place you will need to determine how to get the best results from the content that you have.
The most important question that needs addressing when focusing on the issue of search is "How good is my content?". Without a quality assurance process there are potentially thousands of documents containing valuable information that the user will never be able to find using search because the content was not tagged correctly.
Taxonomy is the discipline of creating order from chaos, of creating a logical system by which all information can be found in the shortest possible time. Defining a taxonomy for your business provides a classification hierarchy for your information that can be as simple or as complex as you need. Though it is directly related to search, taxonomy will be a required skill of your CMS authors who will need to know and understand the principles of classifying documents on your site.
Search tools create an index and rank pages according to complex algorithms that determine what results are most relevant to the user based on their submitted query. The index is created by analysing the text of a page or document being scanned along with the contents of the Meta data; the W3C compliant tags such as title, description, keywords, date, and author along with any bespoke data that has been defined.
What is quality Meta data?
Mapping how your taxonomy is going to be used is essential during the planning phase of your Intranet; Meta data can be fitted to pages retrospectively but not without substantial human resource. By training authors on taxonomy basics and then making key Meta tags obligatory for authored content, the relevancy of the search results will be increased.
Meta information such as keywords will allow your users to be able to find documents specific to their query, whilst the date tag will allow the search tool to order documents by specific time periods, or those most recently authored. What if a user wants to find articles by a specific author? By specifying in your corporate taxonomy that no content can be published through your CMS without a completed author tag, employees are given another criteria to extend their search experience.
As advanced as today's search tool algorithms are, they still cannot replace the knowledge that an author has about their own published document and so making sure all documents are tagged at the point of creation can only improve the search performance.
Establishing a taxonomy and a process for ensuring it is in place is the cornerstone of turning your unstructured data into a logical asset that employees can search.
The fundamental purpose of using a content management tool is to devolve content creation and reduce administration. Authors are invested with the trust to provide quality information for the web site and the quality of this content must include its taxonomy. Consequently, training authors or content providers should always include a briefing on the taxonomy guidelines. Some companies can afford dedicated taxonomy teams who can pour over the data and classify as required, but for most of us our authors have to be mini-taxonomists, classifying as they create. Bear in mind that although your budget may not stretch to dedicated permanent resource, initially using a taxonomist to provide the training and start-up support is highly recommended.
One way to reinforce the meta-tagging strategy is to ask authors to search for their own content after it has been index. This may provide insight to the content creators in how their tagging has affected the ranking of their content in search results. This can lead to a more relevant search tool along with a better-educated author community that in turn requires less quality assurance policing and support.
Customise the Search functionality
Whatever search vendor and tool you select, the product will have functionality beyond providing search results that should be investigated and, where possible, used to extend the search experience.
Common functionality found with enterprise search tools include:
- "Recommended Link" - as with commercial search sites such as AOL it is possible to place strategic links above the returned search results for a specific keyword. This can be useful for providing links to other enterprise applications that users are looking for and for pointing employees at important news articles for example. It can also provide customers with quicklinks to 'hot' products or features.
- Thesaurus - Search tools have a thesaurus file that allows the definition of synonyms and acronyms. For example users searching for "content management system" will also want to look for "cms". In most large businesses there will probably be colloquialisms and abbreviations for products and applications specific to the company and these can all be added to the thesaurus. Any search strategy needs to take account of these and customise the thesaurus accordingly, as some users may search by abbreviations, where others may not.
- Search tips - Poor search technique is often responsible for users not finding the information they want and need. Rolling search tips within the search section of a site can educate employees to improve their searching methods. If these tips are directly related to the business, these tips are more likely to be acted upon, and reduce the support requirement while aiding employees' information retrieval.
- Bespoke Meta tags - Enterprise search tools can be 'taught' the meaning of bespoke Meta data. This can provide different ways of isolating groups of data and therefore search results, narrowing the search field if the user knows where they need to find information, or something specific about it.
- Dissect the site index - Intranet's have a number of divergences in the nature of content, such as by geography, business unit, language, Internet sites may have product sections or subscription type. Users may want to target a specific area for documents, and the search tools available should reflect this requirement and search 'slices' should be created during the planning and implementation phase of search.
Search Metrics and Usage Analysis
A search tool is not a facility that is installed and left to run. Rather, it is a function that needs continual tuning and development to get the most from it. Some aspects of search metrics can provide insight into the structure of your site and some will give you a flavour of what your users are thinking as they type in their expressions.
Measuring who is using your search tool, what they are looking for and what resulting outcome of their search provides the most meaningful way of continually improving the search process. Enterprise search packages will almost always produce raw log files of the activity by the user and within the engine itself. Understanding how this information can benefit your site can be a defining factor in which tool you select as to the metrics that the tool will provide you.
It is recommended that you plan for statistics gathering and archiving both in terms of hardware requirements and data interpretation at the start of your project. Most vendors can assist in this planning task in advising what you will need to capture, how much space it will consume and what tools can be used to analyse it.
Useful measurements can include:
- What search terms are most popular? Identifying key search terms allows priority to be placed upon the information retrieval tuning process and create more relevant search tips and recommended links.
- What searches bring back zero results? Too many and your Meta tags are not servicing your user base or perhaps there are search terms that need to go in the thesaurus.
- What searches bring back too many results (1000+ for example)? Though this could be indicative of poor search expressions it could be that key phrases are not being ranked highly enough by your engine. If the problem is poor search terms then perhaps the training and tips for the engine need to be more prominent.
- What links are users visiting from search result pages? Examining the users interaction with the results will help tune it. For example, a document ranked number 35 is being returned for a search term as the most visited, while the number 1 is unpopular, in which case the Meta data needs reviewing for the documents.
- What percentage of the workforce is using the search tool? Too few and the search box is probably not very accessible. A large percentage will always be good to report back as good ROI.
- What percentage of the workforce is using your advanced functionality and which parts of it? In most cases advanced search is used by a fraction of the users and therefore will not merit extensive interface work, however if more users visit then perhaps its time to re-visit.
This information will help you define your on-going search strategy and improve the search process by highlighting how search is being used, how educated your employees are, and who is using it.
Search will be an important part of your site and deserves a cohesive strategy that defines how it will be implemented, managed and governed. The response and performance of your search is so inter-twined with the content and the Meta tags that CMS and Search strategy need to be closely matched and the links between them well understood. Implementing search is not a simple case of picking a product and installing it on a server somewhere, rather an ongoing process of tuning and evaluation. In this manner search can become one of the strongest assets of your site.
Paul Rudman is the director and head of optimisation at CommerceTuned, he's be involved in developing search strategies and search engine optimisation for 7 years.
Andy Caddy is a published author, frequently presents on web technology and is the founding director of gregor morrison consultants.