Most people are relatively poor at searching. They use terms that are too broad or too narrow.
They overconstrain the search. They don’t consider synonyms. They don’t know how to filter out documents that are irrelevant. As a result, we need to help out as much as much as we can.
Traditional information retrieval makes use of Boolean queries. Boolean searches rely on logical expressions of what constitutes a good result, where search terms are combined with AND, OR, and NOT. Boolean searches can sometimes be an effective query format for trained experts, but most people are highly unsuccessful with Boolean searches.
One key problem is that natural language usage doesn’t match logical expressions exactly. In English, when I say I want information about “cats and dogs,” [usually mea that I want information about “cats,” “dogs,” or “cats and dogs.”
However, in a Boolean query for “cats and dogs,” I wouldn’t find any information about “cats” alone.
This is an instance of a general problem: people generally assume that if they provide more information, they’ll get more and better results; however, most search engines are designed such that the more information that is provided, the further constrained the search and the fewer the results.
In practice, if someone starts describing a man they’ve met a “tall and dark-haired, with a scar on his cheek,” they usually assume you’ll also think of the not-so-tall person who is dark-haired with a scar.
They also assume that if they give you more information (e.g., that the person has a talking parrot and smells like sea salt), you’ll be more likely to correctly identify the person, even if they got one of the details wrong. However, traditional database searches are usually written such that a single incorrect search term will mean that you get no match.
Most traditional database searches suffer from being too rigid in their results. They include all matches and only exact matches. People are much more successful if they can also see near-matches, ordered by the closeness of the match.
Most traditional databases searches suffer from being too rigid in their results. They include all matches and only exact matches. People are much more successful if they can also see near-matches, ordered by the closeness of the match.
Another example of this problem is with parametric searches. This type of search provides a form with multiple options and is typical for database searches with multiple fields.
While this type of advanced search enables a more powerful search capability for skilled searches, many people will either assume they must fill out all from options (and thus overconstain the search) or that filling out an additional field will provide more opportunity for matches. Because of these assumptions, an implementation that ranks the degree of match will be more effective.
An advanced search option is a nice feature for experts, but most users will be most successful with a single text box in which to enter their keywords. Typical syntax for web searches is to support a plus sign (+) before required keywords, a minus sign or hyphen (-) before keywords that must not appear, and quotes around multiword phrases that must appear in the order presented, and to require that capitalization match if users enter uppercase or mixed-case search terms (lowercase terms should match any capitalization).
No Comments so far ↓
There are no comments yet...Kick things off by filling out the form below.