Information Scent

The following is the wiki entry I produced for Dr. Jamie Blustein’s Human Factors in Computer Systems class last fall:

Overview

Information scent is the perceived relevance of a piece of information to a particular user’s information need. It is a sub-concept of information foraging theory, in which scent motivates navigation via on ongoing on-the-fly assessment of the value and cost of different information sources.

Motivation

Information scent and information foraging are part of the larger study of how people search for information, and is directly related to the areas of cognitive psychology and human behaviour. Unlike other modes of relevance assessment, the foundation of this theory is the perception of relevance based on context and factors relating to individuals or groups of users.

In terms of its applications to the Web, attempts to track and predict patterns in information scent are part of a larger effort to build a cognitive model of how users find information on the Web. Such a model can offer guidance in the area of Web and computer interface design—or in the words of one of the originators of information foraging theory, Peter Pirolli, information scent could enable a “move beyond design by good intuition” [6]. If design can ensure that the scent in a site is strong, the user should be more likely to make the best possible choices and be led to appropriate results [7].

As well as fostering the original research in this area [6] [7], to this day Palo Alto Research Center (PARC) supports ongoing research in this field. The PARC website lists 58 publications on the topic of information scent, including several recent additions that describe models for assessing scent and predicting user scent-following as well as systems that assist (or enhance) scent-following in virtual environments.

Chronology

1957 Nobel winning psychologist Herb Simon proposes Bounded Rationality theory; Chi describes it in 2003 as follows: “an agent behaves in a manner that is nearly optimal with respect to its goals as its resources will allow” (Chi, 2003)
1960’s Evolutionary-ecological optimal foraging theory is developed in anthropology
1991 Dennett coins term informavores [7]
1994 Sandstrom makes connection between optimal foraging theory and library science [7]
1995 Pirolli and Card develop Information foraging theory out of Xerox PARC; uses ecological metaphor to understand how users make decisions around information
1999 Pivotal article on information foraging [7] is published; includes prominent use of information scent concept
2000 Chi et al [2] [3] apply concept of information scent to the Web; develop information foraging into predictive model for Web surfing [1]
2003 Development at PARC of InfoScent™ Bloodhound Simulator, “a push-button navigation analysis system, which automatically analyzes the information cues on a Web site to determine the probability of user task success” [5]
2005 Development at PARC of ScentHighlights, a system that highlights relevant sentences in a text in response to keywords provided by the user [2]
2006 Development at PARC of ScentIndex, a system that produces an on-the-fly index specific in response to keywords provided by the user [2]

Details of Theory

Information foraging theory is built on adaptationist evolutionary-ecological explanations of food-foraging from the field of anthropology. Just as the practice of foraging for food is a constantly developing strategy based on an ongoing assessment of value and cost, people’s strategies when seeking out information are based on what they perceive to be information sources’ relative costs and value [7]. Costs of an information source can be resource costs (money, time, energy expended, cognitive use) or opportunity costs (benefits that may not be gained if one selects a particular information source) [7].

Optimal information foraging means “maximizing the rate of valuable information gained per unit cost, given the constraints of the task environment” [7]. This can be done via enrichment (improving your environment to fit your strategy) or by scent-following (using available proximal cues to follow a scent trail to a distal source that will hopefully satisfy your need) [7].

Although not initially modeled on the Web, a great deal of research at PARC [2] [3] has applied this theory to people’s trajectory within Web sites and from one site to another. With Simon’s bounded rationality theory as a starting point, they have extrapolated that in the realm of information searching humans will not always make the best choices, depending on their individual context. When considering the Web, the overwhelming availability of choices and difficulties linked to allocation of attention means that choices will often be suboptimal [1]. Indeed, Pirolli and Card (1999) stress from the onset that information scent is necessarily dynamic and imperfect due to constantly changing environmental conditions [7].

Jakob Nielsen’s simmered down version of information foraging and Web design is often quoted by other sources in the Web design community. His 2003 Alertbox entry Information Foraging: Why Google Makes People Leave Your Site Faster uses the information foraging concept of between-patch behaviour (propensity and motivation for moving from one information source to another) to describe changes in user patterns on the Web. He posits that Google’s reliability in returning relevant hits is contributing to users’ increased confidence about moving to another high-ranking site. He therefore recommends designing for “information snacking” and implementing means designed to lure users back to the site later on [4].

Method

The basis of Pirolli and Card’s information foraging methodology is an adapted form of the complex mathematical formula initially designed to accurately predict animals’ patterns regarding within-patch and between-patch selection of food sources.

The original study by Pirolli and Card (1999), as well as that discussed in Pirolli (1997), makes use of a system they developed called the Scatter/Gather browser. This tool uses clustering to assist a user in sifting through a large number of documents. In the study, the user is assigned a retrieval task query, then shown a number of thematic clusters on screen, of which he or she can gather as many of these as are relevant. The system then scatters a new sampling of clusters before the user, who repeats the gathering process. The clusters offered are at first very large, then increasingly narrow, until the number of remaining clusters is manageable enough for the user to scan or read [6]. The computation behind the system’s scattering is built on the ACT-R production system (later replaced by the ACT-IF model), a model of human cognition designed to recognize the spreading activation network (word representation and inter-word memory in users’ long-term memory) and interword correlation (IWC), “users’ conception of word synonymy” [6].

In later studies by Chi et al (2000, 2001), where information scent and information foraging are applied to the Web, two sets of algorithms are applied. The first, Web User Flow by Information Scent (WUFIS), is a behavioral prediction algorithm, while the second, Inferring User Need by Information Scent (IUNIS), is a need prediction algorithm. WUFIS simulates a large number of agents making their way from link to link and throughout the content in a Web site. For each of these agents, the model computes the information scent at each step, using spreading activation and comparing the pages’ content (i.e. words in the link itself, words in the text surrounding the link, graphics on or around the link, position of the link on the page) with the agent’s original information goals. By comparing agents’ random treks through a site and comparing these to the information goal, the highest-scent trail can be uncovered. [1]. As for the IUNIS algorithm, it is in effect simply a reversal of the WUFIS scent flow trajectory. Instead of having a known goal and an unknown destination, it starts from the end destination, and by applying the spreading activation to the path that is followed by a user, the original information goal is uncovered [1].

Applications

Predictive Model of Web Surfing

Chi (2003) describes the development of a predictive model for Web surfing. The InfoScent™ Bloodhound Simulator is a service designed to automatically infer the usability of a Website by way of an adaptation of the WUFIS algorithm, with simulated users that surf for specific goals. Basically, the Bloodhound needs to be given a Website address and a set of user tasks, and it will produce a usability report. [1]

Scent-friendly Web Design Solutions

Jakob Nielsen prescribes Web design strategies conducive to increased information scent, such as links and category text that accurately describes the content to be found at the destination, use of plain language rather than slogans, and built-in feedback to confirm to users that they are still on the right path towards their desired information goal [4]. Also useful to Web designers, Withrow (2003) points out user behaviours that indicate poor information scent: indecision or hesitation between two or more links, frustration or confusion during browsing, random clicking and over-use of the ‘Back’ button [8]. Withrow also recommends the construction of broader hierarchies, pointing out that by limiting headers to too small a number of terms will result in vaguer terms designed to suit a greater number of sub-headings [8].

Scent-based Information Retrieval Tools

In the last few years, the Palo Alto team has developed a number of practical tools that use information scent as a basis for improving computerized information retrieval. ScentIndex is a system that produces an on-the-fly index specific in response to keywords provided by the user. Similarly, ScentHighlights highlights relevant sentences in a text in response to keywords provided by the user. In both cases the computation for these systems rely on two components: spreading activation (a cognitive model of human memory retrieval used in cognitive psychology research) and word co-occurrence, which models “relatedness of concepts” and is used in statistical language processing [2].

Concluding Summary

As seen above, the tools that have been devised thus far using the principles of information scent and information foraging offer much promise for providing predictive methods of Web analytics and assistance to users in improving information retrieval.

Further Reading

  • Chi, E. H. (2003). Scent of the Web. In Ratner, J. (Ed.) Human factors and web development, pp. 265-285. Mahwah, NJ: Lawrence Erlbaum.
  • Chi, E. H., Pirolli, P., & Pitkow, J. (2000). The scent of a site: A system for analyzing and predicting information scent, usage, and usability of a Web site. In Proceedings of the ACM Conference on Human Factors in Computing Systems, CHI 2001, pp. 161-168. The Hague, Netherlands: Association for Computing Machinery.
  • Pirolli, P. (2007). Information foraging theory: Adaptive interaction with information. Oxford: Oxford University Press.
  • Pirolli, P., & Card, S. (1999). Information foraging. Psychological review, 106(4), 643-675. Retrieved 10/20/2007 from PsycArticles database.

References

[1] Chi, E. H. (2003). Scent of the Web. In Ratner, J. (Ed.) Human factors and web development, pp. 265-285. Mahwah, NJ: Lawrence Erlbaum.

[2] Chi, E. H., Hong, L., Gumbrecht, M., Card, S. K. (January 2005). ScentHighlights: Highlighting conceptually-related sentences during reading. In Proc. of the 10th International Conference on Intelligent User Interfaces, pp. 272-274. San Diego: ACM Press. Retrieved October 22, 2007, from http://www-users.cs.umn.edu/~echi/papers.html

[3] Chi, E. H., Pirolli, P., Chen, K. & Pitkow, J. (2001). Using information scent to model user information needs and actions on the Web. In Proceedings of the ACM Conference on Human Factors in Computing Systems, CHI 2001, pp.490-497. Seattle, WA: Association for Computing Machinery.

[4] Nielsen, N. (2003). Alertbox: Information foraging: Why Google makes people leave your site faster. Retrieved October 22, 2007, from http://www.useit.com/alertbox/20030630.html

[5] Palo Alto Research Center. (n. d.). The Bloodhound project: Automating discovery of Web usability issues using the InfoScent™ simulator. Retrieved from http://www.parc.com/research/publications/details.php?id=4747

[6] Pirolli, P. (1997). Computational models of information scent-following in a very large browsable text collection. In Proceeding of the ACM Conference on Human Factors in Computer Systems, CHI ‘97, pp.3-10. Retrieved October 15, 2007, from ACM database.

[7] Pirolli, P., & Card, S. (1999). Information foraging. Psychological review, 106(4), 643-675. Retrieved 10/20/2007 from PsycArticles data base.

[8] Withrow, J. (2002). Do your links stink? techniques for good web information scent. Bulletin of the American Society for Information Science and Technology, 28(5), 7. Retrieved 10/20/2007 from Research Library Complete database.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s