## Research

A Primer on Probabilistic Approaches to Visual Perception

### A Primer on Probabilistic Approaches to Visual Perception

### Background

The first and most influential advocate of using past experience as a means of contending with the uncertain provenance of visual stimuli was Hermann von Helmholtz (1866/1924). Helmholtz summarized his conception of this empirical contribution to visual percepts by proposing that the raw "sensations" generated by the physiological infrastructure of the eye and the input stages of the visual brain could be modified by information derived from experience. Helmholtz described this process as making "unconscious inferences" about reality, thus generating perceptions more nearly aligned with stimulus sources when input-level sensations proved inadequate (op cit., vol. III, p.10 ff). Despite these speculations and the ensuing debate during the second half of the 19th C., vision science during most of the 20th C. has been understandably dominated by the enormous success of modern neurophysiology and neuroanatomy. A plausible assumption in much contemporary vision research has thus been that understanding visual perception will be best achieved by gleaning increasingly precise information about the receptive field properties of visual neurons and the synaptic connectivity that gives rise to these properties. As a result, the role of past experience in determining what observers see has, until recently, received relatively little attention.### Bayes' theorem

If the visual system uses empirical information to generate perceptions that reflect the real-world conditions and object relationships that observers have always had to respond to by appropriate visually-guided behavior, then understanding vision inevitably means understanding how, in statistical terms, physical sources are related to retinal images. By far the most popular approach to meeting this challenge has been Bayesian decision theory. Thomas Bayes was an 18th C. minister and amateur mathematician whose paper entitled "An Essay towards Solving a Problem in the Doctrine of Chances" was published posthumously in 1763. The manuscript proved a theorem showing how conditional probabilities are used in making inferences. Although Bayes' purpose in elaborating his eponymous theorem remains obscure, it has been applied to advantage in a number of disciplines as a framework for addressing statistical problems whose solution depends on an assessment of hypotheses that are only more or less likely to be true as a result of complex circumstances. In vision research, Bayes' theorem was initially used to develop pattern recognition strategies for computer vision. More recently, however, the framework provided by the theorem has been advocated as a means of rationalizing visual perception (or at least the judgments associated with visual perception). Bayes' theorem is usually written in the formP(H|E) = | P(H) * P(E|H) |

P(E) |

where H is a hypothesis, E the evidence pertinent to its validity and P probability. The first term on the right side of Bayes' equation, P(H), is referred to as the prior probability distribution or simply the prior, and is a statistical measure of confidence in the hypothesis, absent any present evidence pertinent to its truth or falsity. With respect to vision, the prior describes the relative probabilities of different physical states of the world pertinent to retinal images, i.e., the relative frequency of occurrence of various illuminants, surface reflectance values, object sizes and so on. The second term, P(E|H), is called the likelihood function. If hypothesis H were true, this term indicates the probability that the evidence E would have been available to support it. In the context of vision, given a particular state of the physical world (i.e., a particular combination of illumination, reflectance properties, object sizes etc.), the likelihood function describes the probability that the state would generate the retinal projection in question. The product of the prior and the likelihood function, divided by a normalization constant, P(E), gives the posterior probability distribution, P(H|E). The posterior distribution defines the probability of hypothesis H being true, given the evidence E. In vision, the posterior probability distribution thus indicates the relative probability of a given retinal image having been generated by one or another of the different physical realities that might be the source of the image.

Figure 1 / A Bayesian approach to characterizing the relationship between a visual target of luminance L and its possible physical sources. A) The prior distribution of illuminations (W) and reflectance values (R) in the physical world. The distribution (which is didactic only) shows illumination varying on an arbitrary scale of 0 to 100, and reflectance varying from 0% to 100%. B) The dashed red line on the surface of the prior distribution indicates the position where the product of illumination and reflectance equals the luminance (L) of the target. If the image formation process is assumed to be free of noise, the posterior distribution, P(W, R | L), obtained by multiplying the prior by a likelihood function is the section of the prior distribution along the dashed line. C) The addition of Gaussian noise to the image formation process makes the posterior distribution 'thicker' but does not alter the fact that the posterior is effectively a section of the prior.

### Bayesian decision theory

Because the posterior distribution indicates only the relative probabilities of a set of possible image sources, a particular source (i.e., a particular combination of illumination and reflectance in the example above) must be selected from this set if the aim is to predict what an observer will actually see. The usual way of addressing this further issue is to assume that the visual system makes this choice according to the behavioral consequences associated with each perceptual "decision". The influence of various consequences is typically expressed in terms of the discrepancy between the decision made and the actual state of the world, which over the full range of the possible choices defines a gain-loss function. Since there is no a priori way to model this function (indeed, given the enormous number of variables involved, a realistic gain-loss function for some aspect of vision would be extraordinarily difficult to determine), the relative cost of different behavioral responses is assumed. For example, a common assumption is that observers will "choose" the percept that corresponds to the maximum value in posterior probability distribution, since this choice would generally minimize the discrepancy between the percept and the actual state of the world.### Empirical ranking theory

The application of Bayesian decision theory to vision is clearly an important advance in that it formalizes Helmholtz's general proposal about "visual inferences" as a means of contending with stimulus uncertainty. Nonetheless, its implementation presents both conceptual and practical difficulties. With respect to the conceptual implications of Bayesian theory applied to visual perception, the intuitively appealing idea that percepts correspond to physical characteristics such as surface reflectance is problematic and in many instances false (as we explain in a later section). Practical obstacles are the difficulty determining the physical parameters relevant to any specific prior, and the need for a decision rule based on an assumed gain-loss function. Is there, then, any other way of conceptualizing how vision utilizes empirical information to deal with the inverse optics problem?*relative frequency of occurrence of that particular stimulus parameter in relation to all other instances of that parameter experienced in the past.*For example, with respect to the perceptual quality of brightness, the brightness perceived in response to the luminance of region of a visual scene would be determined by how often the specific luminance had occurred relative to the occurrence of all the other luminance values in that context in the past experience of observers. In other words, the brightness elicited by a target is determined by the empirical rank of the relevant luminance value within the full range of experience with similar scenes (see Yang and Purves, 2005; Howe and Purves, 2005 for examples of how this approach has actually been used). This biological rationale of this approach is that it is obviously desirable to have the full perceptual range for any visual quality (from the brightest percept we can have to the dimmest, for example) aligned with the full range of the relevant stimulus parameters generated by the physical world (from the most intense luminance experienced in visual stimuli to the least intense).

Figure 2 / The empirical ranking approach to characterizing the relationship between images and their possible physical sources. A) The prior distribution of illumination and reflectance values shown in Figure 2A can be integrated along the dashed red lines (which represent only a few examples) to produce the marginal distribution in (B). Each dashed line is an iso-luminant line along which the product of illumination and reflectance is a specific luminance value. B) The marginal distribution derived by integrating the distribution in (A) along iso-luminant lines. This distribution describes the relative probability of occurrence of the physical sources of different luminance intensities in human experience. C) The cumulative probability distribution derived from (B). The cumulative probability for any specific luminance value, l, is the summed probability of occurrence of the physical sources that generate luminance values less than or equal to that luminance, derived by calculating the area underneath the curve in (B) and to the left of the position where x = l. Thus, the y-value of each point on the cumulative distribution indicates the percentage of physical sources that generate luminance values less than or equal to a specific luminance, providing a basis for ranking that luminance value in the past experience. In the example shown, luminance L' holds a higher rank (r') than luminance L (which holds rank r), and should thus be seen as brighter than L.

### References

Bayes, T. (1763). An essay toward solving a problem in the doctrine of chances. Philos Trans R Soc, 53, 370-418.

Yang, Z, Purves, D (2004). The statistical structure of natural light patterns determines perceived light intensity. Proceedings of the National Academy of Sciences of the United States of America, 101, 8745-8750.

Howe CQ, Purves D (2005) Perceiving Geometry: Geometrical Illusions Explained in Terms of Natural Scene Statistics. New York: Springer.

Catherine CQ, Lotto RB, Purves D (2006) Empirical approaches to understanding visual perception. J Theor Biol 241: 866-875.