week 2

CHAPTER3.pdf

CHAPTER 3

PERCEPTION Graham Pike, Graham Edgar, and Helen Edgar

1 INTRODUCTION

If you have ever searched frantically for an object that turns out to have been right in front of you all along, then this chapter may make you feel better. For, as you will see, perception of even the simplest object is actu- ally a very complex aff air. So, next time you turn the house upside down looking for your keys and then fi nd them in the fi rst place you looked, remember that your brain is using extremely sophisticated processes, many of which are beyond even the most advanced computer programs available today (not that com- puter programs ever lose their keys!).

Th e sophistication of the cognitive processes that allow us to perceive visually is perhaps, if perversely, revealed best through the errors that our perceptual sys- tem can make. Figure 3.1 contains three very simple visual illusions that illustrate this. Image (a) is the Müller– Lyer illusion, in which the vertical line on the left is per- ceived as being longer even though both lines are of an identical length. Image (b) is a Necker cube, in which it is possible to perceive the cube in either of two perspec- tives (although you can never see both at the same time so please do not strain your eyes trying). Image (c) is Kanizsa’s (1976) illusory square, in which a square is perceived even though the image does not contain a square but only four three-quarter-complete circles.

If the cognitive processes involved in perception were simple, then it would be hard to see how the eff ects in Figure 3.1 could occur. Aft er all, they are all based on very straightforward geometric shapes that should be easy to perceive accurately. Activity 3.1 dem- onstrates that there must be more sophisticated pro- cesses that have been developed to perceive the complex visual environment, which get confused or tricked by elements of these images. In fact the three eff ects above are likely to be caused because our visual system has evolved to perceive solid, three-dimen- sional (3D) objects and attempts to interpret the two- dimensional (2D) shapes as resulting from 3D scenes.

Perceptual errors arising from localized damage to the brain also demonstrate the complexities involved in visual perception. Some of the problems faced by people suff ering from specifi c neuropsychological conditions include: being able to recognize objects but not faces (prosopagnosia); being able to perceive indi- vidual parts of the environment but not to integrate these parts into a whole; believing that one’s family has been replaced by robots/aliens or impostors of the same appearance (Capgras syndrome); and only being able to perceive one side of an object (visual, or sen- sory, neglect – see Chapter 2, Section 5.1).

Co py ri gh t © 2 01 2. O UP

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

EBSCO Publishing : eBook Comprehensive Academic Collection (EBSCOhost) - printed on 10/2/2022 5:29 PM via UNIVERSITY OF MARYLAND GLOBAL CAMPUS AN: 678071 ; Nick Braisby, Angus Gellatly.; Cognitive Psychology Account: s4264928

(c)

(a)

(b)

FIGURE 3.1 Three visual phenomena: (a) Müller–Lyer illusion; (b) Necker cube; (c) Kanizsa’s illusory square.

Look at each of the three visual illusions in Figure

3.1 and try to work out why it occurs. If you

can’t think of an answer, it may help to look at

Figure 3.2.

ACTIVITY 3.1

(a)

(b)

(c)

FIGURE 3.2 Some clues as to why the illusions in Figure 3.1 may occur.

66 PART 1 PERCEPTUAL PROCESSES

Co py ri gh t © 2 01 2. O UP

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

1.1 Perceiving and sensing

Th e term perception has diff erent meanings, although a common element in most meanings is that percep- tion involves the analysis of sensory information. When cognitive psychologists talk about perception, they are usually referring to the basic cognitive pro- cesses that analyse information from the senses. Th roughout this chapter we shall be examining research and theories that have attempted to reveal and describe the cognitive processes responsible for analysing sensory information and providing a basic description of our environment; basically, how we make sense of our senses!

Th ere has been considerable debate about the role played by sensory information in our perception of the world, with some philosophers rejecting the idea that it plays any part at all in the perception of objects. Atherton (2002) suggested that this may be because the notion of a sensation is rather problematic: ‘Sensations seem to be annoying, extra little entities . . . that somehow intervene between the round dish and our perception of it as round’ (Atherton, 2002, p.4). We will not delve into this philosophical debate here, other than to note the distinction between sensation and perception. Th roughout this chapter we will use the term ‘sensation’ to refer to the ability of our sense organs to detect various forms of energy (such as light

or sound waves). However, to sense information does not entail making sense of it. Th ere is a key diff erence between being able to detect the presence of a certain type of energy and being able to make use of that energy to provide information as to the nature of the environment surrounding us. Th us we use the term ‘sensation’ to refer to that initial detection and the term ‘perception’ to refer to the process of construct- ing a description of the surrounding world. For exam- ple, there is a diff erence between the cells in a person’s eye reacting to light (sensation) and that person knowing that their course tutor is off ering them a cup of tea (perception).

You may have noticed that we have begun to focus on visual perception rather than any of the other senses. Although the other senses, particularly hear- ing and touch, are undoubtedly important, there has been far more research on vision than on the other modalities. Th is is because when we interact with the world we rely more on vision than on our other senses. As a result, far more of the primate brain is engaged in processing visual information than in processing information from any of the other senses. We use vision in both quite basic ways, such as avoiding objects, and in more advanced ways, such as in read- ing or recognizing faces and objects. So, although the previous chapter examined auditory perception and Chapter 4 will explore haptic perception (touch) as

COMMENT

One explanation for the Müller–Lyer illusion is that

the arrowheads provide clues as to the distance of the

upright line. For example, the inward-pointing arrow-

heads suggest that the vertical line might be the far

corner of a room, whilst the outward-pointing arrow-

heads suggest that the vertical line could be the near

corner of a building. We therefore see the fi rst verti-

cal line as longer because we assume it is further away

from us than the second vertical line, though it makes

the same size image on the retina.

The Necker cube can be seen in two diff erent

ways, as there are no clues as to which is the nearest

face. Most cube-like objects that we encounter are

solid and contain cues from lighting and texture about

which is the nearest face. As the Necker cube does

not contain these cues, we are unable to say for cer-

tain which face is closest.

Kanizsa’s illusory square occurs due to a phenome-

non known as perceptual completion. When we see an

object partly hidden behind (occluded by) another

object, we represent it to ourselves as a whole object

rather than as missing its hidden parts. In the same way,

we assume that four black circles are being occluded

by a white square.

CHAPTER 3 PERCEPTION 67

Co py ri gh t © 2 01 2. O UP

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

well as visual perception, we will devote the present chapter to examining research into, and theories of, visual perception.

1.2 The eye

Th e logical place to start any consideration of visual perception is with the eye. A cross-section of the human eye is presented in Figure 3.3. Incoming light passes through the cornea into a small compartment called the anterior chamber (fi lled with fl uid termed aqueous humour) and then through the lens into the major chamber of the eye that is fi lled with a viscous jelly called vitreous humour. Th e light is focused by the lens/cornea combination onto the retina on the back surface of the eye. It is the receptor cells in the retina that ‘sense’ the light.

Th e retina consists of two broad classes of receptor cell, rods and cones (so called for their shapes). Both

rods and cones are sensitive to light, although the rods respond better than the cones at low light levels and are therefore the cells responsible for maintain- ing some vision in poor light. Th e cones are responsi- ble for our ability to detect fi ne detail and diff erent colours and are the basis of our vision at higher (day- light) light levels. Many animals, such as dogs and cats, have a higher ratio of rods to cones than humans do. Th is allows them to see better in poor light, but means that they are not so good at seeing either colour or fi ne detail.

One area of the retina that is of particular interest is the central portion known as the macula lutea (it is yellow in colour and ‘lutea’ derives from a Latin word that means yellow), which contains almost all of the cones within the human retina. Within the macula, there is a small indentation called the fovea. Th e fovea is the area of the retina that contains the highest density of cones and is responsible for the perception of fi ne detail.

Optic nerve

Cornea

Clear lens

Retina

Incoming light rays

Aqueous humour

Vitreous humour

Anterior chamber

FIGURE 3.3 The human eye.

Place your thumbs together and hold them out at

arm’s length from your eyes. Now slowly move your

left thumb to the left whilst keeping your eyes

focused on your right thumb. You will fi nd that after

you have moved your left thumb more than about

two thumb widths away from where your eyes are

focused, it appears to go out of focus. This is

because the light being refl ected into your eyes

from the left thumb is no longer striking the fovea,

meaning that you cannot perceive it in fi ne detail.

ACTIVITY 3.2

68 PART 1 PERCEPTUAL PROCESSES

Co py ri gh t © 2 01 2. O UP

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

1.3 Approaches to perception

Psychologists have taken many diff erent approaches to studying perception. One important distinction between approaches is whether the ‘goal’ of percep- tion is assumed to be action or recognition. It is pos- sible to conceive of recognition and action as being stages in the same perceptual process, so that action would only happen once recognition had taken place. However, our reaction to objects in the environment sometimes has to be very quick indeed, so that fi rst having to work out what an object may be would be inconvenient to say the least. For example, if I see a moving object on a trajectory that means it will hit me in the head, then the most important thing is to move my head out of the way. Working out that the object is the crystal tumbler containing vodka and tonic that was only moments ago in the hand of my somewhat angry-looking partner is, for the moment at least, of secondary importance. I need to act to get out of the way of the object regardless of what the object actually is or who threw it.

As we shall see, there is evidence that perception for action and perception for recognition are quite diff erent processes that may involve diff erent neural

mechanisms (Milner and Goodale, 1998). But, although it is important to make the distinction between perception for action and perception for rec- ognition, we should not see them as being entirely independent. Sometimes the object that is about to hit your head could be the football that David Beckham has just crossed from the wing, requiring a very diff er- ent response from that to the crystal tumbler.

Another way of diff erentiating approaches to per- ception is to consider the ‘fl ow of information’ through the perceptual system. To see what we mean by this phrase, try Activity 3.3.

So, in the case of touch, perception of the environ- ment can involve information ‘fl owing’ through the relevant perceptual system in two directions. But what about vision? If we were to remove the blindfold from our student in Activity 3.3, they would instantly be able to tell what the unknown object was or to spot the book in front of them. Does this mean that there is not a similar fl ow of information when the sense being used is vision?

To answer this question, let’s try to formulate the stages involved in the student perceiving that there is a book in front of them. One approach might be:

• Light refl ected from the book strikes the retina and is analysed by the brain.

Consider these two scenarios:

1 A blindfolded student trying to work out

what the unknown object they have been handed

might be.

2 A blindfolded student searching for their

textbook.

Imagine you are the blindfolded student. What strat-

egies do you think you might employ to complete the

above two tasks successfully? Can you identify any

key diff erences in these strategies?

COMMENT

A common strategy to employ for the fi rst scenario is

to try to build up a ‘picture’ of the object by gradually

feeling it. A common strategy to employ for the sec-

ond scenario is to hold in your mind the likely shape

and texture of the book and to search the environ-

ment for an object that shares these characteristics.

The key diff erence between these scenarios is the

direction in which information about the object is

‘fl owing’, demonstrated by how the student’s existing

knowledge of what objects look like is being utilized.

In the fi rst scenario, information is fl owing ‘upward’,

starting with an analysis of the information derived

from the senses (in this case via touch). In the second

scenario, information is fl owing ‘downward’, starting

with the knowledge of what books tend to feel like.

ACTIVITY 3.3

CHAPTER 3 PERCEPTION 69

Co py ri gh t © 2 01 2. O UP

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

• Th is analysis reveals four sudden changes in brightness (caused by the edges of the book against whatever is behind it).

• Two of these are vertical edges and two are horizontal edges (the left /right and top/bottom of the book).

• Each straight edge is joined (by a right angle at each end) to two others (to form the outline of the book).

• Within these edges is an area of gradually changing brightness containing many small, much darker areas (the white pages with a growing shadow toward the spine and the much darker words).

• A comparison of this image with representations of objects seen previously suggests that the object is an open book.

Th is approach starts with the image formed on the retina by the light entering the eye and proceeds by analysing this pattern to gradually build up a represen- tation of the object in view, so we refer to it as involving bottom-up processing. Th is means that the fl ow of information through the perceptual system starts from the bottom – the sensory receptors – and works upward until an internal representation of the object is formed.

Th ere is, however, another way of recognizing the book. It is very likely that the student has seen many books in the past and has a fair idea of what a book should look like. Th is existing knowledge regarding

book appearance could come in very useful in fi nding the textbook. Instead of building up a picture of the environment by analysing sensory information alone, it could be that the student uses existing knowledge of what books look like to fi nd this particular book. For example, they might progress like this:

• I know that books are rectangular in shape and have light pages with dark words.

• I can see something in front of me that matches this description, so it must be a book.

Th e fl ow of information in this latter example has been reversed. Th e student started with existing knowledge regarding the environment and used this to guide their processing of sensory information. Th us the fl ow of information progressed from the top down, as it started with existing knowledge stored in the brain, and we refer to it as involving top-down processing.

So both haptic and visual perceptual processes may operate both by building up a picture of the environ- ment from sensory information and by using existing knowledge to make sense of new information. In other words, the fl ow of information through the per- ceptual system can be either bottom-up or top-down. Th ese concepts will be explored throughout this chap- ter and we shall examine theories that concentrate on one or the other of these processes, and also look at how they might interact.

SUMMARY OF SECTION 1

• Even the perception of simple images involves sophisticated cognitive processing, as demonstrated

by visual illusions and neuropsychological disorders.

• We use the term sensation to refer to the detection of a particular form of energy by one of the senses

and the term perception to refer to the process of making sense of the information sent by the senses.

• In the human eye the lens and cornea focus light onto the retina, which contains receptor cells that

are sensitive to light.

• Perception can have diff erent goals. The most common goals are perception for action and percep-

tion for recognition.

• The bottom-up approach to perception sees sensory information as the starting point, with perception

occurring through the analysis of this information to generate an internal description of the environment.

• The top-down approach to perception involves making greater use of prior knowledge, with this

guiding the perceptual process.

70 PART 1 PERCEPTUAL PROCESSES

Co py ri gh t © 2 01 2. O UP

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

2 THE GESTALT APPROACH TO PERCEPTION

As with Chapter 2, we are going to examine the vari- ous approaches that have been taken to studying vis- ual perception in a more or less historical order. One of the principal approaches to perception in the fi rst half of the twentieth century was that of the Gestalt movement, which was guided by the premise ‘the whole is greater than the sum of its parts’. In percep- tual terms, this meant that an image tended to be per- ceived according to the organization of the elements within it, rather than according to the nature of the individual elements themselves.

It is easy to see perceptual organization at work as it tends to be a very powerful phenomenon. In fact, it appears as if both visual and auditory stimuli can be grouped according to similar organizing principles (Aksentijevic et al., 2001).

Have a look at Activity 3.4. Most people looking at these images describe a circle and two crossing lines. But the image on the left is not a circle as it contains a gap at the top. Th is is the Gestalt percep- tual organizational phenomenon of closure at work, in which a ‘closed’ fi gure tends to be perceived rather than an ‘open’ one. Likewise, the image on the right is not necessarily showing crossing lines, as it could be two pen-tips touching (in the middle of the image). Th e reason you see a cross is due to what the

Gestalt researchers called good continuation, by which we tend to interpret (or organize) images to produce smooth continuities rather than abrupt changes.

Try Activity 3.5. You will probably have seen these images as two squares, due to the law of closure. However, you will also probably have seen the square on the left as consisting of columns of dots and the one on the right as consisting of rows of dots. If so, this was an example of the organizational law of proximity, because in the left -hand image the horizontal spacing between the dots is greater than the vertical, and vice versa for the image on the right. Th us, the proximity of the individual elements is being used to group them into columns in the left -hand square and rows in the right-hand one.

As well as again seeing a square due to the law of closure. Now try Activity 3.6. Again, you probably saw this as a square, due to the law of closure, but perhaps also saw the square as consisting of columns of circles. If so, this was an example of the organiza- tional law of similarity (in this case the similarity of colour). However, the spacing of the circles is such that the law of proximity encourages you to see rows

Look at Figure 3.4 and describe your fi rst impres-

sion of what you see.

FIGURE 3.4 Two examples of perceptual

organization.

ACTIVITY 3.4 As before, look at Figure 3.5 and describe your

fi rst impressions.

FIGURE 3.5 The organizational law of proximity.

ACTIVITY 3.5

CHAPTER 3 PERCEPTION 71

Co py ri gh t © 2 01 2. O UP

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

not columns. For many people the law of similarity takes precedence and they see columns, while others may tend to see rows. Most people can readily switch between one organization (or interpretation) and the other because each conforms with a particular Gestalt law.

Th e Gestalt researchers (including Koffk a, 1935, Kohler, 1947, and Werthiemer, 1923) formulated other organizational laws, but most were deemed to be manifestations of the Law of Pragnanz, described by Koffk a as: ‘Of several geometrically possible organiza- tions that one will actually occur which possesses the best, simplest and most stable shape’ (Koffk a, 1935, p.138).

So, you can see that a number of organiza- tional laws can be used in order to work out which

individual components of an image should be grouped together. Now look around the room in which you are sitting. How many squares composed of dots can you see? How many nearly complete cir- cles and crossing lines are there? Your immediate response was probably to say ‘none’ or ‘only those in this book’. However, if you look carefully you will see that the stimuli used in the Gestalt demonstra- tions do have counterparts in the real world. For example, when I look out of my window I see a foot- ball that is partly hidden by a post, which provides an example of closure, as I perceive a complete sphere rather than an incomplete circle. Th e fi gures that you have seen in this section can therefore be seen as simplifi ed 2D versions of real-world objects and scenes. Because they are simplifi ed, some infor- mation that would be present in real-world scenes is discarded. Th is lack of realism is a disadvantage. On the other hand, however, it is possible to control and manipulate features of these fi gures, such as the proximity or similarity of elements, to see how they may contribute to perception.

As we shall see in the next section, there is consid- erable tension in the fi eld of visual perception regard- ing the usefulness of simplifi ed stimuli such as those used by the Gestaltists. Some approaches are based on laboratory experimentation in which simplifi ed scenes or objects are shown to participants, whilst propo- nents of other approaches claim that perception can only be studied in the real world, by examining how people perceive solid, 3D objects that are part of a complex 3D environment.

SUMMARY OF SECTION 2

• The Gestalt approach to perception involved studying the principles by which individual elements

tend to be organized together.

• Organizing principles include closure, good continuation, proximity, and similarity.

• The stimuli used by Gestalt researchers tended to be quite simple, two-dimensional geometric

patterns.

Now describe what you see in Figure 3.6.

FIGURE 3.6 The organizational laws of similarity

and proximity.

ACTIVITY 3.6

72 PART 1 PERCEPTUAL PROCESSES

Co py ri gh t © 2 01 2. O UP

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

3 GIBSON’S THEORY OF PERCEPTION

In Section 1.3 we stated that one way of classifying dif- ferent approaches to perception was according to whether they were primarily bottom-up or top-down. If visual perception is based primarily around bottom- up processing, we must be capable of taking the infor- mation from the light waves that reach our eyes and refi ning it into a description of the visual environ- ment. Bottom-up perception requires that the light arriving at the retina is rich in information about the environment. One bottom-up approach to perception, that of J.J. Gibson (1950, 1966), is based on the prem- ise that the information available from the visual envi- ronment is so rich that no cognitive processing is required at all. As Gibson himself said:

When the senses are considered as a perceptual sys- tem, all theories of perception become at one stroke unnecessary. It is no longer a question of how the mind operates on the deliverances of sense, or how past experience can organize the data, or even how the brain can process the inputs of the nerves, but simply how information is picked up.

(Gibson, 1966, p.319)

If you are thinking to yourself, ‘What does picked up mean?’ or ‘How is this information picked up?’, you are expressing a criticism that is oft en levelled at Gibson’s theory (e.g. Marr, 1982). Th e Gibsonian approach concentrates on the information present in the visual environment rather than on how it may be analysed. Th ere is a strong link between percep- tion and action in Gibson’s theory, and action rather than the formation of an internal description of the environment can be seen as the ‘end point’ of perception.

Gibson conceptualized the link between percep- tion and action by suggesting that perception is

direct, in that the information present in light is suf- fi cient to allow a person to move through and inter- act with the environment. One implication of this is that, whereas perception of a real environment is direct, perception of a 2D image in a laboratory experiment (or any 2D image come to that) would be indirect. When confronted with an image, our direct perception is that it is an image; that it is two-dimen- sional and printed on paper, for example. Our per- ception of whatever is being depicted by the image is only indirect. For this reason, Gibson thought that perception could never be fully explored using labo- ratory experiments.

When you look at Figure 3.7, what do you see? Your fi rst reaction is probably to say ‘a pipe’. But if what you are seeing is a pipe, then why can’t you pick it up and smoke it? As Magritte informs us, what you are seeing is not a pipe but a picture of a pipe. Like Gibson, Magritte is drawing a distinction between direct perception (paint on canvas) and indirect per- ception (that the painting depicts a pipe).

FIGURE 3.7 Ceci n’est pas une pipe, 1928, by René

Magritte.

CHAPTER 3 PERCEPTION 73

Co py ri gh t © 2 01 2. O UP

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

3.1 An ecological approach

At the heart of Gibson’s approach to perception is the idea that the world around us structures the light that reaches the retina. Gibson believed perception should be studied by determining how the real environment structures the light that reaches our retina. From the importance placed on the ‘real world’ it is clear why Gibson called his theory of perception ecological optics. Gibson referred to theories that were based on experiments employing artifi cial, isolated, fl at (or plane) shapes as ‘air’ theories, whilst he referred to his own as a ‘ground’ theory, as it emphasized the role played by the real, textured surface of the ground in providing information about distance. As Gibson stated: ‘A surface is substantial; a plane is not. A sur- face is textured; a plane is not. A surface is never per- fectly transparent; a plane is. A surface can be seen; a plane can only be visualized’ (Gibson, 1979, p.35).

Th e impetus for Gibson’s theory came from his work training pilots to land and take off during the Second World War. When approaching a runway, it is very important that a pilot is able to judge accurately the distance between the plane and the ground. Th e perceptual skill involved in this judgement is that of ‘depth perception’, this being the ability to judge how far you are from an object or surface. However, Gibson found that tests based on pictorial stimuli did not dis- tinguish good from bad pilots and that training with pictorial stimuli had little impact on actual landing performance (Gibson, 1947). Extrapolating from this problem, Gibson suggested that psychological experi- mentation based on the use of pictorial stimuli is not an apt method for studying perception.

His point was that the experience of perception in the real world is very diff erent from the experience of looking at 2D experimental stimuli in a laboratory. In the real world, objects are not set against a blank background but against the ground, which consists of a very large number of surfaces that vary in their dis- tance from and orientation to the observer. In their turn, these surfaces are not perfectly smooth planes but consist of smaller elements, such as sand, earth, and stone, which give them a textured appearance. In

addition, the objects themselves will consist of real surfaces that also contain texture. To explain percep- tion, we need to be able to explain how these surfaces and textures provide information about the world around us.

3.2 The optic array and invariant information

Th e structure that is imposed on light refl ected by the textured surfaces in the world around us is what Gibson termed the ambient optic array. Th e basic structure of the optic array is that the light refl ected from surfaces in the environment converges at the point in space occupied by the observer (see Figure 3.8). As you can see from Figure 3.9, as you stand up, the position of your head with respect to the environment is altered and the optic array changes accordingly.

You can see from Figures 3.8 and 3.9 that the pri- mary structure of the optic array is a series of angles that are formed by light refl ecting into the eyes from the surfaces within the environment. For example, an angle may be formed between the light that is refl ected from the near edge of a table and that from the far edge.

In addition to the primary structure of the optic array, Gibson maintained that there were additional, higher-order features that could provide unambigu- ous information as to the nature of the environment. He referred to these higher-order features as invariants and believed that an observer could perceive the surrounding world by actively sampling the optic array in order to detect invariant information.

One of the most commonly cited forms of invari- ant information was explored by Sedgwick (1973). Sedgwick demonstrated the ‘horizon ratio relation’, which specifi es that the ratio of how much of an object is above the horizon to how much is below it remains constant (or invariant) as the object travels either toward or away from you (see Figure 3.10). Th is form of invariant information allows you to judge the rela- tive heights of diff erent objects regardless of how far away they are. Th e proportion of the object that is

74 PART 1 PERCEPTUAL PROCESSES

Co py ri gh t © 2 01 2. O UP

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

‘above’ the horizon increases with the overall height of the object (see Figure 3.11).

One of the most important forms of invariant infor- mation in Gibson’s theory is texture gradient, although he also discusses gradients of colour, intensity, and

disparity. Th ere are three main forms of texture gradi- ent relating to the density, perspective, and compres- sion of texture elements. Th e exact nature of a texture element will change from surface to surface (see Figure 3.12); in a carpet the elements are caused by the

FIGURE 3.8 The ambient optic array. Source: Gibson, 1979, Figure 5.3

FIGURE 3.9 Change in the optic array caused by movement of the observer. Source: Gibson, 1979, Figure 5.4

CHAPTER 3 PERCEPTION 75

Co py ri gh t © 2 01 2. O UP

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

individual twists of material, while on a road they are caused by the small stones that make up the surface. In making use of texture gradients, we assume that the texture of the surface is uniform; for example, that the road surface consists of stones of similar size through- out its length. Th erefore, any change in the apparent nature of the texture provides us with information regarding the distance, orientation, and curvature of the surface.

Using texture gradients as a guide, we can tell if a surface is receding because the density of texture

elements (number of elements per square metre) will increase with distance. For example, the surface in Figure 3.13(a) appears to recede as the density of tex- ture elements (the individual squares) increases toward the top of the image.

In a similar fashion, the perspective gradient (the width of individual elements) and the compression gradient (the height of individual elements) can reveal the shape and orientation of a surface. As you can see from Figure 3.13(b), we do not see this surface as fl at because the width and height of the individual texture

1m 10m 20m

FIGURE 3.10 The horizon ratio relation: same height objects at diff erent distances.

1.8m

1.4m

FIGURE 3.11 The horizon ratio relation: diff erent height objects at same distance.

76 PART 1 PERCEPTUAL PROCESSES

Co py ri gh t © 2 01 2. O UP

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

FIGURE 3.12 Examples of texture elements. Source: Gibson, 1979, Figure 2.1

(a) (b)

FIGURE 3.13 (a) How texture gradient can reveal that a surface is receding; (b) How perspective and compression

gradients reveal the shape and orientation of a surface.

CHAPTER 3 PERCEPTION 77

Co py ri gh t © 2 01 2. O UP

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

elements changes, making the surface appear to be slanting and curved.

Without texture, considerable ambiguity about shape and orientation can be introduced into the stimulus and this poses a problem for experiments that make use of planar geometric shapes (as you saw with the Necker cube in Activity 3.1). So, texture gradient is a powerful source of invariant information provided by the struc- ture of light within the optic array. It furnishes us with a wealth of information regarding the distance, size, and orientation of surfaces in the environment.

3.3 Flow in the ambient optic array

What is clear to me now that was not clear before is that structure as such, frozen structure, is a myth, or at least a limiting case. Invariants of structure do not exist except in relation to variants.

(Gibson, 1979, p.87)

In the above quotation Gibson is highlighting the importance of another intrinsic aspect of perception that is oft en missing from laboratory stimuli – that of motion. His argument is that invariant information can only be perceived in relation to variant informa- tion, so to perceive invariant information we have to see the environment change over time.

Th ere are two basic forms of movement: motion of the observer and motion of objects within the envi- ronment. Motion of the observer tends to produce the greatest degree of movement as the entire optic array is transformed (see Figure 3.9). Gibson suggested that this transformation provides valuable information about the position and shape of surfaces and objects. For example, information about shape and particu- larly position is revealed by a phenomenon known as motion parallax. Th e principle of motion parallax is that the further an object is from an observer, the less it will appear to move as the observer travels past it. Imagine the driver of a moving inter-city train looking out of their side-window at a herd of cows grazing in a large fi eld next to the line. Th e cows near the train will

appear to move past much faster than the cows at the back of the fi eld. Th us, the degree of apparent motion is directly related to the distance of the object from the observer.

A second means by which observer motion can provide information about the shape and position of objects is through occlusion. Imagine the same observer described above travelling past the same fi eld of cows. Th eir motion will cause the cows nearest to the train to pass in front of, or occlude, the cows graz- ing further away. Th is allows the observer to deduce that the occluded cows (i.e. the ones that become hid- den by other cows) are further away than those doing the occluding.

Gibson dealt with the motion of the observer through reference to fl ow patterns in the optic array. As our train driver looks at the grazing cows by the side of the track, the entire optic array will appear to fl ow past from left to right, assuming that the driver looks out of the right-hand window (see Figure 3.14).

When the train driver becomes bored of cow watch- ing and returns their attention to the track in front of the train, the fl ow patterns in the optic array will change so that the texture elements appear to be radi- ating from the direction in which the train is travelling (the apparent origin of this radiating fl ow pattern is known as the pole). Th e texture elements that make up the surfaces in the environment will appear to emerge from the pole, stream toward the observer, and then disappear from view (see Figure 3.15).

Th is pattern would be completely reversed if the guard at the rear of the train were to look back toward the direction from which the train had come (see Figure 3.16).

Gibson proposed a set of rules that linked fl ow in the optic array to the movement of the observer through the environment (Gibson, 1979):

• If there is fl ow in the ambient optic array, the observer is in motion; if there is no fl ow, the observer is not moving.

• Outfl ow of the optic array from the pole specifi es approach by the observer and infl ow to the pole specifi es retreat.

78 PART 1 PERCEPTUAL PROCESSES

Co py ri gh t © 2 01 2. O UP

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

• Th e direction of the pole specifi es the direction in which the observer is moving.

• A change in the direction of the pole specifi es that the observer is moving in a new direction.

For Gibson, the movement of the observer was a critical part of perception. In fact, he deemed it of such impor-

tance that he saw the perceptual system as not being limited to the eyes and other sense organs, but consti- tuting a hierarchy of organs in which the eyes are linked to a head that can turn, which is linked to a body that can move. As Gibson said: ‘Perceiving is an act, not a response, an act of attention, not a triggered impres- sion, an achievement, not a refl ex’ (Gibson, 1979, p.21).

FIGURE 3.14 Flow patterns in the optic array parallel to the direction of the observer’s motion.

FIGURE 3.15 Flow patterns in the optic array in the direction of the observer’s motion.

CHAPTER 3 PERCEPTION 79

Co py ri gh t © 2 01 2. O UP

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

3.4 Aff ordances and resonance

We began our discussion of Gibson’s theory by stating that he saw information as being directly perceived or ‘picked up’ from the environment. In his later work Gibson (1979) took this idea of information being ‘picked up’ one step further and suggested that the end point of the perceptual process was not a visual description of the surrounding world, but rather that objects directly ‘aff orded’ their use.

At its simplest (and least controversial) level, the concept of aff ordance builds on earlier research con- ducted by the Gestalt psychologists, in which the fea- tures of objects were seen as providing information as to their use. For instance, the features of a rock would suggest that it could be stood upon, the features of a fallen branch that it could be picked up, and the fea- tures of a fruit that it could be eaten.

However, Gibson makes two claims regarding aff ordances that are rather harder to accept and have proven to be far more controversial. First, he states that aff ordances act as a bridge between perception and action and do not require the intervention of any cognitive processes. Just as the nature of the environ-

ment can be directly ‘picked up’ from the structure of the optic array, the observer can interact with surfaces and objects in the environment directly through aff ordance.

Second, Gibson saw no role for memory in percep- tion, as the observer does not have to consult their prior experience in order to be able to interact with the world around them. Instead he states that the per- ceptual system resonates to invariant information in the optic array. Although the defi nition of ‘resonates’ and the identity of what is doing the resonating is left very vague by Gibson, the point is that ‘global’ infor- mation about the optic array (in the form of invariant information) is dealt with by the perceptual system without the need to analyse more ‘local’ information such as lines and edges.

Th ese assertions may seem unreasonable to you, as they have done to other researchers. If we are studying psychology, then surely the cognitive processes that allow us to perceive must be one focus of our atten- tion. In addition, if when perceiving the world we do not make use of our prior experiences, how will we ever learn from our mistakes? In the next two sections we shall turn to theories that attempt to deal with these issues and to explain exactly how the brain makes sense of the world around us.

FIGURE 3.16 Flow patterns in the optic array in the opposite direction to the observer’s motion.

80 PART 1 PERCEPTUAL PROCESSES

Co py ri gh t © 2 01 2. O UP

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

However, even if Gibson’s theory does not enlighten us as to the nature of the cognitive processes that are involved in perception, it has still been extremely infl uential, and researchers in perception still need to bear in mind his criticisms of the laboratory approach that makes use of artifi cial stimuli:

Experiments using dynamic naturalistic stimuli can now be conducted, virtual scenes can be constructed,

and images of brain activity while viewing these can be captured in a way that would have been diffi cult to envisage a century ago. However, the sim- ulated lure of the screen (or even a pair of screens) should not blind experimenters and theorists to the diff erences that exist between the virtual and the real.

(Wade and Bruce, 2001, p.105)

SUMMARY OF SECTION 3

• Gibson developed an ecological approach to perception and placed great emphasis on the way in

which real objects and surfaces structure light – he termed this the ambient optic array.

• He suggested that invariant information (such as texture gradient) could be ‘picked up’ from the optic

array to provide cues as to the position, orientation, and shape of surfaces.

• Invariant information could also be revealed by motion, which produces variants such as fl ow

patterns in the optic array.

• The importance of real surfaces and of motion led Gibson to suggest that perception could not be

studied using artifi cial stimuli in a laboratory setting.

• Gibson did not see perception as a product of complex cognitive analysis, but suggested that objects

could ‘aff ord’ their use directly.

• Interaction with the environment is at the heart of Gibson’s theory; action is seen as the ‘goal’ of

perception.

4 MARR’S THEORY OF PERCEPTION

. . . the detection of physical invariants, like image surfaces, is exactly and precisely an information-pro- cessing problem, in modern terminology. And sec- ond, he (Gibson) vastly underrated the sheer diffi culty of such detection . . . Detecting physical invariants is just as diffi cult as Gibson feared, but nevertheless we can do it. And the only way to understand how is to treat it as an information-processing problem.

(Marr, 1982, p.30)

As we stated previously, one criticism that has been levelled at Gibson’s approach is that it does not explain in suffi cient detail how information is picked up from

the environment. To address this problem, a theory was needed that attempted to explain exactly how the brain was able to take the information sensed by the eyes and turn it into an accurate, internal representa- tion of the surrounding world. Such a theory was pro- posed by David Marr (1982).

Before we look at Marr’s theory, it is worth pointing out some of the similarities and diff erences between the approaches taken by Marr and Gibson. Like Gibson, Marr’s theory suggests that the information from the senses is suffi cient to allow perception to occur. However, unlike Gibson, Marr adopted an information- processing approach in which the processes responsible

CHAPTER 3 PERCEPTION 81

Co py ri gh t © 2 01 2. O UP

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

for analysing the retinal image were central. Marr’s the- ory is therefore strongly ‘bottom-up’, in that it sees the retinal image as the starting point of perception and explores how this image might be analysed in order to produce a description of the environment. Th is meant that, unlike Gibson who saw action as the end point of perception, Marr concentrated on the perceptual pro- cesses involved in object recognition.

Marr saw the analysis of the retinal image as occur- ring in four distinct stages, with each stage taking the output of the previous one and performing a new set of analyses on it. Th e four stages were:

1. Grey level description – the intensity of light is measured at each point in the retinal image.

2. Primal sketch – fi rst, in the raw primal sketch, areas that could potentially correspond to the edges and texture of objects are identifi ed. Th en, in the full primal sketch, these areas are used to generate a description of the outline of any objects in view.

3. 2½D sketch – at this stage a description is formed of how the surfaces in view relate to one another and to the observer.

4. 3D object-centred description – at this stage object descriptions are produced that allow the object to be recognized from any angle (i.e. independent of the viewpoint of the observer).

More generally, Marr concentrated his work at the computational theory and algorithmic levels of analy- sis (see Chapter 1) and had little to say about the neu- ral hardware that might be involved. One reason for this is that he developed his theory largely by design- ing computer-based models and algorithms that could perform the requisite analyses.

4.1 The grey level description

One way of describing the fi rst stage in Marr’s theory is to say that it gets rid of colour information. Th is is not because Marr thought that colour was unimpor- tant in perception. Rather, he thought that colour information was processed by a distinct module and

need not be involved in obtaining descriptions of the shape of objects and the layout of the environment. In fact, the modular nature of perception was a funda- mental part of Marr’s theory:

Computer scientists call the separate pieces of a process its modules, and the idea that a large computation can be split up and implemented as a collection of parts that are as nearly independent of one another as the overall task allows, is so important that I was moved to elevate it to a principle; the principle of modular design.

(Marr, 1982, p.102)

Th is meant that the perception of colour could be handled by one ‘module’ and the perception of shape by another.

Th e fi rst stage in Marr’s theory acts to produce a description containing the intensity (i.e. the bright- ness) of light at all points of the retina. A description composed solely of intensity information is referred to as ‘greyscale’, because without the information pro- vided by analysing the wavelength of light, it will con- sist of nothing but diff erent tones of grey. If you turn down the colour on your TV, the resulting picture will be a greyscale image – although we call it ‘black and white’, it actually consists of many shades of grey.

Without going into too much detail, it is possible to derive the intensity of the light striking each part of the retina because as light strikes a cell in the retina, the voltage across the cell membrane changes and the size of this change (or depolarization) corresponds to the intensity of the light. Th erefore, a greyscale (or grey level) description is produced by the pattern of depolarization on the retina. In other words, it is pos- sible to derive the greyscale description simply by ana- lysing the outputs of the receptor cells in the retina.

4.2 The primal sketch

Th e next part in Marr’s theory, the generation of the primal sketch, occurs in two stages. Th e fi rst stage consists of forming a raw primal sketch from the grey level description by identifying patterns of changing

82 PART 1 PERCEPTUAL PROCESSES

Co py ri gh t © 2 01 2. O UP

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

intensity. Activity 3.7 may help you understand what this means.

It is possible to group changes in the intensity of the refl ected light into three categories:

• Relatively large changes in intensity produced by the edge of an object.

• Smaller changes in intensity caused by the parts and texture of an object.

• Still smaller changes in intensity due to random fl uctuations in the light refl ected.

Marr and Hildreth (1980) proposed an algorithm that could be used to determine which intensity changes corresponded to the edges of objects, mean- ing that changes in intensity due to random fl uctua- tions could be discarded. Th e algorithm made use of a technique called Gaussian blurring, which involves averaging the intensity values in circular regions of the greyscale description. Th e values at the centre

of the circle are weighted more than those at the edges in a way identical to a normal (or Gaussian) distribution.

By changing the size of the circle in which intensity values are averaged, it is possible to produce a range of images blurred to diff erent degrees. Figure 3.17 shows images that have been produced in this manner. Th e original (i.e. unblurred) image is shown in (a). As you can see, using a wider circle (b) produces a more blurred image than using a narrower circle (c).

Marr and Hildreth’s algorithm works by compar- ing images that have been blurred to diff erent degrees. If an intensity change is visible at two or more adja- cent levels of blurring, then it is assumed that it can- not correspond to a random fl uctuation and must relate to the edge of an object. Although this algo- rithm was implemented by Marr and Hildreth on a computer, there is evidence that retinal processing delivers descriptions that have been blurred to diff er- ent degrees.

(a) (b) (c)

FIGURE 3.17 Examples of Gaussian blurred images. Source: Marr and Hildreth, 1980, p.190

Find a wooden table or chair and place it where it is

both well illuminated and against a light background.

Describe how the intensity of the light refl ected

from the table/chair changes across its surface and

in comparison with the background.

COMMENT

You should be able to see that the edges of the table/

chair are marked by a quite large, sharp change in the

intensity of the refl ected light caused by the object in

question being darker than the background. In addi-

tion, there are smaller changes in intensity caused by

the individual parts of the table/chair and by the tex-

ture of the wood. You may also have noticed other

changes in the intensity of the refl ected light that did

not correspond to the edge of the object, its parts, or

its texture.

ACTIVITY 3.7

CHAPTER 3 PERCEPTION 83

Co py ri gh t © 2 01 2. O UP

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

By analysing the changes in intensity values in the blurred images, it is possible to form a symbolic rep- resentation consisting of four primitives correspond- ing to four types of intensity change. Marr referred to these primitives as ‘edge-segments’, ‘bars’, ‘termina- tions’, and ‘blobs’. An edge-segment represented a sudden change in intensity; a bar represented two parallel edge-segments; a termination represented a sudden discontinuity; and a blob corresponded to a

small, enclosed area bounded by changes in inten- sity. In Figure 3.18 you can see how the image shown in Figure 3.17(a) would be represented using three of these primitives, whilst Figure 3.19 shows how three simple lines would be represented in the raw primal sketch.

As you can see from Figure 3.19, although the raw primal sketch contains a lot of information about details in the image, it does not contain explicit information

(a) (b) (c)

FIGURE 3.18 Primitives used in the raw primal sketch: (a) blobs; (b) edge-segments; and (c) bars. Source: Marr, 1982,

Figure 2.21, p.72

(a)

(b)

(c)

FIGURE 3.19 Representation of three simple lines in the raw primal sketch: ‘The raw primal sketch represents a

straight line as a termination, several oriented segments, and a second termination (a). If the line is replaced by a

smooth curve, the orientations of the inner segments will gradually change (b). If the line changes its orientation

suddenly in the middle (c), its representation will include an explicit pointer to this discontinuity. Thus in this

representation, smoothness and continuity are assumed to hold unless explicitly negated by an assertion’. Source: Marr,

1982, Figure 2.22, p.74

84 PART 1 PERCEPTUAL PROCESSES

Co py ri gh t © 2 01 2. O UP

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

about the global structure of the objects in view. Th e next step is therefore to transform the raw primal sketch into a description, known as the full primal sketch, which contains information about how the image is organized, particularly the location, shape, texture, and internal parts of any objects that are in view.

Basically, the idea is that place tokens are assigned to areas of the raw primal sketch based on the group- ing of the edge-segments, bars, terminations, and blobs. If these place tokens then form a group them- selves, they can be aggregated together to form a new, higher-order place token.

Imagine looking at a tiger. Th e raw primal sketch would contain information about the edge of the tiger’s body, but also about the edges and pattern of its stripes and the texture of its hair. In the full primal sketch, place tokens will be produced by the grouping of the individual hairs into each of the stripes. Th e place tokens for each stripe would then also be grouped (because they run in a consistent vertical pat- tern along the tiger) into a higher-order place token, meaning that there will be at least two levels of place tokens making up the tiger.

Various mechanisms exist for grouping the raw pri- mal sketch components into place tokens and for grouping place tokens together. Th ese include cluster- ing, in which tokens that are close to one another are grouped in a way very similar to the Gestalt prin- ciple of proximity, and curvilinear aggregation, in which tokens with related alignments are grouped in a similar fashion to the Gestalt principle of good continuation.

As we saw in Section 2, perceptual grouping is a robust, long-established, and powerful eff ect. Marr saw algorithms expressing laws such as those formu- lated by the Gestalt approach as being responsible for turning the ambiguous raw primal sketch into the full primal sketch in which the organization of objects and surfaces was specifi ed.

4.3 The 21/ 2D sketch

In Marr’s theory, the goal of early visual processing is the production of a description of the environment in

which the layout of surfaces and objects is specifi ed in relation to the particular view that the observer has at that time. Up until now we have been looking at how the shape of objects and surfaces can be recovered from the retinal image. However, in order to specify the layout of surfaces, we need to now include other information, specifi cally cues that tell us how far away each surface is.

Marr’s modular approach to perception means that while the full primal sketch is being produced, other visual information is being analysed simultaneously. Much of this has to do with establishing depth rela- tions, the distance between a surface and the observer, and also how far objects extend. We saw in Section 3 that motion cues and cues from texture can be used to specify the distance to an object, and it is also possible to make use of the disparity in the retinal images of the two eyes (known as stereopsis), and shading cues that are represented in the primal sketch.

Marr proposed that the information from all these ‘modules’ was combined together to produce the 21/2D sketch. It is called the 21/2D sketch rather than the 3D sketch because the specifi cation of the position and depth of surfaces and objects is done in relation to the observer. Th us, the description of an object will be viewer-centred and will not contain any information about the object that is not present in the retinal image. How the viewer-centred 21/2D sketch is turned into a fully 3D, object-centred description is one of the topics dealt with in the next chapter.

Marr saw the 21/2D sketch as consisting of a series of primitives that contained vectors (a line depicting both size and direction) showing the orientation of each surface. A vector can be seen as a needle, in which the direction the needle is pointing tells us in which direction the surface is facing, and its length tells us by how much the surface is slanted in relation to the observer. A cube would therefore be repre- sented like the one shown in Figure 3.20. In addition to the information shown in Figure 3.20, Marr sug- gested that each vector (or needle) would have a num- ber associated with it that indicated the distance from the observer.

Th e 21/2D sketch therefore provides an unambigu- ous description of the size, shape, location, orientation,

CHAPTER 3 PERCEPTION 85

Co py ri gh t © 2 01 2. O UP

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

and distance of all the surfaces currently in view, in relation to the observer.

4.4 Evaluating Marr’s approach

Marr’s theory was the catalyst for a great deal of com- putational and psychological research. Some of this research has reported fi ndings consistent with the mechanisms proposed by Marr, whilst some has found that Marr’s theory does not off er a good explanation for the results obtained. We will not attempt to review every single study here, but instead describe a few studies that have tested elements of Marr’s theory.

Marr and Hildreth (1980) attempted to test their idea that the raw primal sketch was formed by searching for changes in intensity values in adjacent levels of blurring, by implementing this algorithm in a computer program. Th ey found that when applied to images of everyday scenes the algorithm was reasonably successful in locat- ing the edges of objects. However, as with all computer- simulation research, it is important to remember that, just because a specifi c program yields the expected results, it does not necessarily follow that this is what is happening in the human perceptual system.

It seems as if Marr’s approach to the formation of the full primal sketch was fl awed in that it was limited to grouping strategies based on the 2D properties of an image. Enns and Rensick (1990) showed that par- ticipants could easily determine which one of a series of fi gures consisting of blocks was the odd one out, even though the only diff erence between the fi gures was their orientation in three dimensions. Th us, some grouping strategies must make use of 3D information.

One area in which Marr’s theory does seem to fi t the results of experimentation is in the integration of depth cues in the 21/2D sketch, studied in experi- ments that have attempted to isolate certain forms of depth cue and then determine how they interact. For example, Young et al. (1993) looked at how motion cues interacted with texture cues. Th ey concluded that the perceptual system does process these cues separately, and will also make selective use of them depending on how ‘noisy’ they are. In other words, in forming the 21/2D sketch, the perceptual system does seem to integrate diff erent modules of depth infor- mation, but will also place more emphasis on those modules that are particularly useful for processing the current image.

As well as the success of the specifi c processes sug- gested by Marr, it is also possible to evaluate his the- ory according to broader concepts. As we shall see in Section 6, there is evidence that there are two vis- ual pathways in the brain that appear to process separately ‘what’ information and ‘where’ informa- tion. It seems that diff erent perceptual processes exist according to whether the goal of perception is action or object recognition. Although Marr’s theory is a modular approach, so that diff erent types of visual information are processed separately, it did not pre- dict the separation of visual pathways into action and object recognition, and indeed it is hard to incorpo- rate this into the theory (Wade and Bruce, 2001). However, although the precise nature of the processes suggested by Marr may not map exactly onto those actually used by the brain to perceive the world, the impact of Marr’s theory should not be underesti- mated: ‘Th us it is not the details of Marr’s theory which have so far stood the test of time, but the approach itself ’ (Wade and Bruce, 2001, p.97).

FIGURE 3.20 A 21/2D sketch of a cube. Source: Marr,

1982, Figure 4.2, p.278

86 PART 1 PERCEPTUAL PROCESSES

Co py ri gh t © 2 01 2. O UP

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

SUMMARY OF SECTION 4

• Marr proposed a theory of vision that was based on bottom-up processing of information.

• His approach was to see perception as being composed of a series of stages, with each stage generat-

ing an increasingly sophisticated description.

• Marr saw the end point of the perceptual process as object recognition rather than action.

• The fi rst stage involves producing a grey level description based on the activation of retinal cells.

• This description is analysed by blurring it to diff erent degrees. Changes in intensity value that are

present in two or more adjacent levels of blurring are assumed to correspond to the ‘edge’ of an

object (or part of an object).

• The raw primal sketch is generated by assigning one of four primitives (edge-segment, bar, termina-

tion, or blob) to each change in intensity values.

• The full primal sketch is generated by using perceptual organizational principles such as clustering and

similarity to group these primitives together and assign each group a place token.

• Information from diff erent modules (such as stereopsis and motion) are combined with the full primal

sketch to produce the 21/2D sketch. This contains primitives consisting of vectors that reveal the

distance and orientation, in relation to the observer, of the visible surfaces.

5 CONSTRUCTIVIST APPROACHES TO PERCEPTION

Th e previous sections of this chapter should have given you some idea of how we can see and interpret sensory information. Th e emphasis so far has been on ‘bottom-up’ processes (see also Activity 3.8). As dis- cussed previously, there is also information fl owing ‘top-down’ from stored knowledge. Th is makes intui- tive sense. To be able to perceive something as ‘a bus’, you need to access stored knowledge concerning what the features of a bus actually are (big object with wheels, etc.).

Th us, what you see a stimulus as depends on what you know. Th is notion, that perceiving something involves using stored knowledge as well as informa- tion coming in from the senses, is embodied in an approach referred to as the constructivist approach. Th e approach is described as ‘constructivist’ because it is based on the idea that the sensory information that forms the basis of perception is, as we have

already suggested, incomplete. It is necessary to build (or ‘construct’) our perception of the world from incomplete information. To do this we use what we already know about the world to interpret the incom- plete sensory information coming in and ‘make sense’ of it. Th us stored knowledge is used to aid in the rec- ognition of objects.

Two of the foremost proponents of the constructiv- ist approach are Irvin Rock (1977, 1983, 1997) and Richard Gregory (1980). Gregory suggested that indi- viduals attempt to recognize objects by generating a series of perceptual hypotheses about what that object might be. Gregory conceptualized this process as being akin to how a scientist might investigate a prob- lem by generating a series of hypotheses and accepting the one that is best supported by the data (in percep- tion, ‘data’ would be the information fl owing ‘up’ from the senses).

CHAPTER 3 PERCEPTION 87

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

We are forced to generate hypotheses, according to Gregory’s argument, because the sensory data are incomplete. If we had perfect and comprehensive sen- sory data we would have no need of hypotheses as we would know what we perceived. Stored knowledge is assumed to be central to the generation of perceptual hypotheses as it allows us to fi ll in the gaps in our sen- sory input. Th e infl uence of stored knowledge in guid- ing perceptual hypotheses can be demonstrated by the use of impoverished fi gures such as the one in Figure 3.22 (Street, 1931).

At fi rst glance this picture may be diffi cult to per- ceive as anything other than a series of blobs. So the resulting hypothesis might be that it is just ‘a load of blobs.’ If, however, you are told that it is a picture of an ocean liner (coming towards you, viewed from water

FIGURE 3.22 An example of an impoverished fi gure.

Source: Street, 1931

Look back at Activity 3.1. Can you explain any of the

visual illusions in terms of what you now know

about the bottom-up approach to perception?

COMMENT

Gibson would tell us that the Necker cube is a geomet-

ric fi gure that contains none of the information (partic-

ularly texture gradients) that we would usually use

when perceiving an object. Marr’s theory can help us to

explain Kanizsa’s illusory square, as the four areas of

intensity change corresponding to the missing parts of

the circles would be grouped together to form a square.

But what about the Müller–Lyer illusion? There are

a number of alternative explanations for this illusion,

one of which is that we group each vertical line with its

set of arrowheads to form a single object. This of

course results in the object with the inward-pointing

arrowheads being larger than the one with the out-

ward-pointing arrowheads; basically, due to perceptual

grouping we cannot separate the vertical line from the

overall size of the object. However, as the Müller–Lyer

illusion is reduced if the straight arrowheads are

replaced with curved lines (see Figure 3.21), it could

be that we also need to look at an explanation based

on top-down perception.

(a) (b)

FIGURE 3.21 The original Müller-Lyer illusion (a),

and with curved arrowheads (b).

As we saw in Activity 3.1, another explanation of the

Müller–Lyer illusion is that we make use of top-down

information and see the outward-pointing arrow-

heads as an indication that the vertical line is nearer to

us than the line with the inward-pointing arrowheads.

ACTIVITY 3.8

88 PART 1 PERCEPTUAL PROCESSES

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

level) then the picture may immediately resolve into an image of an ocean liner. Th e sensory information has not changed, but what you know about it has, allowing you to generate a reasonable hypothesis of what the fi g- ure represents. Similarly, in the example used in Activity 3.3 of trying to identify an object by touch alone, if you are given some clues about the function of the object (i.e. your knowledge related to the object is increased), it is likely to be easier to identify it.

Th e use of knowledge to guide our perceptual hypoth- eses may not always lead to a ‘correct’ perception. Th ere are some stimuli with which we are so familiar (such as faces) that there can be a strong bias towards accepting a particular perceptual hypothesis, resulting in a ‘false’ perception. For instance, look at the faces in Figure 3.23.

Th is is the mask of Hor, an Egyptian mummy. Th e fi rst view is the mask from the front and the second two are of the back of the mask. Although the face viewed from the back is ‘hollow’ it still appears per- ceptually as a normal face. Our knowledge of how a face is supposed to look is (according to Gregory, 1980) so strong that we cannot accept the hypothesis that a face could be ‘hollow.’ Th is eff ect is interesting in that it provides an example of a perceptual hypothesis confl icting with what Gregory terms ‘high-level’ knowledge. You know at a conceptual level that the mask is hollow, yet you still perceive it as a ‘normal’ face. Th is, as Gregory suggests, represents a tendency to go with the most likely hypothesis. Th e Penrose tri- angle (Penrose and Penrose, 1958) in Figure 3.24 demonstrates a similar point.

It would be impossible to construct the object in Figure 3.24 so that the three sides of the triangle were joined. At one level, we ‘know’ that this must be true. Yet whichever corner of the triangle we attend to suggests a particular 3D interpretation. Our interpretation of the

fi gure changes as our eyes (or just our attention) jumps from corner to corner. Th ese data-supported interpreta- tions, or hypotheses, tend to overwhelm the conceptual knowledge that we are viewing a fl at pattern.

Although the constructivist approach in general, and Gregory’s theories in particular, provide an attractive explanatory framework for perception, there are areas of the theory (as there were with Gibson’s approach) that are left rather vague. For instance, how do we actually generate hypotheses and how do we know when to stop and decide which is the ‘right’ one? Why does knowledge sometimes but not always help perception? How can we ‘know’ something is wrong and yet still perceive it as wrong (as with the hollow face)? Although these are diffi cult questions to answer, progress is being made in explaining how human perception may be based, at least in part, on constructivist principles; some of this work will be discussed below.

Th us, there appears to be evidence that perceptions of the outside world can be ‘constructed’ using infor- mation fl owing ‘up’ from the senses combined with knowledge fl owing ‘down’. However, this seems to be in direct contrast to the theories of Gibson and Marr

FIGURE 3.23 The mask of Hor.

FIGURE 3.24 An impossible triangle. Source: Penrose

and Penrose, 1958

CHAPTER 3 PERCEPTION 89

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

discussed earlier, which suggest that there is no need to use stored knowledge to interpret the information fl owing in from the senses. Indeed, the impossible tri- angle above shows that we do not always make use of knowledge that may be relevant and available. So, just how important is knowledge to the process of percep-

tion, and is there any way in which we can reconcile theories of perception that see knowledge as being essential with those that see it as unnecessary? Th e fol- lowing section considers how these diff erent theories may be reconciled through consideration of the way in which the brain processes sensory information.

SUMMARY OF SECTION 5

• What you see a stimulus as depends on what you know. This means that perception must involve

top-down processing.

• The constructivist approach to perception is based on the idea that sensory data is often incomplete,

so a description can only be constructed by making use of stored knowledge.

• Gregory suggested that sensory data are incomplete and we perceive by generating a series of per-

ceptual hypotheses about what an object might be.

• The use of stored information can lead to perceptual hypotheses that are inaccurate, which is why we

may be fooled by some visual illusions.

6 THE PHYSIOLOGY OF THE HUMAN VISUAL SYSTEM

Th ere appear to be at least two (and maybe more) par- tially distinct streams of information fl owing back from the retina (via the optic nerve) into the brain (e.g. Shapley, 1995). Th e characteristics of these streams and their relation to the theories of perception already described is the topic of this section. It should be emphasized that the distinction between the two streams is fairly loose. Th ere is overlap in the types of information that the streams carry and there are numerous interconnections between them, but they may conveniently be conceptualized as distinct. Th e following subsections trace these streams of informa- tion from the retina to the brain.

6.1 From the eye to the brain

You may remember from Section 1.2 that there are two types of light-sensitive cells in the retina, called

rods and cones. Both rods and cones are connected to what are termed retinal ganglion cells that essentially connect the retina to the brain. Ganglion cell axons leave the eye via the ‘blind spot’ (the concentration of blood vessels and nerve axons here means that there is no room for any receptors, hence this region is ‘blind’). Th ese cells then project (send connections) to an area termed the lateral geniculate nucleus (LGN), and from there to the area of the brain known as the ‘primary visual cortex’ (also known as V1). Even at the level of retinal ganglion cells, there is evidence of two distinct streams or ‘pathways’, referred to as the parvocellular pathway and the magnocellular pathway (e.g. Shapley, 1995). Th ese names derive from the relative sizes of the cells in the two pathways, with larger cells in the magnocellular pathway and smaller cells in the parvo- cellular one. Th is distinction is maintained up to and within the primary visual cortex, although there are interconnections between the two pathways.

90 PART 1 PERCEPTUAL PROCESSES

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

Information travelling onward from the primary visual cortex is still maintained in two distinct streams (see Figure 3.25). One stream, leading to the infer- otemporal cortex, is termed the ventral stream, and the other, leading to the parietal cortex, is known as the dorsal stream (these were described briefl y in Chapter 2, Section 5.1).

6.2 The dorsal and ventral streams

Th e ventral stream projects to regions of the brain that appear to be involved in pattern discrimination and object recognition, whilst the dorsal stream projects to areas of the brain that appear to be specialized for the analysis of information about the position and move- ment of objects. Schneider (1967, 1969) carried out work with hamsters that suggested that there were two distinct parts of the visual system, one system con- cerned with making pattern discriminations and the other involved with orientation in space. Schneider suggested that one system is concerned with the ques- tion, ‘What is it?’, whereas the other system is con- cerned with the question, ‘Where is it?’. Th is, and later work (Ungerleider and Mishkin, 1982), led to the ven- tral pathway being labelled a ‘what’ system, and the dorsal pathway a ‘where’ system.

Although the two streams appear to be specialized for processing diff erent kinds of information, there is ample evidence of a huge degree of interconnection

between the systems at all levels. Also, the streams appear to converge in the prefrontal cortex (Rao et al., 1997), although there is still some evidence that the dorsal–ventral distinction is maintained (Courtney et al., 1996). It has been suggested that it is in the pre- frontal cortex that meaning is associated with the information carried by the two streams.

Although describing the two streams as ‘what’ and ‘where’ is convenient, there is a large body of work that suggests that the distinction is not quite that straight- forward. For instance, Milner and Goodale (1995) report a number of studies with a patient, DF, who suff ered severe carbon monoxide poisoning that appeared to prevent her using her ventral system for analysing sensory input. She could not recognize faces or objects, or even make simple visual discriminations such as between a triangle and a circle. She could draw objects from memory but not recognize them once she had drawn them. DF did, however, appear to have an intact dorsal stream. Although unable to tell if two discs were of the same or diff erent widths (or even indicate the widths by adjusting the distance between her fi ngers), if she was asked to pick the discs up then the distance between her index fi nger and thumb as she went to pick them up was highly correlated with the width of the discs. In other words, she did not have size information available to conscious perception (via the ventral stream), but it was available to guide action (via the dorsal stream).

Norman (2002), following on from similar sugges- tions by Bridgeman (1992) and Neisser (1994), has drawn on the ongoing debate concerning the charac- teristics of the dorsal and ventral streams and sug- gested a dual-process approach. In this approach, the two streams are seen as acting synergistically so that the dorsal stream is largely concerned with perception for action and the ventral stream essentially concerned with perception for recognition. Th e dual-process approach is supported by some of the characteristics of the two streams (Norman, 2001, 2002):

1. Th ere appears to be evidence (Goodale and Milner, 1992; Ungerleider and Mishkin, 1982) to suggest that the ventral stream is primarily concerned with recognition whilst the dorsal

Primary visual cortex

Parietal cortex

Inferotemporal cortex

Dorsal stream

Ventral stre am

FIGURE 3.25 The dorsal and ventral streams.

CHAPTER 3 PERCEPTION 91

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

stream drives visually-guided behaviour (pointing, grasping, etc.).

2. Th e ventral system is generally better at processing fi ne detail (Baizer et al., 1991) whereas the dorsal system is better at processing motion (Logothesis, 1994).

3. Th e studies on patient DF (Milner and Goodale, 1995) suggest that the ventral system is knowledge-based and uses stored representations to recognize objects, whilst the dorsal system appears to have only very short-term storage available (Bridgeman et al., 1997; Creem and Proffi tt, 1998).

4. Th e dorsal system receives information faster than the ventral system (Bullier and Nowak, 1995).

5. A limited amount of psychophysical evidence suggests that we are much more conscious of ventral than of dorsal stream functioning (Ho, 1998).

6. It has been suggested (Goodale and Milner, 1992; Milner and Goodale, 1995) that the ventral system recognizes objects and is thus object-centred. Th e dorsal system is presumed to be used more in driving some action in relation to an object and thus uses a viewer-centred frame of reference (this distinction arises again in the next chapter).

6.3 The relationship between visual pathways and theories of perception

We have already seen that Gibson’s approach to per- ception concentrated more on perception for the pur- poses of action, whilst Marr’s theory was principally concerned with object recognition. Th e constructivist approach is also more concerned with perception for recognition than perception for action, as it concen- trates on how we may use existing knowledge to work out what an object might be. Although these approaches have their diff erences, it is undoubtedly the case that we need to both recognize objects and perform actions in order to interact with the environ-

ment. It could be, then, that the type of perception discussed by Gibson is principally subserved by the dorsal system, whilst the ventral system is the basis for the recognition approach favoured by Marr and the constructivists.

For example, Gibson’s notion of ‘aff ordance’ empha- sizes that we might need to detect what things are for rather than what they actually are. Th at is, aff ordances are linked to actions (‘lift ing’ or ‘eating’, for example). Th e dorsal system appears to be ideally suited to pro- viding the sort of information we need to act in the environment. In addition, if a system is to be used to drive action, it really needs to be fast, as the dorsal stream seems to be.

Th e earlier discussion of Gibson’s ecological approach also stated that Gibson saw no need for memory in perception. Certainly, one of the charac- teristics of the dorsal stream is that it appears to have no more than a very short ‘memory’ (at least for repre- sentations of objects). Th us, there appear to be some grounds for suggesting that the dorsal stream is Gibsonian in operation.

In contrast, the ventral stream appears to be ideally suited to the role of recognizing objects. It is special- ized in analysing the sort of fi ne detail that Marr saw as essential to discriminating between objects, and it also seems able to draw on our existing knowledge (top-down information) to assist in identifying them. In addition, it is slower than the dorsal stream, but then recognizing what an object may be is not neces- sarily an immediate priority. For example, knowing that an object is moving towards you quickly is ini- tially more important than knowing what it is.

6.4 A dual-process approach?

Norman’s proposal discussed above does provide an attractive way of reconciling two of the classic approaches to visual perception. Th ere is perhaps a danger, however, in trying to ‘shoehorn’ what is known about the dorsal and ventral streams into the frame- work provided by previous theories. Given that both the constructivist and Gibsonian theories are rather vague on how the processes they describe could be

92 PART 1 PERCEPTUAL PROCESSES

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

implemented, it is questionable how useful they are as a theoretical framework in which to interpret the workings of the dorsal and ventral streams. Attempting to explain the streams in the light of the previous the- ories does tend to emphasize the way in which they work separately rather than the way in which they work together. Undoubtedly, the two streams can operate independently (as demonstrated by the case of DF discussed earlier), but this is rather like saying that you can take the steering column out of a car and both the car and the steering wheel will still function to some degree! In fact, Norman (2002) describes the two streams as synergistic and interconnected, rather than independent.

Binsted and Carlton (2002), in a commentary on the proposal put forward by Norman, provide an illus- tration of the interaction between the dorsal and ven- tral streams using the example of skill acquisition. Previous work (Fitts, 1964) suggests that the early stages of learning a skill (such as driving) are charac- terized by cognitive processes of the sort associated with the ventral stream, whereas once the task is well practised it is characterized by learned motor actions of the sort associated with the dorsal stream.

Th e question is, if these two streams function in such diff erent ways, how is learning transferred from one to the other? It is possible, of course, that learning occurs in both streams at the same time and that whichever is most eff ective ‘leads’ in performance of the task, but this still implies a high degree of interaction between them and a blurring of the boundaries between their functions. Th e issue (which is as yet unresolved) then becomes whether the two streams interact to such an extent that it is meaningless to consider them to be functionally separate and representative of diff erent theoretical approaches to visual processing (as Norman suggests). Th us, rather than questioning whether both Gibsonian and constructivist principles are operating in visual processing, the debate centres on whether it is appropriate to ascribe these types of processing to dis- crete pathways. Whatever the outcome of the debate, Norman does present a compelling argument that vis- ual processing does not have to be either for action or for recognition; it can be both.

6.5 Applying perceptual research: a case study of an ‘aircraft proximity event’

One of the themes that runs throughout this chapter is the importance of motion in visual perception. In Section 3.3, two basic types of motion are discussed: motion of the observer and motion of objects in the environment. An indication of how we detect change in our environment comes from studies such as that of Beck et al. (2005) where repetitive transcranial mag- netic stimulation was used to disrupt activity in the right parietal cortex (a part of the dorsal stream). Th ey found that when the activity of the dorsal stream was disrupted, there was also a disruption in the ability to detect changes in a visual stimulus.

One of the (many) roles the dorsal stream has developed could, therefore, be to detect changes in the environment generated by the motion of approaching objects (such as sabre tooth tigers or oncoming cars). Once an oncoming object is detected, it is then neces- sary for the observer to assess what it is and whether it may be a threat, and this is where the ventral stream comes into play.

A bit of introspection will also tell us that we are inherently very good at detecting sudden changes in our environment, such as when something suddenly moves (or if there is a sudden new noise). Th ink of how many times you have suddenly become aware of some- thing moving ‘out of the corner of your eye’ (it’s usually a spider!). One thing that we might deduce from this is that if something could creep up on us without appear- ing to move, then it should be much more diffi cult to detect. Th is sounds impossible, but can happen if the motion of the observer and the motion of the object eff ectively cancel each other out. Th is is precisely how some air accidents occur, when pilots are relying on visual detection of approaching aircraft .

If two aircraft are fl ying on converging fl ight paths at constant speeds, as shown in Figure 3.26, the bear- ing of the aircraft relative to each other (i.e. the angle between them) will remain constant. Th is means that there appears to be no relative motion of one aircraft

CHAPTER 3 PERCEPTION 93

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

with respect to the other. If the pilots are relying on visual ‘see and avoid’ principles, then an increase in perceived size of the other aircraft will be the only clue that it is getting closer. Th is lack of relative motion is recognized as a contributory factor in some air acci- dents and near misses. An example of this was an ‘air- craft proximity event’ (aka ‘near miss’) at Bankstown Aerodrome in New South Wales (ATSB, 2009). In this example, two aircraft (a Chieft an and a Cherokee) were approaching the airport on converging tracks and neither appeared aware of the other until the Chieft an pilot took ‘severe avoiding action’ at the last moment and narrowly managed to avoid a collision. Th e pilot of the Cherokee also became aware of the other aircraft , but only 1–2 seconds before the possi- ble collision, and was unable to react to the situation. Although, theoretically, these two aircraft should have been able to see each other, the lack of relative motion (as indicated in Figure 3.26) made it much more dif- fi cult for the pilots to perceive the potential collision hazard. Th e investigation into the incident cited as a key fi nding that, ‘Both aircraft would have appeared to be stationary objects in each pilot’s visual fi eld, decreasing the likelihood that the respective pilots would sight the other aircraft ’ (ATSB, 2009, p.6).

Th is aircraft proximity event also illustrates another aspect of perception embodied by the constructivist approach; that registering sensory information does not guarantee that it will make sense. Th e sensory

information may be registered perfectly, but the resulting hypothesis as to what it means may be wrong. In the above incident, Air traffi c control (ATC) attempted to advise the Cherokee to take avoiding action using the phrase, ‘. . .widen out to the left . . .’. Obviously the air traffi c controller knew what was meant by that instruction (the Cherokee was being instructed to turn to the left in a wider circle). Th is was not, however, understood by the (relatively inex- perienced) pilot of the Cherokee, who failed to take the avoiding action indicated by the ATC.

So, in ‘real life’ just as in the lab, perception is not always perfect. Luckily in this case there was no colli- sion, although the pilot of the Chieft ain did report that ‘. . . the sudden evasive manoeuvre caused the pilot’s head to hit the roof of the cockpit. . .’!

6.6 Combining bottom-up and top-down processing

As we have shown, approaches to perception can be diff erentiated according to whether they are primarily concerned with perception for action or recognition, or with bottom-up or top-down processing. It may have occurred to you when reading about these approaches that it is likely that perception must in fact contain elements of both types of processing. A key question, then, is whether there is any evidence that this is in fact the case.

You were introduced to the idea of visual masking in the last chapter, particularly the concept of back- ward masking, in which the presentation of a second image disrupted the perception of an initial image. In Figure 3.27 you can see sets of stimuli that have been used to demonstrate two diff erent types of visual masking. In each case, the mask is presented aft er a very brief presentation of the target. Th e task facing the participant is to report which corner of the dia- mond target is missing.

Standard explanations of why masking occurs with the stimuli in Figure 3.27 require that the mask con- tains contours that either overlap (Figure 3.27(a)) or exactly coincide with (Figure 3.27(b)) those of the

Collision

Relative bearing

Relative bearings are the same

FIGURE 3.26 Two aircraft approaching on straight

paths, at a constant speed, and on a collision course. Each

plane would not appear to be moving relative to the

other, making detection less likely.

94 PART 1 PERCEPTUAL PROCESSES

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

target (Enns and Di Lollo, 2000). But if masking is a product of the close similarity between the contours of target and mask, it is hard to account for the fact that a masking eff ect is also found for the images in Figure 3.28 (Di Lollo et al., 1993).

Enns and Di Lollo (1997) reported that the four-dot pattern shown in Figure 3.28 appeared to mask the target if target and mask were presented together and the target displayed very briefl y, or if the mask was displayed very soon aft er a brief presentation of the target. Enns and Di Lollo (2000) explained the mask- ing observed using the four-dot pattern by reference to re-entrant processing. We know from neuroscience

research that communication between two diff erent regions of the brain is never unidirectional. If one region is sending a signal to another, then the second region also sends a signal back through what are referred to as re-entrant pathways (Felleman and Van Essen, 1991).

Hupe et al. (1998) suggested that re-entrant path- ways could be used to allow the brain to check a per- ceptual hypothesis against the information in an incoming signal. In other words:

• Bottom-up processing produces a low-level description.

• Th is is used to generate a perceptual hypothesis at a higher level.

• Using re-entrant pathways, the accuracy of the perceptual hypothesis is assessed by comparing it with the (perhaps now changed) low-level description.

Di Lollo et al. (2000) used this idea as the basis for an explanation of visual masking. Th e idea is that each part of the displayed image(s) is perceived in terms of a combination of high-level descriptions similar to a perceptual hypothesis, and low-level codes produced by bottom-up processes. If the target is only presented very briefl y, then masking can occur because by the time the high-level perceptual hypothesis is compared with the low-level bottom-up description, the target will have been replaced by the mask. Th us, the percep- tual hypothesis will be rejected because it is based on a pattern (the target) that is diff erent from the pattern currently being subjected to bottom-up processing (the mask) – see Figure 3.29.

Th e re-entrant processing explanation of visual masking is based upon the presumed interaction of bottom-up processes with top-down processes. Th is is consistent with the idea that perception is neither entirely bottom-up nor entirely top-down, but is actu- ally reliant on both forms of processing. However, as the visual system is very complicated, it is likely that as well as re-entrant processing there are other processes involving top-down and bottom-up interactions con- tributing to four-dot masking (Gellatly et al., 2010; Pilling and Gellatly, 2010).

Brief presentation of target Replaced by mask

Target Mask

(a)

(b)

FIGURE 3.27 Stimuli used to demonstrate backward

masking.

Target Mask

FIGURE 3.28 An example of a four-dot mask.

CHAPTER 3 PERCEPTION 95

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

SUMMARY OF SECTION 6

• There appear to be at least two partially distinct but interconnected streams of information fl owing

back from the retina to the primary visual cortex.

• From here, a ventral stream leads to the inferotemporal cortex and a dorsal stream to the parietal

cortex.

• There is evidence that the ventral stream may be involved in perception for recognition and the dor-

sal stream in perception for action.

• Thus the dorsal stream would be better at dealing with the type of perception dealt with by Gibson

and the ventral stream with the type of perception dealt with by Marr and the constructivist approach.

• Enns and Di Lollo’s (2000) re-entrant processing explanation of backward masking was based on a

combination of bottom-up and top-down perception.

7 CONCLUSION

We started this chapter by promising to show you just how complex even the perception of simple objects can be. We hope you now have some idea of these

complexities and of the problems that face any poten- tial theory of visual perception. You have also seen how rich the fi eld of perception is. Th ere are many

Participant checks the perceptual hypothesis against the current low-level description – but this is now of the four circles. The hypothesis is therefore rejected

Participant forms low-level description of target using bottom-up processing

Participant forms the perceptual hypothesis ‘The image is of a diamond’

FIGURE 3.29 The re-entrant processing explanation of backward masking.

96 PART 1 PERCEPTUAL PROCESSES

O xf or d. A ll r ig ht s re se rv ed . Ma y no t be r ep ro du ce d in a ny f or m wi th ou t pe rm is si on f ro m th e pu bl is he r,

e xc ep t fa ir u se s pe rm it te d un de r U. S. o r

ap pl ic ab le c op yr ig ht l aw .

infl uential theories that have had a profound impact on both our understanding of perception and the way we approach cognitive psychology more generally. For example, Gibson showed us the importance of consid- ering how we interact with the real world and Marr demonstrated the advantages of the modular approach to information processing. We have also seen that although these diff erent theories may seem contradic-

tory at fi rst glance, it could well be that they are all describing vital but diff erent aspects of the perceptual process, which achieve diff erent goals and are dealt with by diff erent parts of the brain. So, next time you are hunting in vain for your keys, do not be too hard on yourself. Remember all the computations, descrip- tions, and hypotheses that your brain is having to pro- cess in order to perceive the environment around you.