This page was started because during the summer 2003 semester at Ivy Tech State College, Terre Haute campus I was a student of Cathy Alsman taking a Psychology class. One of our assignments was to write a paper on a subject of our choosing, and I chose depth perception. This page is an expanded version of the paper I wrote for the class. You can access the original paper I wrote for the class HERE
That optical illusions seem to trick the eye isn't due to the to any defect in our eye / brain co-ordination but is testament to the processing that our brain does to make sense of the visual world around us. Despite all the research done into Artificial Intelligence and robotic vision there isn't yet a system that can match us in visual alacrity.
In order to understand how optical illusions work it's best to have some understanding of how we perceive the world around us.
Depth perception is an important advantage for humans and other binocular animals. Not only does it give us an accurate sense of where objects are in relation to one another but also where we stand in relation to those same objects. Although there are monocular clues to depth perception, our binocular vision makes coordination between hand and eye much more accurate. A third of our brain is devoted to vision in one form or another. This page will show how our brain uses the clues given by our eyes to produce an accurate view of the world that surrounds us.
Most mammals and some other groups of animals can be divided into two classes; the browsers and the hunters. Hunters have forward facing eyes. The extra information relayed to the brain from having two overlapping images from the eyes give extra clues to how far away prey or any other object is. Browsers need good all round vision to detect a threat. This is why cows, sheep and horses have eyes situated on the sides of their heads. This does not mean they lack depth perception. There are good monocular clues as to the relative positions of objects. It’s just that binocular vision is better.
At birth, babies have poor eyesight. They are unable to focus properly and the eyes are sometimes completely uncoordinated. This is because although they have the physical structures in order to see, they haven’t learned to use them yet. At two months they are starting the process of turning the individual images seen by their eyes into a single image. At five months, full binocular vision is achieved. They can see in full color and have learned to like some colors above others. They can now also focus on objects farther away and can pick up objects. But it isn’t until they are around six months old that they begin to have depth perception. It may take a further four years until the brain can process the information it receives from the eyes accurately enough to have full depth perception. (CVIN; Lucile)
The now famous "Visual Cliff" experiments of the 1960’s by Eleanor Gibson and Richard Walk showed that babies six months old would not venture over a drop covered by glass. (Gibson & Walk, 1960) More recent studies have shown that this isn’t the whole story. Babies over nine months old placed on the glass covered drop have an increased heart rate, perhaps showing that they are frightened. Babies less than six months old actually show a decreased heart rate. Other experiments show that the sight of their smiling mother on the other side of the drop will encourage the toddlers move across it, overriding their fear. (Talaris, 2002)
Risking the Visual Cliff
Image: O’Neill, Catherine, (1987), You Won’t Believe Your Eyes! (page 67) Washington D.C: National Geographic Society
The brain takes the information received from the eyes and combines it into a single image in a process called stereopsis.
Image: Optometrists Network, Stereo Vision. Retrieved June 13, 2003
There are believed to be around thirty areas in the brain used to process visual information, taking one third of the brain. Researchers using Magnetic Resonance Imaging (MRI) at Massachusetts General Hospital have so far identified fifteen of them. (Mapping Vision, 1995) Information from the eyes is sent along the optic nerves. In an area called the Optic Chiasm, the nerves join and then separate. At this junction, information from the left visual field from both eyes is now sent to the right side of the brain and vica versa. The signals now travel to an area in the thalamus called the lateral geniculate nucleus (LGN). This appears to be a relay station that amplifies the visual signal before transmitting it to the temporal lobes. (Montgomery) Research done at the California Institute of Technology, shows that the temporal lobes are used for extracting motion clues from what we are seeing along with some other visual perception. (Anderson, 1998) The primary visual cortex, located in the occipital lobes, is responsible for most of our visual processing.
The Visual Path
Image: Montgomery, G. Howard Hughes Medical Institute, The urgent need to use both eyes.
Retrieved June 13, 2003
There are many monocular clues for depth perception, all of which give the brain clues as to where an object is in relation to another. Linear Perspective is the apparent convergence of linear lines; for example, a road or railway tracks leading towards the horizon.
Linear Perspective, Size Consistency, Relative Height & Texture Gradient
Image: Rock, Irvin, Perception, (page 19) New York: Scientific American Library
It was during the fifteenth century that linear perspective was used in western art. Pictures previous to this lacked "depth" and looked flat or distorted. This is exactly how we would perceive the world without linear perspective depth clues. Linear perspective is used in conjunction with other monocular clues such as size consistency and relative height.
Art without perspective – the Bateaux Tapestry (circa 1066)
The figures are all on one plane, the only only clues to depth is by using interposition.
Size Consistency stems from the fact that objects do not usually change size over short periods of time. Thus for an object that we are familiar with, the larger it appears, the closer to us it is. Mistakes can be made. Clouds, for example, can be any shape and size, and so the distance they are away from us is hard to judge. (Mayer, 2003) Other mistakes can happen. Trains are something that, although we have a passing familiarity with, most people don’t realize just how big they are. (Kardas, 2003) As a consequence we can’t judge their speed very accurately and this gives rise to around 3,000 accidents in the United States annually. (Fry, 2000)
In these two images the sets of lines that seem to converge towards some horizon give them
We are so used to using this as a visual clue that in the first it is hard to realize that the two vertical red bars are actually the same size and in the second that although the "people" have been simply pasted into different positions on the image that they aren't increasing in size the further "into" the image they are.
One of the simplest Linear Perspective illusions, first noticed by Mario Ponzo in 1913
The horizontal bars are actually the same size
Relative Height is the phenomenon where objects furthest away are higher in our visual field. For airborne objects, such as clouds, the reverse is true. Those furthest away are lower in our visual field. In general, it can be said that the closer to the level of the horizon, the farther away an object appears. (Krantz, 2003)
M. C. Escher used relative height to great effect in his 1961 lithograph "The Waterfall". The aqueduct carrying the water follows a zigzag course. Using our perception of visual height there is nothing wrong with this, the aqueduct is simply receding into the distance. It his positioning of the far end of it that confounds us. He produces a waterfall that returns the water to the near end of the aqueduct. This is often the same with many optical illusions or "trick" pictures. There is usually nothing wrong with different elements of an image, it is the way they are put together that fool us.
The Waterfall, M. C. Escher, 1961
Texture Gradient is another monocular clue. The closer something is to us, the more detail and texture can be seen. As the distance increases the amount of texture lessens until it looks uniform. (Gibson, 1950)
Aerial or Atmospheric Perspective is caused by the scattering of light in the atmosphere by small particles or vapor. Blue light, which has a shorter wavelength than other colors, is scattered more than the other colors. This scattering causes distant objects to appear slightly hazy and bluish in color. This is also why mountains appear much closer on clear, dry days.
Aerial or Atmospheric Perspective
Light is scattered by the atmosphere therefore distant objects seem hazy and bluish
Image: Kolb, Helga; Fernandez, Eduardo & Nelson, Ralph (2003), The Perception of Depth,
Webvision, Retrieved July 19, 2003