Thursday, December 20, 2012

The vision science of 48fps

Given the recent release of Peter Jackson's The Hobbit in High Frame Rate (HFR) 3D there has been a lot of discussion about the pros and cons of moving from the entrenched 24fps to 48fps (or even the 60fps proposed by James Cameron). I recently weighed in on the vision science behind the perception of higher frame rates for Tested here:

The article is a nice summary of the topics the journalist and I discussed but his personal dislike for HFR overshadows several of my points about why I think the move to 48fps or higher is necessary and will become the standard in cinema. To understand the problems with the current 24fps filming and projection process you need to understand how we are able to see a rapidly presented series of still images as a continuous moving sequence. Here is a passage from an encyclopaedia entry I wrote on film perception a few years ago:

 Smith, T.J. (2010) Film (Cinema) Perception. PDF icon In E.B. Goldstein (ed.)The Sage Encyclopedia of Perception.

"Movies consist of a series of still images, known as frames projected on to a screen at a rate of 24 frames per second. Even though the frames are stationary on the screen and are momentarily blanked as a new frame replaces the old we experience film as a continuous image containing real motion. The two perceptual phenomena contributing to this experience are persistence of vision and apparent motion. Persistence of vision refers to the continued activation of visual neurons after visual stimulation has been removed. During film projection the light is obscured as the frame is changed. If this only happened 24 times a second (Hz) there would be a noticeable flicker. To avoid this flicker each frame is blanked three times by a shutter. This creates a presentation rate above the critical flicker fusion rate of 60Hz. Above this rate persistence of vision ensures that the blank is masked by continued activation of visual neurons and we perceive the projected image as continuous.

The motion we perceive in film is apparent because it is based on static visual information not real motion. It is commonly believed that the apparent motion perceived in films is beta movement. Beta movement is perceived when a simple object such as a line is alternately presented at two different locations around 10 times a second. The two lines are perceived as a single line moving smoothly between the two locations. Due to the slow rate of presentation and the large distances covered, long-range apparent motions such as beta movement are thought to be processed late in the visual system and require inferences based on knowledge of real motion and the most likely correspondences between objects in the image sequence.

Beta movement, along with other long-range motion phenomena such as apparent rotations and transformations may occur during film perception but they cannot account for the majority of motion perceived in film. The 24Hz presentation rate used in film is too fast for long-range motion and film frames are too complex, making the task of identifying corresponding objects in subsequent frames very difficult. Instead, apparent motion in film is due to the same short-range motion system used to detect real motion. Motion detectors in the early visual system respond in the same way to the retinal stimulation caused by real motion and by rapidly presented (>13Hz) static images that depict only slight differences in object location. This processing occurs very early in our visual system and does not require perceptual inferences. The directness with which film is processed results in an experience of motion that is indiscernible from real-motion."

As you can see there are two processes involved that allow us to see a series of frames as motion: persistence of vision and apparent motion. Frames need to alternate faster than ~60 times per second (i.e. Hz)  if we are going to perceive constant luminance, i.e. not perceive a flicker. Old film projectors reached this threshold by using a shutter to present each frame twice (=48Hz) or three times (=72Hz). Modern digital projectors don't have a shutter as the images is constantly present and doesn't need to accommodate the next frame being registered in front of the lens so instead they present each frame 3 times ("triple flash"=72Hz). This is sufficient to remove the flicker but when we move to stereoscopic 3D digital projection we encounter a problem with the amount of light presented during each frame. Most 3D projectors (such as RealD) alternate the left/right eye images, with each being presented at 24fps (24fps x 2 eyes = 48Hz). Each of these left or right images are subsequently flashed 3 times creating a total flicker rate for stereo 3D movie of 144Hz (72Hz per eye)! This ensures that we don't see the flicker in either eye even though they are alternately blind to the image.

Unfortunately, due to the radial polarisation needed to ensure only the left image is seen by the left eye and the right image by the right eye the amount of light reaching the viewer's eyes is significantly less than a traditional 2D presentation. This creates a murkier image and makes it harder to perceive apparent motion as our eyes cannot create the correspondence between moving objects in each frame. This problem is exaggerated by the film being photographed at 24fps per eye. Moments of high camera or object motion create motion blur in the image as the camera's shutter is open too long. This motion blur makes the edges of objects hard to locate and decreases our perception of apparent motion, making the image appear to jump across the screen instead of flowing smoothly. Given that this motion stuttering is happening alternately between the two eyes it makes it difficult for our visual system to fuse the 3D image, resulting in a loss of depth perception and eye strain.

The solution to both problems of light loss and motion stutter in a stereo 3D movie is to increase the frame rate. I'm not sure whether the new HFR/48fps projectors use a double or triple flash but whichever they use the rate of presentation per eye will exceed the critical flicker fusion rate (double flash = 96 Hz; triple = 144Hz per eye). Because each frame is a sharper image with less motion blur the left and right images registered by our eyes will be brighter, clearer and easier to fuse in depth to perceive 3D. Camera and object motion will be clearer as we are better able to perceive apparent motion between the crisper edges of objects and the overall effect should be less cognitive load on the viewer and less eye strain.

The bizarre irony of Peter Jackson's decision to move to 48fps in an attempt to get 3D cinema closer to reality is that it has revealed the artificiality of the Hobbit. As I say in the Tested article, like the move from SD to HD the increased in information on the screen makes the imperfections of the image easier to see. The move to 48fps may not be increasing the spatial resolution of the image but by increasing the temporal resolution (i.e. frame rate) it makes each pixel easier to see and each face prosthetic and matte backdrop easier to notice. Suspension of disbelief is harder in the quiet sequences at the beginning of the Hobbit and it is only when the action picks up in the final act when the higher frame rate and 3D really gel. Many reviewers have reported growing used to the 48fps as the movie progresses and have noted that the chase sequences at the end of the movie are easier to see, more fluid and result in less eyestrain than typically experienced in 3D movies. It is only when Jackson presents a combination of filmed live-action, sets and digital characters or backdrops together on the screen at the same time and gives the viewer time to interrogate the image that viewers seem to have issue with the higher frame rate. We would only really know the impact of the 48fps on filmgoer experience by performing a controlled psychological test on audiences. Viewers would have to be naive to which frame rate presentation they were seeing and various aspects of their experience of the film monitored. Only then could we see if it actually had an impact on their experience without any pre-existing bias against it or resistance to new technologies getting in the way.

Personally I believe the creative potentials of stereo 3D is massive and only starting to be tapped with movies like Scorsese's Hugo and (apparently, although I'm yet to see if) Ang Lee's Life of Pi. If higher frame rates encourage more filmmakers to experiment with 3D without having to worry about viewer eye strain and discomfort I think it is a great step forward.

p.s. Merry Christmas :)

Tuesday, December 11, 2012

Sight & Sound video essay

Kevin B. Lee (@alsolifelike) posted a video essay for Sight & Sound on the evolution of Paul Thomas Anderson's steadicam work which discusses my eyetracking work on There Will Be Blood and the DIEM project here. You can view the video here:

The video essay discusses the careful use of staging and choreography to introduce the viewer to spaces and characters critical to several of Anderson's films including Hard Eight  and Boogie Nights. The use of steadicam in all of these sequences varies from the bravura long and complex sequences of Boogie Nights and Magnolia to subtle uses in There Will be Blood and Punchdrunk Love. This analysis reveals the many ways in which steadicam can create affinity or conflict between what the viewer wants to see and how the camera moves relative to the characters in the scene. This affinity was clearly demonstrated in my analysis of the eye movement behaviour during the table-top sequence of There Will be Blood (as posted on David Bordwell's blog here). By choreographing the camera moves to natural attentional cues such as dialogue switches, character movements and the introduction of characters in from the side of the frame the filmmaker can make a reliable prediction about where most viewers are likely to be attending.

As discussed in the video essay, eyetracking film viewers gives us a direct line to the viewer experience of a film and can be used to validate filmmaker intentions for such sequences. It also provides us with ways to test hypotheses about how production decisions can influence the resulting viewer experience. With the decreasing cost of steadicam equipment and digital production in general the use of such techniques is becoming more and more common. But what this video essay, and my eyetracking research shows is that such sequences will result in viewer disorientation and confusion unless they are carefully designed with viewer sequential attention in mind.  For example, our recent paper in Journal of Vision ( shows how viewer attention can be altered for the same sequence of close-up shots just by excluding audio. I have recently reviewed the influence of such factors and compositional decisions in general in a journal article and book chapter:

 Smith, T. J. (in press) Watching you watch movies: Using eye tracking to inform cognitive film theory. In A. P. Shimamura (Ed.),Psychocinematics: Exploring Cognition at the Movies. New York: Oxford University Press.

Smith, T. J. (2012) The Attentional Theory of Cinematic Continuity, Projections: The Journal for Movies and the Mind. 6(1), 1-27. (pdf)

 There have been very few empirical studies looking specifically at the influence of steadicam shots on gaze behaviour but one recent study by Wang and colleagues (  showed how powerful such shots could be for creating similarity in gaze across viewers. Using long steadicam clips from Russian Ark and Children of Men, the authors showed that introducing cuts and scrambling the order of frames within the steadicam sequences disrupted gaze behaviour but the control of each shot over viewer attention was so strong that viewers were able to very quickly reorient to the disordered sequences and re-attend to the centre of interest. This study shows that by using a carefully choreographed steadicam shot, the director can give the viewer the illusion of freedom to roam a continuous shot whilst actually constraining where they look when, creating continuity of attention within the frame and across viewers.