Thursday, December 20, 2012

The vision science of 48fps

Given the recent release of Peter Jackson's The Hobbit in High Frame Rate (HFR) 3D there has been a lot of discussion about the pros and cons of moving from the entrenched 24fps to 48fps (or even the 60fps proposed by James Cameron). I recently weighed in on the vision science behind the perception of higher frame rates for Tested here:

The article is a nice summary of the topics the journalist and I discussed but his personal dislike for HFR overshadows several of my points about why I think the move to 48fps or higher is necessary and will become the standard in cinema. To understand the problems with the current 24fps filming and projection process you need to understand how we are able to see a rapidly presented series of still images as a continuous moving sequence. Here is a passage from an encyclopaedia entry I wrote on film perception a few years ago:

 Smith, T.J. (2010) Film (Cinema) Perception. PDF icon In E.B. Goldstein (ed.)The Sage Encyclopedia of Perception.

"Movies consist of a series of still images, known as frames projected on to a screen at a rate of 24 frames per second. Even though the frames are stationary on the screen and are momentarily blanked as a new frame replaces the old we experience film as a continuous image containing real motion. The two perceptual phenomena contributing to this experience are persistence of vision and apparent motion. Persistence of vision refers to the continued activation of visual neurons after visual stimulation has been removed. During film projection the light is obscured as the frame is changed. If this only happened 24 times a second (Hz) there would be a noticeable flicker. To avoid this flicker each frame is blanked three times by a shutter. This creates a presentation rate above the critical flicker fusion rate of 60Hz. Above this rate persistence of vision ensures that the blank is masked by continued activation of visual neurons and we perceive the projected image as continuous.

The motion we perceive in film is apparent because it is based on static visual information not real motion. It is commonly believed that the apparent motion perceived in films is beta movement. Beta movement is perceived when a simple object such as a line is alternately presented at two different locations around 10 times a second. The two lines are perceived as a single line moving smoothly between the two locations. Due to the slow rate of presentation and the large distances covered, long-range apparent motions such as beta movement are thought to be processed late in the visual system and require inferences based on knowledge of real motion and the most likely correspondences between objects in the image sequence.

Beta movement, along with other long-range motion phenomena such as apparent rotations and transformations may occur during film perception but they cannot account for the majority of motion perceived in film. The 24Hz presentation rate used in film is too fast for long-range motion and film frames are too complex, making the task of identifying corresponding objects in subsequent frames very difficult. Instead, apparent motion in film is due to the same short-range motion system used to detect real motion. Motion detectors in the early visual system respond in the same way to the retinal stimulation caused by real motion and by rapidly presented (>13Hz) static images that depict only slight differences in object location. This processing occurs very early in our visual system and does not require perceptual inferences. The directness with which film is processed results in an experience of motion that is indiscernible from real-motion."

As you can see there are two processes involved that allow us to see a series of frames as motion: persistence of vision and apparent motion. Frames need to alternate faster than ~60 times per second (i.e. Hz)  if we are going to perceive constant luminance, i.e. not perceive a flicker. Old film projectors reached this threshold by using a shutter to present each frame twice (=48Hz) or three times (=72Hz). Modern digital projectors don't have a shutter as the images is constantly present and doesn't need to accommodate the next frame being registered in front of the lens so instead they present each frame 3 times ("triple flash"=72Hz). This is sufficient to remove the flicker but when we move to stereoscopic 3D digital projection we encounter a problem with the amount of light presented during each frame. Most 3D projectors (such as RealD) alternate the left/right eye images, with each being presented at 24fps (24fps x 2 eyes = 48Hz). Each of these left or right images are subsequently flashed 3 times creating a total flicker rate for stereo 3D movie of 144Hz (72Hz per eye)! This ensures that we don't see the flicker in either eye even though they are alternately blind to the image.

Unfortunately, due to the radial polarisation needed to ensure only the left image is seen by the left eye and the right image by the right eye the amount of light reaching the viewer's eyes is significantly less than a traditional 2D presentation. This creates a murkier image and makes it harder to perceive apparent motion as our eyes cannot create the correspondence between moving objects in each frame. This problem is exaggerated by the film being photographed at 24fps per eye. Moments of high camera or object motion create motion blur in the image as the camera's shutter is open too long. This motion blur makes the edges of objects hard to locate and decreases our perception of apparent motion, making the image appear to jump across the screen instead of flowing smoothly. Given that this motion stuttering is happening alternately between the two eyes it makes it difficult for our visual system to fuse the 3D image, resulting in a loss of depth perception and eye strain.

The solution to both problems of light loss and motion stutter in a stereo 3D movie is to increase the frame rate. I'm not sure whether the new HFR/48fps projectors use a double or triple flash but whichever they use the rate of presentation per eye will exceed the critical flicker fusion rate (double flash = 96 Hz; triple = 144Hz per eye). Because each frame is a sharper image with less motion blur the left and right images registered by our eyes will be brighter, clearer and easier to fuse in depth to perceive 3D. Camera and object motion will be clearer as we are better able to perceive apparent motion between the crisper edges of objects and the overall effect should be less cognitive load on the viewer and less eye strain.

The bizarre irony of Peter Jackson's decision to move to 48fps in an attempt to get 3D cinema closer to reality is that it has revealed the artificiality of the Hobbit. As I say in the Tested article, like the move from SD to HD the increased in information on the screen makes the imperfections of the image easier to see. The move to 48fps may not be increasing the spatial resolution of the image but by increasing the temporal resolution (i.e. frame rate) it makes each pixel easier to see and each face prosthetic and matte backdrop easier to notice. Suspension of disbelief is harder in the quiet sequences at the beginning of the Hobbit and it is only when the action picks up in the final act when the higher frame rate and 3D really gel. Many reviewers have reported growing used to the 48fps as the movie progresses and have noted that the chase sequences at the end of the movie are easier to see, more fluid and result in less eyestrain than typically experienced in 3D movies. It is only when Jackson presents a combination of filmed live-action, sets and digital characters or backdrops together on the screen at the same time and gives the viewer time to interrogate the image that viewers seem to have issue with the higher frame rate. We would only really know the impact of the 48fps on filmgoer experience by performing a controlled psychological test on audiences. Viewers would have to be naive to which frame rate presentation they were seeing and various aspects of their experience of the film monitored. Only then could we see if it actually had an impact on their experience without any pre-existing bias against it or resistance to new technologies getting in the way.

Personally I believe the creative potentials of stereo 3D is massive and only starting to be tapped with movies like Scorsese's Hugo and (apparently, although I'm yet to see if) Ang Lee's Life of Pi. If higher frame rates encourage more filmmakers to experiment with 3D without having to worry about viewer eye strain and discomfort I think it is a great step forward.

p.s. Merry Christmas :)

Tuesday, December 11, 2012

Sight & Sound video essay

Kevin B. Lee (@alsolifelike) posted a video essay for Sight & Sound on the evolution of Paul Thomas Anderson's steadicam work which discusses my eyetracking work on There Will Be Blood and the DIEM project here. You can view the video here:

The video essay discusses the careful use of staging and choreography to introduce the viewer to spaces and characters critical to several of Anderson's films including Hard Eight  and Boogie Nights. The use of steadicam in all of these sequences varies from the bravura long and complex sequences of Boogie Nights and Magnolia to subtle uses in There Will be Blood and Punchdrunk Love. This analysis reveals the many ways in which steadicam can create affinity or conflict between what the viewer wants to see and how the camera moves relative to the characters in the scene. This affinity was clearly demonstrated in my analysis of the eye movement behaviour during the table-top sequence of There Will be Blood (as posted on David Bordwell's blog here). By choreographing the camera moves to natural attentional cues such as dialogue switches, character movements and the introduction of characters in from the side of the frame the filmmaker can make a reliable prediction about where most viewers are likely to be attending.

As discussed in the video essay, eyetracking film viewers gives us a direct line to the viewer experience of a film and can be used to validate filmmaker intentions for such sequences. It also provides us with ways to test hypotheses about how production decisions can influence the resulting viewer experience. With the decreasing cost of steadicam equipment and digital production in general the use of such techniques is becoming more and more common. But what this video essay, and my eyetracking research shows is that such sequences will result in viewer disorientation and confusion unless they are carefully designed with viewer sequential attention in mind.  For example, our recent paper in Journal of Vision ( shows how viewer attention can be altered for the same sequence of close-up shots just by excluding audio. I have recently reviewed the influence of such factors and compositional decisions in general in a journal article and book chapter:

 Smith, T. J. (in press) Watching you watch movies: Using eye tracking to inform cognitive film theory. In A. P. Shimamura (Ed.),Psychocinematics: Exploring Cognition at the Movies. New York: Oxford University Press.

Smith, T. J. (2012) The Attentional Theory of Cinematic Continuity, Projections: The Journal for Movies and the Mind. 6(1), 1-27. (pdf)

 There have been very few empirical studies looking specifically at the influence of steadicam shots on gaze behaviour but one recent study by Wang and colleagues (  showed how powerful such shots could be for creating similarity in gaze across viewers. Using long steadicam clips from Russian Ark and Children of Men, the authors showed that introducing cuts and scrambling the order of frames within the steadicam sequences disrupted gaze behaviour but the control of each shot over viewer attention was so strong that viewers were able to very quickly reorient to the disordered sequences and re-attend to the centre of interest. This study shows that by using a carefully choreographed steadicam shot, the director can give the viewer the illusion of freedom to roam a continuous shot whilst actually constraining where they look when, creating continuity of attention within the frame and across viewers.

Tuesday, October 16, 2012

Guest post: Camera Views of Candidates’ Debates Could Play Key Role in Winning Style

This week our blog hosts a guest post from Lester Loschky (Cognitive Psychologist) on the recent US presidential and VP debates and how subtle directorial decisions may impact our impressions of the candidates. (Tim J. Smith)

Camera Views of Candidates’ Debates Could Play Key Role in Winning Style
by Lester Loschky

I will make a claim that many people may find counter-intuitive: The camera views of the US Presidential candidates in their debates could prove important in determining who “wins” those debates.  But before you close your browser window on this seemingly crazy idea, read on, and see if you don’t find it more persuasive.  There is a lot of research, and a lot of punditry that backs it up.

Those following the current US Presidential election campaign know that the impact of the Presidential debates has assumed a greater importance than any in recent memory.  President Obama’s poor performance relative to Governor Mitt Romney in their first debate apparently led to his losing a commanding 5 point lead in the general election polls in the period of a week

In addition, most of the commentary on that debate has shown that it was particularly the “style” of each of the candidates that was particularly important.  The importance of style is consistent with what has been said about other important US Presidential debates of the past.  For example Richard Nixon’s sweating and five o’clock shadow compared to JFK’s cool demeanor in the 1960 debates, and Al Gore’s superior seeming sighs compared to W’s folksy manner, have both been credited with influencing the outcomes of their respective elections. 

I would like to point to one particular point of style that was very apparent in the first Obama/Romney debate—namely eye contact with the camera.  Howard Kurtz noted “stylistically, Romney came on strong, showing a confident command of facts and figures even as he tried to moderate or distance himself from some of his proposals. He also made direct eye contact with the camera while Obama often seemed to be looking down [emphasis added], never adjusting his intensity and acting like he was at a garden-variety news conference” Howard Kurtz, Oct 3, 2012 10:35 PM EDT).  Thus, Obama’s lack of eye contact with the camera during his debate may have been a factor is his losing of the debate.

Importantly, this issue also came up in the Vice Presidential debate between Vice President Biden and Congressman Ryan.  However, in that debate, it can be argued that it was due to the ABC Debate Director’s decision as to which views of each candidate to show to the TV audience.  Specifically, the camera views in the Biden versus Ryan debate Closing Statements favored Ryan.  Biden was shown looking at the wrong camera for the entire 1:19 of his final remarks but Ryan was show looking at the right one (see the Youtube embedded video clip below and a couple of screen captures from the clip). 

This is odd, because there is a red light on the camera you are supposed to look at, and Biden must know this very well.  So why was Biden looking at the wrong camera?  It seems implausible that Biden could not see the red light on the camera he was supposed to look at, or that he intentionally looked at the wrong camera, or he chose to address his comments to the chair of the debate.  Most importantly, the ABC Debate Director in the control room was the person who ultimately chose how Biden was presented to the national TV audience.  If it was argued to have been due to a lapse of attention by the Director, and the person below the Director who was in charge of pressing the button that selects the camera view to show the TV audience, then it was an extremely long lapse of attention, since the camera shot on Biden lasted for 79 seconds (i.e., 1:19), at the single most important (final) portion of the debate.  However, we can assume that the Director of the debate in the control room was a consummate professional, since s/he was chosen as Director for this very high stakes debate.  Thus, we can also assume that it was not a simple mistake due to a lapse of attention.  This means that it had to have been a conscious decision.  If so, it is a big problem.

Specifically, research has shown that failure to make eye contact reduces the likeability of a person (Mason, Tatkow et al. 2005), and makes a speaker less persuasive (Yokoyama & Daibo, 2012).  Thus, the Debate Director's choice of camera view for Biden's closing statement made him less likeable and persuasive (he wouldn't look you in the eyes), and made Ryan more likeable and persuasive (he looked you in the eyes).  Again, assuming this was not a simple mistake, for the reasons give above, put it into the realm of a “plausibly deniable” political “dirty trick” of the sort that Richard Nixon’s staff was famous for in the Watergate scandal.

Of course, one could argue that the camera view choice was a small thing, for only 1:19 of the Vice Presidential debate, which common wisdom says will not change the course of an election.  The counter argument to that is that the V.P. debate was argued to be critical in determining the momentum of the Presidential election campaign, and that the Closing Statement is the last thing that viewers see in the debate, and should therefore be most memorable.  This is based on the extremely well-known phenomenon of the “recency effect” which research has shown also affects long-term memory for things such as memory for US presidents (e.g., name all the US presidents you can remember in reverse chronological memory—most people’s memory is best for the most recent Presidents)(Roediger & Crowder, 1976). 

More importantly, what if the same "mistake" happens tonight in President Obama's or Governor Romney’s closing statement?  These simple directorial decisions may impact our perception of each candidate in subtle ways that cumulatively effect our overall confidence in them and their politics.

Lester Loschky
Cognitive Psychologist

Friday, April 13, 2012

UCLA Visual Narrative workshop 20-22nd June 2012

I'll be presenting my research on film cognition and eye movements as part of UCLA's workshop on Visual Narrative, June 20-22nd 2012. The workshop will present a wonderful array of approaches to understanding the nature of narrative in visual media including film, TV, comic books and on-line visual media. Other presenters include Elisabeth Camp (Philosophy, U. of Pensylvania), Dorit Abusch (Linguistics, Cornell), Elsi Kaiser (Linguistics, USC), Matthew Stone (Computer Science, Rutgers), and George Wilson (Philosophy, USC).

Register now at the workshop website:

Tuesday, April 10, 2012

Real|Reel article

Chloe Penman (@ideaswithlegs) has written a summary of a presentation I gave at Bristol Vision Institute back in January and posted it on the on-line journal, Real|Reel here.

Chloe does a better job at succinctly summarising some of the key aspects of my Attentional Theory of Cinematic Continuity (AToCC) than I think I could. She also uses some great video demonstrations of some of the key editing techniques (Match-Action, Jump Cuts, 180 Degree Rule) to elegantly expand her points.

If you are interested in reading about AToCC in more detail or related areas of film cognition please check out my recent publications:

  • Smith, T. J., Levin, D. T. & Cutting, J. (2012) A Window on Reality: Perceiving Edited Moving Images. Current Directions in Psychological Science.21: 101-106 doi:10.1177/0963721412436809 (print version) (preprint)
  • Smith, T. J. (2012) The Attentional Theory of Continuity Editing,Projections: The Journal for Movies and the Mind. 6(1)
  • Smith, T. J. (2012) Extending AToCC: a reply, Projections: The Journal for Movies and the Mind. 6(1)

Tuesday, April 03, 2012

Rear Window Timelapse

Absolutely brilliant reworking of Hitchcock's Rear Window (1954) by Jeff Desom so that all the sequences viewed out of the window by Jimmy Stewart's character are morphed together into one continuous time lapse viewpoint using Adobe Aftereffects. It also highlights an interesting mismatch between how we think we perceive the locations depicted in a scene and how they actually appear when spatial relationships are reconstructed. I've studied Rear Window in detail several times and I had no idea that the conservatory in the right of the scene was so close to Jimmy Stewart's apartment.

Friday, March 23, 2012

A Window of Reality: Perceiving Edited Moving Images

A little review article Dan Levin, James Cutting and I put together is now published in Current Directions in Psychological Science.

Have you ever wondered how we watch films? How viewing highly artificial edited sequences that jump about in space and time can be effortless> Why films seem to be getting faster, darker and more agitated? Or why we fail to notice massive continuity errors? Check out the article

Smith, T. J., Levin, D. T. & Cutting, J. (2012) A Window on Reality: Perceiving Edited Moving Images. Current Directions in Psychological Science.21: 101-106 doi:10.1177/0963721412436809 (print version) (preprint)

Edited moving images entertain, inform, and coerce us throughout our daily lives, yet until recently, the way people perceive movies has received little psychological attention. We review the history of empirical investigations into movie perception and the recent explosion of new research on the subject using methods such as behavioral experiments, functional magnetic resonance imagery (fMRI) eye tracking, and statistical corpus analysis. The Hollywood style of moviemaking, which permeates a wide range of visual media, has evolved formal conventions that are compatible with the natural dynamics of attention and humans’ assumptions about continuity of space, time, and action. Identifying how people overcome the sensory differences between movies and reality provides an insight into how the same cognitive processes are used to perceive continuity in the real world.

Tuesday, February 14, 2012

Cut detection experiment

If you have a spare 45 minutes and fancy reflecting on how you watch movies please take part in my student, Yvonne's on-line experiment:

You'll be shown a series of film clips from movies and asked to detect cuts. Just press the spacebar every time you see a cut. It's that simple!

Or is it?

Sunday, February 12, 2012

Cognitive Film Theory bibliography

I just stumbled across Nick Redfern's wonderful bibliography of Cognitive Film theory on-line and thought I had to share:

This is a great starting point for anybody trying to get a feel for the research area.

Thursday, February 09, 2012

UCL Festival of the Moving Image

This Sunday (12th February 2012) I will be giving an intro to my work on film cognition and a live eyetracking demonstration as part of the UCL Festival of Moving Images 2012:

The evening will begin with a screening of Tarsem Singh's The Fall (2006), a beautiful digital fairytale that exemplifies Tarsem's use of digital compositions and mise en scene.

This will be followed by Richard Linklater's rotoscoped philosophical dream journey, Waking Life (2001).

Both films explore issues related to the fantasy of reality (and vice versa) and it will be my task to bridge the two with some demonstrations of exactly how illusory our experience of the real-world is.

The event is free but space is limited so please come along early if you are interested in attending.

Friday, February 03, 2012

Gazing at Blade Runner on BBC Film 2012

The legendary BBC film review program, Film 2012 honoured me with a visit a couple of weeks ago and the piece they filmed aired last Wednesday (1st Feb 2012, 23:30). If you're in the UK you can view it on the BBC iPlayer for seven days:

The piece represented a personal journey for the presenter, Danny Leigh who wanted to understand the psychology of film viewing. Using Blade Runner (1982) as our sample film, Danny and I discussed how filmmakers capitalise on our natural interest in simple visual features such as motion and more complex details of scenes such as social cues and faces to guide our attention within the frame and across cuts. By eyetracking Danny with a Tobii TX300 I was able to compare his viewing behaviour to that of other viewers and show how similar they were for the majority of the clip.

To get an idea of the attentional synchrony (i.e. clustering of gaze) between all viewers you can take a look at the video below. This represents the gaze location and resulting heatmap of seven people watching the clip at different times.

I'd like to thank Danny Leigh, Suniti Somaiya and the BBC Film team for making the filming so enjoyable. I look forward to working with them again.