Digging Deeper into The Dolly Zoom

The dolly zoom is an unsettling in-camera effect that appears to undermine normal visual perception.

The effect is achieved by using the setting of a zoom lens to adjust the angle of view (often referred to as field of view or FOV) while the camera dollies (or moves) towards or away from the subject in such a way as to keep the subject the same size in the frame throughout. In its classic form, the camera angle is pulled away from a subject while the lens zooms in, or vice-versa. Thus, during the zoom, there is a continuous perspective distortion, the most directly noticeable feature being that the background appears to change size relative to the subject.

As the human visual system uses both size and perspective cues to judge the relative sizes of objects, seeing a perspective change without a size change is a highly unsettling effect, and the emotional impact of this effect is greater than the description above can suggest. The visual appearance for the viewer is that either the background suddenly grows in size and detail and overwhelms the foreground, or the foreground becomes immense and dominates its previous setting, depending on which way the dolly zoom is executed.

Understanding Perspective Distortion

In photography and cinematography, perspective distortion is a warping or transformation of an object and its surrounding area that differs significantly from what the object would look like with a normal focal length, due to the relative scale of nearby and distant features. Perspective distortion is determined by the relative distances at which the image is captured and viewed, and is due to the angle of view of the image (as captured) being either wider or narrower than the angle of view at which the image is viewed, hence the apparent relative distances differing from what is expected. Related to this concept is axial magnification — the perceived depth of objects at a given magnification.

Perspective distortion takes two forms: extension distortion and compression distortion, also called wide-angle distortion and long-lens or telephoto distortion, when talking about images with the same field size. Extension or wide-angle distortion can be seen in images shot from close using a wide-angle lens (with an angle of view wider than a normal lens). Object close to the lens appears abnormally large relative to more distant objects, and distant objects appear abnormally small and hence more distant – distances are extended. Compression, long-lens, or telephoto distortion can be seen in images shot from a distant using a long focus lens or the more common telephoto sub-type (with an angle of view narrower than a normal lens). Distant objects look approximately the same size – closer objects are abnormally small, and more distant objects are abnormally large, and hence the viewer cannot discern relative distances between distant objects – distances are compressed.

Note that perspective distortion is caused by distance, not by the lens per se – two shots of the same scene from the same distance will exhibit identical perspective distortion, regardless of lens used. However, since wide-angle lenses have a wider field of view, they are generally used from closer, while telephoto lenses have a narrower field of view and are generally used from farther away. For example, if standing at a distance so that a normal lens captures someone’s face, a shot with a wide-angle lens or telephoto lens from the same distance will have exactly the same perspective on the face, though the wide-angle lens may fit the entire body into the shot, while the telephoto lens captures only the nose. However, crops of these three images with the same coverage will yield the same perspective distortion – the nose will look the same in all three. Conversely, if all three lenses are used from distances such that the face fills the field, the wide-angle will be used from closer, making the nose larger, and the telephoto will be used from farther, making the nose smaller.

Outside of photography, expansion distortion is familiar to many through side-view mirrors (see “objects in mirror are closer than they appear”) and peepholes, though these often use a fisheye lens, exhibiting different distortion. Compression distortion is most familiar in looking through binoculars or telescopes, as in telescopic sights, while a similar effect is seen in fixed-slit strip photography, notably a photo finish, where all capture is parallel to the capture, completely eliminating perspective (a side view).

Influencing factors

Perspective distortion is influenced by the relationship between two factors: the angle of view at which the image is captured by the camera and the angle of view at which the photograph of the subject is presented or viewed.

Angle of view of the capture

When photographs are viewed at the ordinary viewing distance, the angle of view at which the image is captured accounts completely for the appearance of perspective distortion. The general assumption that “undoctored” photos cannot distort a scene is incorrect. Perspective distortion is particularly noticeable in portraits taken with wide-angle lenses at short camera-to-subject distances. They generally give an unpleasant impression, making the nose appear too large with respect to the rest of the face, and distorting the facial expression. Framing the same subject identically while using a moderate telephoto or long focus lens (with a narrow angle of view) flattens the image to a more flattering perspective. It is for this reason that, for a 35 mm camera, lenses with focal lengths from about 85 through 135 mm are generally considered to be good portrait lenses. Conversely, using lenses with much longer focal lengths for portraits results in more extreme flattening of facial features, which also may be objectionable to the viewer.

Photograph viewing distance

Photographs are ordinarily viewed at a distance approximately equal to their diagonal. When viewed at this distance, the distortion effects created by the angle of view of the capture are apparent. However, theoretically, if one views pictures exhibiting extension (wide angle) distortion at a closer distance, thus widening the angle of view of the presentation, then the phenomenon abates. Similarly, viewing pictures exhibiting compression (telephoto) distortion from a greater distance, thus narrowing the angle of view of the presentation, reduces the effect. In both cases, at some critical distance, the apparent distortion disappears completely.

Artistic uses

Although perspective distortion is often annoying when unintended, it is also intentionally used for artistic purposes. Extension (wide angle) distortion is often implemented to emphasize some element of the scene by making it appear larger and spatially removed from the other elements. Compression (telephoto) distortion is often used to give the appearance of compressed distance between distant objects, such as buildings or automobiles in order to convey a feeling of congestion. Longer lenses magnify the subject more, apparently compressing distance and (when focused on the foreground) blurring the background because of their shallower depth of field. Wider lenses tend to magnify distance between objects while allowing greater depth of field.

Another result of using a wide-angle lens is a greater apparent perspective distortion when the camera is not aligned perpendicularly to the subject: parallel lines converge at the same rate as with a normal lens, but converge more due to the wider total field. For example, buildings appear to be falling backwards much more severely when the camera is pointed upward from ground level than they would if photographed with a normal lens at the same distance from the subject, because more of the subject building is visible in the wide-angle shot.

Because different lenses generally require a different camera–subject distance to preserve the size of a subject, changing the angle of view can indirectly distort perspective, changing the apparent relative size of the subject and background. If identical field size is maintained, wide-angle lenses make subjects appear larger by introducing size differences along with the converging lines mentioned above, and they make rooms and spaces around the subject appear more vast by increasing the distance between subject and background (expanded perspective).

Mood effect and famous uses

The mood effect of perspective distortion achieved by rectilinear extreme wide-angle lenses is that the resulting image looks grotesque and unsettling, while not looking as unrealistic as curvilinear fisheye lenses which display barrel distortion. The effect is especially noticeable the closer the camera is to the subject, as its amount increases the shorter the focal length is at the same field size.

One notable director that frequently employs rectilinear ultra wide angle lenses in order to achieve a distinctive signature style defined by extreme perspective distortion is Terry Gilliam, for instance. Also Stanley Kubrick (in Paths of Glory, and Dr. Strangelove, among others) as well as Orson Welles (in The Trial, partly Orson Welles’ London, segment Four Clubmen),Sam Peckinpah (in Straw Dogs), and Sidney Lumet (in The Offence) have occasionally done the same in the past, though mostly in moderation, for single shots or sequences only, while Gilliam hardly ever uses any lens longer than 14mm, which has garnered lenses of that particular focal length the informal nickname “The Gilliam” among film-makers. Jean-Pierre Jeunet and Marc Caro, two French filmmakers influenced by Gilliam, adopted his typical wide-angle photography in their two most “Gilliamesque” features, Delicatessen and The City of Lost Children. Orson Welles’s The Trial is notable for heavily influencing Gilliam’s signature style years before the American ex-patriate joined the Monty Python comedy troupe while only being a one-feature style for Welles.

Due to the grotesque, unsettling mood effect peculiar to wide-angle lenses, films making use of such perspective distortion can often be placed in one of two categories: Grotesque and surreal satire and fantasy, also to some extent black comedy (Gilliam, Jeunet & Caro, Orson Welles, Dr. Strangelove) on the one hand, and serious, more realistic films with a particular edge for social criticism on the other, whereas social conventions, collective society, and/or the motives and acts of leaders are portrayed as grotesque and absurd, and often also feature tyrannical characters with conformist values who act out in an extremely hostile and prejudiced way towards individualism and outsiders (Paths of Glory, Straw Dogs, The Offence).
On the other end of the focal length spectre, Leni Riefenstahl used extreme telephoto lenses to compress large crowds inTriumph of the Will while their allmighty Führer Adolf Hitler is seen through normal lenses and often from a low angle to appear tall in comparison.

Understanding Zoom Lenses

A zoom lens is a mechanical assembly of lens elements for which the focal length (and thus angle of view) can be varied, as opposed to a fixed focal length (FFL) lens (see prime lens).

A true zoom lens, also called a parfocal lens, is one that maintains focus when its focal length changes. A lens that loses focus during zooming is more properly called a varifocal lens.

Zoom lenses are often described by the ratio of their longest to shortest focal lengths. For example, a zoom lens with focal lengths ranging from 100 mm to 400 mm may be described as a 4:1 or “4×” zoom. The term superzoom or hyperzoom is used to describe photographic zoom lenses with very large focal length factors, typically more than 5× and ranging up to 18× in SLR camera lenses and 50× in amateur digital cameras. This ratio can be as high as 300× in professional television cameras. As of 2009, photographic zoom lenses beyond about 3× cannot generally produce imaging quality on par with prime lenses. Constant fast aperture zooms (usually f/2.8 or f/2.0) are typically restricted to this zoom range. Quality degradation is less perceptible when recording moving images at low resolution, which is why professional video and TV lenses are able to feature high zoom ratios. Digital photography can also accommodate algorithms that compensate for optical flaws, both within in-camera processors and post-production software.

Some photographic zoom lenses are long-focus lenses, with focal lengths longer than a normal lens, some are wide-angle lenses (wider than normal), and others cover a range from wide-angle to long-focus. Lenses in the latter group of zoom lenses, sometimes referred to as “normal” zooms, have displaced the fixed focal length lens as the popular one-lens selection on many contemporary cameras. The markings on these lenses usually say W and T for “Wide” and “Telephoto”. Telephoto is designated because the longer focal length supplied by the negative diverging lens is longer than the overall lens assembly (the negative diverging lens acting as the “telephoto group”).

Some digital cameras allow cropping and enlarging of a captured image, in order to emulate the effect of a longer focal length zoom lens (narrower angle of view). This is commonly known as digital zoom and produces an image of lower optical resolution than optical zoom. Exactly the same effect can be obtained by using digital image processing software on a computer to crop the digital image and enlarge the cropped area. Many digital cameras have both, combining them by first using the optical, then the digital zoom.


Early forms of zoom lenses were used in optical telescopes to provide continuous variation of the magnification of the image, and this was first reported in the proceedings of the Royal Society in 1834. Early patents for telephoto lenses also included movable lens elements which could be adjusted to change the overall focal length of the lens. Lenses of this kind are now calledvarifocal lenses, since when the focal length is changed, the position of the focal plane also moves, requiring refocusing of the lens after each change.

The first true zoom lens, which retained near-sharp focus while the effective focal length of the lens assembly was changed, was patented in 1902 by Clile C. Allen (U.S. Patent 696,788). The first industrial production was the Bell and Howell Cooke “Varo” 40–120 mm lens for 35mm movie cameras introduced in 1932. The most impressive early TV Zoom lens was the VAROTAL III from Rank Taylor Hobson from UK built in 1953. The Kilfitt 36–82 mm/2.8 Zoomar introduced in 1959 was the first varifocal lens in regular production for still 35mm photography. The first modern film zoom lens was designed around 1950 by Roger Cuvillier, a French engineer working for SOM-Berthiot. It had an optical compensation zoom system. In 1956, Pierre Angénieux introduced the mechanical compensation system, enabling precise focus while zooming, in his 10x lens released in 1958. Angénieux received a 1964 technical award from the academy of motion pictures for the design of that 12-120 mm zoom lens.

Since then advances in optical design, particularly the use of computers for optical ray tracing, has made the design and construction of zoom lenses much easier, and they are now used widely in professional and amateur photography.


There are many possible designs for zoom lenses, the most complex ones having upwards of thirty individual lens elements and multiple moving parts. Most, however, follow the same basic design. Generally they consist of a number of individual lenses that may be either fixed, or slide axially along the body of the lens. While the magnification of a zoom lens changes, it is necessary to compensate for any movement of the focal plane to keep the focused image sharp. This compensation may be done by mechanical means (moving the complete lens assembly while the magnification of the lens changes), or optically (arranging the position of the focal plane to vary as little as possible while the lens is zoomed).

A simple scheme for a zoom lens divides the assembly into two parts: a focussing lens similar to a standard, fixed-focal-length photographic lens, preceded by an afocalzoom system, an arrangement of fixed and movable lens elements that does not focus the light, but alters the size of a beam of light travelling through it, and thus the overall magnification of the lens system.

In this simple optically compensated zoom lens, the afocal system consists of two positive (converging) lenses of equal focal length (lenses L1 and L3) with a negative (diverging) lens (L2) between them, with an absolute focal length less than half that of the positive lenses. Lens L3 is fixed, but lenses L1 and L2 can be moved axially, and do so in a fixed, non-linear relationship. This movement is usually performed by a complex arrangement of gears and cams in the lens housing, although some modern zoom lenses use computer-controlled servos to perform this positioning.

While the negative lens L2 moves from the front to the back of the lens, the lens L1 moves forward and then backward in a parabolic arc. In doing so, the overall angular magnification of the system varies, changing the effective focal length of the complete zoom lens. At each of the three points shown, the three-lens system is afocal (neither diverging or converging the light), and so does not alter the position of the focal plane of the lens. Between these points, the system is not exactly afocal, but the variation in focal plane position can be small enough (~±0.01 mm in a well-designed lens) not to make a significant change to the sharpness of the image.

An important issue in zoom lens design is the correction of optical aberrations (such aschromatic aberration, and in particular field curvature) across the whole operating range of the lens; this is considerably harder in a zoom lens than a fixed lens, which needs only to correct the aberrations for one focal length. This problem was a major reason for the slow uptake of zoom lenses, with early designs being considerably inferior to contemporary fixed lenses, and usable only with a narrow range of f-numbers. Modern optical design techniques have enabled the construction of zoom lenses with good aberration correction over widely variable focal lengths and apertures.

Whereas lenses used in cinematography and video applications are required to maintain focus while the focal length is changed, there is no such requirement for still photography, or if a zoom lens is used as a projection lens. Since it is harder to construct a lens that does not change focus with the same image quality as one that does, the latter applications often have lenses that require refocusing once the focal length has changed (and thus strictly speaking are varifocal lenses, not zoom lenses). As most modern still cameras are autofocus, this is not a problem.

Designers of zoom lenses with large zoom ratios often trade one or more aberrations for higher image sharpness. For example, a greater degree of barrel and pincushion distortion is tolerated in lenses that span the focal length range from wide angle to telephoto with a focal ratio of 10x or more than would be acceptable in a fixed focal length lens or a zoom lens with a lower ratio. Although modern design methods have been continually reducing this problem, barrel distortion of greater than one percent is common in these large-ratio lenses. Another price paid is that at the extreme telephoto setting of the lens the effective focal length changes significantly while the lens is focused on closer objects. The apparent focal length can more than halve while the lens is focused from infinity to medium close-up. To a lesser degree, this effect is also seen in fixed focal length lenses that move internal lens elements, rather than the entire lens, to effect changes in magnification.

Varifocal lens

Many so-called “zoom” lenses, particularly in the case of fixed lens cameras, are actually varifocal lenses, which gives lens designers more flexibility in optical design trade-offs (focal length range, maximum aperture, size, weight, cost) than true parfocal zoom, and which is practical because of autofocus, and because the camera processor can move the lens to compensate for the change in the position of the focal plane while changing magnification (“zooming”), making operation essentially the same as a true parfocal zoom.

Camera Dollies in Filmmaking

A camera dolly is a specialized piece of filmmaking and television production equipment designed to create smooth camera movements (cinematic techniques). The camera is mounted to the dolly and the camera operator and focus puller or camera assistant, usually ride on the dolly to operate the camera. The dolly grip is the dedicated technician trained to operate the dolly.


The camera dolly may be used as a shooting platform on any surface but is often raised onto a track, to create smooth movement on a horizontal axis known as a dolly shot. Additionally, most professional film studio dollies have a hydraulic arm that raises or lowers the camera on the vertical axis. When a dolly grip operates a dolly on perpendicular axes simultaneously, it’s known as a compound move.

Dolly moves may also be executed without track, giving more freedom on the horizontal plane and with it, a higher degree of difficulty. These are called dance floor moves and may either be done on the existing surface (if smooth enough) or on an overlay designed for dolly movement. The ground overlay usually consists of thick plywood as a bottom layer and masoniteon top.

Camera dollies have several steering mechanisms available to the dolly grip. The typical mode is rear-wheel steering, where the front wheels remain fixed, while the wheels closest to the operating handle are used to turn. A second mode, round steering, causes the front wheels to turn in the opposite direction from the rear wheels. This mode allows the dolly to move in smooth circles and is frequently used when the dolly is on curved track. A third mode, called crab steering, is when the front wheels steer in the same direction as the rear wheels. This allows the dolly to move in a direction diagonal to the front end of the dolly.


Studio dollies are large and stable and can feature hydraulics. These are the first choice for studio, backlot and location shoots when using professional cameras. A studio dolly usually needs a specialized operator called a “dolly grip”, and many are built for the operator to ride on the dolly with the camera.

Lightweight dolly systems are more simple, affordable and are best used with lighter-weight cameras. Lightweight systems are usually favored by independent filmmakers and students because they are easier to carry and operate. These dollies support only the camera, and the operator needs to move alongside. Some lightweight dollies are small enough to be carried in a backpack.

The best way to be able to replicate the same camera movement for multiple takes (which is important for editing) is to use a dolly on track.


Dolly tracks used for heavy cameras have traditionally been constructed of steel or aluminium. Steel, although heavier than aluminum, is less expensive and withstands heavier use. Longer track segments, while heavier to transport, allow track to be laid straighter with less effort. Curved track is also available. Plastic versions of track have been used with lightweight dolly systems. In the 2000s, flexible rubber track allowed quicker set up and easier transportation for use with light cameras.



Digging Deeper into Dolly Zoom Quiz