On Sergei Eisenstein's Audio-Visual Montage

story © Michael Betancourt, April 10, 2011 all rights reserved.


Sergei Eisenstein (1898 – 1948) proposed a series of techniques in his montage theory that provide a complete system for motion pictures. As optical sound became the dominant technology, his theories became concerned with the organization and relationship between sound and image. Concerned more with the editing of sequences than the graphic animation of imagery, montage nevertheless does have a direct relevance to the synchronization of sound and image. Eisenstein proposed a special type of montage form, chromo-phonic montage. This conception emerges from his critical engagement with the color-sound relationships surveyed in his article The Synchronization of the Senses, a fact that reflects the pervasive influence of synaesthesia on art before World War II.

The foundations of chromo-phonic montage theory depend on a synchronization that is not only a matter of rhythmic connection between the editing and the music, but would also include and be determined by color. This specific montage formulation emerges from an analysis of the linkages between color and sound created by color organs, where Eisenstein concludes from his analysis that:

The decisive role is played by the image structure of the work, not so much by employing generally accepted correlations [of color to sound], but by establishing in our images of specific creative work whatever correlations (of sound and picture, sound and color, etc.) are dictated by the idea and theme of the particular work.

The construction of color-sound relationships is arbitrary, dependent on the needs and desires of the work itself, and should not be reflect any other concern than those internal and specific to the work being made. Reaching this conclusion, Eisenstein rejects the foundations of color music to embrace the more variable construction common to artists engaged with visual music. While his discussion at a theoretical level is concerned with a full range of sensation, on the technical level his interest is in the relationship between sound, color, and image; this expansion of montage theory in the 1930s builds on the five earlier, primarily visual, montage techniques he developed starting in the 1920s.

Eisenstein’s five varieties of montage build towards greater complexity, based upon the weight given to the image’s contents: metric, rhythmic, tonal, overtonal and intellectual (or the montage of attractions). The basic unit for this system was the “shot”—the individual film strip with moving imagery on it—and their organization in the montage sequence was to be based upon collision or conflict in a literally visual joining of the dialectical process of thesis/antithesis. The resulting new meaning that would be synthesis, spontaneously appearing from these juxtapositions. For Eisenstein, this was a theoretical translation of the dialectical process, leading him to claim the montage construction was an inherently Marxist method for film production:

According to Marx and Engels the dialectic system is only the conscious reproduction of the dialectic course (substance) of the external events of the world.

The projection of the dialectic system of things
into the brain
into the process of thinking
yields: dialectic methods of thinking;
dialectic materialism—
And also:
The projection of the same system of things
while creating concretely
while giving form

Eisenstein’s claim for montage is simple. It is the transformation of mere images into a visual construction that is functionally identical to the dialectical materialist process. This shift causes the artwork itself to become an example of Marxist theory at the level of praxis in a direct movement from theoretical construction through to application in particular media work. The theory of montage, being based in the visual collision of opposites dramatizes this dialectical procedure inherently in itself. Thus, within Eisenstein’s argument, montage is not only inherently Marxist, it is also immune to being used for purposes that would conflict with its dialectical nature—montage would always assure that the resulting film bears the imprint of Marxist analysis.

However, a closer consideration of the various types of montage suggests an alternate potential—a formal one rather than dialectical one. The essential element that produces the dialectical dimension of Eisenstein’s theory depends on the actual, visual contents of each shot—that the different visual materials present a conflict. This factor becomes clear from Eisenstein’s enumeration of various types of visual conflict: motion, scale, size, apparent volume; he also includes contrasts of light/dark, and conceptual relationships such as young/old. Such a recognition of the importance for image content, implicates the context of the montage with the larger work as well. The various montage structures he describes do function without the dialectical element, even if in his own films constructing a dialectic was essential to their meaning.

The various “lower” levels of montage, specifically metric and rhythmic, have immediate and obvious analogues in musical form. In metric montage the cut comes based on a specific frame count, no matter what is happening within the image (i.e. the image’s content is of low importance), while in rhythmic montage the visible content does begin to matter: when there is a rhythm visible in the shot, that must be taken into account in deciding how to cut following a variable meter—hence, the “rhythm” in the name. Both types of montage treat the montage sequence musically, creating patterns and effects through the visual tempo of the editing—incorporating elements of metric cutting into the fabricated rhythm. Yet neither type is necessarily concerned with the actual content of the image, unless the motion contained therein affects the rhythm of the cut.

The visual qualities and contents of the image becomes increasingly important in the remaining three types of montage. Tonal montage is based on the graphic tone of the image—black and white levels, for example. Overtonal montage is concerned with the emotional tone or affect of the image, a factor dependent on the subject matter shown and the graphic character of depiction. Different styles of photography produce radically different emotional feelings, independent on what those images portray. Similarly, rhythmic and metric elements influence the emotional sense that a given sequence of shots has, impacting what kinds of feelings the montage sequence produces. All of these types of montage are concerned with the elicitation of feeling and organization of the sequence around creating a specific emotional tone.

Only intellectual montage approaches the assembly of the sequence in a linguistic fashion, as if each shot were also a word so that the sequence can be regarded as analogous to a sentence, or paragraph in its meaning and effect. It is intellectual montage that most specifically produces the thesis+antithesis combinatory effect: the juxtaposition of two images creates a meaning that is not present in either image when seen by itself. This last type of montage is the one most often simply, generically, identified as being “montage.”

However, the “lower” levels of montage are more important to the history of how sound and image can be synchronized than the intellectual montage that Eisenstein championed in his writing and films. The concerns of intellectual montage, with its aspiration to semiotic construction of filmed sequences comparable to language, have a different character entirely than the other methods he described. When confronted by the issue of sound, Eisenstein proposed a variety of synchronization between image and sound where both would contribute to a composite effect:

Everyone is familiar with the appearance of an orchestral score. There are several staffs, each containing the part for one instrument or a group of like instruments. Each part is developed horizontally. But the vertical structure plays no less important a role, interrelating as it does all the elements of the orchestra within each given unit of time. Through the progression of the vertical line, pervading the entire orchestra, and interwoven horizontally, the intricate harmonic musical movement of the whole orchestra moves forward.
When we turn from this image of the orchestral score to that of the audio-visual score, we find it necessary to add a new part to the instrumental parts: this new part is a “staff” of visuals, succeeding each other and corresponding, according to their own laws, with the movement of the music—and vice versa.

Eisenstein’s recognition about the organization of sound to image necessitated the development of techniques to handle their synchronization. That sound and image can be linked in a technical fashion to achieve a direct, permanent connection is the central factor that enabled the theorization of audio-visual montage.

In his discussion of the “Battle on the Ice” in Alexander Nevski, Eisenstein considered the visual dramatization of music. His conception of how this synchronization would work reveals an approach to visual structure closely related to the concepts of visual music:

It was exactly this kind of “welding” [of the demands imposed by the actual photographic films strips to the initial plan], further complicated (or perhaps further simplified?) by another line—the sound-track—that we tried to achieve in Alexander Nevsky, especially in the sequence of the German knights advancing across the ice. Here the lines of the sky’s tonality—clouded or clear,- of the accelerated pace of the riders, of their direction, of the cutting back and forth from Russians to knights, of the faces in close-up and the total long-shots, the tonal structure of the music, its themes, its tempi, its rhythm, etc.—created a task no less difficult than that of a silent sequence.

Central to this orchestration of sound, movement, and montage is the recognition that sound and picture happen simultaneously—that there is no delay between how sound and image are understood: they are perceptually superimposed one over the other. The issue for his audio-visual montage was the coordination of sound-to-image where the problem posed by music was not so much a matter of locating comparable elements in music, but of rendering the relative movement in visual terms:

Musical and visual imagery are not actually commensurable through narrowly “representational elements. If one speaks of a genuine and profound relations and proportions between music and picture, it can only be in reference to the relations between the fundamental movements of the music and the picture, i.e. compositional and structural elements.

Eisenstein’s solution to these problems is to create visual analogues to the audible elements in a translation of musical arrangement to visual appearances on screen; his conception for relating the sequence of notes in music to the graphic organization of the frame was not based on the individual graphic presentation of tones. It is the relative structure of a sequence of notes, heard simultaneously with a particular image that was his focus. This organization anticipates the counterpoint synchronization that John Whitney would propose and develop under the name “digital harmony.” However, unlike other experiments and theories of synchronic and contrapuntal structure, the underlying material used in Eisenstein’s montage theory is representational, live action photography—and not abstracted forms or absolute animation.

All Eisenstein quotes are taken from Film Form (first quote), and The Film Sense (all others), trans. Jay Leyda, (New York: Harvest/HBJ, 1975).

Copyright © Michael Betancourt  April 10, 2011  all rights reserved.

All images, copyrights, and trademarks are owned by their respective owners: any presence here is for purposes of commentary only.