A Multi-Agent Model of Compositional Process: Sketching, Assembling, and Managing Musical Salience
Alan Fleming Baird
University of Glasgow (PhD Candidate)
Abstract
This paper proposes a three-stage model of compositional process consisting of sketching, assembling, and evaluating that explicitly accounts for the multiple layers of agency involved in musical communication. Drawing from cognitive musicology, semiotics, and music theory, the model addresses how musical meaning transforms through contextual relationships and how composers must navigate between grammatical correctness and semantic function across a chain of intentional agents from composer through performer to listener.
Keywords: composition, sketching, musical salience, semiotics, cognitive musicology, multi-agent systems
1. Introduction
Contemporary compositional practice operates within what Nattiez (1990) characterises as a tripartite semiological framework, encompassing the poietic (compositional), neutral (score-based), and esthesic (receptive) dimensions of musical meaning. However, existing models of compositional process have inadequately addressed the dynamic relationships between these dimensions, particularly how musical materials undergo semantic transformation as they move from private sketch to public performance. This paper proposes a systematic approach to understanding compositional decision-making that integrates insights from cognitive psychology, musical semiotics, and practical compositional experience.
2. Historical Context and Contemporary Practice
The use of sketching as a compositional methodology has extensive historical precedent, though its systematic theorisation remains underdeveloped. Mozart's compositional process, long mischaracterised as spontaneous inspiration, has been revealed through sketch studies to involve extensive drafting and revision. Analysis of Mozart's surviving sketches demonstrates that even his apparently effortless compositions emerged through conscious processes of material generation, contextual testing, and structural refinement (Tyson, 1987). Similarly, Beethoven's compositional method involved what Kerman (1982) characterises as "slow and laborious" sketching processes, with extensive notebooks documenting his systematic approach to developing musical ideas through iterative revision.
Contemporary composition pedagogy increasingly emphasises sketching as a fundamental practice, though often without explicit theoretical frameworks for understanding how sketched materials transform through assembly. Educational approaches typically advocate for multi-stage processes involving initial idea generation, rough structural planning, and subsequent development, yet these pedagogical models rarely address the cognitive and semiotic dimensions of compositional decision-making (Hickey, 2012). Some contemporary composition educators have begun implementing systematic sketching methodologies as teaching tools. Bussick (2017) advocates for structured sketching principles, including limitation-setting, focus on characteristic material, and strategic omission of redundant elements, positioning sketching as "stage two of the composing process" that bridges initial conception and formal assembly. This pedagogical approach aligns with the theoretical framework proposed here, demonstrating practical applications of systematic sketching methodology in contemporary composition education.
The relationship between sketching practices and contemporary cognitive musicology has remained largely unexplored. Whilst research in music perception and cognition has generated a sophisticated understanding of how listeners process musical information, these insights have not been systematically integrated into compositional methodology. The sketching-assembling-evaluating model proposed here represents an attempt to bridge this gap, providing a framework that connects practical compositional experience with theoretical understanding of musical communication and cognition.
3. Positioning the Current Model
What distinguishes the present model from existing approaches is its explicit integration of three theoretical dimensions: semiotic analysis of musical meaning-making, cognitive research on musical perception and expectation, and systematic attention to the multi-agent nature of musical communication. Whilst individual composers may intuitively consider how their materials will be interpreted by performers and received by audiences, the systematic articulation of this "chain of intentional agents" within compositional methodology appears novel.
The model's emphasis on managing musical salience through contextual evaluation also represents a distinctive contribution. Whilst composition teachers often advise students to consider how materials work "in context," the theoretical framework provided by research on musical salience (Lerdahl, 1992; Dibben, 1999) offers specific analytical tools for understanding why certain contextual combinations succeed or fail. The integration of this cognitive research with practical compositional decision-making provides a bridge between scientific understanding of musical perception and creative practice.
Furthermore, the model's focus on "contextual transformation" (how musical materials change meaning through assembly) addresses a gap in both compositional pedagogy and music theory. Traditional analytical approaches often treat musical meaning as relatively stable, focusing on the inherent properties of musical objects rather than their relational dynamics. Schenkerian analysis, for instance, tends to identify structural functions that remain consistent throughout a work, whilst set-theory approaches in atonal music analysis emphasise the invariant properties of pitch collections. These methodologies, whilst valuable for understanding musical structure, provide limited insight into how meaning emerges and transforms through the juxtaposition of materials.
The present model emphasises the dynamic, emergent nature of musical semantics, recognising that a chord progression, melodic gesture, or rhythmic pattern may function differently depending on its contextual position within the larger work. This perspective challenges the notion that musical elements possess fixed semantic properties, instead proposing that meaning arises through what might be termed "semiotic negotiation" between materials in proximity. A dominant seventh chord, for example, may function as a stable sonority in one context, a transitional harmony in another, and a source of harmonic tension in a third, with each contextual placement activating different aspects of its semantic potential.
This dynamic view aligns with recent developments in musical semiotics, particularly Hatten's (2004) work on markedness and correlation, which demonstrates how musical meanings emerge through networks of stylistic associations that can be activated, suppressed, or transformed through contextual manipulation. Similarly, Agawu's (1991) concept of musical "play" emphasises how signs acquire meaning through their participation in larger structural games rather than through inherent referential properties. The sketching-assembling-evaluating model extends these semiotic insights by providing practical methodologies for composers to actively manage these transformative processes.
For compositional pedagogy, this approach suggests moving beyond instruction focused solely on grammatical correctness towards teaching strategies for contextual evaluation and semantic management. Students might be taught to consider not just whether a musical idea "works" in isolation, but how it might function differently when placed in various contextual relationships. This could involve exercises in deliberate recontextualisation, where the same musical material is placed in different structural positions to observe how its semantic function shifts. Such pedagogical approaches would prepare composers to think more systematically about the communicative dimensions of their work, recognising composition as an active process of meaning construction rather than merely the arrangement of pre-existing musical objects.
4. Theoretical Framework
The proposed model builds upon Peirce's (1931-1958) triadic conception of the sign, where musical gestures function simultaneously as signs (the material itself), objects (what they refer to), and interpretants (how they are understood). Within compositional practice, this triadic relationship becomes particularly complex as musical materials move through different contextual frames, each potentially altering their semiotic function. As Hatten (1994, 2004) demonstrates in his work on musical topics and gestures, meaning in music emerges through correlational networks rather than fixed referential relationships.
The cognitive dimension of this process draws heavily from research in musical expectation and statistical learning. Huron's (2006) ITPRA theory (Imagination, Tension, Prediction, Reaction, Appraisal) provides a framework for understanding how listeners construct meaning through predictive processing, while Pearce's (2018) work on statistical learning reveals how stylistic knowledge shapes expectation formation. For composers, this research suggests that effective composition requires not merely the organisation of sounds, but the careful management of listener cognition through the strategic deployment of expectation, confirmation, and surprise.
5. The Sketching-Assembling-Evaluating Model
5.1 Stage 1: Sketching and Grammatical Constraint
The initial sketching phase involves the generation of musical materials ranging from individual chords and gestures to extended phrases. At this stage, evaluation criteria operate primarily within what Jackendoff and Lerdahl (1983) term "well-formedness rules": the syntactic constraints that determine acceptable combinations within a given musical idiom. This phase aligns with Bregman's (1990) concept of auditory scene analysis, where individual musical elements must first establish their perceptual integrity before entering into more complex relational networks.
However, the grammatical acceptability established at the sketching stage proves insufficient for determining the ultimate function of musical materials within the completed work. As Krumhansl (2001) demonstrates in her research on pitch hierarchies, musical elements exist in states of semantic indeterminacy until contextualised within broader structural frameworks. Individual sketches, therefore, represent what Saussure (1916) would characterise as signs awaiting their full semiotic realisation through syntagmatic combination.
The sketching process also involves what Larson (2012) identifies as the operation of "musical forces" (gravity, magnetism, and inertia) that govern the local behaviour of musical elements. These forces operate most clearly at the sketch level, where individual gestures can be evaluated for their internal coherence and directional tendency. However, as materials move into assembly, these local forces may conflict with or be overridden by larger-scale structural imperatives.
5.2 Stage 2: Assembly and Contextual Transformation
The assembly stage presents qualitatively different compositional challenges, operating within what Agawu (1991) characterises as the realm of musical play, where signs enter into complex relationships that can fundamentally alter their semantic function. Materials that demonstrated grammatical coherence in isolation may undergo what can be termed "contextual revaluation": a process whereby their structural significance shifts through relationship with surrounding elements.
This phenomenon is particularly evident in harmonic progression, where voice-leading relationships between individually acceptable chords can produce unexpected tonal implications. Tymoczko's (2011) geometric approach to harmony reveals how voice-leading parsimony creates pathways between harmonies that may not be evident from their individual properties. A chord that functions as a stable sonority in isolation may, through efficient voice-leading, become a pivot point between tonal areas, fundamentally altering its structural role within the work.
The assembly process requires continuous evaluation along two primary dimensions: kinship and contrast. Kinship assessment involves identifying materials that share what Klein (2005) terms "intertextual resonances": structural, gestural, or topical similarities that create coherent musical discourse. Contrast evaluation, conversely, requires understanding how differences between materials can articulate formal boundaries and create a dramatic trajectory. As Meyer (1973, 1989) argues, musical meaning emerges precisely through the tension between conformity and deviation from established patterns.
This stage also involves what might be termed "reverse-engineering" of musical materials: composing backwards from perceived structural needs rather than forward from spontaneous inspiration. When gaps are identified in the assembled materials, composers must generate sketches that fulfil specific functional requirements whilst maintaining stylistic consistency with existing materials. This process draws upon what Cenkerová and Parncutt (2015) identify as "style-dependent expectation," where compositional decisions must account for the specific predictive frameworks established by the emerging work itself.
5.3 Stage 3:
Managing Musical Salience and Semantic ControlThe evaluative dimension of assembly centres on the management of musical salience as theorised by Lerdahl (1992). Musical salience operates through multiple hierarchical levels (local accent patterns, metrical structure, harmonic rhythm, and large-scale formal articulation), creating what Lerdahl terms "event hierarchies" that guide listener attention and interpretation. Compositional success requires careful calibration of these hierarchical relationships to ensure that salient events correspond to structurally significant moments.
The challenge lies in distinguishing between intended and unintended salience. A musical gesture that appears unremarkable in sketch form may, when contextualised, create implications that exceed the composer's original intentions. Narmour's (1990, 1991) implication-realisation model demonstrates how musical meaning emerges through the dynamic relationship between expectation and fulfilment, making contextual positioning crucial to semantic control. What Narmour characterises as "bottom-up" processing (driven by immediate melodic and harmonic relationships) may conflict with "top-down" processing driven by learned stylistic patterns, creating interpretive ambiguities that composers must carefully manage.
Dibben's (1999) research on structural perception in atonal music reveals how salience functions differently across stylistic contexts. In tonal music, salience often aligns with functional harmonic relationships, whilst in atonal contexts, factors such as registral position, rhythmic placement, and timbral differentiation become primary salience markers. This research suggests that effective composition requires style-specific understanding of how musical elements achieve perceptual prominence.
The concept of "musical red herrings" (gestures that inadvertently signal structural significance through their perceptual prominence rather than their intended function) represents a critical challenge in salience management. Drawing from Schlenker's (2017) work on music semantics, these false signals can be understood as instances where the compositional "speaker meaning" diverges from the performative or receptive "audience meaning." Effective composition requires strategies for preventing such semantic drift whilst maintaining the spontaneity and surprise that characterise compelling musical discourse.
6. The Chain of Intentional Agents
This compositional model operates within what can be understood as a series of intentional agents, each contributing to the ultimate musical discourse. The theoretical framework for understanding this chain draws from Cumming's (2000) work on musical subjectivity, which reveals how musical meaning emerges through the interaction of multiple interpretive perspectives rather than through simple authorial intention.
The composer's initial intentionality represents what Nattiez (1990) terms the "poietic" dimension of musical creation. However, this intentionality undergoes transformation as the emerging work develops its own internal logic: what might be termed the "Narrator's Intentionality" of the piece itself. This concept aligns with Hatten's (2004) understanding of musical "agency," where musical works exhibit emergent properties that constrain and direct compositional choice-making. The work begins to "speak" through its own structural imperatives, creating a dialogue between creative agency and systematic necessity.
The transition from compositional to performative agency involves what Barthes (1977) characterises as the "grain of the voice": the irreducible physical and interpretive specificity that performers bring to musical realisation. Performers function as intermediate intentional agents, decoding compositional cues and re-encoding them through performance decisions that will ultimately guide listener comprehension. This process involves what Juslin and Lindström (2010) identify as the translation between "composed" and "performed" features, where structural indications must be converted into acoustical and gestural realities.
The performer's interpretive task is complicated by what London (2012) terms the "temporal" dimension of musical meaning. Unlike visual artworks, music unfolds through time, requiring performers to make real-time decisions about phrasing, timing, and emphasis that directly shape listener cognition. These decisions operate within the constraint networks established by the composition whilst drawing upon performance traditions and individual interpretive insight.
The listener serves as the final intentional agent in this chain, actively constructing musical meaning through what Schäfer et al. (2015) characterise as "emotional memory integration": the process whereby current musical experience is understood in relation to previous listening experiences. This process involves multiple cognitive systems operating simultaneously: expectation formation based on statistical learning (Pearce, 2018), emotional response mechanisms (Varga and Parkinson, 2025), and structural pattern recognition (Müller, 2015).
The composer's task involves not merely organising sounds, but curating the listener's cognitive pathway through the work. This requires understanding how musical structures interact with perceptual and cognitive processes to create what Huron (2006) terms "predictive engagement": the active mental participation that constitutes musical experience. Effective composition must therefore account for the statistical learning processes through which listeners develop stylistic competence whilst providing sufficient novelty to maintain engagement and interest.
7. Implications for Compositional Practice and Pedagogy
This multi-agent model suggests that effective composition requires simultaneous consideration of multiple interpretive perspectives, essentially functioning as cognitive modelling of musical communication. The sketching-assembling-evaluating framework provides a practical methodology for managing this complex interpretive ecology, offering systematic approaches to the contextual testing and refinement of musical materials.
The model highlights the inadequacy of purely grammatical approaches to compositional evaluation. Whilst musical grammar provides necessary constraints for material generation, the semantic function of musical gestures emerges only through contextual relationships. This observation aligns with Tagg's (2013) critique of formalist analytical approaches that divorce musical structure from communicative function. Compositional pedagogy might therefore benefit from greater emphasis on contextual evaluation skills alongside traditional grammatical training.
Furthermore, the model suggests that composition involves managing what Monelle (2000) terms "musical semiotics": the systems of signs and meanings through which musical communication operates. This requires understanding not only how individual musical elements function, but how they combine to create larger-scale semantic networks. Such understanding demands integration of theoretical knowledge about musical structure with empirical research on musical cognition and perception.
The framework also has implications for understanding compositional style and historical change. As musical languages evolve, the relationships between grammar, salience, and meaning shift, requiring composers to adapt their contextual evaluation strategies. This perspective offers a dynamic view of musical style that emphasises the cognitive and communicative dimensions of stylistic change rather than purely formal or technical considerations.
9. Parallels in Poetic Composition: The Sketching-Assembling Model in Literary Practice
The sketching-assembling-evaluating model proposed here for musical composition finds remarkable parallels in poetic practice, suggesting that this approach to creative work may reflect broader cognitive principles of artistic creation. An examination of documented poetic processes reveals that many poets employ remarkably similar strategies of fragment collection, contextual assembly, and semantic evaluation.
Emily Dickinson's compositional method provides perhaps the most documented example of poetic sketching and assembly. Dickinson left approximately 100 fragments or scraps, which scholar Marta Werner characterises as "extrageneric" documents that exist between prose and poetry. Her process involved jotting first drafts on odd scraps of paper, which were later transcribed and "neatly copied in ink on sheets of folded stationery which she arranged in groups, usually of sixteen to twenty-four pages, and sewed together into packets or fascicles". These fascicles recorded "the variations in word choice Dickinson considered," demonstrating her evaluative process.
The parallel to musical sketching becomes particularly evident in Dickinson's practice of jotting down "single lines and raw snatches" of poetry, which functioned as poetic equivalents to musical gestures or harmonic sketches. Like musical materials that change function when placed in different contextual frameworks, Dickinson's fragments underwent transformation when assembled within the formal constraints of her fascicles.
Walt Whitman's compositional approach demonstrates a similar process of collection and assembly on a larger scale. The Thomas Biggs Harned Collection of Walt Whitman Papers documents his extensive use of "notes and notebooks" in developing his poetry. Whitman's expansion of Leaves of Grass from "a slim book of 12 poems" in 1855 to "a thick compendium of almost 400" by his death exemplifies the assembly process, where individual poetic fragments were continuously reorganised and recontextualised.
Contemporary poets have continued this tradition of fragment-based composition. Mark Strand, in his interviews and essays on poetic craft, described collecting what he termed "scraps of language" in notebooks: fragmentary phrases and observations that would later find their way into finished poems through a process he characterised as "assembling the pieces" (Strand, 1990). This approach mirrors the musical sketching process, where individual materials await contextual integration.
Mary Oliver's compositional practice provides another well-documented example of the sketching-assembly approach. Oliver, described as "an avid walker," often pursued inspiration on foot, and her papers include extensive notebooks documenting her observations during these walks. She would jot down fragments (single lines, image sketches, or brief nature observations), which were later developed and combined into complete poems. The Mary Oliver Papers, comprising "some 40,000 items in more than 118 containers," include extensive documentation of this process through her notebooks and draft materials.
This poetic sketching-assembly process exhibits several characteristics that parallel the musical model proposed here. First, both involve the generation of fragmentary materials that possess local coherence but remain semantically indeterminate until contextualised. Second, the assembly process requires continuous evaluation of how fragments interact to create meaning, with particular attention to what might be termed "poetic salience": lines or images that draw attention through their contextual prominence rather than their intended function.
The evaluative dimension of poetic assembly also involves managing reader cognition in ways analogous to the composer's management of listener expectation. Poets must consider how the juxtaposition of fragments will guide reader interpretation, ensuring that salient elements contribute to rather than distract from the overall poetic discourse. This suggests that the multi-agent model of creative communication (involving creator, work-as-narrator, intermediate interpreter, and ultimate receiver) may operate across artistic domains.
The existence of parallel processes in poetry supports the hypothesis that the sketching-assembling-evaluating model reflects fundamental cognitive principles of creative work. Both musical and poetic compositions appear to involve similar challenges in managing the transformation of meaning through contextual relationships, suggesting that insights from one domain might inform understanding of creative processes more generally. This cross-disciplinary perspective offers potential for developing more comprehensive theories of artistic creation that account for both the cognitive and communicative dimensions of creative practice.
10. Conclusion
The sketching-assembling-evaluating model provides a framework for understanding compositional process that accounts for both the creative and cognitive dimensions of musical communication. By recognising composition as a multi-agent communicative act, the model offers practical insights for composers whilst contributing to broader theoretical discussions about musical meaning and interpretation.
Future research might explore how this model applies across different musical styles and cultural contexts, investigate the specific cognitive processes involved in contextual evaluation, and examine how technological tools might support composers in managing the complex relationships between musical materials and their emergent meanings. The model's emphasis on the communicative function of musical structure also suggests fruitful connections with research in musical emotion, cross-cultural musical cognition, and the social dimensions of musical meaning.
References
Agawu, V.K. (1991) Playing with signs: a semiotic interpretation of classic music. Princeton: Princeton University Press.
Barthes, R. (1977) 'The grain of the voice', in Image-music-text, pp. 179-189. New York: Hill and Wang.
Bregman, A.S. (1990) Auditory scene analysis: the perceptual organization of sound. Cambridge, MA: MIT Press.
Bussick, J. (2017) 'Sketching music: stage two of the composing process', Art of Composing. Available at: https://www.artofcomposing.com/sketching-music-stage-two-of-the-composing-process (Accessed: 19 August 2025).
Cenkerová, Z. and Parncutt, R. (2015) 'Style-dependency of melodic expectation: changing the rules in real time music perception', Music Perception, 33(1), pp. 110-128. doi: 10.1525/mp.2015.33.1.110.
Cumming, N. (2000) The sonic self: musical subjectivity and signification. Bloomington: Indiana University Press.
Dibben, N. (1999) 'The perception of structural stability in atonal music: the influence of salience, stability, horizontal motion, pitch commonality, and dissonance', Music Perception: An Interdisciplinary Journal, 16(3), pp. 265-294. doi: 10.2307/40285794.
Hatten, R.S. (1994) Musical meaning in Beethoven: markedness, correlation, and interpretation. Bloomington: Indiana University Press.
Hatten, R.S. (2004) Interpreting musical gestures, topics, and tropes: Mozart, Beethoven, Schubert. Bloomington: Indiana University Press.
Hickey, M. (2012) Music outside the lines: ideas for composing in K-12 music classrooms. Oxford: Oxford University Press.
Huron, D. (2006) Sweet anticipation: music and the psychology of expectation. Cambridge, MA: MIT Press.
Jackendoff, R. and Lerdahl, F. (1983) A generative theory of tonal music. Cambridge, MA: MIT Press.
Juslin, P.N. and Lindström, E. (2010) 'Musical expression of emotions: modelling listeners' judgements of composed and performed features', Music Analysis, 29(1/3), pp. 334-364.
Kerman, J. (1982) 'Sketch studies', in The New Grove Dictionary of Music and Musicians. London: Macmillan.
Klein, M.L. (2005) Intertextuality in Western art music. Bloomington: Indiana University Press.
Krumhansl, C.L. (2001) Cognitive foundations of musical pitch. Oxford: Oxford University Press.
Larson, S. (2012) Musical forces: motion, metaphor, and meaning in music. Bloomington: Indiana University Press.
Lerdahl, F. (1987) 'Timbral hierarchies', in Howell, P., West, R., and Cross, I. (eds.) Music, mind, and culture: the perception of musical structures. Oxford: Oxford University Press.
Lerdahl, F. (1989) 'Atonal prolongational structure', Contemporary Music Review, 4(1), pp. 65-87.
Lerdahl, F. (1992) 'Salience and the structure of music', Music Perception, 10(1), pp. 91-114.
London, J. (2012) Hearing in time: psychological aspects of musical meter (2nd ed.). Oxford: Oxford University Press.
Meyer, L.B. (1973) Explaining music: essays and explorations. Berkeley: University of California Press.
Meyer, L.B. (1989) Style and music: theory, history, and ideology. Philadelphia: University of Pennsylvania Press.
Monelle, R. (2000) The sense of music: semiotic essays. Princeton: Princeton University Press.
Müller, M. (2015) Fundamentals of music processing: audio, analysis, algorithms, applications. Cham: Springer.
Narmour, E. (1977) Beyond Schenkerism: the need for alternatives in music analysis. Chicago: University of Chicago Press.
Narmour, E. (1990) The analysis and cognition of basic melodic structures. Chicago: University of Chicago Press.
Narmour, E. (1991) 'The top-down and bottom-up systems of musical implication', Music Perception, 9(1), pp. 1-26.
Nattiez, J.-J. (1990) Music and discourse: toward a semiology of music. Translated by C. Abbate. Princeton: Princeton University Press.
Pearce, M.T. (2018) 'Statistical learning and probabilistic prediction in music cognition: mechanisms of stylistic enculturation', Annals of the New York Academy of Sciences, 1423(1), pp. 378-395.
Peirce, C.S. (1931-1958) Collected papers of Charles Sanders Peirce (Vols. 1-8). Cambridge, MA: Harvard University Press.
Saussure, F. de (1916) Course in general linguistics. Translated by W. Baskin. New York: Philosophical Library.
Schäfer, T., Huron, D., Shanahan, D. and Sedlmeier, P. (2015) 'How we remember the emotional intensity of past musical experiences', Frontiers in Psychology, 5, article 911. doi: 10.3389/fpsyg.2014.00911.
Schlenker, P. (2017) 'Outline of music semantics', Music Perception, 35(1), pp. 1-35.
Tagg, P. (2013) Music's meanings: a modern musicology for non-musos. New York: Mass Media Music Scholars' Press.
Tyson, A. (1987) Mozart: studies of the autograph scores. Cambridge, MA: Harvard University Press.
Tymoczko, D. (2011) A geometry of music: harmony and counterpoint in the extended common practice. Oxford: Oxford University Press.
Strand, M. (1990) 'Notes on the craft of poetry', in The Weather of Words: Poetic Invention. New York: Knopf, pp. 15-32.
Werner, M. (1999) Emily Dickinson's open folios: scenes of reading, surfaces of writing. Ann Arbor: University of Michigan Press.
Alan Fleming Baird
University of Glasgow (PhD Candidate)
Abstract
This paper proposes a three-stage model of compositional process consisting of sketching, assembling, and evaluating that explicitly accounts for the multiple layers of agency involved in musical communication. Drawing from cognitive musicology, semiotics, and music theory, the model addresses how musical meaning transforms through contextual relationships and how composers must navigate between grammatical correctness and semantic function across a chain of intentional agents from composer through performer to listener.
Keywords: composition, sketching, musical salience, semiotics, cognitive musicology, multi-agent systems
1. Introduction
Contemporary compositional practice operates within what Nattiez (1990) characterises as a tripartite semiological framework, encompassing the poietic (compositional), neutral (score-based), and esthesic (receptive) dimensions of musical meaning. However, existing models of compositional process have inadequately addressed the dynamic relationships between these dimensions, particularly how musical materials undergo semantic transformation as they move from private sketch to public performance. This paper proposes a systematic approach to understanding compositional decision-making that integrates insights from cognitive psychology, musical semiotics, and practical compositional experience.
2. Historical Context and Contemporary Practice
The use of sketching as a compositional methodology has extensive historical precedent, though its systematic theorisation remains underdeveloped. Mozart's compositional process, long mischaracterised as spontaneous inspiration, has been revealed through sketch studies to involve extensive drafting and revision. Analysis of Mozart's surviving sketches demonstrates that even his apparently effortless compositions emerged through conscious processes of material generation, contextual testing, and structural refinement (Tyson, 1987). Similarly, Beethoven's compositional method involved what Kerman (1982) characterises as "slow and laborious" sketching processes, with extensive notebooks documenting his systematic approach to developing musical ideas through iterative revision.
Contemporary composition pedagogy increasingly emphasises sketching as a fundamental practice, though often without explicit theoretical frameworks for understanding how sketched materials transform through assembly. Educational approaches typically advocate for multi-stage processes involving initial idea generation, rough structural planning, and subsequent development, yet these pedagogical models rarely address the cognitive and semiotic dimensions of compositional decision-making (Hickey, 2012). Some contemporary composition educators have begun implementing systematic sketching methodologies as teaching tools. Bussick (2017) advocates for structured sketching principles, including limitation-setting, focus on characteristic material, and strategic omission of redundant elements, positioning sketching as "stage two of the composing process" that bridges initial conception and formal assembly. This pedagogical approach aligns with the theoretical framework proposed here, demonstrating practical applications of systematic sketching methodology in contemporary composition education.
The relationship between sketching practices and contemporary cognitive musicology has remained largely unexplored. Whilst research in music perception and cognition has generated a sophisticated understanding of how listeners process musical information, these insights have not been systematically integrated into compositional methodology. The sketching-assembling-evaluating model proposed here represents an attempt to bridge this gap, providing a framework that connects practical compositional experience with theoretical understanding of musical communication and cognition.
3. Positioning the Current Model
What distinguishes the present model from existing approaches is its explicit integration of three theoretical dimensions: semiotic analysis of musical meaning-making, cognitive research on musical perception and expectation, and systematic attention to the multi-agent nature of musical communication. Whilst individual composers may intuitively consider how their materials will be interpreted by performers and received by audiences, the systematic articulation of this "chain of intentional agents" within compositional methodology appears novel.
The model's emphasis on managing musical salience through contextual evaluation also represents a distinctive contribution. Whilst composition teachers often advise students to consider how materials work "in context," the theoretical framework provided by research on musical salience (Lerdahl, 1992; Dibben, 1999) offers specific analytical tools for understanding why certain contextual combinations succeed or fail. The integration of this cognitive research with practical compositional decision-making provides a bridge between scientific understanding of musical perception and creative practice.
Furthermore, the model's focus on "contextual transformation" (how musical materials change meaning through assembly) addresses a gap in both compositional pedagogy and music theory. Traditional analytical approaches often treat musical meaning as relatively stable, focusing on the inherent properties of musical objects rather than their relational dynamics. Schenkerian analysis, for instance, tends to identify structural functions that remain consistent throughout a work, whilst set-theory approaches in atonal music analysis emphasise the invariant properties of pitch collections. These methodologies, whilst valuable for understanding musical structure, provide limited insight into how meaning emerges and transforms through the juxtaposition of materials.
The present model emphasises the dynamic, emergent nature of musical semantics, recognising that a chord progression, melodic gesture, or rhythmic pattern may function differently depending on its contextual position within the larger work. This perspective challenges the notion that musical elements possess fixed semantic properties, instead proposing that meaning arises through what might be termed "semiotic negotiation" between materials in proximity. A dominant seventh chord, for example, may function as a stable sonority in one context, a transitional harmony in another, and a source of harmonic tension in a third, with each contextual placement activating different aspects of its semantic potential.
This dynamic view aligns with recent developments in musical semiotics, particularly Hatten's (2004) work on markedness and correlation, which demonstrates how musical meanings emerge through networks of stylistic associations that can be activated, suppressed, or transformed through contextual manipulation. Similarly, Agawu's (1991) concept of musical "play" emphasises how signs acquire meaning through their participation in larger structural games rather than through inherent referential properties. The sketching-assembling-evaluating model extends these semiotic insights by providing practical methodologies for composers to actively manage these transformative processes.
For compositional pedagogy, this approach suggests moving beyond instruction focused solely on grammatical correctness towards teaching strategies for contextual evaluation and semantic management. Students might be taught to consider not just whether a musical idea "works" in isolation, but how it might function differently when placed in various contextual relationships. This could involve exercises in deliberate recontextualisation, where the same musical material is placed in different structural positions to observe how its semantic function shifts. Such pedagogical approaches would prepare composers to think more systematically about the communicative dimensions of their work, recognising composition as an active process of meaning construction rather than merely the arrangement of pre-existing musical objects.
4. Theoretical Framework
The proposed model builds upon Peirce's (1931-1958) triadic conception of the sign, where musical gestures function simultaneously as signs (the material itself), objects (what they refer to), and interpretants (how they are understood). Within compositional practice, this triadic relationship becomes particularly complex as musical materials move through different contextual frames, each potentially altering their semiotic function. As Hatten (1994, 2004) demonstrates in his work on musical topics and gestures, meaning in music emerges through correlational networks rather than fixed referential relationships.
The cognitive dimension of this process draws heavily from research in musical expectation and statistical learning. Huron's (2006) ITPRA theory (Imagination, Tension, Prediction, Reaction, Appraisal) provides a framework for understanding how listeners construct meaning through predictive processing, while Pearce's (2018) work on statistical learning reveals how stylistic knowledge shapes expectation formation. For composers, this research suggests that effective composition requires not merely the organisation of sounds, but the careful management of listener cognition through the strategic deployment of expectation, confirmation, and surprise.
5. The Sketching-Assembling-Evaluating Model
5.1 Stage 1: Sketching and Grammatical Constraint
The initial sketching phase involves the generation of musical materials ranging from individual chords and gestures to extended phrases. At this stage, evaluation criteria operate primarily within what Jackendoff and Lerdahl (1983) term "well-formedness rules": the syntactic constraints that determine acceptable combinations within a given musical idiom. This phase aligns with Bregman's (1990) concept of auditory scene analysis, where individual musical elements must first establish their perceptual integrity before entering into more complex relational networks.
However, the grammatical acceptability established at the sketching stage proves insufficient for determining the ultimate function of musical materials within the completed work. As Krumhansl (2001) demonstrates in her research on pitch hierarchies, musical elements exist in states of semantic indeterminacy until contextualised within broader structural frameworks. Individual sketches, therefore, represent what Saussure (1916) would characterise as signs awaiting their full semiotic realisation through syntagmatic combination.
The sketching process also involves what Larson (2012) identifies as the operation of "musical forces" (gravity, magnetism, and inertia) that govern the local behaviour of musical elements. These forces operate most clearly at the sketch level, where individual gestures can be evaluated for their internal coherence and directional tendency. However, as materials move into assembly, these local forces may conflict with or be overridden by larger-scale structural imperatives.
5.2 Stage 2: Assembly and Contextual Transformation
The assembly stage presents qualitatively different compositional challenges, operating within what Agawu (1991) characterises as the realm of musical play, where signs enter into complex relationships that can fundamentally alter their semantic function. Materials that demonstrated grammatical coherence in isolation may undergo what can be termed "contextual revaluation": a process whereby their structural significance shifts through relationship with surrounding elements.
This phenomenon is particularly evident in harmonic progression, where voice-leading relationships between individually acceptable chords can produce unexpected tonal implications. Tymoczko's (2011) geometric approach to harmony reveals how voice-leading parsimony creates pathways between harmonies that may not be evident from their individual properties. A chord that functions as a stable sonority in isolation may, through efficient voice-leading, become a pivot point between tonal areas, fundamentally altering its structural role within the work.
The assembly process requires continuous evaluation along two primary dimensions: kinship and contrast. Kinship assessment involves identifying materials that share what Klein (2005) terms "intertextual resonances": structural, gestural, or topical similarities that create coherent musical discourse. Contrast evaluation, conversely, requires understanding how differences between materials can articulate formal boundaries and create a dramatic trajectory. As Meyer (1973, 1989) argues, musical meaning emerges precisely through the tension between conformity and deviation from established patterns.
This stage also involves what might be termed "reverse-engineering" of musical materials: composing backwards from perceived structural needs rather than forward from spontaneous inspiration. When gaps are identified in the assembled materials, composers must generate sketches that fulfil specific functional requirements whilst maintaining stylistic consistency with existing materials. This process draws upon what Cenkerová and Parncutt (2015) identify as "style-dependent expectation," where compositional decisions must account for the specific predictive frameworks established by the emerging work itself.
5.3 Stage 3:
Managing Musical Salience and Semantic ControlThe evaluative dimension of assembly centres on the management of musical salience as theorised by Lerdahl (1992). Musical salience operates through multiple hierarchical levels (local accent patterns, metrical structure, harmonic rhythm, and large-scale formal articulation), creating what Lerdahl terms "event hierarchies" that guide listener attention and interpretation. Compositional success requires careful calibration of these hierarchical relationships to ensure that salient events correspond to structurally significant moments.
The challenge lies in distinguishing between intended and unintended salience. A musical gesture that appears unremarkable in sketch form may, when contextualised, create implications that exceed the composer's original intentions. Narmour's (1990, 1991) implication-realisation model demonstrates how musical meaning emerges through the dynamic relationship between expectation and fulfilment, making contextual positioning crucial to semantic control. What Narmour characterises as "bottom-up" processing (driven by immediate melodic and harmonic relationships) may conflict with "top-down" processing driven by learned stylistic patterns, creating interpretive ambiguities that composers must carefully manage.
Dibben's (1999) research on structural perception in atonal music reveals how salience functions differently across stylistic contexts. In tonal music, salience often aligns with functional harmonic relationships, whilst in atonal contexts, factors such as registral position, rhythmic placement, and timbral differentiation become primary salience markers. This research suggests that effective composition requires style-specific understanding of how musical elements achieve perceptual prominence.
The concept of "musical red herrings" (gestures that inadvertently signal structural significance through their perceptual prominence rather than their intended function) represents a critical challenge in salience management. Drawing from Schlenker's (2017) work on music semantics, these false signals can be understood as instances where the compositional "speaker meaning" diverges from the performative or receptive "audience meaning." Effective composition requires strategies for preventing such semantic drift whilst maintaining the spontaneity and surprise that characterise compelling musical discourse.
6. The Chain of Intentional Agents
This compositional model operates within what can be understood as a series of intentional agents, each contributing to the ultimate musical discourse. The theoretical framework for understanding this chain draws from Cumming's (2000) work on musical subjectivity, which reveals how musical meaning emerges through the interaction of multiple interpretive perspectives rather than through simple authorial intention.
The composer's initial intentionality represents what Nattiez (1990) terms the "poietic" dimension of musical creation. However, this intentionality undergoes transformation as the emerging work develops its own internal logic: what might be termed the "Narrator's Intentionality" of the piece itself. This concept aligns with Hatten's (2004) understanding of musical "agency," where musical works exhibit emergent properties that constrain and direct compositional choice-making. The work begins to "speak" through its own structural imperatives, creating a dialogue between creative agency and systematic necessity.
The transition from compositional to performative agency involves what Barthes (1977) characterises as the "grain of the voice": the irreducible physical and interpretive specificity that performers bring to musical realisation. Performers function as intermediate intentional agents, decoding compositional cues and re-encoding them through performance decisions that will ultimately guide listener comprehension. This process involves what Juslin and Lindström (2010) identify as the translation between "composed" and "performed" features, where structural indications must be converted into acoustical and gestural realities.
The performer's interpretive task is complicated by what London (2012) terms the "temporal" dimension of musical meaning. Unlike visual artworks, music unfolds through time, requiring performers to make real-time decisions about phrasing, timing, and emphasis that directly shape listener cognition. These decisions operate within the constraint networks established by the composition whilst drawing upon performance traditions and individual interpretive insight.
The listener serves as the final intentional agent in this chain, actively constructing musical meaning through what Schäfer et al. (2015) characterise as "emotional memory integration": the process whereby current musical experience is understood in relation to previous listening experiences. This process involves multiple cognitive systems operating simultaneously: expectation formation based on statistical learning (Pearce, 2018), emotional response mechanisms (Varga and Parkinson, 2025), and structural pattern recognition (Müller, 2015).
The composer's task involves not merely organising sounds, but curating the listener's cognitive pathway through the work. This requires understanding how musical structures interact with perceptual and cognitive processes to create what Huron (2006) terms "predictive engagement": the active mental participation that constitutes musical experience. Effective composition must therefore account for the statistical learning processes through which listeners develop stylistic competence whilst providing sufficient novelty to maintain engagement and interest.
7. Implications for Compositional Practice and Pedagogy
This multi-agent model suggests that effective composition requires simultaneous consideration of multiple interpretive perspectives, essentially functioning as cognitive modelling of musical communication. The sketching-assembling-evaluating framework provides a practical methodology for managing this complex interpretive ecology, offering systematic approaches to the contextual testing and refinement of musical materials.
The model highlights the inadequacy of purely grammatical approaches to compositional evaluation. Whilst musical grammar provides necessary constraints for material generation, the semantic function of musical gestures emerges only through contextual relationships. This observation aligns with Tagg's (2013) critique of formalist analytical approaches that divorce musical structure from communicative function. Compositional pedagogy might therefore benefit from greater emphasis on contextual evaluation skills alongside traditional grammatical training.
Furthermore, the model suggests that composition involves managing what Monelle (2000) terms "musical semiotics": the systems of signs and meanings through which musical communication operates. This requires understanding not only how individual musical elements function, but how they combine to create larger-scale semantic networks. Such understanding demands integration of theoretical knowledge about musical structure with empirical research on musical cognition and perception.
The framework also has implications for understanding compositional style and historical change. As musical languages evolve, the relationships between grammar, salience, and meaning shift, requiring composers to adapt their contextual evaluation strategies. This perspective offers a dynamic view of musical style that emphasises the cognitive and communicative dimensions of stylistic change rather than purely formal or technical considerations.
9. Parallels in Poetic Composition: The Sketching-Assembling Model in Literary Practice
The sketching-assembling-evaluating model proposed here for musical composition finds remarkable parallels in poetic practice, suggesting that this approach to creative work may reflect broader cognitive principles of artistic creation. An examination of documented poetic processes reveals that many poets employ remarkably similar strategies of fragment collection, contextual assembly, and semantic evaluation.
Emily Dickinson's compositional method provides perhaps the most documented example of poetic sketching and assembly. Dickinson left approximately 100 fragments or scraps, which scholar Marta Werner characterises as "extrageneric" documents that exist between prose and poetry. Her process involved jotting first drafts on odd scraps of paper, which were later transcribed and "neatly copied in ink on sheets of folded stationery which she arranged in groups, usually of sixteen to twenty-four pages, and sewed together into packets or fascicles". These fascicles recorded "the variations in word choice Dickinson considered," demonstrating her evaluative process.
The parallel to musical sketching becomes particularly evident in Dickinson's practice of jotting down "single lines and raw snatches" of poetry, which functioned as poetic equivalents to musical gestures or harmonic sketches. Like musical materials that change function when placed in different contextual frameworks, Dickinson's fragments underwent transformation when assembled within the formal constraints of her fascicles.
Walt Whitman's compositional approach demonstrates a similar process of collection and assembly on a larger scale. The Thomas Biggs Harned Collection of Walt Whitman Papers documents his extensive use of "notes and notebooks" in developing his poetry. Whitman's expansion of Leaves of Grass from "a slim book of 12 poems" in 1855 to "a thick compendium of almost 400" by his death exemplifies the assembly process, where individual poetic fragments were continuously reorganised and recontextualised.
Contemporary poets have continued this tradition of fragment-based composition. Mark Strand, in his interviews and essays on poetic craft, described collecting what he termed "scraps of language" in notebooks: fragmentary phrases and observations that would later find their way into finished poems through a process he characterised as "assembling the pieces" (Strand, 1990). This approach mirrors the musical sketching process, where individual materials await contextual integration.
Mary Oliver's compositional practice provides another well-documented example of the sketching-assembly approach. Oliver, described as "an avid walker," often pursued inspiration on foot, and her papers include extensive notebooks documenting her observations during these walks. She would jot down fragments (single lines, image sketches, or brief nature observations), which were later developed and combined into complete poems. The Mary Oliver Papers, comprising "some 40,000 items in more than 118 containers," include extensive documentation of this process through her notebooks and draft materials.
This poetic sketching-assembly process exhibits several characteristics that parallel the musical model proposed here. First, both involve the generation of fragmentary materials that possess local coherence but remain semantically indeterminate until contextualised. Second, the assembly process requires continuous evaluation of how fragments interact to create meaning, with particular attention to what might be termed "poetic salience": lines or images that draw attention through their contextual prominence rather than their intended function.
The evaluative dimension of poetic assembly also involves managing reader cognition in ways analogous to the composer's management of listener expectation. Poets must consider how the juxtaposition of fragments will guide reader interpretation, ensuring that salient elements contribute to rather than distract from the overall poetic discourse. This suggests that the multi-agent model of creative communication (involving creator, work-as-narrator, intermediate interpreter, and ultimate receiver) may operate across artistic domains.
The existence of parallel processes in poetry supports the hypothesis that the sketching-assembling-evaluating model reflects fundamental cognitive principles of creative work. Both musical and poetic compositions appear to involve similar challenges in managing the transformation of meaning through contextual relationships, suggesting that insights from one domain might inform understanding of creative processes more generally. This cross-disciplinary perspective offers potential for developing more comprehensive theories of artistic creation that account for both the cognitive and communicative dimensions of creative practice.
10. Conclusion
The sketching-assembling-evaluating model provides a framework for understanding compositional process that accounts for both the creative and cognitive dimensions of musical communication. By recognising composition as a multi-agent communicative act, the model offers practical insights for composers whilst contributing to broader theoretical discussions about musical meaning and interpretation.
Future research might explore how this model applies across different musical styles and cultural contexts, investigate the specific cognitive processes involved in contextual evaluation, and examine how technological tools might support composers in managing the complex relationships between musical materials and their emergent meanings. The model's emphasis on the communicative function of musical structure also suggests fruitful connections with research in musical emotion, cross-cultural musical cognition, and the social dimensions of musical meaning.
References
Agawu, V.K. (1991) Playing with signs: a semiotic interpretation of classic music. Princeton: Princeton University Press.
Barthes, R. (1977) 'The grain of the voice', in Image-music-text, pp. 179-189. New York: Hill and Wang.
Bregman, A.S. (1990) Auditory scene analysis: the perceptual organization of sound. Cambridge, MA: MIT Press.
Bussick, J. (2017) 'Sketching music: stage two of the composing process', Art of Composing. Available at: https://www.artofcomposing.com/sketching-music-stage-two-of-the-composing-process (Accessed: 19 August 2025).
Cenkerová, Z. and Parncutt, R. (2015) 'Style-dependency of melodic expectation: changing the rules in real time music perception', Music Perception, 33(1), pp. 110-128. doi: 10.1525/mp.2015.33.1.110.
Cumming, N. (2000) The sonic self: musical subjectivity and signification. Bloomington: Indiana University Press.
Dibben, N. (1999) 'The perception of structural stability in atonal music: the influence of salience, stability, horizontal motion, pitch commonality, and dissonance', Music Perception: An Interdisciplinary Journal, 16(3), pp. 265-294. doi: 10.2307/40285794.
Hatten, R.S. (1994) Musical meaning in Beethoven: markedness, correlation, and interpretation. Bloomington: Indiana University Press.
Hatten, R.S. (2004) Interpreting musical gestures, topics, and tropes: Mozart, Beethoven, Schubert. Bloomington: Indiana University Press.
Hickey, M. (2012) Music outside the lines: ideas for composing in K-12 music classrooms. Oxford: Oxford University Press.
Huron, D. (2006) Sweet anticipation: music and the psychology of expectation. Cambridge, MA: MIT Press.
Jackendoff, R. and Lerdahl, F. (1983) A generative theory of tonal music. Cambridge, MA: MIT Press.
Juslin, P.N. and Lindström, E. (2010) 'Musical expression of emotions: modelling listeners' judgements of composed and performed features', Music Analysis, 29(1/3), pp. 334-364.
Kerman, J. (1982) 'Sketch studies', in The New Grove Dictionary of Music and Musicians. London: Macmillan.
Klein, M.L. (2005) Intertextuality in Western art music. Bloomington: Indiana University Press.
Krumhansl, C.L. (2001) Cognitive foundations of musical pitch. Oxford: Oxford University Press.
Larson, S. (2012) Musical forces: motion, metaphor, and meaning in music. Bloomington: Indiana University Press.
Lerdahl, F. (1987) 'Timbral hierarchies', in Howell, P., West, R., and Cross, I. (eds.) Music, mind, and culture: the perception of musical structures. Oxford: Oxford University Press.
Lerdahl, F. (1989) 'Atonal prolongational structure', Contemporary Music Review, 4(1), pp. 65-87.
Lerdahl, F. (1992) 'Salience and the structure of music', Music Perception, 10(1), pp. 91-114.
London, J. (2012) Hearing in time: psychological aspects of musical meter (2nd ed.). Oxford: Oxford University Press.
Meyer, L.B. (1973) Explaining music: essays and explorations. Berkeley: University of California Press.
Meyer, L.B. (1989) Style and music: theory, history, and ideology. Philadelphia: University of Pennsylvania Press.
Monelle, R. (2000) The sense of music: semiotic essays. Princeton: Princeton University Press.
Müller, M. (2015) Fundamentals of music processing: audio, analysis, algorithms, applications. Cham: Springer.
Narmour, E. (1977) Beyond Schenkerism: the need for alternatives in music analysis. Chicago: University of Chicago Press.
Narmour, E. (1990) The analysis and cognition of basic melodic structures. Chicago: University of Chicago Press.
Narmour, E. (1991) 'The top-down and bottom-up systems of musical implication', Music Perception, 9(1), pp. 1-26.
Nattiez, J.-J. (1990) Music and discourse: toward a semiology of music. Translated by C. Abbate. Princeton: Princeton University Press.
Pearce, M.T. (2018) 'Statistical learning and probabilistic prediction in music cognition: mechanisms of stylistic enculturation', Annals of the New York Academy of Sciences, 1423(1), pp. 378-395.
Peirce, C.S. (1931-1958) Collected papers of Charles Sanders Peirce (Vols. 1-8). Cambridge, MA: Harvard University Press.
Saussure, F. de (1916) Course in general linguistics. Translated by W. Baskin. New York: Philosophical Library.
Schäfer, T., Huron, D., Shanahan, D. and Sedlmeier, P. (2015) 'How we remember the emotional intensity of past musical experiences', Frontiers in Psychology, 5, article 911. doi: 10.3389/fpsyg.2014.00911.
Schlenker, P. (2017) 'Outline of music semantics', Music Perception, 35(1), pp. 1-35.
Tagg, P. (2013) Music's meanings: a modern musicology for non-musos. New York: Mass Media Music Scholars' Press.
Tyson, A. (1987) Mozart: studies of the autograph scores. Cambridge, MA: Harvard University Press.
Tymoczko, D. (2011) A geometry of music: harmony and counterpoint in the extended common practice. Oxford: Oxford University Press.
Strand, M. (1990) 'Notes on the craft of poetry', in The Weather of Words: Poetic Invention. New York: Knopf, pp. 15-32.
Werner, M. (1999) Emily Dickinson's open folios: scenes of reading, surfaces of writing. Ann Arbor: University of Michigan Press.
RSS Feed