Beneath (or beyond) the surface: Discovering voice-leading patterns with skip-grams

David R.W. Sears, Gerhard Widmer

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


Recurrent voice-leading patterns like the Mi-Re-Do compound cadence (MRDCC) rarely appear on the musical surface in complex polyphonic textures, so finding these patterns using computational methods remains a tremendous challenge. The present study extends the canonical n-gram approach by using skip-grams, which include sub-sequences in an n-gram list if their constituent members occur within a certain number of skips. We compiled four data sets of Western tonal music consisting of symbolic encodings of the notated score and a recorded performance, created a model pipeline for defining, counting, filtering, and ranking skip-grams, and ranked the position of the MRDCC in every possible model configuration. We found that the MRDCC receives a higher rank in the list when the pipeline employs 5 skips, filters the list by excluding n-gram types that do not reflect a genuine harmonic change between adjacent members, and ranks the remaining types using a statistical association measure.

Original languageEnglish
Pages (from-to)1-26
Number of pages26
JournalJournal of Mathematics and Music
StatePublished - 2020


  • Skip-gram
  • cadence
  • cadential six-four
  • collocation
  • multi-word expression
  • n-gram
  • pattern discovery
  • tonal music
  • voice-leading pattern


Dive into the research topics of 'Beneath (or beyond) the surface: Discovering voice-leading patterns with skip-grams'. Together they form a unique fingerprint.

Cite this