Analysing Musical Performances


Music plays a central role in our everyday life. A recent report has estimated that the average person listens to music for 961 hours per year, equivalent to over 40 continuous days. And that’s not counting all the other forms of media we might engage with that feature music alongside other content, including television, films, and advertising. With this in mind, it’s not surprising that studying the ways in which music is performed and listened to can tell us a lot about many different aspects of human experience, including our facility for creation, perception, and cognition. In this resource, we explore how and why we might want to analyse musical performances in a scientific manner, including the types of scientific questions we might hope to answer, the methods that are appropriate for extracting and analysing data from a performance, and the potential real-world applications of this work.

Why study music performance

In the introduction, it was
mentioned that researchers may have many different reasons for wanting to study
music performance. To the psychologist, studying musical performances allows
them to consider how individuals create, process information, and communicate
through a shared, non-verbal “language”. To the historian, studying
performances enables them to consider the nature of musical interpretation and
changes in style over time. Finally, to the data scientist, studying
performance allows for the development of classification and recommendation
algorithms that enhance music discovery, user engagement, and personalised
content delivery. However, regardless of their field, there are broad
similarities in how researchers extract and analyse data from musical

The scientific study of
musical performance dates back to the late 1800s; however, progress has
invariably been constrained by the pace of technological development. In the
earliest studies of performances, measurements had to be made manually from
musical instruments; one experiment involved attaching magnets to the keys of a
piano keyboard, which in turn were placed underneath a spinning metallic drum.
As the performer pressed the keys, these “drew” traces onto the drum, and the
length of the trace could then be measured to estimate how long the key was
held for. Nowadays, with the advent of music streaming and digital downloads,
we can collect data automatically from thousands of hours of audio recordings,
without even having to conduct an experiment in-person.

Collecting data from
musical performances

We have already touched on two
methods for collecting quantitative data from musical performances: 1) by
conducting experiments, or 2) by analysing recordings. Experimental studies
involve participants performing music in a controlled environment. We will then
manipulate something to do with the performance situation, with this manipulation
becoming our “independent variable”. We then study the effect that manipulating
our independent variable has on another series of variables, called our “dependent
variables”. We can do this by collecting audio and video recordings of the
performers, as well as “self-report” data like questionnaires and interviews.
In an experiment, we don’t have to manipulate the performance context in an
extreme manner. One example involves an experiment where the members of a
string quartet were instructed to perform the same piece several times, either
“expressively” or “unexpressively”. The researchers then studied how their body
sway changed in response to the instructions they were given.

The second method to collect quantitative
data involves working with commercial audio or video recordings, such as those
we’d find on CDs or streaming services. In this case, we no longer have an
explicit “independent” variable: we’re not manipulating anything to do with the
performance context. But we can still study how musical factors might differ between
performers or historical eras, for instance, and these categories can become
equivalent to our independent variable. The main disadvantage of studying recordings,
as opposed to running experiments, is that we can’t control the quality of the
performances. For instance, when working with historical recordings, we often
have to consider the age of the recorded medium: recordings made on tape or
vinyl might be noisy or play back slower or faster than the actual performance,
and this can affect the quality of the data we extract. However, working with
recordings is the only way to study important performers or performances, as
well as study how trends in musical performance have evolved historically over

Regardless of how we have
collected our quantitative data, we also need to think about how we can analyse
it. Here, we are generally concerned with “features” (sometimes also known as
attributes). Each feature represents a measurable piece of information that can
be used in analysis. If we consider the “features” of an animal, this might
include its species, height, weight, and age; for a musical performance, this
could include its tempo, loudness, and duration. When we represent this data in
a table, we typically would see our “observations” (individual performances) as
rows, and our features as columns. We can then use our features to develop models
to help explain our research questions: going back to our earlier string
quartet example, we could produce a regression model that predicts how much a
performer will move based on whether or not we told them to play expressively.

Real-world applications

There are many possible
applications for this work, some of which you may already be familiar with. One
example includes the development of music recommendation systems. When a
streaming platform like Spotify or YouTube recommends content for you to listen
to next, it needs to be able to extract a variety of different audio features
from the tracks you’ve already heard. Some of these features are relatively
straightforward, like the mode (major or minor key) and tempo. Others are more
complex: for instance, a track’s “danceability” refers to how suitable it is
for dancing, based on a combination of its tempo, rhythmic stability, beat
strength, and overall regularity, while “energy” considers dynamic range, loudness,
and musical density to provide a score for the perceived intensity and activity
of a track. Once these features have been extracted, they can then be
cross-referenced with your listening habits (e.g., the genres and artists you
typically prefer listening to, as well as related genres and artists) and other
metadata in order to compile personalised recommendations for you.

Resource activities

Complete this short worksheet to consolidate your learning


Activity questions

  • How might researchers in different fields and areas use data extracted from musical performances in their work?
  • What are the advantages and disadvantages of different methods for gathering data from musical performances?
  • 3. Describe two potential applications for music performance analysis in the real world.

Reflective questions

To answer and record these questions you will need to have an account and be logged in.

Task 1

What are the key arguments, concepts, points contained within it?

Task 2

What are you struggling to understand?

What could you do to improve your understanding of these concepts/terminology etc.?

Task 3

What further questions has this resource raised for you?

What else are you keen to discover about this topic and how could you go about learning more?

Can you make any links between this topic and your prior knowledge or school studies?

Help us evaluate this resource

Your feedback is very important to us. Please complete a short questionnaire.


Further reading

  • McMaster University LiveLAB

    A state-of-the-art concert hall/laboratory hybrid for conducting research into musical performances.

  • Spotify Audio Documentation

    Describes the types of audio features that are extracted from a track automatically by Spotify and used to make judgements as part of their classification and recommendation algorithms.

  • Music Information Retrieval

    Walks through the process by which features can be extracted from audio recordings.

  • Jazz Duo Analysis

    • A recent study of coordination and interaction in improvising groups of jazz musicians.