We extend our evaluation of generative models of music transcriptions that were first presented in Sturm, Santos, Ben-Tal, and Korshunova (2016). We evaluate the models in five different ways: 1) at the population level, comparing statistics of 30,000 generated transcriptions with those of over 23,000 training transcriptions; 2) at the practice level, examining the ways in which specific generated transcriptions are successful as music compositions; 3) as a “nefarious tester”, seeking the music knowledge limits of the models; 4) in the context of assisted music composition, using the models to create music within the conventions of the training data; and finally, 5) taking the models to real-world music practitioners. Our work attempts to demonstrate new approaches to evaluating the application of machine learning methods to modelling and making music, and the importance of taking the results back to the realm of music practice to judge their usefulness. Our datasets and software are open and available at https://github.com/IraKorshunova/folk-rnn.

How to Cite
L. Sturm B. & Ben-Tal O. (2017) “Taking the Models back to Music Practice: Evaluating Generative Transcription Models built using Deep Learning”, Journal of Creative Music Systems. 2(1).





Bob L. Sturm (Queen Mary University of London)
Oded Ben-Tal (Kingston University)

Creative Commons Attribution 4.0

Peer Review

This article has been peer reviewed.

File Checksums (MD5)
  • PDF: a3f236d4587dc9e27a1f8c56d6805139