Music Information Retrieval - Applications to Jazz Discography

A major new phase of J-DISC will be to explore the ways that Music Information Retrieval can help identify and document jazz (particularly the burgeoning number of online files which may appear with poor session information), and in turn to ask how questions raised by jazz, with its iterative improvisations on similar material, densely layered polyrhythms, and complex melodic style, can inform the field of MIR.

We will be joined by engineers who specialize in music signal processing and computer music in performing this upcoming research project. This thread may be useful to hear from discographers or jazz historians about their ideas of how MIR might be useful to the field, and to incite a dialogue with people on the engineering side about the issues or goals of the research.

This sounds like potentially a very rewarding direction to pursue, even if at this point it's pure research without a specific application in mind. The idea of actually analyzing jazz recordings to see if there are characteristics that could be used to positively identify an individual recording regardless of the media it’s on is great, and might resolve a lot of problems with the digital download age. A book like the Coltrane Reference is potentially obsolete. not so much because the information goes stale so quickly as because the information captured in the book is slowly becoming less relevant to the music. By that I mean, discographies catalogue “issues”, but for one there are so many reissues and off-brand issues that it becomes impossible to keep up with them (and probably of little value to a collector or scholar). And as physical representations are replaced by digital representations, the ability of a discographer to say “if your source is this issue with this catalogue number, the music you are hearing is exactly this” slowly fades.

So signal analysis offers a couple of possibilities. Of primary importance, it might give us a replacement for the catalogue number as a means of uniquely identifying a recording (or at least offering an alternate identifier). If we can analyze a recording the way we analyze a thumbprint and identify it uniquely by a set of audio artifacts, we can use that unique set of data as a way to identify otherwise untagged music. I could see J-DISC offering a service where a subscriber could present a music file (MP3 or whatever), J-DISC could compare that file to digital records in the database, identify the recording, and present the subscriber with a match in the discography. That would be amazing. Of course there are many major issues—my experience doing that the old fashioned way (running recordings in tandem on two tape recorders) tells me for example that you have to immediately decide how to pitch adjust the two files being compared, and to automate the human ear’s ability to exclude noise and focus on the elements that have to be compared. I've done comparison more recently with notebooks running CDs, where pitch adjustment is easier--but ultimately it's still the ears that have to decide.

A software equivalent comparing two recordings would have to be able to identify and ignore "noise" elements (an audience recording with crowd noise versus a soundboard recording of the same music, or a dub of a scratchy LP versus a pristine original studio tape). Having isolated the important elements, the software would have to deal with pitch variation--perhaps by being able to identify identical audio events occuring in an identical sequence with the only variation being time (when they occur). But the potential to preserve our ability to link a recording with its context and history is immense. --david wild

Several brief observations. If the proliferation of issues makes them difficult to keep track of, I would think it makes the session-based principle of reference in traditional discography all the more important and central. Whatever the changes in distribution, people still make records in studios, and the information about it shouldn’t change with the playback carrier. (Whether they are capturing or sharing the session information is another story; and, yes, there is the world of splicing and other post-production editing.)

Secondly, MIR already has the technology to analyze and discriminate among production qualities of different studios or recording venues. We just need it for jazz, where it could help identify what is live versus studio, what is by a certain engineer or not, etc.

Finally, it would be very desirable to use MIR to identify a song and match it against a database of songs, but such a database has to be built up over time, which has many challenges. On the other hand, the text information in J-DISC (if more robustly populated) or any comprehensive discographic work would be sufficient, as long as the MIR process returned enough clues to be able to look it up. Does that make sense?


Taking a step back from the larger questions, what would we want to be able to do with a given performance?

For example, someone posts a audio recording of live Coltrane gig to youtube. We definitely want to know some things about it, maybe first and foremost if this is the same as some other material, or if it is related to other known recordings.

Another case is self-produced, independent sessions. My CD had a physical release, and my information went into gracenote or cddb (or whatever it's called today). Is this true for every EP that only results in an mp3?

Tad: is it always a professional studio? How about a basement studio, or a high-quality live recording?



AJ--Let me try to answer your questions, to the best of my limited knowledge at this point.

We now have a checklist of areas that MIR handles that we might want to apply to a given performance. That said, we're not aware of anyone doing it systematically for jazz and it's going to be up to a postdoctoral researcher we're just hiring to guide that exploration over two years.

We're interested in fingerprinting an artist through repetitive figures or timbre. That would be very valuable--and is now very difficult even at the state of the art. The problem seems to be separating out frequencies of accompanying instruments from the solo instrument.

Nevertheless, artists repeat themselves. It might be the least sexy part in terms of studying someone's art, but even the greats have their ticks that they may play over and over. This is my idea and will have to be fleshed out.

The problem with referring to a song--aside from multiple versions of the same song, that is--is that MIR has previously relied on testing for "covers" in pop, or standard notated rep in classical. In jazz, what is a "song" is much looser and thus seems unreliable to identify a recording. Set compositions, like bop heads or big band arrangements, might be a way to start, though that is hardly all of jazz.

Some initial and interesting attempts have been made to identify chord sequences but there is a long road to go before the complexity of substitutions or voicings could be handled. (It would still be statistical.)

As to how CDDB works: either someone inputs the text and it becomes shared knowledge, or a profile is made from track times of a given EP, or both. That said, I think Google/YouTube have something more sophisticated, because I've seen it happen. But it's under wraps right now.

As to production or studio quality: it may be useful for any kind of recording, as I imagine it. It can't identify a basement tape per se (though it might point out some quality threshold; I'm not sure). But it could distinguish it from one made in the studio of the same rep by the same artist--it can tell different studios, in other words, and that could be valuable. Even if someone goes back 4 months later and records the same thing, but at a different venue, it should be distinguishable.

If two possible venues were known and profiled, they could then be compared. E.g. the dates Rudy Van Gelder recorded, versus the ones that he didn't record but which were at some identifiable venue, because they were Prestige or Blue Note stalwarts. A hypothesis.

The engineers and computer experts may want to weigh in here but it's up to discographers to list and tightly define what we want, as much as possible.


Absolutely, the classic session-centric model of traditional discographies becomes that much more valuable with the proliferation of leased, indifferently-documented reissues, and the growth in down-loaded files with no documentation at all.  I didn't mean to imply that any of what we are about here becomes obsolete because of changes in distribution of the music.  But the portion of a discography that identifies these sessions for end-users by identifying the media that contains them needs new tools to remain useful.


The promise of MIR for me lies in its potential to identify a specific version of a song from the music alone.  MIR's existing ability to analyze and discriminate among such elements as production qualities or venues is very useful of course.  But we started this last year with a discussion of the Gracenote database and its ability to identify a CD inserted into a computer, without reference to anything other than the files on the CD (and of course a huge database of some sort)--and MIR capability in J-DISC would seem to offer us as jazz scholars a similar tool. 


My interest in this potential in MIR is undoubtedly a result of my work on the Coltrane discography.  Coltrane's European tours (1961, 1962 and 1963) produced multiple versions of standard songs in the band's repertoire.  These have shown up in a variety of collector-traded tapes, bootlegs and legitimate issues, often incorrectly documented, and occasionally touted as newly discovered (when in fact they usually are not).  And of course these kinds of problems are not unique to Coltrane's body of work. 


A scholar-oriented database like J-DISC equipped with MIR would seem potentially to offer considerable assistance in these research tasks.  And of course it would take time to build up the database of samples to allow for this kind of research--but it's going to take time to build up the text information in J-DISC.  As you say, the text information in J-DISC or other comprehensive discographies is often sufficient to identify songs, but the more support something like MIR can provide, the more accurately and quickly that process becomes.  --david wild

And thanks to Aaron for mentioning You-tube.  A wide variety of material gets posted to You-tube, usually without much information about provenance, personnel, etc.  Coltrane is a good example, but consider Wayne Shorter, whose current working quartet (with Danilo Perez, John Patitucci and Brian Blade) is very poorly represented in commercial, studio recordings, but well-documented in concert recordings and broadcasts.  Jazz researchers in the future who study Shorter's work are going to have to deal with this body of non-commercial recordings as documentation of the development of the group.  In cataloguing some of those recordings I've found audience video posted to You-tube of concerts that otherwise were unknown. The possibility of a program that could identify such recordings, or at least reduce the number that have to be reviewed to identify the recording, is tantalizing...  --david wild


Thanks for the examples of fertile ground for MIR research. I'm now going through everything the Center for Jazz Studies has in digital form to load into a test dataset. One thing that turned up was a box set called John Coltrane: Live Trane - The European Tours (Pablo 7PACD--4433-2). Does that contain some of the material you're referring to? What's the quality of the info in the set, if you're familiar with it?

That's not to say the CDs in this collection nor its notes resolve the issues. I just want to identify recordings we might use in the artificial "lab" experiment to see what eventually can be machine identified and what not. (This box set is one of thousands of discs bequeathed to us that have never been filed or catalogued, until now). One task in the process would be to listen to them and see what areas of continuity there are, then "train" our applications to find it. With Bird, or Trane, or Red Nichols or Grover Washington or whomever.

Yes, the value added by seasoned discography to the field of jazz studies and to the industry should be immense.

For an example of a problem that could be solved by signal analysis, see this J-DISC record:



There is some question whether tracks from an Attila Zoller film score were recorded on a different date, with a different group, than those with similar titles referenced in the above session. If there were a way to cross-reference the signal from a given the studio (one of the two potentially involved here), MIR techniques already in existence could potentially help. Thanks to John Szwed for pointing this out.