Big Data's Unexplored Frontier: Recorded Music
by The Daily Eye Team January 10 2017, 2:33 pm Estimated Reading Time: 0 mins, 44 secsWhile still a vast field, a huge part of machine learning exists for what may seem to be a relatively narrow subset of problems. These are problems involving visual processing: character recognition, facial recognition, the generation of trippy images dominated by populations of dogslugs, birdlegs, and spidereyes. This isn't accidental. Image data is unique in its suitability for machine learning tasks. It naturally occurs as multidimensional arrays—tensors, really—of pixel data. It's more at the fringes of machine learning that audio data gets a turn. Part of the problem is that, despite the vast amounts of digital audio data that exists in the world, there is a relative lack of openly accessible computational datasets. There's pretty much just one, actually: the Million Song Dataset, which offers some 280 GB of feature data extracted from 1 million audio tracks. Musicology remains largely old-school.