Audio-based music classification with a pretrained convolutional networkReport as inadecuate

Audio-based music classification with a pretrained convolutional network - Download this document for free, or read online. Document in PDF available to download.

(2011)Proceedings of the 12th international society for music information retrieval conference : Proc. ISMIR 2011.p.669-674 Mark abstract Recently the ‘Million Song Dataset’, containing audio features and metadata for one million songs, was made available. In this paper, we build a convolutional network that is then trained to perform artist recognition, genre recognition and key detection. The network is tailored to summarize the audio features over musically significant timescales. It is infeasible to train the network on all available data in a supervised fashion, so we use unsupervised pretraining to be able to harness the entire dataset: we train a convolutional deep belief network on all data, and then use the learnt parameters to initialize a convolutional multilayer perceptron with the same architecture. The MLP is then trained on a labeled subset of the data for each task. We also train the same MLP with randomly initialized weights. We find that our convolutional approach improves accuracy for the genre recognition and artist recognition tasks. Unsupervised pretraining improves convergence speed in all cases. For artist recognition it improves accuracy as well.

Please use this url to cite or link to this publication:

Author: Sander Dieleman, Philémon Brakel and Benjamin Schrauwen



Related documents