In this paper, we propose a method for detecting marks of lossy compression encoding, such as MP3 or AAC, from PCM audio. The method is based on a convolutional neural network (CNN) applied to audio spectrograms and trained with the output of various lossy audio codecs and bitrates. Our method shows good performances on a large database and robustness to codec type and resampling.
The core idea is that lossy compression leaves traces in the spectrogram of processed files, namely holes (areas of the Time-Frequency plane where values are put to zero) band frequency cuts, and clusters.
Using proper training data, most existing lossy compression algorithm are detected by our system with high accuracy.

This paper has been published in the proceedings of the 42nd IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2017).