Audio Fingerprinting with Robustness to Pitch Scaling and Time Stretching
Díaz Millet, Yesenia
MetadataShow full item record
Current audio fingerprinting systems are becoming increasingly robust against noise and filter distortions, however songs that have been pitch scaled and time stretched are still likely to pass undetected. This research focuses on expanding an existing landmark-based fingerprinting method to identify songs that have been pitch scaled and time stretched to escape current systems while still sounding natural to the human ear. Two feature extraction methods have been explored with the purpose of resolving each task individually. The constant Q spectrogram was used for feature extraction, instead of a conventional spectrogram, to identify songs that have been pitch scaled. Mel-frequency Cepstral Coefficients were used as features for the other task. The goal is to verify whether or not low-level spectral based features alone are capable of handling such transformations in a song instead of needing to use mid-level or high-level musical features as is the case with other Song ID methods. Key Terms - Audio Fingerprinting, Feature Extraction, Music Information Retrieval, Music Similarity.