Audio Fingerprinting with Robustness to Pitch Scaling and Time Stretching

Díaz Millet, Yesenia

Ver/

Articulo Final_ Yesenia Diaz (264.8Kb)

Fecha

2013

Autor

Díaz Millet, Yesenia

Metadatos

Mostrar el registro completo del ítem

Resumen

Current audio fingerprinting systems are becoming increasingly robust against noise and filter distortions, however songs that have been pitch scaled and time stretched are still likely to pass undetected. This research focuses on expanding an existing landmark-based fingerprinting method to identify songs that have been pitch scaled and time stretched to escape current systems while still sounding natural to the human ear. Two feature extraction methods have been explored with the purpose of resolving each task individually. The constant Q spectrogram was used for feature extraction, instead of a conventional spectrogram, to identify songs that have been pitch scaled. Mel-frequency Cepstral Coefficients were used as features for the other task. The goal is to verify whether or not low-level spectral based features alone are capable of handling such transformations in a song instead of needing to use mid-level or high-level musical features as is the case with other Song ID methods. Key Terms - Audio Fingerprinting, Feature Extraction, Music Information Retrieval, Music Similarity.

URI

http://hdl.handle.net/20.500.12475/956

Colecciones

Computer Engineering