View on GitHub


Workflows for generating AV editions and exhibits using IIIF manifests by HiPSTAS and Brumfield Labs.

Audio Annotation Standards and Frameworks

compiled by Bethany Radcliff and Kylie Warkentin


The list below includes a selection of citations and resources related to audio annotation. The goal of this list is to compile current standards or reference frameworks being used to structure audio annotations or analysis. We found that audio annotation standards or frameworks are being used and researched in areas related to linguistics, archives and libraries, music annotation, and machine learning and automation, among others. Further, standards for transcription for accessibility of recordings and for oral histories are included below as related resources.

Linguistic Annotation

Auer, E., Russel, A., Sloetjes, H., Wittenburg, P., Schreer, O., Masnieri, S., Schneider, D., & Tschöpel, S. (2010). ELAN as flexible annotation framework for sound and image processing detectors. In Seventh conference on International Language Resources and Evaluation [LREC 2010] (pp. 890-893). European Language Resources Association (ELRA)

Bergelson, E. (2020). Annotation Introduction for SEEDLingS Annotations. Retrieved from

Bird, S., & Liberman, M. (2001). A Formal Framework for Linguistic Annotation. Speech Communication, 33(1-2), 23-60. doi:10.1016/s0167-6393(00)00068-6

De Sutter R., Notebaert S., Van de Walle R. (2006) Evaluation of Metadata Standards in the Context of Digital Audio-Visual Libraries. In: Gonzalo J., Thanos C., Verdejo M.F., Carrasco R.C. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2006. Lecture Notes in Computer Science, vol 4172. Springer, Berlin, Heidelberg.

Hedeland, Hanna. “Providing Digital Infrastructure for Audio-Visual Linguistic Research Data with Diverse Usage Scenarios: Lessons Learnt .” Publications, vol. 8, no. 2, ser. 33, 2020. 33, doi:

Meléndez Catalán, B., Molina, E., & Gómez Gutiérrez, E. (2017). BAT: An open-source, web-based audio events annotation tool.

Simon, R., Jung, J., & Haslhofer, B. (2011). The YUMA Media Annotation Framework. In S. Gradmann, F. Borri, C. Meghini, & H. Schuldt (Eds.), Research and Advanced Technology for Digital Libraries (pp. 434–437). Springer.

Library/Archive Specific Annotation

Egan, P. (2020). Enriching Metadata for Irish Traditional Music at the American Folklife Center.

Kowalczyk, S. T., & Holmes, A. S. (2020). The Studs Terkel Radio Archive: A Journey to Enhanced Usability for Audio. Journal of Archival Organization, 17(1–2), 95–112.

Music Annotation

Fu, Z., Lu, G., Ting, K. M., & Zhang, D. (2011). A Survey of Audio-Based Music Classification and Annotation. IEEE Transactions on Multimedia, 13(2), 303–319.

Music Information Retrieval Evaluation eXchange

MEI Music Encoding Initiative

Machine learning/Automated Annotation

Li, B., Burgoyne, J., & Fujinaga, I. (2006). Extending Audacity for Audio Annotation. (p. 380).

Wang, Y., Mendez, A. E. M., Cartwright, M., & Bello, J. P. (2019). Active Learning for Efficient Audio Annotation and Classification with a Large Amount of Unlabeled Data. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 880–884.


Transcription Guidelines