Heart Sounds Catania 2011 (HSCT11) ================================== This is the Heart Sounds Catania 2011 Database, a collection of heart sounds to be used for research purpose in the field of heart-sounds biometry, collected by the University of Catania, Italy. If you use this database for your research, please cite the following paper in your bibliography: A. Spadaccini and F. Beritelli, "Performance Evaluation of Heart Sounds Biometric Systems on an Open Dataset", in Proceedings of the 18th IEEE International Conference on Digital Signal Processing, 1-3 July 2013, Santorini, Greece. Description of the database --------------------------- The database contains contains heart sounds acquired from 206 people, i.e. 157 male and 49 female. The sensor used for the acquisition is a ThinkLabs Rhythm Digital Electronic Stethoscope; the files were acquired using a sampling frequency of 11025 Hz and 16 bits per sample, and are stored using the WAVE format. During the acquisition phase the person was sitting, in resting state, and the stethoscope was positioned near the pulmonary valve. The filenames encode the following metadata about the person: - the first character encodes the sex of the person (M or F); - the next 4 characters are the numeric ID of the person; - the next character encodes the heart valve used for the auscultation (M: mitral, P: pulmonary, A: aortic, T: tricuspid); this database contains only sequences recorded near the pulmonary valve; - the next character encodes whether the recording was done with the subject in resting condition (N) or after some light physical activity (C); so far the database contains only sequences recorded in resting condition; - the next 3 characters encode the sequential number of the registration acquired from a given person; the first of these 3 characters is always the letter R. - the next 7 characters encode the date of the acquisition; the first one is always a letter D, the others represent the date in the format MMDDYY; - the next 7 characters encode the birth date of the subject; the first one is always a letter N, the others represent the date in the format MMDDYY; The letters between fields could have been avoided since the fields have a fixed length, but they have been inserted because they make it easier for human eyes to scan the filename and extract the required information. An example filename is: F7007NR01D290610N051077.wav. Evaluation protocol ------------------- The comparison should be done in the following way: for each person, one sequence is used for the model training phase and one is used for the computation of matching scores. Let X be a given person, Xa its first recording and Xb its second recording; also let D be the set of all the people in the database, and let N = |D| = 206 be the number of people in it. Let S be the matching function that, given an identity model and a recording gives a similarity score. For each person, the database user should compute one genuine matching score, that is S(MX, Xb), and N - 1 impostor matching scores S(MY, Xb), for each Y in {D \ X}. This will yield N genuine matching scores and N x (N - 1) impostor matching scores. The baseline EER (Equal Error Rate) value for this database is 13.66 %, obtained used one of the systems described in the paper mentioned in the introduction. The system uses the UBM/GMM method and is based on the Alize/LIA_RAL toolkit. Contacts -------- For enquiries related to the database or to research activities on heart-sounds biometry, please contact: Prof. Francesco Beritelli Ing. Andrea Spadaccini