Abstract
This study is aimed to evaluate a method for distinguishing between healthy and pathological voices. The evaluation was carried out using several acoustic parameters including COVAREP (collaborative voice analysis repository for speech technologies), the auditory-perceptual RBH (roughness, breathiness, hoarseness) scale, and AVQI (acoustic voice quality index). Finally, a classifier is trained using machine learning algorithms from the WEKA (Waikato Environment for Knowledge Analysis) platform. The study group comprised 75 voice recordings of individuals affected by vocal fold paralysis. The control group consisted of 49 voice recordings of healthy individuals. The results indicate that the voice quality of the study group is significantly different than the voice quality of the control group. Acoustic parameters implemented in COVAREP and the RBH scale have proven to be reliable methods assessing voice quality. In addition, data classification achieved over 90 % accuracy for every classifier.Keywords:
voice quality, AVQI, COVAREP, RBH scale, vocal fold paralysisReferences
1. Aha D.W., Kibler D., Albert M.K. (1991), Instance-based learning algorithms, Machine learning, 6: 37–66, https://doi.org/10.1007/bf00153759.
2. Airas M., Alku P. (2007), Comparison of multiple voice source parameters in different phonation types, [in:] Eighth Annual Conference of the International Speech Communication Association, https://doi.org/10.21437/interspeech.2007-28.
3. Alku P., Backstrom T., Vilkman E. (2002), Normalized amplitude quotient for parametrization of the glottal flow, The Journal of the Acoustical Society of America, 112(2): 701–710, https://doi.org/10.1121/1.1490365.
4. Alku P., Strik H., Vilkman E. (1997), Parabolic spectral parameter – A new method for quantification of the glottal flow, Speech Communication, 22(1): 67–79, https://doi.org/10.1016/s0167-6393(97)00020-4.
5. Alpaydin E. (2004), Introduction to Machine Learning, MIT Press.
6. Askenfelt A.G., Hammarberg B. (1986), Speech waveform perturbation analysis: A perceptual-acoustical comparison of seven measures, Journal of Speech, Language, and Hearing Research, 29(1): 50–64, https://doi.org/10.1044/jshr.2901.50.
7. Barsties B., Maryn Y. (2012), Der acoustic voice quality index [in German: Ein Messverfahren zur allgemeinen Stimmqualitat], HNO, 60(8): 715–720, https://doi.org/10.1007/s00106-012-2499-9.
8. Behrbohm H., Kaschke O., Nawka T., Swift A.C. (2011), Ear, Nose and Throat Diseases with Head and Neck Surgery [in Polish: Choroby ucha, nosa i gardła z chirurgią głowy i szyi], 2nd ed., Edra Urban & Partner.
9. Boersma P. (2001), Praat, a system for doing phonetics by computer, Glot International, 5(9/10): 341–345.
10. Chen H.-C., Jen Y.-M., Wang C.-H., Lee J.-C., Lin Y.-S. (2007), Etiology of vocal cord paralysis, ORL, 69(3): 167–171, https://doi.org/10.1159/000099226.
11. Childers D.G., Lee C.K. (1991), Vocal quality factors: Analysis, synthesis, and perception, The Journal of the Acoustical Society of America, 90(5): 2394–2410, https://doi.org/10.1121/1.402044.
12. Compton E.C. et al. (2022), Developing an Artificial Intelligence tool to predict vocal cord pathology in primary care settings, The Laryngoscope, 133(8): 1531–4995, https://doi.org/10.1002/lary.30432.
13. Cooper W.E., Sorensen J.M. (1981), Fundamental Frequency in Sentence Production, Springer Science & Business Media.
14. Crowson M.G. et al. (2020), A contemporary review of machine learning in otolaryngology–head and neck surgery, The Laryngoscope, 130(1): 45–51, https://doi.org/10.1002/lary.27850.
15. Degottex G., Kane J., Drugman T., Raitio T., Scherer S. (2014), COVAREP – A collaborative voice analysis repository for speech technologies, [in:] 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 960–964, https://doi.org/10.1109/icassp.2014.6853739.
16. Dejonckere P.H. et al. (2001), A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating new assessment techniques, European Archives of Oto-rhino-laryngology, 258: 77–82, https://doi.org/10.1007/s004050000299.
17. Deliyski D.D., Shaw H.S., Evans M.K. (2005), Adverse effects of environmental noise on acoustic voice quality measurements, Journal of Voice, 19(1): 15–28, https://doi.org/10.1016/j.jvoice.2004.07.003.
18. Dibazar A.A., Berger T.W., Narayanan S.S. (2006), Pathological voice assessment, [in:] 2006 International Conference of the IEEE Engineering in Medicine and Biology Society, 2006: 1669–1673, https://doi.org/10.1109/IEMBS.2006.259835.
19. Friedman N., Geiger D., Goldszmidt M. (1997), Bayesian network classifiers, Machine Learning, 29: 131–163, https://doi.org/10.1023/A:1007465528199.
20. Godino-Llorente J.I., Gómez-Vilda P., Saenz-Lechón N., Blanco-Velasco M., Cruz-Roldan F., Ferrer-Ballester M.A. (2005), Support vector machines applied to the detection of voice disorders, [in:] Nonlinear Analyses and Algorithms for Speech Processing. NOLISP 2005. Lecture Notes in Computer Science, Faundez-Zanuy M., Janer L., Esposito A., Satue-Villar A., Roure J., Espinosa-Duro V. [Eds.], pp. 219–230, https://doi.org/10.1007/11613107_19.
21. Hacki T. (1989), Classification of glottal dysfunctions on the basis of electroglottography [in German: Klassifizierung von glottiscysfunktionen mit hilfe der elektroglottographie], Folia phoniatrica, 41(1): 43–48, https://doi.org/10.1159/000265931.
22. Hanson H.M. (1997), Glottal characteristics of female speakers: Acoustic correlates, The Journal of the Acoustical Society of America, 101(1): 466–481, https://doi.org/10.1121/1.417991.
23. Hillenbrand J., Houde R.A. (1996), Acoustic correlates of breathy vocal quality: Dysphonic voices and continuous speech, Journal of Speech, Language, and Hearing Research, 39(2): 311–321, https://doi.org/10.1044/jshr.3902.311.
24. Hirano M. (1981), Clinical Examination of Voice, Springer-Verlag, New York.
25. Hogikyan N.D. (2004), The voice-related quality of life (V-RQOL) measure: History and ongoing utility of a validated voice outcomes instrument, Perspectives on Voice and Voice Disorders, 14(1): 3–5, https://doi.org/. 10.1044/vvd14.1.3.
26. Hosokawa K. et al. (2017), Validation of the acoustic voice quality index in the Japanese language, Journal of Voice, 31(2): 260.e1–260.e9, https://doi.org/10.1016/j.jvoice.2016.05.010.
27. Ingrisano D.R., Perry C.K., Jepson K.R. (1998), Environmental noise: A threat to automatic voice analysis, American Journal of Speech-Language Pathology, 7(1): 91–96, doi: https://doi.org/10.1044/1058-0360.0701.91.
28. Jeong G.-E. et al. (2022), Treatment efficacy of voice therapy following injection laryngoplasty for unilateral vocal fold paralysis, Journal of Voice, 36(2): 242–248, https://doi.org/10.1016/j.jvoice.2020.05.014.
29. Kane J., Gobl C. (2011), Identifying regions of nonmodal phonation using features of the wavelet transform, [in:] Twelfth Annual Conference of the International Speech Communication Association, pp. 177–180, https://doi.org/10.21437/interspeech.2011-76.
30. Kane J., Gobl C. (2013), Wavelet maxima dispersion for breathy to tense voice discrimination, [in:] IEEE Transactions on Audio, Speech, and Language Processing, 21(6): 1170–1179, https://doi.org/10.1109/tasl.2013.2245653.
31. Kankare E. et al. (2020), The acoustic voice quality index version 02.02 in the Finnish-speaking population, Logopedics Phoniatrics Vocology, 45(2): 49–56, https://doi.org/10.1080/14015439.2018.1556332.
32. Kosztyła-Hojna B., Moskal D., Kuryliszyn-Moskal A., Rutkowski R. (2014), Visual assessment of voice disorders in patients with occupational dysphonia, Annals of Agricultural and Environmental Medicine, 21(4): 898–902, https://doi.org/10.5604/12321966.1129955.
33. Landwehr N., Hall M., Frank E. (2005), Logistic model trees, Machine Learning, 59: 161–205, https://doi.org/10.1007/s10994-005-0466-3.
34. Laukkanen A.-M., Rantala L. (2022), Does the acoustic voice quality index (AVQI) correlate with perceived creak and strain in normophonic young adult Finnish females?, Folia Phoniatrica et Logopaedica, 74(1): 62–69, https://doi.org/10.1159/000514796.
35. Majkowska M. (2004), Basic issues of voice emission and hygiene [in Polish: Podstawowe zagadnienia emisji i higieny głosu], [in:] Prace Naukowe Akademii im. Jana Długosza w Częstochowie, 5: 93–101.
36. Maryn Y., Corthals P., Van Cauwenberge P., Roy N., De Bodt M. (2010), Toward improved ecological validity in the acoustic measurement of overall voice quality: Combining continuous speech and sustained vowels, [in:] Journal of Voice, 24(5): 540–555, https://doi.org/10.1016/j.jvoice.2008.12.014.
37. Maryn Y., De Bodt M., Barsties B., Roy N. (2014), The value of the acoustic voice quality index as a measure of dysphonia severity in subjects speaking different languages, European Archives of Oto-Rhino-Laryngology, 271: 1609–1619, https://doi.org/10.1007/s00405-013-2730-7.
38. Maryn Y., Roy N. (2012), Sustained vowels and continuous speech in the auditory-perceptual evaluation of dysphonia severity, Jornal da Sociedade Brasileira de Fonoaudiologia, 24: 107–112, https://doi.org/10.1590/s2179-64912012000200003.
39. Maryn Y., Roy N., De Bodt M., Van Cauwenberge P., Corthals P. (2009), Acoustic measurement of overall voice quality: A meta-analysis, The Journal of the Acoustical Society of America, 126(5): 2619–2634, https://doi.org/10.1121/1.3224706.
40. Maryn Y., Weenink D. (2015), Objective dysphonia measures in the program Praat: smoothed cepstral peak prominence and acoustic voice quality index, Journal of Voice, 29(1): 35–43, https://doi.org/10.1016/j.jvoice.2014.06.015.
41. Montalbaron M.B. et al. (2023), Presumptive diagnosis in tele-health laryngology: A multi-center observational study, The Annals of Otology, Rhinology, and Laryngology, 132(12): 1511–1519, https://doi.org/10.1177/00034894231165811.
42. Nawka, T., Anders, L., Wendler, J. (1994), The auditory assessment of hoarse voices according to the RBH system [in German], Sprache, Stimme, Gehor, 18: 130–133.
43. Nemr K. et al. (2012), GRBAS and Cape-V scales: High reliability and consensus when applied at different times, Journal of Voice, 26(6): 812.e17–218.e22, https://doi.org/10.1016/j.jvoice.2012.03.005.
44. Parsa V., Jamieson D.G. (2001), Acoustic discrimination of pathological voice: Sustained vowels versus continuous speech, Journal of Speech, Language, and Hearing Research, 44(2): 327–339, https://doi.org/10.1044/1092-4388(2001/027).
45. Patel R.R. et al. (2018), Recommended protocols for instrumental assessment of voice: American Speech-Language-Hearing Association expert panel to develop a protocol for instrumental assessment of vocal function, American Journal of Speech-Language Pathology, 27(3): 887–905, https://doi.org/10.1044/2018. ajslp-17-0009.
46. Portney L.G., Watkins M.P. (2009), Foundations of Clinical Research: Applications to Practice, 3rd ed., Pearson/Prentice Hall Upper Saddle River, NJ.
47. Quinlan J.R. (1999), C4.5: Programs for Machine Learning, Morgan Kaufman.
48. Reynolds V. et al. (2012), Objective assessment of pediatric voice disorders with the acoustic voice quality index, Journal of Voice, 26(5): 672.e1–372.e7, https://doi.org/10.1016/j.jvoice.2012.02.002.
49. Roper T.A. (2014), Clinical Skills, 2nd ed., Oxford University Press.
50. Rosłanowski A. (2008), Phoniatric database [in Polish: Baza nagrań foniatrycznych], B.Eng., Polish-Japanese Academy of Information Technology.
51. Speyer R. et al. (2010), Maximum phonation time: Variability and reliability, Journal of Voice, 24(3): 281–284, https://doi.org/10.1016/j.jvoice.2008.10.004.
52. Suvvari T.K. (2023), The role of Artificial Intelligence in diagnosis and management of laryngeal disorders, Ear, Nose & Throat Journal, https://doi.org/10.1177/01455613231175053.
53. Szklanny K. (2019), Acoustic parameters in the evaluation of voice quality of choral singers. Prototype of mobile application for voice quality evaluation, Archives of Acoustics, 44(3): 439–446, https://doi.org/10.24425/aoa.2019.129257.
54. Szklanny K., Wrzeciono P. (2019), Relation of RBH auditory-perceptual scale to acoustic and electroglottographic voice analysis in children with vocal nodules, IEEE Access, 7: 41647–41658, https://doi.org/10.1109/ACCESS.2019.2907397.
55. Tadeusiewicz R. (1988), Speech Signal [in Polish: Sygnał mowy], Wydawnictwa Komunikacji i Łączności, Warszawa.
56. Tirronen S., Javanmardi F., Kodali M., Reddy Kadiri S., Alku P. (2023), Utilizing Wav2Vec in database-independent voice disorder detection, [in:] ICASSP 2023 – 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5, https://doi.org/10.1109/ICASSP49357.2023.10094798.
57. Uloza V., Petrauskas T., Padervinskis E., Ulozaite N., Barsties B., Maryn Y. (2017), Validation of the acoustic voice quality index in the Lithuanian language, Journal of Voice, 31(2): 257.e1–257.e11, https://doi.org/10.1016/j.jvoice.2016.06.002.
58. Verde L., De Pietro G., Sannino G. (2018), Voice disorder identification by using machine learning techniques, IEEE access, 6: 16246–16255, https://doi.org/10.1109/access.2018.2816338.
59. Verikas A., Gelzinis A., Bacauskiene M., Uloza V. (2006), Towards a computer-aided diagnosis system for vocal cord diseases, Artificial Intelligence in Medicine, 36(1): 71–84, https://doi.org/10.1016/j.artmed.2004.11.001.
60. Wilson J., Webb A., Carding P., Steen I., MacKenzie K., Deary I. (2004), The voice symptom scale (VoiSS) and the vocal handicap index (VHI): A comparison of structure and content, Clinical Otolaryngology & Allied Sciences, 29(2): 169–174, https://doi.org/10.1111/j.0307-7772.2004.00775.x.
2. Airas M., Alku P. (2007), Comparison of multiple voice source parameters in different phonation types, [in:] Eighth Annual Conference of the International Speech Communication Association, https://doi.org/10.21437/interspeech.2007-28.
3. Alku P., Backstrom T., Vilkman E. (2002), Normalized amplitude quotient for parametrization of the glottal flow, The Journal of the Acoustical Society of America, 112(2): 701–710, https://doi.org/10.1121/1.1490365.
4. Alku P., Strik H., Vilkman E. (1997), Parabolic spectral parameter – A new method for quantification of the glottal flow, Speech Communication, 22(1): 67–79, https://doi.org/10.1016/s0167-6393(97)00020-4.
5. Alpaydin E. (2004), Introduction to Machine Learning, MIT Press.
6. Askenfelt A.G., Hammarberg B. (1986), Speech waveform perturbation analysis: A perceptual-acoustical comparison of seven measures, Journal of Speech, Language, and Hearing Research, 29(1): 50–64, https://doi.org/10.1044/jshr.2901.50.
7. Barsties B., Maryn Y. (2012), Der acoustic voice quality index [in German: Ein Messverfahren zur allgemeinen Stimmqualitat], HNO, 60(8): 715–720, https://doi.org/10.1007/s00106-012-2499-9.
8. Behrbohm H., Kaschke O., Nawka T., Swift A.C. (2011), Ear, Nose and Throat Diseases with Head and Neck Surgery [in Polish: Choroby ucha, nosa i gardła z chirurgią głowy i szyi], 2nd ed., Edra Urban & Partner.
9. Boersma P. (2001), Praat, a system for doing phonetics by computer, Glot International, 5(9/10): 341–345.
10. Chen H.-C., Jen Y.-M., Wang C.-H., Lee J.-C., Lin Y.-S. (2007), Etiology of vocal cord paralysis, ORL, 69(3): 167–171, https://doi.org/10.1159/000099226.
11. Childers D.G., Lee C.K. (1991), Vocal quality factors: Analysis, synthesis, and perception, The Journal of the Acoustical Society of America, 90(5): 2394–2410, https://doi.org/10.1121/1.402044.
12. Compton E.C. et al. (2022), Developing an Artificial Intelligence tool to predict vocal cord pathology in primary care settings, The Laryngoscope, 133(8): 1531–4995, https://doi.org/10.1002/lary.30432.
13. Cooper W.E., Sorensen J.M. (1981), Fundamental Frequency in Sentence Production, Springer Science & Business Media.
14. Crowson M.G. et al. (2020), A contemporary review of machine learning in otolaryngology–head and neck surgery, The Laryngoscope, 130(1): 45–51, https://doi.org/10.1002/lary.27850.
15. Degottex G., Kane J., Drugman T., Raitio T., Scherer S. (2014), COVAREP – A collaborative voice analysis repository for speech technologies, [in:] 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 960–964, https://doi.org/10.1109/icassp.2014.6853739.
16. Dejonckere P.H. et al. (2001), A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating new assessment techniques, European Archives of Oto-rhino-laryngology, 258: 77–82, https://doi.org/10.1007/s004050000299.
17. Deliyski D.D., Shaw H.S., Evans M.K. (2005), Adverse effects of environmental noise on acoustic voice quality measurements, Journal of Voice, 19(1): 15–28, https://doi.org/10.1016/j.jvoice.2004.07.003.
18. Dibazar A.A., Berger T.W., Narayanan S.S. (2006), Pathological voice assessment, [in:] 2006 International Conference of the IEEE Engineering in Medicine and Biology Society, 2006: 1669–1673, https://doi.org/10.1109/IEMBS.2006.259835.
19. Friedman N., Geiger D., Goldszmidt M. (1997), Bayesian network classifiers, Machine Learning, 29: 131–163, https://doi.org/10.1023/A:1007465528199.
20. Godino-Llorente J.I., Gómez-Vilda P., Saenz-Lechón N., Blanco-Velasco M., Cruz-Roldan F., Ferrer-Ballester M.A. (2005), Support vector machines applied to the detection of voice disorders, [in:] Nonlinear Analyses and Algorithms for Speech Processing. NOLISP 2005. Lecture Notes in Computer Science, Faundez-Zanuy M., Janer L., Esposito A., Satue-Villar A., Roure J., Espinosa-Duro V. [Eds.], pp. 219–230, https://doi.org/10.1007/11613107_19.
21. Hacki T. (1989), Classification of glottal dysfunctions on the basis of electroglottography [in German: Klassifizierung von glottiscysfunktionen mit hilfe der elektroglottographie], Folia phoniatrica, 41(1): 43–48, https://doi.org/10.1159/000265931.
22. Hanson H.M. (1997), Glottal characteristics of female speakers: Acoustic correlates, The Journal of the Acoustical Society of America, 101(1): 466–481, https://doi.org/10.1121/1.417991.
23. Hillenbrand J., Houde R.A. (1996), Acoustic correlates of breathy vocal quality: Dysphonic voices and continuous speech, Journal of Speech, Language, and Hearing Research, 39(2): 311–321, https://doi.org/10.1044/jshr.3902.311.
24. Hirano M. (1981), Clinical Examination of Voice, Springer-Verlag, New York.
25. Hogikyan N.D. (2004), The voice-related quality of life (V-RQOL) measure: History and ongoing utility of a validated voice outcomes instrument, Perspectives on Voice and Voice Disorders, 14(1): 3–5, https://doi.org/. 10.1044/vvd14.1.3.
26. Hosokawa K. et al. (2017), Validation of the acoustic voice quality index in the Japanese language, Journal of Voice, 31(2): 260.e1–260.e9, https://doi.org/10.1016/j.jvoice.2016.05.010.
27. Ingrisano D.R., Perry C.K., Jepson K.R. (1998), Environmental noise: A threat to automatic voice analysis, American Journal of Speech-Language Pathology, 7(1): 91–96, doi: https://doi.org/10.1044/1058-0360.0701.91.
28. Jeong G.-E. et al. (2022), Treatment efficacy of voice therapy following injection laryngoplasty for unilateral vocal fold paralysis, Journal of Voice, 36(2): 242–248, https://doi.org/10.1016/j.jvoice.2020.05.014.
29. Kane J., Gobl C. (2011), Identifying regions of nonmodal phonation using features of the wavelet transform, [in:] Twelfth Annual Conference of the International Speech Communication Association, pp. 177–180, https://doi.org/10.21437/interspeech.2011-76.
30. Kane J., Gobl C. (2013), Wavelet maxima dispersion for breathy to tense voice discrimination, [in:] IEEE Transactions on Audio, Speech, and Language Processing, 21(6): 1170–1179, https://doi.org/10.1109/tasl.2013.2245653.
31. Kankare E. et al. (2020), The acoustic voice quality index version 02.02 in the Finnish-speaking population, Logopedics Phoniatrics Vocology, 45(2): 49–56, https://doi.org/10.1080/14015439.2018.1556332.
32. Kosztyła-Hojna B., Moskal D., Kuryliszyn-Moskal A., Rutkowski R. (2014), Visual assessment of voice disorders in patients with occupational dysphonia, Annals of Agricultural and Environmental Medicine, 21(4): 898–902, https://doi.org/10.5604/12321966.1129955.
33. Landwehr N., Hall M., Frank E. (2005), Logistic model trees, Machine Learning, 59: 161–205, https://doi.org/10.1007/s10994-005-0466-3.
34. Laukkanen A.-M., Rantala L. (2022), Does the acoustic voice quality index (AVQI) correlate with perceived creak and strain in normophonic young adult Finnish females?, Folia Phoniatrica et Logopaedica, 74(1): 62–69, https://doi.org/10.1159/000514796.
35. Majkowska M. (2004), Basic issues of voice emission and hygiene [in Polish: Podstawowe zagadnienia emisji i higieny głosu], [in:] Prace Naukowe Akademii im. Jana Długosza w Częstochowie, 5: 93–101.
36. Maryn Y., Corthals P., Van Cauwenberge P., Roy N., De Bodt M. (2010), Toward improved ecological validity in the acoustic measurement of overall voice quality: Combining continuous speech and sustained vowels, [in:] Journal of Voice, 24(5): 540–555, https://doi.org/10.1016/j.jvoice.2008.12.014.
37. Maryn Y., De Bodt M., Barsties B., Roy N. (2014), The value of the acoustic voice quality index as a measure of dysphonia severity in subjects speaking different languages, European Archives of Oto-Rhino-Laryngology, 271: 1609–1619, https://doi.org/10.1007/s00405-013-2730-7.
38. Maryn Y., Roy N. (2012), Sustained vowels and continuous speech in the auditory-perceptual evaluation of dysphonia severity, Jornal da Sociedade Brasileira de Fonoaudiologia, 24: 107–112, https://doi.org/10.1590/s2179-64912012000200003.
39. Maryn Y., Roy N., De Bodt M., Van Cauwenberge P., Corthals P. (2009), Acoustic measurement of overall voice quality: A meta-analysis, The Journal of the Acoustical Society of America, 126(5): 2619–2634, https://doi.org/10.1121/1.3224706.
40. Maryn Y., Weenink D. (2015), Objective dysphonia measures in the program Praat: smoothed cepstral peak prominence and acoustic voice quality index, Journal of Voice, 29(1): 35–43, https://doi.org/10.1016/j.jvoice.2014.06.015.
41. Montalbaron M.B. et al. (2023), Presumptive diagnosis in tele-health laryngology: A multi-center observational study, The Annals of Otology, Rhinology, and Laryngology, 132(12): 1511–1519, https://doi.org/10.1177/00034894231165811.
42. Nawka, T., Anders, L., Wendler, J. (1994), The auditory assessment of hoarse voices according to the RBH system [in German], Sprache, Stimme, Gehor, 18: 130–133.
43. Nemr K. et al. (2012), GRBAS and Cape-V scales: High reliability and consensus when applied at different times, Journal of Voice, 26(6): 812.e17–218.e22, https://doi.org/10.1016/j.jvoice.2012.03.005.
44. Parsa V., Jamieson D.G. (2001), Acoustic discrimination of pathological voice: Sustained vowels versus continuous speech, Journal of Speech, Language, and Hearing Research, 44(2): 327–339, https://doi.org/10.1044/1092-4388(2001/027).
45. Patel R.R. et al. (2018), Recommended protocols for instrumental assessment of voice: American Speech-Language-Hearing Association expert panel to develop a protocol for instrumental assessment of vocal function, American Journal of Speech-Language Pathology, 27(3): 887–905, https://doi.org/10.1044/2018. ajslp-17-0009.
46. Portney L.G., Watkins M.P. (2009), Foundations of Clinical Research: Applications to Practice, 3rd ed., Pearson/Prentice Hall Upper Saddle River, NJ.
47. Quinlan J.R. (1999), C4.5: Programs for Machine Learning, Morgan Kaufman.
48. Reynolds V. et al. (2012), Objective assessment of pediatric voice disorders with the acoustic voice quality index, Journal of Voice, 26(5): 672.e1–372.e7, https://doi.org/10.1016/j.jvoice.2012.02.002.
49. Roper T.A. (2014), Clinical Skills, 2nd ed., Oxford University Press.
50. Rosłanowski A. (2008), Phoniatric database [in Polish: Baza nagrań foniatrycznych], B.Eng., Polish-Japanese Academy of Information Technology.
51. Speyer R. et al. (2010), Maximum phonation time: Variability and reliability, Journal of Voice, 24(3): 281–284, https://doi.org/10.1016/j.jvoice.2008.10.004.
52. Suvvari T.K. (2023), The role of Artificial Intelligence in diagnosis and management of laryngeal disorders, Ear, Nose & Throat Journal, https://doi.org/10.1177/01455613231175053.
53. Szklanny K. (2019), Acoustic parameters in the evaluation of voice quality of choral singers. Prototype of mobile application for voice quality evaluation, Archives of Acoustics, 44(3): 439–446, https://doi.org/10.24425/aoa.2019.129257.
54. Szklanny K., Wrzeciono P. (2019), Relation of RBH auditory-perceptual scale to acoustic and electroglottographic voice analysis in children with vocal nodules, IEEE Access, 7: 41647–41658, https://doi.org/10.1109/ACCESS.2019.2907397.
55. Tadeusiewicz R. (1988), Speech Signal [in Polish: Sygnał mowy], Wydawnictwa Komunikacji i Łączności, Warszawa.
56. Tirronen S., Javanmardi F., Kodali M., Reddy Kadiri S., Alku P. (2023), Utilizing Wav2Vec in database-independent voice disorder detection, [in:] ICASSP 2023 – 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5, https://doi.org/10.1109/ICASSP49357.2023.10094798.
57. Uloza V., Petrauskas T., Padervinskis E., Ulozaite N., Barsties B., Maryn Y. (2017), Validation of the acoustic voice quality index in the Lithuanian language, Journal of Voice, 31(2): 257.e1–257.e11, https://doi.org/10.1016/j.jvoice.2016.06.002.
58. Verde L., De Pietro G., Sannino G. (2018), Voice disorder identification by using machine learning techniques, IEEE access, 6: 16246–16255, https://doi.org/10.1109/access.2018.2816338.
59. Verikas A., Gelzinis A., Bacauskiene M., Uloza V. (2006), Towards a computer-aided diagnosis system for vocal cord diseases, Artificial Intelligence in Medicine, 36(1): 71–84, https://doi.org/10.1016/j.artmed.2004.11.001.
60. Wilson J., Webb A., Carding P., Steen I., MacKenzie K., Deary I. (2004), The voice symptom scale (VoiSS) and the vocal handicap index (VHI): A comparison of structure and content, Clinical Otolaryngology & Allied Sciences, 29(2): 169–174, https://doi.org/10.1111/j.0307-7772.2004.00775.x.