Artificial intelligence (AI), especially generative AI, has become a valuable resource in daily life and in many professions, including healthcare and laboratory medicine. On July 26, 2024 we queried the latest version of five of the most common and freely available generative AI online software tools (Chat-GPT; Perplexity; Google Gemini; Cohere; and You.com) with three specific questions, namely “Which are the diagnostic sensitivity and specificity of cardiac troponins for diagnosing myocardial infarction?”; “Which are the diagnostic sensitivity and specificity of D-dimer for diagnosing pulmonary embolism?”; and “Which are the diagnostic sensitivity and specificity of procalcitonin for diagnosing sepsis?”. We specifically asked the software to provide numerical data on diagnostic sensitivity and specificity of the three biomarkers. Although we found a partial overlap in the diagnostic performance of all biomarkers, a large heterogeneity was observed in the responses to the specific questions posed to the five generative AI tools, making the use of generative AI still questionable for laboratory experts, clinicians and even patients seeking information on accuracy of laboratory tests.

Generative artificial intelligence (AI) for reporting the performance of laboratory biomarkers: not ready for prime time

Pighi, Laura;Negrini, Davide;Lippi, Giuseppe
2025-01-01

Abstract

Artificial intelligence (AI), especially generative AI, has become a valuable resource in daily life and in many professions, including healthcare and laboratory medicine. On July 26, 2024 we queried the latest version of five of the most common and freely available generative AI online software tools (Chat-GPT; Perplexity; Google Gemini; Cohere; and You.com) with three specific questions, namely “Which are the diagnostic sensitivity and specificity of cardiac troponins for diagnosing myocardial infarction?”; “Which are the diagnostic sensitivity and specificity of D-dimer for diagnosing pulmonary embolism?”; and “Which are the diagnostic sensitivity and specificity of procalcitonin for diagnosing sepsis?”. We specifically asked the software to provide numerical data on diagnostic sensitivity and specificity of the three biomarkers. Although we found a partial overlap in the diagnostic performance of all biomarkers, a large heterogeneity was observed in the responses to the specific questions posed to the five generative AI tools, making the use of generative AI still questionable for laboratory experts, clinicians and even patients seeking information on accuracy of laboratory tests.
2025
Artificial intelligence, performance, laboratory biomarkers
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11562/1132606
Citazioni
  • ???jsp.display-item.citation.pmc??? 0
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact