Abstract
Study Design: A retrospective analysis. Objectives: To assess the efficacy of large language model (LLM)-based automation in processing clinical questionnaires and compare performance between ChatGPT and Claude. Summary of Literature Review: Although patient-reported outcome measures are crucial in spine surgery, manual processing remains time-consuming and error-prone. Recent LLM developments offer potential automation solutions. Materials and Methods: Fifty-six questionnaire sets (336 pages) were processed thrice using both ChatGPT and Claude. A Python program incorporating PDF preprocessing, optical character recognition processing, and LLM analysis was developed. The performance metrics included accuracy, processing time, token usage, and cost efficiency. Results: Claude showed higher accuracy (96.76%) than ChatGPT (86.54%). Both models processed questionnaires in approximately 27 seconds, compared to 85 seconds for manual entry. Claude used fewer tokens (16,568.8 vs. 18,331.4) but had higher costs ($0.056 vs. $0.023 per questionnaire). High repeatability was observed (Claude: κ=0.97, ChatGPT: κ=0.86). Conclusions: LLM-based automation demonstrates significant potential for processing clinical questionnaires, offering substantial time savings and high accuracy. While manual verification remains necessary, the efficiency of LLMs suggests their viability for large-scale clinical research, particularly using the Claude model.
| Original language | English |
|---|---|
| Pages (from-to) | 23-30 |
| Number of pages | 8 |
| Journal | Journal of Korean Society of Spine Surgery |
| Volume | 32 |
| Issue number | 2 |
| DOIs | |
| Publication status | Published - 2025 Jun |
Bibliographical note
Publisher Copyright:© 2025 Korean Society of Spine Surgery.
Keywords
- Automation
- Clinical questionnaire
- Degenerative lumbar spine
- Large language model
- Spine surgery
ASJC Scopus subject areas
- Orthopedics and Sports Medicine
- Surgery
Fingerprint
Dive into the research topics of 'Automated Analysis of Spinal Questionnaires Using Large Language Models'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS