NLP·November 20, 2024

Fine-Tuning Transformers for Medical Diagnosis

Working with medical NLP datasets requires careful consideration of both the technical and ethical aspects of the problem. For SMP2DIAG, I worked with the GretelAI dataset containing 1,065 samples across 22 different medical diagnoses.

The limited dataset size was a significant challenge. I applied extensive preprocessing including tokenization, lemmatization, and experimented with different embedding strategies including PPMI and GloVe as baselines.

Fine-tuning BERT and ClinicalBERT proved to be the most effective approach. ClinicalBERT, being pre-trained on medical literature, showed superior performance with 97% accuracy and F1 score of 0.97. I also experimented with Flan-T5, achieving 94% accuracy.

The key lesson was that domain-specific pre-training makes a significant difference. ClinicalBERT's medical knowledge gave it an edge over general-purpose models, demonstrating the importance of choosing the right foundation model for your specific domain.