Regression Model for Respiration Functioning Levels

Description

A fine-tuned regression model that assigns a functioning level to Dutch sentences describing respiration functions. The model is based on a pre-trained Dutch medical language model (link to be added): a RoBERTa model, trained from scratch on clinical notes of the Amsterdam UMC. To detect sentences about respiration functions in clinical text in Dutch, use the icf17-domains classification model. We use a single classifier for 17 different ICF categories to determine the level of functioning.

The following ICF categories are covered:

ICF code Domain name in repo
b1300 Energy level ENR
b140 Attention functions ATT
b152 Emotional functions STM
b440 Respiration functions ADM
b455 Exercise tolerance functions INS
b530 Weight maintenance functions MBW
d450 Walking FAC
d550 Eating ETN
d840-d859 Work and employment BER
B280 Sensations of pain SOP
B134 Sleep functions SLP
D760 Family relationships FML
B164 Higher-level cognitive functions HLC
D465 Moving around using equipment MAE
D410 Changing basic body position CBP
B230 Hearing functions HRN
D240 Handling stress and other psychological demands HSP

Functioning levels

Level Meaning
5 No problem functioning
4 No problem functioning or almost complete functioning
3 Shortness of breath in exercise (saturation ≥90), and/or respiratory rate is slightly increased (EWS: 21-30).
2 Shortness of breath in rest (saturation ≥90), and/or respiratory rate is fairly increased (EWS: 31-35).
1 Needs oxygen at rest or during exercise (saturation <90), and/or respiratory rate >35.
0 Mechanical ventilation is needed.

The predictions generated by the model might sometimes be outside of the scale (e.g. 4.2); this is normal in a regression model.

Intended uses and limitations

  • The model was fine-tuned (trained, validated and tested) on medical records from the Amsterdam UMC (the two academic medical centers of Amsterdam). It might perform differently on text from a different hospital or text from non-hospital sources (e.g. GP records).
  • The model was fine-tuned with the Simple Transformers library. This library is based on Transformers but the model cannot be used directly with Transformers pipeline and classes; doing so would generate incorrect outputs. For this reason, the API on this page is disabled.

How to use

To generate predictions with the model, use the Simple Transformers library:

from simpletransformers.classification import ClassificationModel

model = ClassificationModel(
    'roberta',
    'CLTL/icf-levels-adm',
    use_cuda=False,
)

example = 'Nu sinds 5-6 dagen progressieve benauwdheidsklachten (bij korte stukken lopen al kortademig), terwijl dit eerder niet zo was.'
_, raw_outputs = model.predict([example])
predictions = np.squeeze(raw_outputs)

The prediction on the example is:

2.26

The raw outputs look like this:

[[2.26074648]]

Training data

  • The training data consists of clinical notes from medical records (in Dutch) of the Amsterdam UMC. Due to privacy constraints, the data cannot be released.
  • The annotation guidelines used for the project can be found here.

Training procedure

The default training parameters of Simple Transformers were used, including:

  • Optimizer: AdamW
  • Learning rate: 4e-5
  • Num train epochs: 1
  • Train batch size: 8

Evaluation results

The evaluation is done on a sentence-level (the classification unit) and on a note-level (the aggregated unit which is meaningful for the healthcare professionals).

Sentence-level Note-level
mean absolute error 0.48 0.37
mean squared error 0.55 0.34
root mean squared error 0.74 0.58

Authors and references

Authors

Jenia Kim, Piek Vossen

References

When using this repository please cite:

J. Kim, S. Verkijk, E. Geleijn, M. van der Leeden, C. Meskers, C. Meskers, S. van der Veen, P. Vossen, and G. Widdershoven, Modeling dutch medical texts for detecting functional categories and levels of covid-19 patients, 2022. In: Proceedings of the 13th Language Resources and Evaluation Conference, Marseille, June, 2022.

Bibtext:

@proceedings{kim-etal-lrec2022, author={Jenia Kim and Stella Verkijk and Edwin Geleijn and Marieke van der Leeden and Carel Meskers and Caroline Meskers and Sabina van der Veen and Piek Vossen and Guy Widdershoven}, title={Modeling Dutch Medical Texts for Detecting Functional Categories and Levels of COVID-19 Patients}, booktitle={Proceedings of the 13th Language Resources and Evaluation Conference, Marseille, June, 2022}, year={2022} }

Downloads last month
-
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support