Attributes and predictors of long COVID

Carole H Sudre, Benjamin Murray, Thomas Varsavsky, Mark S Graham, Rose S Penfold, Ruth C Bowyer, Joan Capdevila Pujol, Kerstin Klaser, Michela Antonelli, Liane S Canas, Erika Molteni, Marc Modat, M Jorge Cardoso, Anna May, Sajaysurya Ganesh, Richard Davies, Long H Nguyen, David A Drew, Christina M Astley, Amit D Joshi, Jordi Merino, Neli Tsereteli, Tove Fall, Maria F Gomez, Emma L Duncan, Cristina Menni, Frances M K Williams, Paul W Franks, Andrew T Chan, Jonathan Wolf, Sebastien Ourselin, Tim Spector, Claire J Steves, Carole H Sudre, Benjamin Murray, Thomas Varsavsky, Mark S Graham, Rose S Penfold, Ruth C Bowyer, Joan Capdevila Pujol, Kerstin Klaser, Michela Antonelli, Liane S Canas, Erika Molteni, Marc Modat, M Jorge Cardoso, Anna May, Sajaysurya Ganesh, Richard Davies, Long H Nguyen, David A Drew, Christina M Astley, Amit D Joshi, Jordi Merino, Neli Tsereteli, Tove Fall, Maria F Gomez, Emma L Duncan, Cristina Menni, Frances M K Williams, Paul W Franks, Andrew T Chan, Jonathan Wolf, Sebastien Ourselin, Tim Spector, Claire J Steves

Abstract

Reports of long-lasting coronavirus disease 2019 (COVID-19) symptoms, the so-called 'long COVID', are rising but little is known about prevalence, risk factors or whether it is possible to predict a protracted course early in the disease. We analyzed data from 4,182 incident cases of COVID-19 in which individuals self-reported their symptoms prospectively in the COVID Symptom Study app1. A total of 558 (13.3%) participants reported symptoms lasting ≥28 days, 189 (4.5%) for ≥8 weeks and 95 (2.3%) for ≥12 weeks. Long COVID was characterized by symptoms of fatigue, headache, dyspnea and anosmia and was more likely with increasing age and body mass index and female sex. Experiencing more than five symptoms during the first week of illness was associated with long COVID (odds ratio = 3.53 (2.76-4.50)). A simple model to distinguish between short COVID and long COVID at 7 days (total sample size, n = 2,149) showed an area under the curve of the receiver operating characteristic curve of 76%, with replication in an independent sample of 2,472 individuals who were positive for severe acute respiratory syndrome coronavirus 2. This model could be used to identify individuals at risk of long COVID for trials of prevention or treatment and to plan education and rehabilitation services.

Conflict of interest statement

Competing interests

Zoe Global codeveloped the app pro bono for noncommercial purposes. Investigators received support from the Wellcome Trust, the MRC/BHF, European Union, French government, Alzheimer’s Society, NIHR, CDRF and the NIHR-funded BioResource, Clinical Research Facility and BRC based at GSTT NHS Foundation Trust in partnership with KCL. R.D., J.W., J.C.P., A.M. and S.G. work for Zoe Global, and T.S. and P.W.F. are consultants to Zoe Global. L.H.N., D.A.D., J.M., P.W.F. and A.T.C. previously participated as investigators on a diet study unrelated to this work that was supported by Zoe Global. C.H.S., M.S.G., E.M., K.K., M.A., L.S.C., M.M., T.V., M.J.C. and S.O. declare no competing interests.

Figures

Extended Data Fig. 1. Study inclusion criteria.
Extended Data Fig. 1. Study inclusion criteria.
Individuals reporting symptoms for at most 1 day were considered for the purpose of this analysis to be asymptomatic. We further excluded users who joined the app already unhealthy, for which the onset of disease was not calculable. Of the remainder, we excluded those who only reported intermittent unhealthy report and restricted to individuals reporting prospective symptoms at least once a week over the course of the disease. The left side of the diagram represents the inclusion flowchart for individuals reporting a positive swab test while the right side reflects the inclusion flowchart for individuals with antibody positive test only.
Extended Data Fig. 2. IMD ratio compared…
Extended Data Fig. 2. IMD ratio compared to short-COVID.
Ratio of LC28 (n=558) and LC56 (n=189) vs short-COVID (n=1591) by Index of Multiple Deprivation (IMD) quintile.
Extended Data Fig. 3. Odds ratio of…
Extended Data Fig. 3. Odds ratio of LC28 per comorbidity.
Odds ratios and associated 95% confidence interval for the risk of developing Long Covid 28 for each comorbidity or risk factor, correcting for age and gender in each age group (18-49 n=1466, 50-69 n=621, >=70 n=62).
Extended Data Fig. 4. Symptom clustering in…
Extended Data Fig. 4. Symptom clustering in LC28.
Clustering of symptoms in the LC28 group, indicating a common strong higher airways component with fatigue, headache, and loss of smell for both groups; and a more multi system presentation for the second group. Colouring presents the frequency of reporting of a given symptom. Abbreviations: DE – delirium, AP – Abdominal Pain, HV – Hoarse Voice, DI – Diarrhoea, CP – Chest Pain, SM – skipped meals, UMP – Unusual Muscle pains, FV – Fever, ST – Sore Throat, PC – Persistent Cough, LOS – Loss of smell, SOB – Shortness of breath, HA – Headache, FA – Fatigue -.
Extended Data Fig. 5. Odds ratios of…
Extended Data Fig. 5. Odds ratios of LC28 per sex and age group.
Odds ratios and associated 95% confidence intervals of LC28 when presenting a given symptom during the first week compared t, correcting for age and gender (if necessary) in different subgroups female(a) (n=1516), male (b) (n=633), 18-49 (c) (n=1466), 50-69 (d) (n=621), >=70 (e) (n=62). Abbreviations: DE – delirium, AP – Abdominal Pain, HV – Hoarse Voice, DI – Diarrhoea, CP – Chest Pain, SM – skipped meals, UMP – Unusual Muscle pains, FV – Fever, ST – Sore Throat, PC – Persistent Cough, LOS – Loss of smell, SOB – Shortness of breath, HA – Headache, FA – Fatigue.
Extended Data Fig. 6. Comparison of feature…
Extended Data Fig. 6. Comparison of feature importance.
Comparison of mean feature importance (proportion ranging from 0 to 1) for the cross-validated random forest models across the different age groups when considering personal characteristics and presented symptoms during the first week of the disease. Abbreviations – (Abbreviations DE – delirium, AP – Abdominal Pain, HV – Hoarse Voice, DI – Diarrhoea, CP – Chest Pain, SM – skipped meals, UMP – Unusual Muscle pains, FV – Fever, ST – Sore Throat, PC – Persistent Cough, LOS – Loss of smell, SOB – Shortness of breath, HA – Headache, FA – Fatigue).
Extended Data Fig. 7. Decision Analysis Curve.
Extended Data Fig. 7. Decision Analysis Curve.
Decision analysis curve comparing the final simple model to other models of simple logistic regression considering different feature associations.
Extended Data Fig. 8. Nomograms.
Extended Data Fig. 8. Nomograms.
Example of nomograms that could be used to assess risk of developing LC28 based on 7 days of symptoms and corresponding table of sensitivity, specificity positive and negative predictive values at the different thresholds, given a prevalence of 13.3%. For a sensitive model, for example to apply further monitoring for the development of Long-COVID, the threshold between white and pink could be used, with a PPV of 34% and NPV of (98%), whereas more specific model, for example to recruit to trials to prevent Long-COVID, might use the dark red threshold, with a PPV of 60%, although some individuals who would go on to have Long-COVID would not be recruited (NPV 82%). Symptoms considered for the count: Fatigue – Headache – Shortness of breath – Fever – Persistent cough – Sore throat – Hoarse voice – Abdominal pain – Diarrhoea – Delirium – Chest pain – Loss of smell – Skipped meals – Unusual muscle pains.
Fig. 1. Distribution of disease duration and…
Fig. 1. Distribution of disease duration and age effect on duration.
a, Distribution of symptom duration in COVID-19. The colored bars indicate the limits to define short, LC28 and LC56 disease duration. The y axis represents the normalized frequency of symptom duration; 2.4% of negative controls and 3.3% of individuals with COVID-19 reported symptoms for ≥28d. b, ORs and 95% CIs of LC28 for each age decile compared to the 20- to 30-year-old age group when considering LC28 versus short COVID (1,516 females and 633 males). For males aged 20-30 years (n = 117), the proportion who had LC28 was 4.5%, compared with 5.6% of females in same age range (n = 357).
Fig. 2. Symptoms by short, LC28 and…
Fig. 2. Symptoms by short, LC28 and LC56 disease duration.
Each symptom is ordered from top to bottom by increasing frequency of occurrence. For short (n = 1,591), LC28 (n = 558) and LC56 (n = 189) disease durations, the median duration of report is represented by the total (hollowed) bar height and associated IQR is represented by the black line. The filled bars represent the number of times a report has been given. For both duration and the number of reported days of symptoms, the x axis reflects the number of days. This highlights the differences in the symptoms in terms of their intermittence throughout the course of the disease. DE, delirium; AP, abdominal pain; HV, hoarse voice; DI, diarrhea; CP, chest pain; SM, skipped meals; UMP, unusual muscle pains; FV, fever; ST, sore throat; PC, persistent cough; LOS, loss of smell; SOB, shortness of breath; HA, headache; FA, fatigue.
Fig. 3. Prediction of long COVID compared…
Fig. 3. Prediction of long COVID compared with short COVID and illustration of multi-system presentation.
a,b, Symptom correlates of long COVID for LC28 (n = 558; a) and LC56 (n = 189; b) compared to short COVID (n = 1,591) with correction for age and sex. Error bars indicate the 95% CI for the ORs. c, Co-occurrence network of symptom pairs in which nodes represent symptoms, the frequency of symptoms corresponds to the size of the node, and the likelihood of symptom pair co-occurrence is represented by the weight of the edges linking them. Edges representing a co-occurrence of less than 10% were removed. d, ROC curve of the cross-validated full and reduced models on the PCR cohort. e, ROC curve when training on the whole PCR cohort of short and LC28 (n = 2,149) and testing on the antibody-positive cohort (n = 1,440 short COVID and n = 165 LC28) for the full (blue) and reduced (magenta) models. Random predictive probability is indicated by the dashed red line.

Source: PubMed

3
Subscribe