A retrospective cohort study predicting and validating impact of the COVID-19 pandemic in individuals with chronic kidney disease

Chronic kidney disease (CKD) is associated with increased risk of baseline mortality and severe COVID-19, but analyses across CKD stages, and comorbidities are lacking. In prevalent and incident CKD, we investigated comorbidities, baseline risk, COVID-19 incidence, and predicted versus observed one-year excess death. In a national dataset (NHS Digital Trusted Research Environment [NHSD TRE]) for England encompassing 56 million individuals), we conducted a retrospective cohort study (March 2020 to March 2021) for prevalence of comorbidities by incident and prevalent CKD, SARS-CoV-2 infection and mortality. Baseline mortality risk, incidence and outcome of infection by comorbidities, controlling for age, sex and vaccination were assessed. Observed versus predicted one-year mortality at varying population infection rates and pandemic-related relative risks using our published model in pre-pandemic CKD cohorts (NHSD TRE and Clinical Practice Research Datalink [CPRD]) were compared. Among individuals with CKD (prevalent:1,934,585, incident:144,969), comorbidities were common (73.5% and 71.2% with one or more condition[s] in respective data sets, and 13.2% and 11.2% with three or more conditions, in prevalent and incident CKD), and associated with SARS-CoV-2 infection, particularly dialysis/transplantation (odds ratio 2.08, 95% confidence interval 2.04-2.13) and heart failure (1.73, 1.71-1.76), but not cancer (1.01, 1.01-1.04). One-year all-cause mortality varied by age, sex, multi-morbidity and CKD stage. Compared with 34,265 observed excess deaths, in the NHSD-TRE and CPRD databases respectively, we predicted 28,746 and 24,546 deaths (infection rates 10% and relative risks 3.0), and 23,754 and 20,283 deaths (observed infection rates 6.7% and relative risks 3.7). Thus, in this largest, national-level study, individuals with CKD have a high burden of comorbidities and multi-morbidity, and high risk of pre-pandemic and pandemic mortality. Hence, treatment of comorbidities, non-pharmaceutical measures, and vaccination are priorities for people with CKD and management of long-term conditions is important during and beyond the pandemic.

C hronic kidney disease (CKD) carries major global disease burden, as a risk factor for morbidity and mortality, and as the end syndrome of underlying risk factors and diseases, 1,2 such as cancers 3 and cardiovascular disease (CVD). 4 During the coronavirus disease 2019 (COVID-19) pandemic, CKD has been associated with poor prognosis. 5,6 Despite clinical and public health importance, CKD research to date in all stages, multimorbidity, or the general population 7 using national-level data has been limited.The pandemic has had both direct (through infection) and indirect (through changes in health services, economic upheaval, and behavioural factors 8,9 ) impacts. The direct impact in individuals with CKD and other underlying conditions is related to baseline risk, influenced by age, sex, multimorbidity, and other sociodemographic factors. 10 However, previous studies of COVID -19 in CKD have been small scale (12-1099 cases 5 ), have mostly focused on end-stage CKD, and have ignored major comorbidities (either most common in CKD or related to risk of COVID-19 mortality). Few risk stratification tools are used in clinical practice for individuals with CKD or prediction of CKD, and those that include CKD usually do not consider different CKD stages. Better characterization of baseline risk in people with CKD may inform individual and population approaches to CKD prevention and treatment and integrated management of chronic diseases.
CKD, already known to increase baseline risk of mortality, is associated with increased risk of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection, disease severity, hospital 11 and intensive care admission, 12 and mortality. The role of other risk factors and underlying conditions in risk of COVID-19 in people with CKD requires more detailed investigation. [13][14][15] There are clinical practice tools for risk stratification of COVID-19 patients in the community and hospitals, but inclusion of CKD is as a binary variable, and so the spectrum of risk faced by individuals with CKD has not been fully considered. Such analyses are important in risk communication to patients, public and health professionals, as well as policies to suppress infection rate (IR), such as social distancing and physical isolation. Meanwhile, more nuanced investigation of the risk associated with CKD may inform clinical care, COVID-19 vaccination strategies, as well as public health approaches to CKD after the pandemic. [16][17][18][19] Using national, population-based electronic health records (EHRs), in individuals with prevalent and incident CKD, we investigated the following: (i) underlying conditions; (ii) mortality risk; (iii) incidence of SARS-CoV-2 infection, and (iv) prediction and validation of pandemic-related excess deaths.

Study design and data sources
We conducted a retrospective, population-based cohort study using NHS Digital Trusted Research Environment for England (NHSD TRE) 20 : a national database developed for pandemic-related research, linking primary care, 21 Hospital Episode Statistics Admitted Patient Care, COVID-19 trajectories, 22 COVID-19 vaccination, and mortality information from the Office for National Statistics Civil Registration of Deaths (Supplementary Figure S1). To investigate multimorbidity, baseline risk, incidence, and mortality, in individuals with CKD (aged $18 years), we defined "prevalent CKD" as $6 months before the onset of pandemic (March 1, 2020) without history of COVID-19, and "incident CKD" as new onset from March 1, 2020, to March 1, 2021, without history of COVID-19 before developing CKD. To predict 1-year COVID-19-related excess deaths based on prepandemic mortality risk, prevalent CKD at January 1, 2019, was defined using similar criteria. To show applicability of our methods to less complete, less up-to-date data sets, we also used Clinical Practice Research Datalink (CPRD) Gold data (as in our previous research 15 ) to define prevalent CKD at April 6, 2014, by either diagnosed CKD or 2 estimated glomerular filtration rate measures (by Modification of Diet in Renal Disease-4 algorithm 23 ) $6 months before index date.
Having an underlying condition, for all cohorts, was defined as having $6 months' history of the condition: (i) before index date for prevalent CKD and (ii) before incidence date for incident CKD. Number of underlying conditions, where stated, was based on 6 conditions: chronic obstructive pulmonary disease, asthma, CVD, cancer, diabetes, and chronic liver disease. COVID-19 mortality was defined as mortality within 28 days of a positive test result. For SARS-CoV-2 incidence rate in prevalent CKD, disease-free time was estimated from earliest date before death or first-dose vaccination. Incident CKD was defined as SARS-CoV-2 positive $14 days after developing CKD. Disease-free time was measured from date of incident CKD. Crude incidence rate did not account for vaccination or other factors.

Phenotypes
Definitions of underlying conditions were derived from Health Data Research UK-Clinical diseAse research using LInked Bespoke studies and Electronic health Records (CALIBER), a comprehensive platform with validated definitions of underlying conditions. 24 Table S1). CVD was defined as a composite of stroke (nonspecified, ischemic, hemorrhagic, transient ischemic attack, or subarachnoid hemorrhagic), heart failure, arrhythmias, acute myocardial infarction, cardiomyopathy, atrial fibrillation, deep vein thrombosis, isolated calf vein thrombosis, and pulmonary embolism. 25 Obesity was defined as body mass index >40 kg/m 2 . Diabetes included all types of diabetes. Implementation of phenotypes is publicly available (https://github. com/BHFDSC/CCU003_03/tree/main/phenotypes).

Statistical analysis
Underlying conditions. We estimated prevalence of underlying conditions in prevalent and incident CKD, stratifying by age, gender, CKD stage, or dialysis/transplantation. We compared prevalence of underlying conditions in infected versus noninfected for (i) all CKD patients and (ii) nonsurvival group, using odds ratio (Wald method) and Mantel-Haenszel c 2 test with 95% confidence intervals.
Mortality risk. With SARS-CoV-2 infection as exposure and 1-year all-cause mortality as outcome, we estimated adjusted relative risk (RR), stratified by underlying conditions, for both prevalent and incident CKD, using generalized linear model with Poisson distribution (log link) after adjusting for the following: (i) age and (ii) age and other potential cofounders by exact matching based on $1 vaccination dose, age groups (5-year intervals), and sex, assessing matching quality using distributional plots. To estimate overall effect of having an underlying condition, analyses were repeated with generalized linear model for each condition, reporting respective RRs (with "SARS-CoV-2 positive" as another potential confounder in exact matching).
Incidence of SARS-CoV-2 infection. We estimated crude incidence rate of SARS-CoV-2 infection per 10,000 person-week, stratified by underlying conditions for incident and prevalent CKD.
Predicting and validating pandemic-related excess deaths. By Kaplan-Meier analyses, we estimated prepandemic baseline risk of 1-year all-cause mortality for prevalent CKD in NHSD TRE (2019) and CPRD cohorts (2014). We validated our  14,15 (to predict COVID-19-related excess death) using our risk estimates and applying 1-year population IR of 10%, and overall RR of mortality (set at 3) based on previous reports. 15,26 We predicted total excess deaths by: (i) age groups and number of underlying conditions and (ii) underlying conditions, using assumed and observed IR and RR. The analysis was performed according to a prespecified analysis plan published on GitHub (https://github.com/BHFDSC/CCU003_01), including implementations and phenotypes.

Role of the funding source
The funders of the study had no role in study design, data collection, data analysis, data interpretation, or writing of the report. AD, MAM, and AB had full access to all the data in the study; and AB had final responsibility for the decision to submit for publication.

Mortality risk
One-year all-cause mortality varied by age, sex, multimorbidity, and CKD stage (e.g., 0.2% in those aged #50 years, with no comorbidities, and with stage 3 CKD; 29.9% in those aged >80 years, with $3 comorbidities, and with stage 5 CKD; Supplementary Figure S6). The RR of 1-year all-cause mortality associated with SARS-CoV-2 infection was comparable between incident and prevalent cases of CKD, and highest for those on dialysis/transplantation (prevalent CKD:  Table S5). The incidence risk of mortality was significantly lower in vaccinated CKD than nonvaccinated (Supplementary Table S6) after exact matching and adjusting based on age, sex, and being tested positive for SARS-CoV-2 infection. Vaccine efficacy seemed to be highest in CKD patients with dialysis or asthma comparing with other underlying conditions.

DISCUSSION
In this large, nationally representative cohort study of individuals with CKD, we had 4 findings. First, comorbidities and multimorbidity were common, and associated with SARS-CoV-2 infection and severe COVID-19. Second, 1-year mortality risk was high and dependent on age, underlying condition, stage of CKD, and incidence or prevalence of CKD, ranging from 0.5% to 37.2%. Third, the UK burden of COVID-19 excess deaths in individuals with CKD was >34,000 in 1 year and predictable using a simple, parsimonious model and routine EHRs. Fourth, we showed that vaccination was associated with reduced mortality risk.
Diabetes and CVD are well documented as major risk factors and comorbidities in people with CKD, whether in epidemiologic 27,28 or therapeutic research. 29 We describe, for the first time, distribution of comorbidities and multimorbidity across the whole spectrum of CKD, both prevalent and incident CKD in up-to-date national data for England. These data are important for planning services for treatment and prevention in individuals with CKD both during and after the pandemic. For example, 7% of individuals with incident or prevalent CKD have both diabetes and cancer; >10% have CVD and cancer. Projections of direct and indirect impact of COVID-19 have not considered overlap between diseases and treatments, probably leading to underestimation. Our finding of higher infection rates in those with dialysis/transplantation may be related to detection bias due to some regular monitoring of those patients for COVID-19 symptoms, resulting in a better detection of SARS-CoV-2 infection. In this context, developing a new condition (such as incident CKD) could potentially increase the contacts with health service that could have resulted in higher detection of infection in incident CKD than prevalent CKD. Despite that, the low rates observed for cancer patients could be related to shielding strategy in clinically vulnerable patients in the United Kingdom. Our results are in line with prior studies 13 showing higher infection rates in those with CKD. Future research should also address subtypes of CKD and trajectory by comorbidity profile to guide and prioritize preventive clinical and public health interventions.
We provide detailed large-scale, population-based analyses to provide patients, health professionals, and policy makers with understanding of pre-COVID-19 and post-COVID-19 mortality risk in people with CKD, based on age, underlying conditions, and incident versus prevalent diseases. Despite increasing clinical, societal, and scientific interest in precision medicine, CKD has not been comprehensively investigated, whether in terms of etiology, prognosis, or prevention research. 1,2,28 Such granular, personalized data can inform risk prediction and public health projections to translational research and conversations with patients about individual Table 2 | Estimated 1-year excess deaths by population infection rate and relative impact of the pandemic using Lancet 2020 model 15   Excess deaths have been the main metric to measure direct and indirect COVID-19 impact, whether overall or in individuals with particular diseases. 14, 15 We present the first analyses in individuals with CKD. These are projections over 1 year based on a published model 15 and consistent with current estimates of the UK's COVID-19 deaths. 26,27,30,31 The variations in pre-COVID-19 and post-COVID-19 mortality based on age, and underlying conditions, are consistent with observed variation in mortality rates during the pandemic. 27,32 The greater prediction accuracy of our model using assumed IR and RR values (10% and 3%, respectively), compared with observed values (6.7% and 3.7%, respectively) is likely to reflect underestimation of infection rate, even in near-complete national data. Further validation of our prediction model is required across different diseases, patterns of multimorbidity, and countries. Our approach highlights the feasibility of large-scale use of EHRs for pandemic preparedness, even less contemporary, less complete data (e.g., CPRD from 2014), and validity of our estimates of infection and excess deaths. For example, our infection rate estimates in nondialysis patients with prevalent CKD (14.4

Strengths and limitations
This is the largest study to date of individuals with CKD in national EHRs to consider a wide range of comorbidities and COVID-19 mortality, but it has several limitations. Laboratory testing was not available, and phenotyping was based on SNOMED CT concepts with potential underestimation. We used validated CALIBER phenotypes 25 and methods, 34 but biases are possible. 35 We only investigated impact of underlying conditions, or effect of SARS-CoV-2 infection by individual comorbidities. Further studies should investigate comorbidity clusters and progression of CKD and outcomes. We were unable to study detailed ethnic categories because of data quality in EHRs. Our model rests on baseline risks. Underestimation or overestimation of excess deaths is possible for some underlying conditions being differentially affected by specific health policies (e.g., shielding) or by indirect effects of the pandemic (e.g., canceled procedures).

Implications for research and policy
There are 3 policy implications. First, our findings are consistent with a "syndemic," describing convergence of an infectious disease, undertreated noncommunicable diseases, and social determinants of health, 36 requiring multidisciplinary, rather than traditional, disease-and specialty-specific responses. Second, given high comorbidity burden, particularly CVD and cancer, it is important to mitigate against indirect effects, likely to disproportionately affect people with CKD. 14 Third, routine data can provide patients, public, professionals, and policy makers with tailored risk information because mortality is highly variable based on age, sex, multimorbidity, and disease stage, which can inform prepandemic and pandemic management, such as social isolation policies and vaccination prioritization in individuals with CKD. There are 3 research implications. First, clustering approaches may inform and clarify subtype classification, trajectories, and risk prediction in CKD. Second, possible mechanisms underlying observed differences in mortality by age, comorbidities, ethnicity, stage of CKD, and other factors need investigation. Third, pathophysiology of CKD as a risk factor and an outcome in COVID-19 warrants further study, informing etiology, prevention, and intervention research.

Conclusions
In conclusion, individuals with CKD have high burden of multimorbidity and high risk of prepandemic mortality across all stages of CKD and in prevalent and incident disease. We showed that the direct burden of pandemic could be predicted using prepandemic, large-scale EHR data. The combined data for multimorbidity, CKD stage, and age could help prioritize patients for vaccination and post-COVID-19 policies, and design of stratified pathways for CKD patients.

DATA STATEMENT
The data used in this study are available in NHS Digital's Trusted Research Environment for England (TRE), but as restrictions apply, they are not publicly available (https://digital.nhs.uk/coronavirus/ coronavirus-data-services-updates/trusted-research-environmentservice-for-england). The CVD-COVID-UK/COVID-IMPACT programme led by the British Heart Foundation (BHF) Data Science Centre (https://www.hdruk.ac.uk/helping-with-health-data/bhf-data-sciencecentre/) received approval to access data in NHS Digital's TRE from the Independent Group Advising on the Release of Data (https:// digital.nhs.uk/about-nhs-digital/corporate-information-anddocuments/independent-group-advising-on-the-release-of-data) via an application made in the Data Access Request Service Online system (reference DARS-NIC-381078-Y9C5K; https://digital.nhs.uk/ services/data-access-request-service-dars/dars-products-andservices). The CVD-COVID-UK/COVID-IMPACT Approvals and Oversight Board (https://www.hdruk.ac.uk/projects/cvd-covid-uk-project/) subsequently granted approval to this project to access the data within NHS Digital's TRE for England. The deidentified data used in this study were made available to accredited researchers only. Those wishing to gain access to the data should contact bhfdsc@hdruk.ac.uk in the first instance.

ACKNOWLEDGMENTS
This work is performed with the support of the BHF Data Science Centre led by Health Data Research (HDR) UK (BHF grant SP/19/3/ 34678). This study makes use of deidentified data held in NHS Digital's Trusted Research Environment for England and made available via the BHF Data Science Centre's CVD-COVID-UK/COVID-IMPACT consortium. This work uses data provided by patients and collected by the NHS as part of their care and support. We would also like to acknowledge all data providers who make health-relevant data available for research. This study was funded by AstraZeneca UK Ltd. The BHF Data Science Centre (grant SP/19/3/34678, awarded to HDR UK) funded co-development (with NHS Digital) of the trusted research environment, provision of linked data sets, data access, user software licences, computational usage, and data management and wrangling support, with additional contributions from the HDR UK Data and Connectivity component of the UK Government Chief Scientific Adviser's National Core Studies program to coordinate national COVID-19 priority research. Consortium partner organizations funded the time of contributing data analysts, biostatisticians, epidemiologists, and clinicians.
Approval for the study was granted by the Independent Scientific Advisory Committee (20_074R) of the Medicines and Healthcare Products Regulatory Agency in the United Kingdom in accordance with the Declaration of Helsinki. The North East-Newcastle and North Tyneside 2 research ethics committee provided ethical approval for the CVD-COVID-UK/COVID-IMPACT research programme (REC 20/NE/ c l i n i c a l i n v e s t i g a t i o n A Dashtban et al.: COVID-19 pandemic impact in chronic kidney disease 0161) to access, within secure trusted research environments, unconsented, whole-population, de-identified data from electronic health records collected as part of patients' routine health care.

AUTHOR CONTRIBUTIONS
AB conceived the research question. AB, JBM, and TM obtained funding. AB and AD designed the study and analysis plan. SD, CS, and the BHF Data Science Centre CVD-COVID-UK/COVID-IMPACT consortium prepared the data, including electronic health record phenotyping in the CALIBER open portal. CS is the Director of the BHF Data Science Centre and coordinated approvals for and access to data within the NHS Digital Trusted Research Environment for England (TRE) for CVD-COVID-UK/COVID-IMPACT. AD prepared the chronic kidney disease (CKD) cohorts (including phenotyping of CKD stages), designed incidence study, and performed statistical analysis. MAM provided all required implementations for adding phenotypes, and vaccination data in TRE, beside insightful comments throughout research. AB and AD drafted the initial and final versions of the manuscript. All authors critically reviewed early and final versions of the manuscript.

SUPPLEMENTARY MATERIAL
Supplementary File (Word) Figure S1. Study population of prevalent and incident chronic kidney disease in England (NHS Digital Trusted Research Environment for England [NHSD TRE] data for England).   Figure S4. Prevalence of underlying conditions by age group, chronic kidney disease (CKD) stage, and sex, in individuals with prevalent (n ¼ 1,934,585) and incident (n ¼ 144,969) CKD during the coronavirus disease 2019 (COVID-19) pandemic. Figure S5. Association between severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection and underlying conditions in individuals with chronic kidney disease (CKD) in (A) all prevalent CKD (n ¼ 1,934,585) and (B) the nonsurvival group (i.e., those not surviving to 1-year follow-up during pandemic; n ¼ 172,789). Figure S6. One-year all-cause mortality (percentage) in individuals with prevalent (n ¼ 1,934,585) chronic kidney disease (CKD) by number of underlying conditions, age, sex, and CKD stage, using NHS Digital Trusted Research Environment for England (NHSD TRE) data on March 1, 2020. Figure S7. Covariate balance before and after exact matching for prevalent (n ¼ 1,934,585) and incident (n ¼ 144,969) chronic kidney disease, using standardized mean difference in all individuals and those with cancer, diabetes, and dialysis/transplantation. Figure S8. Observed age-specific, unadjusted relative risk of mortality and population infection rate during first year of coronavirus disease 2019 (COVID-19) pandemic in individuals with prevalent chronic kidney disease (n ¼ 1,934,585).   -19) deaths in individuals with prevalent chronic kidney disease by age group during 1 year of pandemic, predicted by Lancet 2020 model 15 (population infection rate, 10%; relative risk, 3) using prepandemic study population in NHS Digital Trusted Research Environment for England (NHSD TRE; predicted n ¼ 28,746) and Clinical Practice Research Datalink (CPRD; predicted n ¼ 24,546), compared with actual excess deaths (observed n ¼ 34,265). Table S1. Code list used to identify chronic kidney disease in primary and secondary care, including International Classification of Diseases, Tenth Revision (ICD-10), codes and SNOMED CT concepts. Table S2. Baseline characteristics in individuals with prevalent chronic kidney disease (CKD; n ¼ 1,934,585) at the onset of and during coronavirus disease 2019 (COVID-19) pandemic (from March 1, 2020): age, sex, stages of CKD, underlying conditions, and COVID-19 mortality. Table S3. Baseline characteristics of incident (n ¼ 144,969) chronic kidney disease (CKD) during coronavirus disease 2019 (COVID-19) pandemic (March 1, 2020, to March 1, 2021): age, sex, stages of CKD, underlying conditions, and 28-day COVID-19 mortality. Table S4. Association between underlying conditions and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection (compared with noninfected individuals) in individuals with chronic kidney disease (CKD) in (A) all prevalent (n ¼ 1,934,585) or incident CKD (n ¼ 144,969) and (B) the nonsurvival group (i.e., those not surviving to 1-year follow-up during pandemic; n ¼ 172,789). Table S5. Association between underlying conditions and 1-year allcause mortality for prevalent (n ¼ 1,934,585) and incident (n ¼ 144,969) chronic kidney disease. Table S6. Association between coronavirus disease 2019 (COVID- 19) vaccination and 1-year all-cause mortality by underlying condition for prevalent (n ¼ 1,934,585) and incident (n ¼ 144,969) chronic kidney disease. Table S7. Incidence rate of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection per 10,000 person-weeks in prevalent (n ¼ 1,934,585) and incident (n ¼ 144,969) chronic kidney disease (CKD) over 1 year of the coronavirus disease 2019 (COVID-19) pandemic: (A) crude and (B) adjusted based on first COVID-19 vaccination, underlying conditions, and CKD stage. Table S8. Baseline characteristics of prepandemic prevalent chronic kidney disease (CKD) in the NHS Digital Trusted Research Environment for England (NHSD TRE; n ¼ 1,727,130; January 1, 2019) and Clinical Practice Research Datalink (CPRD; n ¼ 174,648; April 1, 2014) by age, sex, stages of CKD, and underlying conditions.