Objectives: Clinicians often face uncertainty when interpreting whether a decline in estimated glomerular filtration rate is within the patient’s expected range of fluctuation or if the decline signals a substantial deviation. Thus, accurate predictions of glomerular filtration rate can be an early warning system, prompting timely interventions, such as biopsies to preclude early graft rejection and adjustments in immunosuppression. Traditional models, encompassing linear and conventional methods, typically struggle with variabilities and complexities in posttransplant data.
Materials and Methods: We evaluated the efficacy of a gradient boosting model in predicting posttransplant glomerular filtration rate, to potentially enhance accuracy over traditional prediction approaches. Our patient dataset included 68 pediatric patients aged 1 to 18 years who underwent kidney transplant between 2017 and 2023 at Baskent University Hospital (Ankara, Turkey). The dataset comprised 2285 glomerular filtration rate measurements, along with patient demographics and transplant-related data. For our model, we included “days to transplant” (glomerular filtration rate values pretransplant), “days from transplant” (glomerular filtration rate values up to 7 days posttransplant), patient age, sex, and donor types. We divided the dataset into a training set (70%) and a test set (30%). To evaluate model performance, we used mean absolute error and root mean squared error, with a focus on the accuracy of glomerular filtration rate predictions at various posttransplant stages.
Results: In the training set, the gradient boosting model demonstrated a significant improvement in prediction accuracy, achieving an mean absolute error of ~5.64 mL/min/1.73 m2.
Conclusions: Our model underscored the promise of advanced machine learning techniques in refining prediction of glomerular filtration rate after kidney transplant. With its augmented precision, the model can support clinicians in making informed decisions regarding early biopsies and interventions, thus highlighting the vital role of sophisticated analytical methods in medical prognosis and the monitoring of pediatric patient care.
Key words : Graft function, Machine learning technique, Renal transplantation
Introduction
Recent decades have witnessed substantial improvements in the field of solid-organ transplant, which has greatly improved the lives of those with chronic kidney disease (CKD). Kidney transplants have emerged as a key intervention, substantially boosting the quality of life of pediatric patients. Although recent advancements in pre- and posttransplant care have successfully improved short-term graft survival rates, the long-term prognosis remains challenging. Complicating factors like acute rejection, delayed graft function, the ages of recipients and donors, and the extended use of immunosuppressive drugs persist as major obstacles to long-term graft success.1,2
Thus, the early and accurate diagnosis of rejection is crucial for graft survival. Currently, the allograft biopsy, interpreted via the Banff classification, stands as the gold standard for diagnosing rejection in kidney transplant patients.3 This system categorizes rejection into T-cell-mediated and antibody-mediated types, offering a detailed histopathological basis for evaluation of acute transplant lesions.3 However, the invasiveness of biopsies, along with their associated risks, such as potential bleeding, sampling errors, and costs, has spurred research into less invasive diagnostic alternatives. These include analyzing blood and urine for biochemical and molecular markers, a method referred to as “liquid biopsy,” which has shown promise for improving rejection prevention, monitoring, and treatment.4,5
The use of estimated glomerular filtration rate (eGFR) is pivotal in assessing graft function, with numerous studies underscoring its importance in predicting the long-term success of kidney transplants.1,6,7 Yet, the broad variability and complex interpretation of eGFR results have led to clinical uncertainties; that is, a challenge remains regarding whether fluctuations in eGFR represent normal variations or whether fluctuations indicate changes that warrant further evaluations.7 Accurate eGFR predictions are invaluable, acting as an early warning system to prompt clinical interventions like biopsies to forestall early graft rejection or adjustments in immunosuppression strategies.
This is where machine learning (ML), a key branch of artificial intelligence, comes into play; ML can address these challenges by analyzing vast datasets to identify patterns that could lead to improvements in prediction accuracy, thereby enhancing patient and graft outcomes in the long term.8,9 Recent advancements in ensemble learning have become crucial in refining the precision of predictive analytics across various fields, including health care. Ensemble learning techniques, such as boosting and stacking, have been shown to enhance disease prediction rates by leveraging multiple methods to reduce bias during training and variance during testing phases.10 Specifically, boosting algorithms have demonstrated high accuracy in CKD prediction, with results up to 100% for training datasets and 98.47% for testing datasets.11
The use of gradient boosting for eGFR prediction after kidney transplant marks a notable advance in predictive methodologies. Unlike methods like random forests, where decision trees are built independently, gradient boosting incrementally constructs decision trees, each correcting errors noted in the previous ones. This iterative correction process makes gradient boosting a more precise tool. Our goal is to evaluate, for the first time in pediatric kidney transplant recipients, the effectiveness of gradient boosting in predicting eGFR 3 months posttransplant.
Materials and Methods
In this single-center retrospective study, we included 68 pediatric patients aged 1 to 18 years who underwent kidney transplant from 2017 to 2023 at Baskent University Hospital (Ankara, Turkey). Exclusion criteria were applied to recipients older than 18 years, those receiving repeated kidney transplants to the same or a different recipient, and recipients lacking preoperative kidney function data for at least 10 consecutive measurements. Preoperative details such as primary disease, clinical presentation, induction therapy, and panel reactive antibody specifics, as well as donor information, including donor type and renal transplant laterality, were extracted from the hospital’s electronic medical records system.
Data handling
The dataset comprised 2285 eGFR measurements, along with patient demographics and transplant-related data. Patient eGFR values were obtained through revised Schwartz equation. Key features algorithmically selected for their predictive value included “days to transplant” (all available GFR values before transplant), “days from transplant” (GFR values up to 7 days posttransplant), patient age, sex, and donor type (Figure 1). Data were stored in an Excel file.
Model development and validation
Data was imported from an Excel file using the pandas library, encompassing demographic, transplant-related, and temporal data. The model only included specific features for prediction (days to transplant, days from transplant, age at test, and sex), with GFR as the target variable. Instances missing values were excluded to maintain data integrity. The dataset was then divided into a training set (70%) and a test set (30%) using the “train_test_split” function, ensuring class proportions were maintained. We used a GradientBoostingReg regressor from the scikit-learn package, taking advantage of gradient boosting as a method that builds an additive model in a forward stage-wise manner, allowing for the optimization of arbitrary differentiable loss functions (Figure 2). Ultimately, we stored the trained model parameters with joblib, which facilitated future GFR prediction values during the initial 3 months posttransplant.
Statistical analyses and programming language
We conducted statistical analyses with Python, leveraging libraries such as pandas for data manipulation and analysis and scikit-learn for ML tasks. Python’s comprehensive ecosystem, including NumPy for numerical operations and Matplotlib for data visualization, facilitated detailed data analysis and model development. We used gradient boosting, a robust predictive modeling technique known for its effectiveness in handling diverse data types and improving prediction accuracy compared with traditional models. We evaluated the model’s performance with metrics like mean absolute error and root mean squared error to assess the average magnitude of error and the standard deviation of prediction errors, respectively.
Results
Sixty-eight pediatric kidney transplant recipients (male/female, 39/29) with median age of 12 years (interquartile range, 2-18 y) met the inclusion criteria. The main etiologies for CKD were congenital anomalies of the kidney and urinary tract (n = 25; 36.76%), glomerulonephritis (n = 9; 13.23%), and nephronophthisis (n = 8; 11.76%). Living related donors contributed to 85.29% of the transplants, whereas deceased donors accounted for 14.71%. Before transplant, 86.76% of patients underwent dialysis, with 13.24% receiving a preemptive transplant.
We used 1600 eGFR results (70%) to train the model (training set) and 685 eGFR results (30%) to test the model (testing set). The gradient boosting model demonstrated a mean absolute error of 5.64 mL/min/1.73 m2 and a root mean squared error of 9.26 mL/min/1.73 m2, indicating a high level of accuracy in predicting posttransplant GFR values within the first 3 months.
Discussion
To the best of our knowledge, this study is the first to use the ML for predicting posttransplant eGFRs among pediatric kidney transplant recipients, marking an important advancement in improving postoperative care and monitoring.Pediatric patients with end-stage renal disease undergoing hemodialysis are closely monitored, with their blood creatinine levels maintained within the stable range. However, the organ shortage and the resulting increase in patients with end-stage renal disease often extend the waiting period before transplant can occur.11 Patients are closely followed both before and after transplant, generating thousands of blood parameter data points that require careful analysis. Despite this extensive follow-up, the underlying message of the collective data remains elusive. The variability and fluctuation of eGFR measurements, a focal point of numerous studies, present challenges for clinicians in drawing solid conclusions about whether these fluctuations indicate a failing graft or are part of the normal disease course.7,13,14 The use of ML techniques, such as gradient boosting methods, allows for a more accurate prediction of outcomes after solid-organ transplant, demonstrating the potential of these advanced approaches in enhancing patient care.
Previously, several studies have demonstrated the utility of the ML approaches like latent class linear mixed model, ARIMA, and sequence-to-sequence modeling to distinguish or predict renal function trajectories in patients with CKD.7,13,14 In our study, we used a gradient boosting model to predict eGFR after kidney transplant, with an aim to enhance the precision of posttransplant care, particularly for pediatric recipients. This approach aligns with the previous work of Van Loon and colleagues, who developed a sequence-to-sequence deep learning model to forecast patient-specific eGFR trajectories, highlighting the potential of ML in transplant medicine.7 Although our methodology diverges, by focusing on gradient boosting to leverage historical data for accurate future GFR estimations, both studies underscore the critical role of advanced analytics in improving outcome predictions posttransplant. This methodological advancement enables clinicians to monitor GFR more effectively, facilitating early identification of potential complications and allowing for timely clinical interventions. This capability is paramount in improving patient outcomes, reducing the incidence of graft rejection, and optimizing utilization of health care resources.15
Our findings underscore the vital role of sophisticated analytical methods in refining clinical decision-making and patient care strategies in the context of kidney transplant. Advancing model interpretability remains a critical challenge. Making gradient boosting models easier to understand will facilitate their clinical adoption, enabling clinicians to understand and trust the factors driving the predictions.16 Furthermore, integrating these predictive models into health care information technology ecosystems, thus allowing real-time data analysis, will enhance their utility in clinical practice. The exploration of novel ML architectures and patient-centered research avenues promises to further refine predictive models for posttransplant care, aligning with the principles of personalized medicine and improving quality of life for transplant recipients.
Although our findings are promising, the study’s scope is currently limited to a pediatric population within a single-center setting. Future research should aim to validate these results across broader populations and multiple centers, incorporating a wider array of variables to enhance predictive accuracy and model robustness.In conclusion, the application of gradient boosting techniques in GFR prediction after kidney transplant offers a pioneering approach to personalized patient monitoring and care. This study not only demonstrated the technical efficacy of these models but also highlighted the broader implications for clinical practice and future research in the field of organ transplant.
References:
Volume : 22
Issue : 10
Pages : 78 - 82
DOI : 10.6002/ect.pedsymp2024.O18
From the 1Department of Pediatrics, the 2Department of Pediatric Nephrology, and the 3Division of Transplantation, Department of General Surgery, Baskent University Faculty of Medicine, Ankara, Turkey
Acknowledgements: The authors have not received any funding or grants in support of the presented research or for the preparation of this work and have no declarations of potential conflicts of interest.
Corresponding author: Meraj Alam Siddiqui, Department of Pediatrics, Başkent University Faculty of Medicine, Ankara, Turkey
Tel: +90 507 8280649
E-mail: siddiqui@baskent.edu.tr
Figure 1.Methodology for Constructing and Validating a Gradient Boosting Machine Learning Model Using Data From 68 patients Undergoing Kidney Transplant
Figure 2. Data Processing and Machine Learning Approaches Implemented in Python