Objectives: Survival after liver transplant depends on pretransplant, peritransplant, and posttransplant factors. Identifying effective factors for patient survival after transplant can help transplant centers make better decisions.
Materials and Methods: Our study included 902 adults who received livers from deceased donors from March 2011 to March 2014 at the Shiraz Organ Transplant Center (Shiraz, Iran). In a 3-step feature selection method, effective features of 6-month survival were extracted by (1) F statistics, Pearson chi-square, and likelihood ratio chi-square and by (2) 5 machine-learning techniques. To evaluate the performance of the machine-learning techniques, Cox regression was applied to the data set. Evaluations were based on the area under the receiver operating characteristic curve and sensitivity of models. (3) We also constructed a model using all factors identified in the previous step.
Results: The model predicted survival based on 26 identified effective factors. In the following order, graft failure, Aspergillus infection, acute renal failure and vascular complications after transplant, as well as graft failure diagnosis interval, previous diabetes mellitus, Model for End-Stage Liver Disease score, donor inotropic support, units of packed cell received, and previous recipient dialysis, were found to be predictive factors in patient survival. The area under the receiver operating characteristic curve and model sensitivity were 0.90 and 0.81, respectively.
Conclusions: Data mining analyses can help identify effective features of patient survival after liver transplant and build models with equal or higher performance than Cox regression. The order of influential factors identified with the machine-learning model was close to clinical experiments.
Key words : Cox regression, Data mining, Deceased-donor transplant, Feature selection
Liver transplant (LT) is the preferred choice of therapy for patients with end-stage liver disease. Posttransplant outcomes have improved as the result of advances in LT.1 However, due to shortages of donated livers and factors affecting successful LT, liver allocation faces several challenges, and graft survival and patient survival are important research problems.2 In recent years, researchers have conducted studies using Model for End-Stage Liver Disease (MELD) to identify whether it can predict survival after LT.3-7 Evidence has proved that MELD alone is not sufficient to predict survival following LT.4,8,9 Several studies have considered various characteristics to predict LT patient survival.10-13 Although the estimation of LT survival based on pre-LT variables could be beneficial for liver allocation policies,14 enhancing post-LT management can prolong survival time.
Health data sets are imbalanced, complex, and ambiguous; therefore, choosing a good predictive technique to analyze health data is challenging.15 Researchers have previously tried to identify key survival factors of LT patients based on statistical methods.16-21 These methods need to consider some assumptions that limit their application. For example, Cox regression assumes that key factors have similar effects on survival over time.22 In recent years, several studies have predicted LT patient survival by machine-learning methods. Unlike the common survival analysis methods, machine learning is not limited by any assumption on the data distribution or structure, and it can model negative and positive targets based on nonstructured, complex, and noisy data.23-25 Extracting hidden knowledge based on the adaptation of machine-learning algorithms is not straightforward, but it can recognize relations and patterns in the data by learning and can rank the importance of key factors among unstructured and imbalanced data sets.
The modeling of outcomes of organ transplant using machine-learning techniques has been con-ducted in LT.25-29 In some of these studies, machine learning has predicted graft and patient survival more accurately than other validated scores and Cox regression.24,25,29,30
This study aimed to model the survival of adult LT recipients by identifying key factors that can affect patient survival and to rank the importance of effective factors. In this regard, we investigated pretransplant, peritransplant, and posttransplant factors by using machine-learning techniques and, to confirm our approach, compared the performances with Cox regression performance.
Materials and Methods
A retrospective cross-sectional study was conducted on adult patients (≥ 18 years old) who received LT for the first time between March 2011 and March 2014 at the Shiraz Organ Transplantation Center (Shiraz, Iran). In this center, long-term follow-up is done. Hence, patients who had received a liver from an extended-criteria or non-extended-criteria brain-dead donor were included. The definition of extended-criteria donor was as follows: donor ≥ 60 years old, or cold ischemia time of ≥ 14 hours, or donor hypotension. For this study, 932 patients were included. Of these 932 patients, 30 met the following exclusion criteria: lack of medical records and impossibility of follow-up for 6 months due to changes in phone number or migration to another city or country. Therefore, the follow-up status of cases included in the study was recorded as either alive or dead.
Fifty-eight variables regarding recipients, donors, surgical procedures, and complications (n = 13, 20, 7, and 18, respectively) were collected as input variables. The demographic and clinical variables of all patients were collected until either death or survival at follow-up. During data collection from patient medical records, researchers compiled metadata to provide definition, validity, relevancy, and consistency of data. A database of 932 patients was created. Our data set included categorical, continuous, and nominal data. With the exception of continuous variables, other variables were displayed as numeric codes. Data encoding was conducted based on metadata to improve consistency.
Value consistency was considered during data entry. If inconsistent values were detected, data were verified again according to the patient’s medical records, and precise values were entered into the data set. The missing values were filled by rechecking the medical records, which provided the most accurate information. To select more relevant and informative features, variables with < 30% missing values were considered in the analysis. Thus, 5 donor-related variables (presence of blood hypotension, hepatitis B core antibody in intensive care unit, donor previous history of diabetes mellitus, alcoholic features, and blood hypertension) were excluded. To account for the missing values, mode values for categorical variables and mean values for continuous variables were replaced.
Our approach was a three-step data mining process based on feature selection aimed to find effective variables. Figure 1 shows the schematic represen-tation of the study methodology. For the first feature selection step, important variables were identified by statistical tests of the association between each variable and output variable. The importance of each candidate predictor was calculated based the P value. For continuous and for categorical variables, P values were calculated based on the F statistic, Pearson chi-square test, or likelihood ratio chi-square test. Only three variables were unimportant; therefore, these were filtered from modeling.
Classifiers were built in the training set, and then the second set was used to validate the model. Five well-known machine-learning techniques, that is, support vector machine (SVM),31 Bayesian network,32 C5.0 decision tree,33 multilayer perceptron neural network (MLP),34 and K-nearest neighbors,35 were applied for the second round of feature selection. Next, to compare models with a standard and popular method, Cox regression was fitted to the data set. At step 3, machine-learning models were also compared with each other based on performance evaluation criteria, and the well-performed machine-learning model was applied on the effective features selected in step 2.
Statistics and performance metrics
Sensitivity, specificity, precision, accuracy, F measure, and area under the receiver operating characteristic (ROC) curve were calculated to evaluate the effectiveness of features as identified by models. True positives (TP) refer to patients who died and who were correctly predicted as dead. False positives (FP) refer to patients who survived and who were incorrectly predicted as dead. True negatives (TN) refer to patients who survived and who were correctly predicted as having survived. False negatives (FN) refer to deceased patients who were incorrectly predicted as having survived.
The following equations show definitions of the performance metrics:
F measure is defined as a harmonic mean of sensitivity and specificity, and its range is between 0 and 1. The area under the ROC curve (AUC) generates values from 0 to 1, in which a value > 0.9 is considered as an excellent discrimination, a value between 0.8 and 0.9 is considered as a good classifier, and a value between 0.5 and 0.8 is considered as useful.
According to AUC and sensitivity of validation set, we selected final machine-learning models that performed well at steps 2 and 3. Based on effective factors found in all well-performed models in step 2, the data were analyzed for a second time by a well-performed technique in step 3 (Figure 1), and the most important factors that affected patient survival were proposed and ranked based on importance.
P values < .05 were considered statistically significant. MedCalc for Windows version 14.8.1 (MedCalc Software, Ostend, Belgium) was used to calculate the AUC. We used IBM SPSS statistics for Windows version 23 (IBM SPSS Inc., Armonk, NY, USA) for statistical analysis. We used IBM SPSS Modeler for Windows, version 14.2 (IBM SPSS Inc.), a predictive analytics platform, to perform data mining analyses. We also used the IBM SPSS Modeler for Cox regression analyses.
Detailed characteristics are shown in Table 1. The follow-up period was fulfilled in 902 patients (mean ± standard deviation age of 41.5 ± 13.3 y; range, 18-74 y), which included 571 men (63.3%) and 331 women (36.7%). Ninety-four patients died from LT complications during the 6-month follow-up period. Survival probability of patients was 89.6% by Kaplan-Meier method. For the analysis, patients were randomly divided into either the model building (n = 634) or internal validation (n = 268) group. Survival rates of the model building and validation subsets were 90.22% and 88.06%, respectively. Mean survival time of patients in this study was generally 165.06 ± 46.74 days. In the model building set, it was 166 ± 45.32 days; in the validation set, it was 162.85 ± 49.94 days. The log-rank and Breslow test results showed no significant difference between the survival curves for model building and validation sets (P = .328, P = .325, respectively).
In this study, the most frequent cause of liver failure was hepatitis B virus in 24.7% of all patients. Primary sclerosing cholangitis, cryptogenic cirrhosis, and autoimmune hepatitis were found in 18.8%, 16.9%, and 15% of the participants, respectively. Other causes of liver failure such as alcoholic liver disease, hepatitis C virus, non-alcoholic fatty liver disease, Wilson disease, fulminant hepatic failure, primary biliary cirrhosis, and Budd-Chiari disease were 1.4%, 3.3%, 1.1%, 5.5%, 0.7%, 2%, and 3.5%, respectively. Hepatocellular carcinoma was found in 7.6% of patients. A MELD score of > 20 was shown in 402 patients (44.57%), and MELD ≤ 20 was shown in 500 patients (55.43%). Percentages of patients with Child-Pugh class A, B, and C scores were 10.4%, 49.8%, and 39.8%, respectively, showing that about one-half of patients were in class B. Recipient previous history of diabetes mellitus was found in 124 patients (13.75%), and 150 patients (16.63%) were diagnosed with diabetes mellitus after LT.
Data mining results: Classifier performance
The validation results are shown in Table 2. The minimum and maximum accuracy results of the machine-learning models were 0.92 and 0.96, as shown in the MLP and SVM models, respectively. Therefore, machine learning could correctly predict at least 92% of all patients, correctly classifying patients as well as Cox regression (94% accuracy).
However, what is important in LT studies is correct prediction of death, called sensitivity. The sensitivity of the SVM model was higher than results shown in other machine-learning techniques in this study. Cox model sensitivity among all models was ≤ 3 of machine-learning models. The data set was imbalanced (94 patients died and 808 patients survived), considering a measure that evaluates the minority class performance to be essential, but specificity was important, too. Therefore, we considered AUC as a fair base criterion. The ROC curves of models are shown in Figure 2. Compared with other models in step 2, MLP also performed well. That is, SVM and MLP models were not significantly different (P = .1968) based on AUC, but SVM was more sensitive (sensitivity results for SVM and MLP models were 0.78 and 0.69, respectively). Therefore, SVM was selected as the well-performing model in step 2. When we compared SVM as the selected machine-learning model with Cox, based on AUC, SVM performed better than Cox regression (AUC = 0.78). Results showed that these 2 models were significantly different (P = .0079). Other criteria such as precision and F measure were also in favor of the SVM model.
The SVM model outperformed other machine-learning models; therefore, we used it for data set analyses again for the step 3 feature selection based on effective variables (Figure 2). Performance results of the final model are also shown in Table 2. The final SVM model performed well on the data set (AUC = 0.90 and sensitivity = 0.81). Indeed, the results showed how the 3-step feature reduction of irrelevant variables could influence the AUC in survival prediction.
Feature selection results
Before partitioning in the first feature selection step, 3 variables (donor liver microvesicular steatosis, recipient sex and peritransplant portal vein thrombosis) were not selected. In the second step, to discover the hidden influential features in the patient data set, machine-learning analysis was conducted on the data set based on 50 independent variables. Each technique was considered as a feature selection method, resulting in 5 models with some suggestions.
The results revealed that postoperative factors could be more effective than demographic and pretransplant ones with regard to patient survival. Compared with other machine-learning models in step 2, the order of important variables in SVM could be clinically more consistent based on experience (Table 3). Although all of the technically important factors were not clinically discriminative, a possibly significant effect on survival with a combination of factors could be predicted.
In the step 2 analysis, graft failure, Aspergillus infection, vascular complications, acute renal failure, graft type, and graft failure diagnosis interval (GFDI) were identified as predictors in all models (5, 3, 3, 2, 2, 2 repeated, respectively), with other important factors presented only in some or one model (Table 3).
Based on the 26 features from step 2, the third feature selection step was applied on the data set. The final model, as shown in Table 4, included complications at high ranks and other recipient-, donor-, and surgery-related factors at lower ranks.
Graft failure with importance of 0.47 was identified as the most influential factor; Aspergillus infection and acute renal failure with 0.8 impact factor were the second most important factors. The importance of vascular complications was 0.07, and the importance of GFDI (primary nonfunctioning or malfunction) was 0.05. Importance of preoperative predictive factors in the model, including recipient history of diabetes mellitus, MELD, use of inotropic support for donor, and recipient receiving units of packed red blood cells in the model, was 0.03, 0.02, 0.02, and 0.01, respectively. Sixteen other factors had less effect on patient survival after LT.
In our study, 115 patients received organs from extended-criteria donors (ECD); these patients had a survival rate of 79%. Patients who received a liver from an ideal donor had a survival rate of 91%. Although ECD was not identified as a predictive factor in the final model, it was identified as an independent risk factor in the SVM model in step 2.
Cox regression was used to compare prediction performances. Table 5 shows the feature selection results of Cox regression. It was not considered in the third step of feature selection.
The results of our study showed that the machine-learning model outperformed the Cox regression model. The accuracy of the machine-learning model was higher than the accuracy of the Cox model, as was its sensitivity to detect effective factors (Table 2). In addition to the increased sensitivity and AUC of the model, the machine-learning model reduced the subset of informative features and ranked the predictive factors appropriately. Furthermore, this study showed that, in high-dimensional LT datasets, Cox regression model had lower sensitivity compared with machine-learning models. Thus, data mining can be a beneficial method for identifying predictive factors for patient survival after LT in data sets with high dimensionality.
Based on our machine-learning results, the order of LT patient survival predictive factors in step 3 of the machine-learning model was close to that shown in clinical experience (Table 4). The results demon-strated that graft failure is overwhelmingly the strongest predictive factor, as we expected. It is not surprising, but this result revealed the performance of our machine-learning method. As shown in Table 4, some variables, including post-LT Aspergillus infection, post-LT acute renal failure, vascular complications, GFDI, recipient history of diabetes mellitus, MELD score, donor inotropic support in the intensive care unit, number of packed red blood cell units, and pre-LT dialysis, have more important effects on patient survival after LT. Haseli and associates36 also reported that patient survival after LT was influenced by graft type, Pediatric End-Stage Liver Disease or MELD score, complications after LT, and initial diagnosis. In a study similar to ours, Khosravi and associates reported that the order of influential factors in the Cox regression model could be close to clinical findings.25 However, we found that the clinically extracted factors in machine-learning models are consistent with experience. It is not clear why some differences exist in the technical results between our study and the research from Khosravi and colleagues; however, a possible justification is the difference in our approach and the way we filtered irrelevant variables in the 3 steps and selected influential variables as inputs for the next phase.
Patients who receive ECD livers have outcomes similar to those who receive normal livers.37 In our study, we considered recipients of both normal and ECD livers, with investigation of whether ECD is an influential factor. In step 2 of feature selection, it was a predictive factor; however, among the 26 features in the last step, it could not be extracted. Therefore, ECD may be an essential factor to better understand patient survival, but it is not more important than the other variables extracted in step 3. Complications are potential causes of LT surgery death; indeed, based on step 3 results, the post-LT predictive factors were more influential than the other factors. Our extracted discriminative factors were in line with results of Haseli and colleagues36 and Khosravi and associates.25
Our results showed that the number of packed cell units is a predictive factor of survival after LT, which is in accordance with previous studies.38-40 Furthermore, having a history of diabetes mellitus could have a negative effect on patient survival.41-43 Diabetes mellitus after LT also increased patient morbidity.44 We found that recipient diabetes mellitus was an important factor; however, development of diabetes mellitus post-LT was removed from the model. Post-LT acute renal failure was extracted as a predictive factor in our study; this showed the substantial impact of this factor on survival. Other studies also revealed that post-LT acute renal failure and recipient history of diabetes mellitus could predict survival in LT recipients.45,46 We found that history of dialysis could also be as a predictive factor for patient survival. Furthermore, our study showed that donor inotropic support in an intensive care unit was as important as MELD and the number of packed red blood cell units for LT survival prediction. However, Feng and associates47 did not identify inotropic support to be related to mortality after LT.
Most studies on LT survival have had a number of limitations. A major limitation of our study was the lack of a central data repository; hence, data had to be collected manually from different records at difference places in the center. Therefore, data collection was time consuming and preprocessing was difficult.
The second limitation of this study was lack of knowledge of nutrition and lifestyle characteristics of patients. In addition, to investigate other survival times, such as at 12 or 24 months or more, by classification techniques, we required separate runs on separate data sets. This was because some complications may occur in the second 6 months or in second 1 year after LT but do not occur in the first 6 months or the first 1 year. Therefore, this situation may cause some missing value in data set and may result in differences in performance of the models, with possible influential factors of patient survival at 12 and 24 months or more. Another limitation of this study was its unbalanced data set, which made the analysis complex; however, we showed how machine learning could correctly recognize minor class examples.
Our study aimed to model patient survival after LT using machine-learning methods to investigate influential factors and compare the performance of these methods with a classic statistic method. We applied a 3-step machine-learning feature selection model to determine factors affecting patient survival after LT. Our data mining approach was able to provide an improvement in performance of predicting LT outcomes. Predicting patient survival after LT by machine learning led to a more consistent selection of important factors that agreed with clinical experience. We recognize that a large cohort study that would include characteristics considered in this study, post-LT nutrition, and lifestyle variables may allow a better understanding of the factors associated with patient survival after LT. Creating an LT data warehouse can help facilitate research and improve quality of data.
DOI : 10.6002/ect.2018.0170
From the 1Student Research Committee, School of Management and Medical
Information Sciences, Shiraz University of Medical Sciences, Shiraz, Iran; the
2Abualisina Transplant Hospital, Shiraz University of Medical Sciences, Shiraz,
Iran; the 3Department of Computer Science and Engineering and IT, School of
Electrical Engineering and Computer, Shiraz University, Shiraz, Iran; and the
4Health Human Resources Research Center, School of Management and Medical
Information Sciences, Shiraz University of Medical Sciences, Shiraz, Iran
Acknowledgements: This research was performed in partial fulfillment of the requirements for the MSc degree in Medical Informatics, Shiraz University of Medical Sciences in Shiraz, Iran. This study was supported by the Vice Chancellor for research affairs of Shiraz University of Medical Sciences (proposal number of 93-7456). We thank the Research Vice Chancellor of Shiraz University of Medical Sciences for financially supporting the research, the Research Consultation Center for statistics consultations, and the Center for Development of Clinical Research of Nemazee Hospital for editorial assistance from Dr. Shokrpour. The authors have no conflicts of interest to declare.
Corresponding author: Roxana Sharifian, School of Management and Medical Information Sciences, Almas Building, Alley 29, Qasrodasht Ave, Shiraz, Iran; or Ashkan Sami, Department of CSE and IT, School of Electrical Engineering and Computer Science, Shiraz University, Mollasadra Street, Shiraz, Iran
Phone: +88 6451371989
E-mail: Sharifianr@sums.ac.ir / firstname.lastname@example.org or Sami@shirazu.ac.ir / Ashkan.Sami@gmail.com
Table 1. Characteristics of Study Patients (N = 902)
Table 1 (continued) Characteristics of Study Patients (N = 902)
Table 2. Performance Criteria of All Models on Validation Set, Sorted by Area Under the Curve
Table 3. Effective Variables Based on Data Mining Methods
Table 4. Effective Variables of Final Machine-Learning Model
Table 5. Effective Variables Based on Cox Regression
Figure 1. Methodology Representation
Figure 2. Receiver Operating Characteristic Curves of Step 2 Models to Compare Methods