Using Data Mining to Predict Hospital Admissions from the Emergency Department

Abstract
Crowding within Emergency Departments (EDs) can have significant negative consequences for patients. EDs therefore need to explore the use of innovative methods to improve patient flow and prevent overcrowding. One potential method is the use of data mining using machine learning techniques to predict ED admissions. This study uses routinely collected administrative data (120,600 records) from two major acute hospitals in Northern Ireland to compare contrasting machine learning algorithms in predicting the risk of admission from the ED. Drawing on logistic regression, we identify several factors related to hospital admissions including hospital site, age, arrival mode, triage category, care group, previous admission in the past month, and previous admission in the past year. This study highlights the potential utility of three common machine learning algorithms in predicting patient admissions. Practical implementation of the models developed in this study in decision support tools would provide a snapshot of predicted admissions from the emergency department at a given time, allowing for advance resource planning and the avoidance bottlenecks in patient flow, as well as comparison of predicted and actual admission rates. When interpretability is a key consideration, EDs should consider adopting logistic regression models, although GBM’s will be useful where accuracy is paramount.
Existing System
Using a range of clinical and demographic data relating to elderly patients, La Mantiana et al. used logistic regression to predict admissions to hospital, and ED re-attendance. They predicted admissions with moderate accuracy, but were unable to predict ED re-attendance accurately. The most important factors predicting admission were age, Emergency Severity Index (ESI) triage score, heart rate, diastolic blood pressure, and chief complaint (pg. 255). Baumann and Strout also find an association between the ESI and admission of patients aged over 65. Boyle et al. used historical data to develop forecast models of ED presentations and admissions. Model performance was evaluated using the mean absolute percentage error (MAPE), with the best attendance model achieving a MAPE of around 7%, and the best admission model achieving a MAPE of around 2% for monthly admissions. The use of historical data by itself to predict future events has the advantage of allowing forecasts further into the future, but has the disadvantage of not incorporating data captured at arrival and through triage, which may improve the accuracy of short term forecasting of admissions.
The most important predictors in their model included ‘triage category, age, National Early Warning Score, arrival by ambulance, referral source, and admission within the last year’ (pg. 1), with an area under the curve of the receiver operating characteristic (AUC-ROC) of 0.877. Other variables including weekday, out of hours attendances, and female gender, were significant but did not have high enough odds ratios to be included in the final models. Kim et al. used routine administrative data to predict emergency admissions, also using a logistic regression model. However, their model was less accurate with an accuracy of 76% for their best model.
Proposed System
The method for this study involved seven data mining tasks. These were: 1. Data extraction; 2. Data cleansing and feature engineering; 3. Data visualisation and descriptive statistics; 4. Data splitting into training (80%) and test sets (20%); 5. Model tuning using the training set and 10-fold cross validation repeated 5 times; 6. Predicting admissions based on the test data set and; 7. The evaluation of model performance based on predictions made on the test data. These steps help to ensure the models are optimal and prevent against overfitting.
The study was based on administrative data, all of which was recorded on electronic systems, and subsequently warehoused for business intelligence, analytics, and reporting purposes. The data was recorded during the 2015 calendar year, and includes all ED attendances at two major acute hospitals situated within a single Northern Ireland health and social care trust. The trust itself offers a full range of acute, community, and social care services delivered in a range of settings including two major acute hospitals, which were the setting for this study. Both hospitals offer a full range of inpatient, outpatient, and emergency services and have close links to other areas of the healthcare system such as community and social services. Hospital 1 is larger, treating approximately 60000 inpatients and day cases each year and 75000 outpatients, whilst hospital 2 treats approximately 20000 inpatients and day cases and 50000 outpatients.
CONCLUSION

This study involved the development and comparison of three machine learning models aimed at predicting hospital admissions from the ED. Each model was trained using routinely collected ED data using three different data mining algorithms, namely logistic regression, decision trees and gradient boosted machines. Overall, the GBM performed the best when compared to logistic regression and decision trees, but the decision tree and logistic regression also performed well. The three models presented in this study yield comparable, and in some cases improved performance compared to models presented in other studies. Implementation of the models as a decision support tool could help hospital decision makers to more effectively plan and manage resources based on the expected patient inflow from the ED. This could help to improve patient flow and reduce ED crowding, therefore reducing the adverse effects of ED crowding and improving patient satisfaction. The models also have potential application in performance monitoring and audit by comparing predicted admissions against actual admissions. However, whilst the model could be used to support planning and decision making, individual level admission decisions still require clinical judgement.
REFERENCES

[1] J.S. Olshaker, N.K. Rathlev, Emergency Department overcrowding and ambulance diversion: The impact and potential solutions of extended boarding of admitted patients in the Emergency Department, J. Emerg. Med. 30 (2006) 351–356. doi:10.1016/j.jemermed.2005.05.023.
[2] J. Boyle, M. Jessup, J. Crilly, D. Green, J. Lind, M. Wallis, P. Miller, G. Fitzgerald, Predicting emergency department admissions, Emerg. Med. J. 29 (2012) 358–365. doi:10.1136/emj.2010.103531.
[3] S.L. Bernstein, D. Aronsky, R. Duseja, S. Epstein, D. Handel, U. Hwang, M. McCarthy, K.J. McConnell, J.M. Pines, N. Rathlev, R. Schafermeyer, F. Zwemer, M. Schull, B.R. Asplin, The effect of emergency department crowding on clinically oriented outcomes, Acad. Emerg. Med. 16 (2009) 1–10. doi:10.1111/j.1553-2712.2008.00295.x.
[4] D.M. Fatovich, Y. Nagree, P. Sprivulis, Access block causes emergency department overcrowding and ambulance diversion in Perth, Western Australia., Emerg. Med. J. 22 (2005) 351–354. doi:10.1136/emj.2004.018002.
[5] M.L. McCarthy, S.L. Zeger, R. Ding, S.R. Levin, J.S. Desmond, J. Lee, D. Aronsky, Crowding Delays Treatment and Lengthens Emergency Department Length of Stay, Even A High-Acuity Patients, Ann. Emerg. Med. 54 (2009) 9–13. doi:10.1016/j.annemergmed.2009.03.006.
[6] D.B. Richardson, Increase in patient mortality at 10 days associated with emergency department overcrowding, Med. J. Aust. 184 (2006) 213–216.
[7] N.R. Hoot, D. Aronsky, Systematic Review of Emergency Department Crowding: Causes, Effects, and Solutions, Ann. Emerg. Med. 52 (2008). doi:10.1016/j.annemergmed.2008.03.014.
[8] Y. Sun, B.H. Heng, S.Y. Tay, E. Seow, Predicting hospital admissions at emergency department triage using routine administrative data, Acad. Emerg. Med. 18 (2011) 844–850. doi:10.1111/j.1553-2712.2011.01125.x.
[9] M.A. LaMantia, T.F. Platts-Mills, K. Biese, C. Khandelwal, C. Forbach, C.B. Cairns, J. Busby-Whitehead, J.S. Kizer, Predicting hospital admission and returns to the emergency department for elderly patients, Acad. Emerg. Med. 17 (2010) 252–259. doi:10.1111/j.1553-2712.2009.00675.x.
[10] J.S. Peck, S.A. Gaehde, D.J. Nightingale, D.Y. Gelman, D.S. Huckins, M.F. Lemons, E.W. son, J.C. Benneyan, Generalizability of a simple approach for predicting hospital admission from an emergency department, Acad. Emerg. Med. 20 (2013) 1156–1163. doi:10.1111/acem.12244.
[11] A. Cameron, K. Rodgers, A. Ireland, R. Jamdar, G.A. McKay, A simple tool to predict admission at the time of triage., Emerg. Med. J. 32 (2013) 174–9. doi:10.1136/emermed-2013-203200.
[12] N. Esfandiari, M.R. Babavalian, A.M.E. Moghadam, V.K. Tabar, Knowledge discovery in medicine: Current issue and future trend, Expert Syst. Appl. 41 (2014) 4434–4463. doi:10.1016/j.eswa.2014.01.011.