Difference between revisions of "User:Shawndouglas/sandbox/sublevel13"
Shawndouglas (talk | contribs) (Saving and adding more.) |
Shawndouglas (talk | contribs) (Saving and adding more.) |
||
Line 111: | Line 111: | ||
====Preprocessing Engine==== | ====Preprocessing Engine==== | ||
This system performs the ETL (extract-transform-load) processes for the AdvantCare data. It first communicates with AdvantCare using the available APIs to retrieve the data, which later is transformed into a suitable format to be introduced to the Processing Engine. Because of the metadata provided by AdvantCare, the information can be classified to ease its analysis. Normalized and consolidated data gets stored in MongoDB, the leading free and open-source document-oriented database, where collections store both data for real time analysis as well as historic data to support batch analysis to compute the evolution of different metrics in time. | |||
====Processing Engine==== | |||
This system runs over the Spark computing cluster and oversees data consolidation processes for periodically aggregating data, also supporting the alert and recommendation subsystems. | |||
====Data Warehouses==== | |||
Data filtered by the Preprocessing Engine and enriched by the Processing Engine gets stored in the Big Data Warehouse, responsible for storing real-time information. Additionally, the Historic Data Warehouse stores aggregated historic data, which gets used by the Analytics Engine to identify new trends or trend shifts for the different quality metrics. | |||
====Analytics Engine==== | |||
This system runs the batch processes that will apply the statistical analysis methods, as well as machine learning algorithms over real-time big data. Along with the historic data, time series and ARIMA (autoregressive integrated moving average) techniques provide diagnosis of the temporal behavior of the model. This engine also implements a Bayes-based early alerts system (EAS) able to detect and predict a decrease in the service quality or efficiency metrics under a preset threshold, sending alerts in the form of push or email notifications. | |||
===Data visualization module=== | |||
This module provides a reporting dashboard that receives information from the big data platform in real time and displays two panels. The first panel shows the main quality and efficiency metrics in real time, along with its evolution over time and the quality thresholds. The second panel provides the diagnoses computed by the Analytics Engine, as well as intelligent recommendations to prevent reaching undesired situations, such as metrics falling below acceptable thresholds. | |||
The dashboard is implemented using the D3.js library, providing nice and intuitive visualizations. | |||
==Preprocessing Engine== | |||
The Preprocessing Engine performs the ETL process over the data, and this section describes how different data are extracted from the various sources, transformed and loaded as a part of this process. | |||
===Extraction=== | |||
This engine extracts the assistance call data by polling the AdvantCare module every five minutes, retrieving all data generated by all the rooms. Data from planned tours are retrieved daily also by polling the REST API, while patients’ satisfaction surveys are loaded as CSV files. | |||
===Transformation=== | |||
The Preprocessing Engine performs several transformation tasks so that data is in a suitable format to be handled by the Processing Engine and the Analytics Engine. | |||
====Assistance task events==== | |||
Assistance task events get transformed into MongoDB documents, where each event is stored in a different document, and all of them belong to the events collection. When one event | |||
status changes (e.g., from “activated” to “notified”), the document is updated to reflect these changes. | |||
Figure 2 shows a sample document representing an event. | |||
<pre>{ | |||
“_id”: ObjectId(“565c234f152aee26874d7a18”), | |||
“full_event”: true, | |||
“presence”: { | |||
“ev”: “EV PRES”, | |||
“ts”: ISODate(“2015-10-02T01:35:36.384Z”) | |||
}, | |||
“area”: “Madrid”, | |||
“notification” : { | |||
“ev”: “EV NOTIF”, | |||
“ts”: ISODate(“2015-10-02T01:32:21.984Z”) | |||
}, | |||
“room_number”: “126”, | |||
“location”: “PERA”, | |||
“activation” : { | |||
“week”: 40, | |||
“weekday”: 5, | |||
“user”: “Anonimo”, | |||
“hour”: 1, | |||
“minute”: 31, | |||
“year”: 2015, | |||
“month”: 10, | |||
“day”: 2, | |||
“ev”: “EV PERA”, | |||
“ts”: ISODate(“2015-10-02T01:31:45.696Z”) | |||
}, | |||
“room_letter”: “-”, | |||
“center”: “Aravaca”, | |||
“day_properties”: { | |||
“holiday_or_sunday”: true, | |||
“social_events”: true, | |||
“rain”: true, | |||
“extreme_heat”: true, | |||
“summer_vacation”: true, | |||
“holiday”: true, | |||
“weekend”: true, | |||
“friday_or_eve”: true | |||
}, | |||
“floor”: “1”, | |||
“times”: { | |||
“cancellation_notification”: 195, | |||
“used”: 194, | |||
“idle”: 36, | |||
“cancellation_activation”: 231, | |||
“total”: 230, | |||
“cancellation_presence”: 1 | |||
}, | |||
“hour_properties”: { | |||
“shift_change”: true, | |||
“shift”: “TARDE”, | |||
“sleeptime”: true, | |||
“nurse_count”: “8”, | |||
“dinnertime”: true, | |||
“lunchtime”: true | |||
}, | |||
“cancellation”: { | |||
“ev”: “EV CPRES”, | |||
“remote”: true, | |||
“ts”: ISODate(“2015-10-02T01:35:37.248Z”) | |||
} | |||
}</pre> | |||
{| | |||
| STYLE="vertical-align:top;"| | |||
{| border="0" cellpadding="5" cellspacing="0" width="400px" | |||
|- | |||
| style="background-color:white; padding-left:10px; padding-right:10px;"| <blockquote>'''Figure 2.''' Sample JSON document representing an assistance task event in the | |||
MongoDB events collection</blockquote> | |||
|- | |||
|} | |||
|} | |||
====Planned tours==== | |||
Data from planned tours are retrieved daily from AdvantCare using the REST API and are transformed to a MongoDB document in the ''shifts'' collection. A sample document is shown in Figure 3. | |||
<pre>{ | |||
“_id”: ObjectId(“569e50b1aa40450a027eb4ec”), | |||
“floor”: 3, | |||
“room”: 326, | |||
“date”: “1/10/15”, | |||
“hour”: “9:00:45”, | |||
“center_name”: “Aravaca”, | |||
“ts”: ISODate(“2015-10-01T09:00:45.000Z”), | |||
“shift_type”: “MAÑANA” | |||
} | |||
</pre> | |||
{| | |||
| STYLE="vertical-align:top;"| | |||
{| border="0" cellpadding="5" cellspacing="0" width="400px" | |||
|- | |||
| style="background-color:white; padding-left:10px; padding-right:10px;"| <blockquote>'''Figure 3.''' Sample JSON document representing a shift in the MongoDB ''shifts'' | |||
collection</blockquote> | |||
|- | |||
|} | |||
|} | |||
Revision as of 01:45, 24 May 2018
Full article title | DataCare: Big data analytics solution for intelligent healthcare management |
---|---|
Journal | International Journal of Interactive Multimedia and Artificial Intelligence |
Author(s) | Baldominos, Alejandro; de Rada, Fernando; Saez, Yago |
Author affiliation(s) | Universidad Carlos III de Madrid, Camilo José Cela University |
Primary contact | Email: abaldomi at inf dot uc3m dot es |
Year published | 2018 |
Volume and issue | 4(7) |
Page(s) | 13–20 |
DOI | 10.9781/ijimai.2017.03.002 |
ISSN | 1989-1660 |
Distribution license | Creative Commons Attribution 3.0 Unported |
Website | http://www.ijimai.org/journal/node/1621 |
Download | http://www.ijimai.org/journal/sites/default/files/files/2017/03/ijimai_4_7_2_pdf_16566.pdf (PDF) |
This article should not be considered complete until this message box has been removed. This is a work in progress. |
Abstract
This paper presents DataCare, a solution for intelligent healthcare management. This product is able not only to retrieve and aggregate data from different key performance indicators in healthcare centers, but also to estimate future values for these key performance indicators and, as a result, fire early alerts when undesirable values are about to occur or provide recommendations to improve the quality of service. DataCare’s core processes are built over a free and open-source cross-platform document-oriented database (MongoDB), and Apache Spark, an open-source cluster computing framework. This architecture ensures high scalability capable of processing very high data volumes coming at rapid speeds from a large set of sources. This article describes the architecture designed for this project and the results obtained after conducting a pilot in a healthcare center. Useful conclusions have been drawn regarding how key performance indicators change based on different situations, and how they affect patients’ satisfaction.
Keywords: Architecture, artificial intelligence, big data, healthcare, management
Introduction
When managing a healthcare center, there are many key performance indicators (KPIs) that can be measured, such as the number of events, the waiting time, the number of planned tours, etc. Often, keeping these KPIs within the expected limits is vital to achieving high user satisfaction.
In this paper we present DataCare, a solution for intelligent healthcare management. DataCare provides a complete architecture to retrieve data from sensors installed in the healthcare center, process and analyze it, and finally obtain relevant information, which is displayed in a user-friendly dashboard.
The advantages of DataCare are twofold: first, it is intelligent. Besides retrieving and aggregating data, the system is able to predict future behavior based on past events. This means that the system can fire early alerts when a KPI is expected to have a future value that falls outside the expected boundaries, and it can provide recommendations for improving the behavior and the metrics, or prevent future problems with attending events.
Second, the core system module is built on top of a big data platform. Processing and analysis are run over Apache Spark, and data are stored in MongoDB, thus enabling a highly scalable system that can process large volumes of data coming in at very high speeds.
This article will discuss many aspects of DataCare. The next section will present context for this research by analyzing the state of the art and related work. After that an overview of DataCare’s architecture will be presented, including the three main modules responsible for retrieving data, processing and analyzing it, and displaying the resulting valuable information.
After the architecture has been explained, the subsequent three sections will describe the preprocessing, processing, and analytics engines in further detail. The design of these systems is crucial to providing a scalable solution with an intelligent behavior. After discussing those engines in detail, the article will then describe the visual analytics engine and the different dashboards that are presented to users.
Finally, the penultimate section will describe how the solution has been validated, and the last section will provide some conclusive remarks, along with potential future work.
State of the art
Because healthcare services are very complex and life-critical, many works have tackled the design of healthcare management systems, aimed at monitoring metrics in order to detect undesirable behaviors that decrease their satisfaction or even threaten their safety.
Discussion on the design and implementation of the healthcare management system is not new. In the 2000s, Curtright et al.[1] described a system to monitor KPIs, summarizing them in a dashboard report, with a real-world application in the Mayo Clinic. Also, Griffith and King[2] proposed to establish a “championship” where those healthcare systems with consistently good metrics would help improve decision making processes.
Some of these works explore the sensing technology that enable proposals. For instance, Ngai et al.[3] focus on how RFID technology can be applied for building a healthcare management system, yet it is only implemented in a quasi real-world setting. Ting et al.[4] also focus on the application of RFID technology to such a project, from the perspective of its preparation, implementation, and maintenance.
Some previous works have also tackled the design of intelligent healthcare management systems. Recently Jalal et al.[5] have proposed an intelligent, depth video-based human activity recognition system to track elderly patients that could be used as part of a healthcare management and monitoring system. However, the paper does not explore this integration. Also, Ghamdi et al.[6] have proposed an ontology-based system for prediction of patients’ readmission within 30 days so that those readmissions can be prevented.
Regarding the impact of data in a healthcare management system, the importance of data-driven approaches has been addressed by Bossen et al..[7] Roberts et al.[8] have explored how to design healthcare management systems using a design thinking framework. Basole et al.[9] propose a web-based game using organizational simulation for healthcare management. Zeng et al.[10] have proposed an enhanced VIKOR method that can be used as a decision support tool in healthcare management contexts. A relevant work from Mohapatra[11] explores how a hospital information system is used for healthcare management, improving the KPIs; and a pilot has been conducted in Kalinga hospital (India), turning out to be beneficial for all stakeholders.
Some works have also explored how to increase patients’ satisfaction. For example, Fortenberry and McGoldrick[12] suggest improving the patient experience via internal marketing efforts, while Minniti et al.[13] propose a model in which patient feedback is processed in real time, driving rapid cycle improvement.
To place this work into its context, what we have developed is a data-driven intelligent healthcare management system. Because of the volume and velocity of big data, we have used a big data architecture based on the one proposed by Baldominos et al.[14], but updating the tools to use Apache Spark for the sake of efficiency. Also, a pilot has been conducted to evaluate the performance of the proposed system.
Overview of the architecture
DataCare’s architecture comprises three main modules: the first oversees retrieving and aggregating the information generated in the health center or hospital, the second processes and analyzes the data, and the third displays the valuable information in a dashboard, allowing the integration with external information systems.
Figure 1 depicts a broad overview of this architecture, while the following describes each of the modules in further detail.
|
Data retrieval and aggregation module
Data retrieval is carried out by AdvantCare software, developed by Itas Solutions S.L. AdvantCare is a set of hardware and software tools designed to manage communications between patients and healthcare staff. Its core comprises three main systems: 1) Buslogic manages and aggregates the information of actions carried out by nondoctor personnel (nurses and nursing assistants), 2) AdvantControl monitors and controls the infrastructure, and 3) EasyConf manages voice communication.
In hospital rooms, different data acquisition systems are placed, which often consist of hardware devices connected to an IP network and include one of the following elements:
- sensors such as thermometers or noise or light sensors measuring some current value or status either in a continuous or periodic fashion and sending it to Buslogic or AdvantControl servers;
- assistance devices such as buttons or pull handlers that are actioned by the patients and transmit the assistance call to the Buslogic server;
- voice and video communication systems that send and receive information from other devices or from Jitsi (SIP Communicator), which are handled by EasyConf; or
- data acquisition systems operated by means of graphical user interfaces in devices such as tablets, e.g., surveys or other information systems.
In general terms, the information retrieved by AdvantCare belongs to one of the following:
- Planned tours: Healthcare personnel will periodically visit certain rooms or patients as a part of a pre-established plan. Data about how shifts are carried out is essential to evaluate assistance quality and the efficiency of nurses and nursing assistants.
- Assistance tasks: Nurses and nursing assistants must perform certain tasks as a response to an assistance call. It would be great to know in advance these tasks, so they can be monitored properly.
- Patient satisfaction: The most important service quality subjective metric is the patient's satisfaction, which is obtained by mean of surveys.
As said before, AdvantCare software comprises three systems, as well as communication/integration interfaces.
Buslogic
This software oversees communication with the assistance call systems. It also handles GestCare and MediaCare, which are the systems used for tasks planning, personnel work schedules, patient information, satisfaction surveys, and entertainment. Buslogic retrieves core business information about the assistance process, including alerts, waiting times to assist patients, and achieved assistance objectives.
AdvantControl
This software controls and monitors the infrastructure and automation functionalities, including the status of lights, doors, or the DataCare infrastructure itself. It provides real-time alerts about possible quality of service issues.
EasyConf
This software manages SIP Communicator and provides data about calls such as the origin, the destination, and the total call duration.
Communication/Integration APIs
Data can be retrieved from AdvantCare servers by means of SOAP web services, which get used in those requests that require high processing capacity, and are stateless. Also, the information can be accessed via a REST application programming interface (API), where the calls are performed through HTTP requests, and data is exchanged in JSON-serialized format. REST servers are placed in the software servers themselves (either Buslogic, AdvantControl or EasyConf), thus allowing real-time queries, as well as parameter modifications. Finally, a TELNET channel will allow asynchronous communication to broadcast events from the servers to the connected clients.
Data processing and analysis module
The Data Processing and Analysis module is part of a big data platform based on Apache Spark[15], which allows an integrated environment for the development and exploitation of real time massive data analysis, outperforming other solutions such as Hadoop MapReduce or Storm, scaling out up to 10,000 nodes, providing fault tolerance[16] and allowing queries using a SQL-like language.
As shown in Figure 1, this module comprises four different systems: Preprocessing Engine, Processing Engine, Big Data and Historic Data Warehouses, and Analytics Engine.
Preprocessing Engine
This system performs the ETL (extract-transform-load) processes for the AdvantCare data. It first communicates with AdvantCare using the available APIs to retrieve the data, which later is transformed into a suitable format to be introduced to the Processing Engine. Because of the metadata provided by AdvantCare, the information can be classified to ease its analysis. Normalized and consolidated data gets stored in MongoDB, the leading free and open-source document-oriented database, where collections store both data for real time analysis as well as historic data to support batch analysis to compute the evolution of different metrics in time.
Processing Engine
This system runs over the Spark computing cluster and oversees data consolidation processes for periodically aggregating data, also supporting the alert and recommendation subsystems.
Data Warehouses
Data filtered by the Preprocessing Engine and enriched by the Processing Engine gets stored in the Big Data Warehouse, responsible for storing real-time information. Additionally, the Historic Data Warehouse stores aggregated historic data, which gets used by the Analytics Engine to identify new trends or trend shifts for the different quality metrics.
Analytics Engine
This system runs the batch processes that will apply the statistical analysis methods, as well as machine learning algorithms over real-time big data. Along with the historic data, time series and ARIMA (autoregressive integrated moving average) techniques provide diagnosis of the temporal behavior of the model. This engine also implements a Bayes-based early alerts system (EAS) able to detect and predict a decrease in the service quality or efficiency metrics under a preset threshold, sending alerts in the form of push or email notifications.
Data visualization module
This module provides a reporting dashboard that receives information from the big data platform in real time and displays two panels. The first panel shows the main quality and efficiency metrics in real time, along with its evolution over time and the quality thresholds. The second panel provides the diagnoses computed by the Analytics Engine, as well as intelligent recommendations to prevent reaching undesired situations, such as metrics falling below acceptable thresholds.
The dashboard is implemented using the D3.js library, providing nice and intuitive visualizations.
Preprocessing Engine
The Preprocessing Engine performs the ETL process over the data, and this section describes how different data are extracted from the various sources, transformed and loaded as a part of this process.
Extraction
This engine extracts the assistance call data by polling the AdvantCare module every five minutes, retrieving all data generated by all the rooms. Data from planned tours are retrieved daily also by polling the REST API, while patients’ satisfaction surveys are loaded as CSV files.
Transformation
The Preprocessing Engine performs several transformation tasks so that data is in a suitable format to be handled by the Processing Engine and the Analytics Engine.
Assistance task events
Assistance task events get transformed into MongoDB documents, where each event is stored in a different document, and all of them belong to the events collection. When one event status changes (e.g., from “activated” to “notified”), the document is updated to reflect these changes.
Figure 2 shows a sample document representing an event.
{ “_id”: ObjectId(“565c234f152aee26874d7a18”), “full_event”: true, “presence”: { “ev”: “EV PRES”, “ts”: ISODate(“2015-10-02T01:35:36.384Z”) }, “area”: “Madrid”, “notification” : { “ev”: “EV NOTIF”, “ts”: ISODate(“2015-10-02T01:32:21.984Z”) }, “room_number”: “126”, “location”: “PERA”, “activation” : { “week”: 40, “weekday”: 5, “user”: “Anonimo”, “hour”: 1, “minute”: 31, “year”: 2015, “month”: 10, “day”: 2, “ev”: “EV PERA”, “ts”: ISODate(“2015-10-02T01:31:45.696Z”) }, “room_letter”: “-”, “center”: “Aravaca”, “day_properties”: { “holiday_or_sunday”: true, “social_events”: true, “rain”: true, “extreme_heat”: true, “summer_vacation”: true, “holiday”: true, “weekend”: true, “friday_or_eve”: true }, “floor”: “1”, “times”: { “cancellation_notification”: 195, “used”: 194, “idle”: 36, “cancellation_activation”: 231, “total”: 230, “cancellation_presence”: 1 }, “hour_properties”: { “shift_change”: true, “shift”: “TARDE”, “sleeptime”: true, “nurse_count”: “8”, “dinnertime”: true, “lunchtime”: true }, “cancellation”: { “ev”: “EV CPRES”, “remote”: true, “ts”: ISODate(“2015-10-02T01:35:37.248Z”) } }
|
Planned tours
Data from planned tours are retrieved daily from AdvantCare using the REST API and are transformed to a MongoDB document in the shifts collection. A sample document is shown in Figure 3.
{ “_id”: ObjectId(“569e50b1aa40450a027eb4ec”), “floor”: 3, “room”: 326, “date”: “1/10/15”, “hour”: “9:00:45”, “center_name”: “Aravaca”, “ts”: ISODate(“2015-10-01T09:00:45.000Z”), “shift_type”: “MAÑANA” }
|
References
- ↑ Curtwright, J.W.; Stolp-Smith, S.C.; Edell, E.S. (2000). "Strategic performance management: Development of a performance measurement system at the Mayo Clinic". Journal of Healthcare Management 45 (1): 58–68. PMID 11066953.
- ↑ Griffith, J.R. (2000). "Championship management for healthcare organizations". Journal of Healthcare Management 45 (1): 17–30. PMID 11066948.
- ↑ Ngai. E.W.T.; Poon, J.K.L.; Suk, F.F.C.; Ng, C.C. (2009). "Design of an RFID-based Healthcare Management System using an Information System Design Theory". Information Systems Frontiers 11 (4): 405–417. doi:10.1007/s10796-009-9154-3.
- ↑ Ting, S.L.; Kwok, S.K.; Tsang, A.H.; Lee, W.B. (2011). "Critical elements and lessons learnt from the implementation of an RFID-enabled healthcare management system in a medical organization". Journal of Medical Systems 35 (4): 657–69. doi:10.1007/s10916-009-9403-5.
- ↑ Jalal, A.; Kamal, S.; Kim, D. (2017). "A Depth Video-based Human Detection and Activity Recognition using Multi-features and Embedded Hidden Markov Models for Health Care Monitoring Systems". International Journal of Interactive Multimedia and Artificial Intelligence 4 (4): 54–62. doi:10.9781/ijimai.2017.447.
- ↑ Ghamdi, H.A.; Alshammari, R.; Razzak, M.I. (2016). "An ontology-based system to predict hospital readmission within 30 days". International Journal of Healthcare Management 9 (4): 236–244. doi:10.1080/20479700.2016.1139768.
- ↑ Bossen, C.; Danholt, P.; Ubbesen, M.B. et al. (2016). "Challenges of Data-driven Healthcare Management: New Skills and Work". 19th ACM Conference on Computer-Supported Cooperative Work and Social Computing: 5. http://pure.au.dk/portal/da/publications/challenges-of-datadriven-healthcare-management-new-skills-and-work(fd56833b-db7b-44ed-b4fd-15882b382271).html.
- ↑ Roberts, J.P.; Fisher, T.R.; Trowbridge, M.J.; Bent, C. (2016). "A design thinking framework for healthcare management and innovation". Healthcare 4 (1): 11–14. doi:10.1016/j.hjdsi.2015.12.002. PMID 27001093.
- ↑ Basole, R.C.; Bodner, D.A.; Rouse, W.B. (2013). "Healthcare management through organizational simulation". Decision Support Systems 55 (2): 552–563. doi:10.1016/j.dss.2012.10.012.
- ↑ Zeng, Q.L.; Li, D.D.; Yang, Y.B. (2013). "VIKOR method with enhanced accuracy for multiple criteria decision making in healthcare management". Journal of Medical Systems 37 (2): 9908. doi:10.1007/s10916-012-9908-1. PMID 23377778.
- ↑ Mohapatra, S. (2015). "Using integrated information system for patient benefits: A case study in India". International Journal of Healthcare Management 8 (4): 262–71. doi:10.1179/2047971915Y.0000000007.
- ↑ Fortenberry Jr., J.L. (2015). "Internal marketing: A pathway for healthcare facilities to improve the patient experience". International Journal of Healthcare Management 9 (1): 28–33. doi:10.1179/2047971915Y.0000000014.
- ↑ Minniti, M.J.; Blue, T.R.; Freed, D.; Ballen, S. (2016). "Patient-Interactive Healthcare Management, a Model for Achieving Patient Experience Excellence". Healthcare Information Management Systems. Springer. pp. 257–281. doi:10.1007/978-3-319-20765-0_16. ISBN 9783319207650.
- ↑ Baldominos, A.; Albacete, E.; Saez, Y.; Isasi, P. (2014). "A scalable machine learning online service for big data real-time analysis". 2014 IEEE Symposium on Computational Intelligence in Big Data. doi:10.1109/CIBD.2014.7011537.
- ↑ Zaharia, M.; Chowdhury, M.; Franklin, M.J. et al. (2010). "Spark: Cluster computing with working sets". Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing: 10.
- ↑ Zaharia, M.; Chowdhury, M.; Das, T. et al. (2012). "Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing". Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation: 2.
Notes
This presentation is faithful to the original, with only a few minor changes to presentation. Grammar has been updated for clarity. In some cases important information was missing from the references, and that information was added. The original article lists references alphabetically, but this version — by design — lists them in order of appearance.