Detecting and Categorizing Conflicts from Textual Health Advice to Augment Health Safety

Preclude Example
A sample conflict between a pair of advice coming from a general weight loss app and WebMD
Health Apps Result in Conflicts
Semantic decomposition of textual health advice

With the rapid digitalization of the health sector, people often turn to mobile apps and online health websites for health advice. Health advice generated from different sources can be conflicting as they address different aspects of health (e.g., weight loss, diet, disease) or as they are unaware of the context of a user (e.g., age, gender, physiological condition). Conflicts can occur due to lexical features, (such as, negation, antonyms, or numerical mismatch) or can be conditioned upon time and/or physiological status. We formulate the problem of finding conflicting health advice and develop a comprehensive taxonomy of conflicts. While a similar research area in the natural language processing domain explores the problem of textual contradiction identification, finding conflicts in health advice poses its own unique lexical and semantic challenges. These include large structural variation between text and hypothesis pairs, finding conceptual overlap between pairs of advice, and inference of the semantics of an advice (i.e., what to do, why and how). Hence, we develop Preclude, a novel semantic rule-based solution to detect conflicting health advice derived from heterogeneous sources utilizing linguistic rules and external knowledge bases. As our solution is interpretable and comprehensive, it can guide users towards conflict resolution too. We evaluate Preclude using 1156 real advice statements covering 8 important health topics that are collected from smart phone health apps and popular health websites. Preclude results in 90% accuracy and outperforms the accuracy and F1 score of the baseline approach by about 1.5 times and 3 times, respectively.

Currently, I am focusing on developing a robust, parametric solution to detect conflicts from health advice using structured prediction algorithms and deep learning. I am also working on automatically detecting the degree of severity of a conflict based on the linguistic features of a textual health advice, e.g., degree of negativity or positivity, degree of risk in case of non-adherence, suggested intervention, etc. Automatically detecting severity of a conflict can differentiate the major conflicts from the minor ones and reduce patients’ anxiety and help them to prioritize on which conflicts to resolve first.

[PerCom'17] [IPSN'17]

Personalized Conflict Detection in Heterogeneous Health and Wellness Applications

Dependency among multiple activities
Drug usage guidelines often poses different dependencies among activities of daily life
Sources of conflicts
Conflicts may occur among health advice suggested by health applications and drug usage guidelines

Conflicting health information is a primary barrier of self-management of chronic diseases. Increasing number of people now rely on mobile health apps and online health websites to meet their information needs and often receive conflicting health advice from these sources. This problem is more prevalent and severe in the setting of multi-morbidities. In addition, often medical information can be conflicting with regular activity patterns of an individual.

In this work, we formulate the problem of finding conflicts in heterogeneous health applications including health websites, health apps, online drug usage guidelines, and daily activity logging applications. We develop a comprehensive taxonomy of conflicts based on the semantics of textual health advice and activities of daily living. Finding conflicts in health applications poses its own unique lexical and semantic challenges. These include large structural variation between pairs of textual advice, finding conceptual overlap between pairs of advice, inferring the semantics of an advice (i.e., what to do, why and how) and activities, and aligning activities suggested in advice with the activities of daily living based on their underlying dependencies and polarity. Hence, we develop Preclude2, a novel semantic rule-based solution to detect conflicts in activities and health advice derived from heterogeneous sources. Preclude2 utilizes linguistic rules and external knowledge bases to infer advice. In addition, Preclude2 considers personalization and context-awareness while detecting conflicts. We evaluate Preclude2 using 1156 real advice statements covering 8 important health topics, 90 online drug usage guidelines, 1124 online disease specific health advice covering 34 chronic diseases, and 2 activity datasets. The evaluation is personalized based on 34 real prescriptions. Preclude2 detects direct, conditional, sub-typical, quantitative, and temporal conflicts from 2129 advice statements with 0.91, 0.83, 0.98, 0.85 and 0.98 recall, respectively. Overall, it results in 0.88 recall for detecting inter advice conflicts and 0.89 recall for detecting activity–advice conflicts. We also demonstrate the effects of personalization and context-awareness in conflict detection from heterogeneous health applications.

[Pervasive and Mobile Computing Journal, 2017][Dataset]

CognitiveEMS: A Wearable Cognitive Assistant Systems for Emergency Response Decision Support

The overview of our proposed wearable cognitive assistant

The main objective of this project is to develop a cognitive assistant system that improves situational awareness and safety of emergency responders by real-time collection and analysis of data from incident scene and providing dynamic data-driven feedback to them. We are developing a natural language processing pipeline that will automatically analyze spoken language data collected from the incident scene, the patient’s physiological data (e.g., vital signs, symptoms), the patient’s past medical history, emergency protocol database, and medical knowledge bases to recommend relevant interventions according to standard emergency medical services (EMS) protocols that can aid the decision-making process of the emergency responders. The recommendations include what safe actions to perform at the scene in real-time to aid the patient and improve the outcome of the emergency incident, what information to record for future references (i.e., recommendation to fill out EMS incident reports and forms). The objective of such recommendations is to reduce the cognitive overload of the first responders by automatically analyzing and recording important information from unstructured heterogeneous data-streams collected from the emergency scene.

To enable such a cognitive assistant, we need to extract the safety-critical concepts form the spoken language collected at the scene that trigger one or more EMS protocols. This task is similar to the traditional task of concept extraction from the domains of natural language processing and information retrieval. This task poses several technical challenges including, lexical variation of a concept, domain mismatch (from general to EMS domain), lack of annotated data and resource constraint of the target application. I have developed a weakly-supervised knowledge-integrated, data-driven concept extraction approach to extract these safety-critical concepts from the spoken language collected at the emergency scene. The approach relies on minimal annotated data and domain expertise. Based on the extensive evaluation performed on a EMS corpus of over 9000 real EMS narrations, it outperforms the state-of-the-art medical concept extraction solution for concept extraction for EMS data. Specifically, our experimental results show that on average our solution achieves 0.84 recall and 0.83 F1-score for EMS concept extraction and outperforms a state-of-the-art supervised medical concept extraction tool with three times increase in F1-score and 22% increase in recall. The concepts extracted by this approach can enable the cognitive assistant to model and execute EMS protocols and automatically generate suggestion for filling EMS incident reports.

[ACM SIGBED, 2017][ICCPS, 2018]

MAPer: A Multi-scale Adaptive Personalized Model for Temporal Human Behavior Prediction

Sample behavior matrix
An example of behavior sample matrix representa- tion: demonstrating lag and cycle for modeling an individual user’s behavior in the context of Twitter usage. The cells are darkened in proportion to the number of tweets at the corresponding time interval.

The primary objective of this research is to develop a simple and interpretable predictive framework to perform temporal modeling of individual user’s behavior traits based on each person’s past observed traits/behavior. Individual-level hu- man behavior patterns are possibly influenced by various temporal features (e.g., lag, cycle) and vary across tempo- ral scales (e.g., hour of the day, day of the week). Most of the existing forecasting models do not capture such multi- scale adaptive regularity of human behavior or lack inter- pretability due to relying on hidden variables. Hence, we build a multi-scale adaptive personalized (MAPer) model that quantifies the effect of both lag and behavior cy- cle for predicting future behavior. MAper includes a novel basis vector to adaptively learn behavior patterns and cap- ture the variation of lag and cycle across multi-scale tempo- ral contexts. We also extend MAPer to capture the inter- action among multiple behaviors to improve the prediction performance. We demonstrate the effectiveness of MAPer on four real datasets representing different behavior domains, including, habitual behavior collected from Twitter, need based behav- ior collected from search logs, and activities of daily living collected from a single resident and a multi-resident home. Experimental results indicate that MAPer significantly im- proves upon the state-of-the-art and baseline methods and at the same time is able to explain the temporal dynamics of individual-level human behavior.

Demonstrating interpretability of MAPer: how it explains effect of different parameters using heatmap for a set of randomly chosen 100 search log users. Each row represents a user and each column of a row represents the significance of a parameter for modeling the users behavior. Red (the darker shade) and green indicate significance and insignificance of corresponding parameters, respectively. Users who are similar in terms of their behavior model (i.e., set of learned parameters from training model), are clustered together. The left and right images represent the heatmaps from autoregressive model and MAPer model, respectively. MAPer does not only capture the significance of lag and cycle, but also the significance of hour of the day and day of the week. Thus MAPer not only predicts future behavior but also provides a set of temporal contexts when the prediction accuracy of behavior is higher.
In addition, MAPer yields explanatory results which can be useful for several interesting applications. Such as, detecting user community or recommending friends based on similarity of temporal behavior. MAPer can be paired with text mining approaches to provide personalized information retrieval, i.e., retrieving entertainment related queries during weekends. Other potential application domains where our model is applicable include personalized human activity understanding, temporal user profiling, or social recommendations from an individual user’s online behavior.


Holmes: A Comprehensive Anomaly Detection System for Daily In-home Activities

Holmes framework
Holmes Framework for Anomaly Detection

Advances in wireless sensor networks have enabled the mon- itoring of daily activities of elderly people. The goal of these monitoring applications is to learn normal behavior in terms of daily activities and look for any deviation, i.e., anoma- lies, so that alerts can be sent to relatives or caregivers. However, human behavior is very complex, and many ex- isting anomaly detection systems are too simplistic which cause many false alarms, resulting in unreliable systems. We present Holmes, a comprehensive anomaly detection system for daily in-home activities. Holmes accurately learns a res- ident’s normal behavior by considering variability in daily activities based not only on a per day basis, but also consid- ering specific days of the week, di↵erent time periods such as per week and per month, and collective, temporal, and corre- lation based features. This approach of learning complicated normal behaviors reduces false alarms. Also, based on resi- dent and expert feedback, Holmes learns semantic rules that explain specific variations of activities in specific scenarios to further reduce false alarms. We evaluate Holmes using data collected from our own deployed system, public data sets, and data collected by a senior safety system provider company from an elderly resident’s home. Our evaluation shows that compared to state of the art systems, Holmes reduces false positives and false negatives by at least 46% and 27%, respectively.


Safety-Aware Decision Making in Smart City Applications

CityGuard acts as the connecting layer between the infrastructure layer consisting of the sensors and actuators and software layer consisting of IoT platforms and individual smart services. CityGuard intercepts the actions from smart services, detects whether potential conflicts exist ahead of time and provides resolution whenever possible so that the infrastructure layer can operate with increased safety and efficiency.

Cities are projected to be smart in the near future by using IoT platforms and services that are dedicated to improve the performance of different domains, e.g., transportation, emergency, environment, public safety, etc. In this project, we consider safety control in the context of integrated smart services in terms of potential run time conflicts among services and violation of safety thresholds resulting from the simultaneous operations of different services. We have developed, CityGuard, a watch-dog system with sophisticated safety requirements that can be embedded between the infrastructure layer and services of a smart city. To the best of our knowledge, it is the first architecture that detects and resolves conflicts among actions of different services considering both safety and performance requirements. To start with, safety and performance requirements and a spectrum of conflicts are specified. Sophisticated models are used to analyze secondary effects, and detect device and environmental conflicts. A simulation based on New York City is used for the evaluation. The results show that CityGuard (i) identifies unsafe actions and thus helps to prevent the city from safety hazards, (ii) detects and resolves two major types of conflicts, i.e., device and environmental conflicts, and (iii) improves the overall city performance.


An Annotation Tool and Corpus for Drug Usage Guidelines

Sample drug usage guidelines
An excerpt from the drug usage guideline document of the drug Warfarin. Text underlined in blue, green, and red indicate advice related to drug administration, food interaction, and pregnancy, respectively.

Adherence to drug usage guidelines for prescription and over-the-counter drugs is critical for drug safety and effectiveness of treatment. Drug usage guideline documents contain advice on potential drug-drug interaction, drug-food interaction, and drug administration process. Current research on drug safety and public health indicates patients are often either unaware of such critical advice or overlook them. Categorizing advice statements from these documents according to their topics can enable the patients to find safety critical information. However, automatically categorizing drug usage guidelines based on their topic is an open challenge and there is no annotated dataset on drug usage guidelines. To address the latter issue, this paper presents (i) an annotation scheme for annotating safety critical advice from drug usage guidelines, (ii) an annotation tool for such data, and (iii) an annotated dataset containing drug usage guidelines from 90 drugs. This work is expected to accelerate further release of annotated drug usage guideline datasets and research on automatically filtering safety critical information from these textual documents.

[LREC’18] [dataset]