publications
2024
- ConferenceProbabilistic Modeling for Sequences of Sets in Continuous-TimeYuxin Chang, Alex Boyd, and Padhraic SmythIn International Conference on Artificial Intelligence and Statistics (AISTATS) 2024[Oral Presentation]
Neural marked temporal point processes have been a valuable addition to the existing toolbox of statistical parametric models for continuous-time event data. These models are useful for sequences where each event is associated with a single item (a single type of event or a “mark”) – but such models are not suited for the practical situation where each event is associated with a set of items. In this work, we develop a general framework for modeling set-valued data in continuous-time, compatible with any intensity-based recurrent neural point process model. In addition, we develop inference methods that can use such models to answer probabilistic queries such as “the probability of item 𝐴 being observed before item 𝐵,” conditioned on sequence history. Computing exact answers for such queries is generally intractable for neural models due to both the continuous-time nature of the problem setting and the combinatorially-large space of potential outcomes for each event. To address this, we develop a class of importance sampling methods for querying with set-based sequences and demonstrate orders-of-magnitude improvements in efficiency over direct sampling via systematic experiments with four real-world datasets. We also illustrate how to use this framework to perform model selection using likelihoods that do not involve one-step-ahead prediction.
2023
- ConferenceInference for Mark-Censored Temporal Point ProcessesAlex Boyd, Yuxin Chang, Stephan Mandt, and Padhraic SmythIn Uncertainty in Artificial Intelligence (UAI) 2023[Spotlight Presentation]
Marked temporal point processes (MTPPs) are a general class of stochastic models for modeling the evolution of events of different types (“marks”) in continuous time. These models have broad applications in areas such as medical data monitoring, financial prediction, user modeling, and communication networks. Of significant practical interest in such problems is the issue of missing or censored data over time. In this paper, we focus on the specific problem of inference for a trained MTPP model when events of certain types are not observed over a period of time during prediction. We introduce the concept of mark-censored sub-processes and use this framework to develop a novel marginalization technique for inference in the presence of censored marks. The approach is model-agnostic and applicable to any MTPP model with a well-defined intensity function. We illustrate the flexibility and utility of the method in the context of both parametric and neural MTPP models, with results across a range of datasets including data from simulated Hawkes processes, self-correcting processes, and multiple real-world event datasets.
- ConferenceProbabilistic Querying of Continuous-Time Event SequencesAlex Boyd, Yuxin Chang, Stephan Mandt, and Padhraic SmythIn International Conference on Artificial Intelligence and Statistics (AISTATS) 2023
Continuous-time event sequences, i.e., sequences consisting of continuous time stamps and associated event types (“marks”), are an important type of sequential data with many applications, e.g., in clinical medicine or user behavior modeling. Since these data are typically modeled in an autoregressive manner (e.g., using neural Hawkes processes or their classical counterparts), it is natural to ask questions about future scenarios such as “what kind of event will occur next” or “will an event of type 𝐴 occur before one of type 𝐵.” Addressing such queries with direct methods such as naive simulation can be highly inefficient from a computational perspective. This paper introduces a new typology of query types and a framework for addressing them using importance sampling. Example queries include predicting the 𝑛th event type in a sequence and the hitting time distribution of one or more event types. We also leverage these findings further to be applicable for estimating general “𝐴 before 𝐵” type of queries. We prove theoretically that our estimation method is effectively always better than naive simulation and demonstrate empirically based on three real-world datasets that our approach can produce orders of magnitude improvements in sampling efficiency compared to naive methods.
- ConferenceFair Survival Time Prediction via Mutual Information MinimizationHyungrok Do, Yuxin Chang, Yoon Sang Cho, Padhraic Smyth, and Judy ZhongIn Machine Learning for Healthcare (MLHC) 2023
Survival analysis is a general framework for predicting the time until a specific event occurs, often in the presence of censoring. Although this framework is widely used in practice, few studies to date have considered fairness for time-to-event outcomes, despite recent significant advances in the algorithmic fairness literature more broadly. In this paper, we propose a framework to achieve demographic parity in survival analysis models by minimizing the mutual information between predicted time-to-event and sensitive attributes. We show that our approach effectively minimizes mutual information to encourage statistical independence of time-to-event predictions and sensitive attributes. Furthermore, we propose four types of disparity assessment metrics based on common survival analysis metrics. Through experiments on multiple benchmark datasets, we demonstrate that by minimizing the dependence between the prediction and the sensitive attributes, our method can systematically improve the fairness of survival predictions and is robust to censoring.