Loss Functions for Multiset Prediction

About

We study the problem of multiset prediction. The goal of multiset prediction is to train a predictor that maps an input to a multiset consisting of multiple items. Unlike existing problems in supervised learning, such as classification, ranking and sequence generation, there is no known order among items in a target multiset, and each item in the multiset may appear more than once, making this problem extremely challenging. In this paper, we propose a novel multiset loss function by viewing this problem from the perspective of sequential decision making. The proposed multiset loss function is empirically evaluated on two families of datasets, one synthetic and the other real, with varying levels of difficulty, against various baseline loss functions including reinforcement learning, sequence, and aggregated distribution matching loss functions. The experiments reveal the effectiveness of the proposed loss function over the others.

Sean Welleck, Zixin Yao, Yu Gai, Jialin Mao, Zheng Zhang, Kyunghyun Cho• 2017

Related benchmarks

Task	Dataset	Result
Disease onset forecasting	Patient EHR dataset Knee OA 2-year horizon (test)	AUROC0.692	5
Disease onset forecasting	Patient EHR dataset COPD 2-year horizon (test)	AUROC63.7	5
Disease onset forecasting	Patient EHR dataset CHF 2-year horizon (test)	AUROC74.2	5
Disease onset forecasting	Patient EHR dataset Prostate Cancer 2-year horizon (test)	AUROC80.1	5
Disease onset forecasting	Patient EHR dataset Disease Suite Aggregate 2-year horizon (test)	AUROC0.665	5
Disease onset forecasting	EHR 5-year horizon (test)	Macro AUROC0.661	5
Disease onset forecasting	Patient EHR dataset Dementia 2-year horizon (test)	AUROC70	5
Disease onset forecasting	Patient EHR dataset Acute MI 2-year horizon (test)	AUROC58.6	5
Disease onset forecasting	Patient EHR dataset Pancreatic Cancer 2-year horizon (test)	AUROC0.494	5

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord