Modeling Hierarchical Structures with Continuous Recursive Neural Networks

About

Recursive Neural Networks (RvNNs), which compose sequences according to their underlying hierarchical syntactic structure, have performed well in several natural language processing tasks compared to similar models without structural biases. However, traditional RvNNs are incapable of inducing the latent structure in a plain text sequence on their own. Several extensions have been proposed to overcome this limitation. Nevertheless, these extensions tend to rely on surrogate gradients or reinforcement learning at the cost of higher bias or variance. In this work, we propose Continuous Recursive Neural Network (CRvNN) as a backpropagation-friendly alternative to address the aforementioned limitations. This is done by incorporating a continuous relaxation to the induced structure. We demonstrate that CRvNN achieves strong performance in challenging synthetic tasks such as logical inference and ListOps. We also show that CRvNN performs comparably or better than prior latent structure models on real-world tasks such as sentiment analysis and natural language inference.

Jishnu Ray Chowdhury, Cornelia Caragea• 2021

Related benchmarks

Task	Dataset	Result
Natural Language Inference	SNLI	Accuracy85.3	196
Natural Language Inference	MNLI (matched)	Accuracy72.2	110
Natural Language Inference	MNLI (mismatched)	Accuracy72.6	68
Natural Language Inference	SNLI hard 1.0 (test)	Accuracy70.6	27
Sentiment Classification	SST2 phrase	Accuracy88.3	16
Paraphrase Detection	PAWS QQP	Accuracy34.8	16
Paraphrase Detection	PAWS Wiki	Accuracy46.6	12
Paraphrase Detection	QQP IID	Accuracy84.8	8
Sentiment Classification	IMDB (Contrast)	Accuracy77.8	8
Sentiment Classification	IMDB Counterfactual	Accuracy85.38	8

Showing 10 of 18 rows

Other info

Follow for update

@wizwand_team Discord