Differentiable Normative Guidance for Nash Bargaining Solution Recovery

About

Autonomous artificial intelligence agents in negotiation systems must generate equitable utility allocations satisfying individual rationality (IR), ensuring each agent receives at least its outside option, and the Nash Bargaining Solution (NBS), which maximizes joint surplus. Existing generative models often learn suboptimal human behaviors, producing solutions far from Pareto efficiency, while classical methods require full Pareto frontier knowledge, which is unavailable in real datasets. We propose a guided graph diffusion framework that generates individually rational utility vectors while approximating the NBS without frontier knowledge at inference time. Negotiations are modeled as directed graphs with graph attention capturing asymmetric agent attributes, and a conditional diffusion model maps these to utility vectors. A differentiable composite guidance loss, applied in the final reverse diffusion steps, penalizes IR violations and Nash product gaps. We prove that, under sufficient penalty weighting, solutions enter the IR region in finite time. Across datasets, the method achieves 100% IR compliance. Nash efficiency reaches 99.45% on synthetic data (within 0.55 percentage points of an oracle), and 54.24% (CaSiNo) and 88.67% (Deal or No Deal), improving 20-60 percentage points over unconstrained generative baselines.

Moirangthem Tiken Singh, Surajit Borkotokey, Rajnish Kumar• 2026

Related benchmarks

Task	Dataset	Result
Bargaining Allocation	Synthetic NTU	Information Ratio100	8
Bargaining Allocation	CaSiNo	IR (%)100	8
Bargaining Allocation	Deal or No Deal	Information Ratio100	8

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord