Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Differentiable Normative Guidance for Nash Bargaining Solution Recovery

About

Autonomous artificial intelligence agents in negotiation systems must generate equitable utility allocations satisfying individual rationality (IR), ensuring each agent receives at least its outside option, and the Nash Bargaining Solution (NBS), which maximizes joint surplus. Existing generative models often learn suboptimal human behaviors, producing solutions far from Pareto efficiency, while classical methods require full Pareto frontier knowledge, which is unavailable in real datasets. We propose a guided graph diffusion framework that generates individually rational utility vectors while approximating the NBS without frontier knowledge at inference time. Negotiations are modeled as directed graphs with graph attention capturing asymmetric agent attributes, and a conditional diffusion model maps these to utility vectors. A differentiable composite guidance loss, applied in the final reverse diffusion steps, penalizes IR violations and Nash product gaps. We prove that, under sufficient penalty weighting, solutions enter the IR region in finite time. Across datasets, the method achieves 100% IR compliance. Nash efficiency reaches 99.45% on synthetic data (within 0.55 percentage points of an oracle), and 54.24% (CaSiNo) and 88.67% (Deal or No Deal), improving 20-60 percentage points over unconstrained generative baselines.

Moirangthem Tiken Singh, Surajit Borkotokey, Rajnish Kumar• 2026

Related benchmarks

TaskDatasetResultRank
Bargaining AllocationSynthetic NTU
Information Ratio100
8
Bargaining AllocationCaSiNo
IR (%)100
8
Bargaining AllocationDeal or No Deal
Information Ratio100
8
Showing 3 of 3 rows

Other info

Follow for update