Measuring Successful Cooperation in Human-AI Teamwork: Development and Validation of the Perceived Cooperativity and Teaming Perception Scales
About
As human-AI cooperation becomes increasingly prevalent, reliable instruments for assessing the subjective quality of cooperative human-AI interaction are needed. We introduce two theoretically grounded scales: the Perceived Cooperativity Scale (PCS), grounded in joint activity theory, and the Teaming Perception Scale (TPS), grounded in evolutionary cooperation theory. The PCS captures an agent's perceived cooperative capability and practice within a single interaction sequence; the TPS captures the emergent sense of teaming arising from mutual contribution and support. Both scales were adapted for human-human cooperation to enable cross-agent comparisons. Across three studies (N = 409) encompassing a cooperative card game, LLM interaction, and a decision-support system, analyses of dimensionality, reliability, and validity indicated that both scales successfully differentiated between cooperation partners of varying cooperative quality and showed construct validity in line with expectations. The scales provide a basis for empirical investigation and system evaluation across a wide range of human-AI cooperation contexts.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Teaming Perception Scale (TPS) Construct Validity Analysis | Han-HH interaction context | PCS Spearman Rho0.82 | 3 | |
| Teaming Perception Scale (TPS) Construct Validity Analysis | Han-RB interaction context | PCS Spearman Rho0.83 | 3 | |
| Teaming Perception Scale (TPS) Construct Validity Analysis | Han-RL interaction context | PCS (Spearman rho)0.79 | 3 | |
| Teaming Perception Scale (TPS) Construct Validity Analysis | Focused interaction context | PCS (Spearman rho)0.77 | 3 | |
| Teaming Perception Scale (TPS) Construct Validity Analysis | Augmented interaction context | PCS (Spearman rho)0.67 | 3 | |
| Teaming Perception Scale (TPS) Construct Validity Analysis | Conflicted interaction context | PCS Spearman Rho0.71 | 3 | |
| Teaming Perception Scale (TPS) Construct Validity Analysis | Failed interaction context | PCS Spearman Rho0.77 | 3 | |
| Construct Validity Assessment | Hanabi Han-HH sample | SIPA Score0.76 | 1 | |
| Construct Validity Assessment | Hanabi Han-RB sample | SIPA Score0.62 | 1 | |
| Construct Validity Assessment | Hanabi Han-RL sample | Warmth Score0.43 | 1 |