Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Motion-to-Response Content Generation via Multi-Agent AI System with Real-Time Safety Verification

About

This paper proposes a multi-agent artificial intelligence system that generates response-oriented media content in real time based on audio-derived emotional signals. Unlike conventional speech emotion recognition studies that focus primarily on classification accuracy, our approach emphasizes the transformation of inferred emotional states into safe, age-appropriate, and controllable response content through a structured pipeline of specialized AI agents. The proposed system comprises four cooperative agents: (1) an Emotion Recognition Agent with CNN-based acoustic feature extraction, (2) a Response Policy Decision Agent for mapping emotions to response modes, (3) a Content Parameter Generation Agent for producing media control parameters, and (4) a Safety Verification Agent enforcing age-appropriateness and stimulation constraints. We introduce an explicit safety verification loop that filters generated content before output, ensuring compliance with predefined rules. Experimental results on public datasets demonstrate that the system achieves 73.2% emotion recognition accuracy, 89.4% response mode consistency, and 100% safety compliance while maintaining sub-100ms inference latency suitable for on-device deployment. The modular architecture enables interpretability and extensibility, making it applicable to child-adjacent media, therapeutic applications, and emotionally responsive smart devices.

HyeYoung Lee• 2026

Related benchmarks

TaskDatasetResultRank
Emotion RecognitionRAVDESS (test)
Accuracy0.785
17
Content Parameter GenerationExpert-designed targets (test)
MAE0.05
7
Emotion RecognitionIEMOCAP (test)
Accuracy73.2
7
Inference LatencyInternal audio-based emotional signals (test)
Mean Latency12.3
4
Safety verificationSafety Verification scenarios (test)
Pass Rate100
4
Emotion RecognitionSynthetic (test)
Accuracy89.3
3
Emotion RecognitionCross-domain (test)
Accuracy61.4
3
Emotion RecognitionAIHub (test)
Accuracy75.1
2
Response Mode PredictionResponse Mode Prediction dataset (test)
Accuracy89.4
1
Showing 9 of 9 rows

Other info

Follow for update