Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Dialogue Generation on DSTC-8 Reddit (test)
Loading...
12.56
R-L Score
SECOND THOUGHTS (AEM + VM default)
7.1
8.5175
9.935
11.3525
Jan 1, 2023
R-L Score
Perplexity
Updated 1mo ago
Evaluation Results
Method
Method
Links
R-L Score
Perplexity
SECOND THOUGHTS (AEM + VM default)
variant=AEM + VM (defa...
2023.01
12.56
12.4
SECOND THOUGHTS (AEM + AIL)
variant=AEM + AIL
2023.01
11.31
12.85
SECOND THOUGHTS (AEM Only)
variant=AEM Only
2023.01
9.8
11.56
InstructGPT
Service=Huge LM API se...
2023.01
8.8
10.57
GPT-3
Service=Huge LM API se...
2023.01
7.31
11.44
Feedback
Search any
task
Search any
task