Eval Explorer
NanoBEIR
Subset
Metric
NDCG
MRR
Accuracy
Recall
K
1
3
10
Base Model
Model Groups
Display
Std bars
Group lines
Model Comparison
Size vs Perf
All Metrics Table
Query Browser
Significance
Metric@K Trend
Training Trajectory
Show as compression ratio
Metrics
K values
Embedding size (MB)
Standard deviation
Aggregate metrics by model
Queries
←
Select a query to view per-model rankings
Subset
Metric
Wilcoxon
Paired t-test
p < 0.05 = significant
↓ Click a cell for normality check
Normality Check · Differences (A − B)
—
—
✕
Q-Q Plot (vs Normal)
Distribution of Differences
Metric
NDCG
MRR
Accuracy
Subset
Metric
NDCG
MRR
Accuracy
@K
1
3
5
10
Subset