Sign Language Recognition Benchmarks
Compiled from published papers. Only well-attested numbers are included. Marked with ⚠️ if approximate from memory—verify against the source paper.
Metrics: WER = Word Error Rate (lower is better), BLEU = BLEU score (higher is better), Acc = Top-1 Accuracy (%).
Isolated Sign Language Recognition
AUTSL (Turkish Sign Language) — 226 Classes
WLASL (American Sign Language) — 2000 Glosses
| Method |
Modality |
Top-1 Acc (%) |
Top-5 Acc (%) |
Source |
| I3D |
RGB |
62.8 ⚠️ |
83.5 ⚠️ |
Li et al. 2020 |
| VAC + I3D |
RGB |
66.2 ⚠️ |
— |
Literature |
| SOTA (various) |
RGB |
~70–75 ⚠️ |
— |
Recent papers |
MS-ASL (American Sign Language) — 1000 Classes
Continuous Sign Language Recognition
RWTH-PHOENIX-2014 (DGS, Weather) — 1087 Glosses
| Method |
Year |
WER (%) |
Source |
| CTC + CNN + BiLSTM (Koller et al.) |
2019 |
~22.0 ⚠️ |
Koller et al., IEEE PAMI 2019 |
| ReLU + BiLSTM (Koller et al.) |
2019 |
21.8 ⚠️ |
Koller et al., IEEE PAMI 2019 |
| DenseNet + BiLSTM |
2019 |
24.0 ⚠️ |
Various |
| VAC (Visual Alignment Constraint) |
2021 |
~20.3 ⚠️ |
Cheng et al., ICCV 2021 |
| Squeeze-and-Excitation + CTC |
2021 |
20.8 ⚠️ |
Various |
| SlowFast + CTC |
2024 |
~18.0 ⚠️ |
Ahn et al., ICASSP 2024 |
Note: PHOENIX-2014 WER is reported on the development set in many papers. Test set WERs may differ. Verify set usage in each paper.
CSL-Daily (Chinese Sign Language) — 200 Glosses
| Method |
Year |
WER (%) |
Source |
| 2S-AGCN |
2021 |
24.4 ⚠️ |
[Zhou et al., CVPR 2021] |
| VAC |
2021 |
~23.0 ⚠️ |
Literature |
| SlowFast + CTC |
2024 |
~19.7 ⚠️ |
Ahn et al., ICASSP 2024 |
How2Sign (ASL, Continuous) — ~350 Glosses
Note: How2Sign is significantly harder than PHOENIX due to larger vocabulary, longer sentences, and more varied content. WERs are substantially higher.
Sign Language Translation
RWTH-PHOENIX-2014T (DGS → German)
| Method |
Year |
BLEU-4 (%) |
Source |
| Neural SLT (Camgoz et al., gloss-based) |
2018 |
19.3 ⚠️ |
Camgoz et al., CVPR 2018 |
| STMC-Transformer (gloss-based) |
2020 |
26.0 ⚠️ |
Yin & Read, COLING 2020 |
| Sign Language Transformers (end-to-end) |
2020 |
~22.0 ⚠️ |
[Camgoz et al., CVPR 2020] |
| Multi-Modality Transfer Learning |
2022 |
34.3 ⚠️ |
Chen et al., CVPR 2022 |
| Recent SOTA |
2024+ |
~38–42 ⚠️ |
Recent papers |
How2Sign (ASL → English)
CSL-Daily (CSL → Chinese)
Notes on Comparing Results
⚠️ Caution when comparing numbers across papers:
- Different papers may use different train/test splits
- Some report on development set, others on test set
- PHOENIX-2014 has multiple common experimental setups
- Pre-processing (cropping, resolution) varies
- Modalities differ (RGB only vs. RGB+optical flow vs. pose)
- Gloss vocabulary sizes differ across datasets
Recommended approach: Always compare within the same paper’s table when possible, and verify experimental setup.
Last updated: March 2026. Numbers marked ⚠️ are from memory of published work—please verify against the linked source papers. Contributions and corrections welcome via pull request.