What’s the Point? Spatial Grammar & Index Resolution for Sign Language Recognition

less than 1 minute read

Published: June 18, 2026

What's the Point? Spatial Grammar & Index Resolution for Sign Language Recognition is available on arXiv.

Venue: arXiv preprint
Authors: Ranum, O., Hadfield, S., Bowden, R.

🌐 Project Page · 📄 arXiv

Abstract: Sign language models are predominantly trained with gloss-sequence or text supervision, thereby under-modeling non-lexical and productive constructions. One comparatively tractable instance is spatial indexing: pointing gestures that assign discourse entities to spatial loci for subsequent co-reference, which lexicon-centric objectives largely fail to capture. We present a targeted evaluation of indexing in Sign Language Recognition, showing that despite comprising 10-15% of signing content, indexing is poorly recovered. We introduce a framework for training and evaluating indexing experts, establishing a baseline for index-aware sign language modeling. Our approach decomposes spatial reference resolution into index detection and discourse entity linking. The resulting mention representations enable automatic annotation and non-lexical structure modeling, and serve as an auxiliary indexing expert that augments a frozen SLR model at inference time.

Share on

Twitter Facebook LinkedIn

Oline Ranum

What’s the Point? Spatial Grammar & Index Resolution for Sign Language Recognition

Share on

You May Also Enjoy

BackTranslation2.0 accepted at ECCV 2026

SignGPT and the Visual Language Toolkit

NGT200: Geometric Multi-View Isolated Sign Recognition

The 3D-LEX Dataset