Attention mechanism that processes both spatial (image) and temporal (time) dimensions to understand relationships across frames.