An efficient attention mechanism that groups similar tokens together to reduce computation, allowing the model to handle longer texts without excessive memory use.