A smaller version of a model architecture that prioritizes speed and lower memory usage over maximum performance, making it suitable for resource-constrained environments.