Using adaptive convolution kernels guided by object size proportions, combined with transformer-based backbones, significantly improves detection of objects at different scales in satellite imagery.
RDNet improves salient object detection in satellite images by replacing traditional CNN backbones with SwinTransformer and adding three specialized modules that adapt to different object sizes and use frequency analysis to better understand context. This solves the problem of detecting objects of varying scales in remote sensing imagery more accurately than existing methods.