News

Intuitively, relations among objects assist a model in performing inference under constrained environments. However, the top-down information flow in the Feature pyramid network (FPN) dilutes the ...
In this paper, we propose a scene-relation-level pre-training task by considering relations as more valuable modal connection bridges. Based on this, we construct a novel Visual-Language Scene ...