1. University-1652

University-1652 is a multi-view multi-source benchmark for drone-based geo-localization that contains 1652 buildings of 72 universities around the world. We provide images collected from the virtual drone, the satellite and the ground.

[Paper] [Slide] [Dataset] [Explore Drone-view Data] [Explore Satellite-view Data] [Explore Street-view Data] [Video Sample] [中文介绍]

Task 1: Drone-view target localization. (Drone -> Satellite)} Given one drone-view image or video, the task aims to find the most similar satellite-view image to localize the target building in the satellite view.

Task 2: Drone navigation. (Satellite -> Drone)} Given one satellite-view image, the drone intends to find the most relevant place (drone-view images) that it has passed by. According to its flight history, the drone could be navigated back to the target place.

2. DG-Market

We provide our generated images and make a large-scale synthetic dataset called DG-Market. This dataset is generated by our DG-Net ( and consists of 128,307 images (613MB), about 10 times larger than the training set of original Market-1501 (even much more can be generated with DG-Net). It can be used as a source of unlabeled training dataset for semi-supervised learning. You may download the dataset from Google Drive (or Baidu Disk password: qxyh).

  DG-Market Market-1501 (training)
#identity - 751
#images 128,307 12,936

3. VSPW: A large-scale dataset for video scene parsing in the wild

1001623303486_ pic_hd [Project Page]

  1. Large Scale: 251,632 pixel-level annotated frames from 124 categories, 3,536 videos from 231 scenarios (indoor and outdoor).
  2. Well-trimmed long-temporal clips: a complete shot lasting 5 seconds on average.
  3. Dense annotation: The pixel-level annotations are provided at 15 f/s.
  4. High resolution. Over 96% videos are with high resolutions from 720P to 4K.

4. Awesome Lists