
Name: geomapnet

Owner: NVIDIA Research Projects

Description: Geometry-Aware Learning of Maps for Camera Localization (CVPR2018)

Created: 2018-04-06 20:15:13.0

Updated: 2018-05-14 18:19:42.0

Pushed: 2018-05-12 03:50:38.0

Homepage: https://goo.gl/mRB3Au.

Size: 2542

Language: Python

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits


License CC BY-NC-SA 4.0 Python 2.7

Geometry-Aware Learning of Maps for Camera Localization


Copyright © 2018 NVIDIA Corporation. All rights reserved. Licensed under the CC BY-NC-SA 4.0 license (https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode).


This is the PyTorch implementation of our CVPR 2018 paper

Samarth Brahmbhatt, Jinwei Gu, Kihwan Kim, James Hays, and Jan Kautz. Geometry-Aware Learning of Maps for Camera Localization. CVPR 2018..

A four-minute video summary (click below for the video)



MapNet uses a Conda environment that makes it easy to install all dependencies.

  1. Install miniconda with Python 2.7.

  2. Create the mapnet Conda environment: conda env create -f environment.yml.

  3. Activate the environment: conda activate mapnet_release.


We support the 7Scenes and Oxford RobotCar datasets right now. You can also write your own PyTorch dataloader for other datasets and put it in the dataset_loaders directory. Refer to this README file for more details.

The datasets live in the data/deepslam_data directory. We provide skeletons with symlinks to get you started. Let us call your 7Scenes download directory 7SCENES_DIR and your main RobotCar download directory (in which you untar all the downloads from the website) ROBOTCAR_DIR. You will need to make the following symlinks:

cd data/deepslam_data && ln -s 7SCENES_DIR 7Scenes && ln -s ROBOTCAR_DIR RobotCar_download

Special instructions for RobotCar: (only needed for RobotCar data)
  1. Download this fork of the dataset SDK, and run cd scripts && ./make_robotcar_symlinks.sh after editing the ROBOTCAR_SDK_ROOT variable in it appropriately.

  2. For each sequence, you need to download the stereo_centre, vo and gps tar files from the dataset website.

  3. The directory for each 'scene' (e.g. full) has .txt files defining the train/test split. While training MapNet++, you must put the sequences for self-supervised learning (dataset T in the paper) in the test_split.txt file. The dataloader for the MapNet++ models will use both images and ground-truth pose from sequences in train_split.txt and only images from the sequences in test_split.txt.

  4. To make training faster, we pre-processed the images using scripts/process_robotcar_images.py. This script undistorts the images using the camera models provided by the dataset, and scales them such that the shortest side is 256 pixels.

Running the code

The trained models for all experiments presented in the paper can be downloaded here. The inference script is scripts/eval.py. Here are some examples, assuming the models are downloaded in scripts/logs. Please go to the scripts folder to run the commands.

7_Scenes RobotCar

The executable script is scripts/train.py. Please go to the scripts folder to run these commands. For example:


For example, we can train MapNet++ model on heads from a pretrained MapNet model:

thon train.py --dataset 7Scenes \
eckpoint logs/7Scenes_heads_mapnet_mapnet_learn_beta_learn_gamma/epoch_250.pth.tar \
ene heads --config_file configs/mapnet++_7Scenes.ini --model mapnet++ \
vice 0 --learn_beta --learn_gamma

For MapNet++ training, you will need visual odometry (VO) data (or other sensory inputs such as noisy GPS measurements). For 7Scenes, we provided the preprocessed VO computed with the DSO method. For RobotCar, we use the provided stereo_vo. If you plan to use your own VO data (especially from a monocular camera) for MapNet++ training, you will need to first align the VO with the world coordinate (for rotation and scale). Please refer to the “Align VO” section below for more detailed instructions.

The meanings of various command-line parameters are documented in scripts/train.py. The values of various hyperparameters are defined in a separate .ini file. We provide some examples in the scripts/configs directory, along with a README file explaining some hyper-parameters.

If you have visdom = yes in the config file, you will need to start a Visdom server for logging the training progress:

python -m visdom.server -env_path=scripts/logs/.

Network Attention Visualization

Calculates the network attention visualizations and saves them in a video

Other Tools
Align VO to the ground truth poses

This has to be done before using VO in MapNet++ training. The executable script is scripts/align_vo_poses.py.

Mean and stdev pixel statistics across a dataset

This must be calculated before any training. Use the scripts/dataset_mean.py, which also saves the information at the proper location. We provide pre-computed values for RobotCar and 7Scenes.

Calculate pose translation statistics

Calculates the mean and stdev and saves them automatically to appropriate files python calc_pose_stats.py --dataset 7Scenes --scene redkitchen This information is needed to normalize the pose regression targets, so this script must be run before any training. We provide pre-computed values for RobotCar and 7Scenes.

Plot the ground truth and VO poses for debugging

python plot_vo_poses.py --dataset 7Scenes --scene heads --vo_lib dso --val. To save the output instead of displaying on screen, add the --output_dir ../results/ flag

Process RobotCar GPS

The scripts/process_robotcar_gps.py script must be run before using GPS for MapNet++ training. It converts the csv file into a format usable for training.

Demosaic and undistort RobotCar images

This is advisable to do beforehand to speed up training. The scripts/process_robotcar_images.py script will do that and save the output images to a centre_processed directory in the stereo directory. After the script finishes, you must rename this directory to centre so that the dataloader uses these undistorted and demosaiced images.


If you find this code useful for your research, please cite our paper

tle={Geometry-Aware Learning of Maps for Camera Localization},
thor={Samarth Brahmbhatt and Jinwei Gu and Kihwan Kim and James Hays and Jan Kautz},
oktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.