Name: geomapnet
Owner: NVIDIA Research Projects
Description: Geometry-Aware Learning of Maps for Camera Localization (CVPR2018)
Created: 2018-04-06 20:15:13.0
Updated: 2018-05-14 18:19:42.0
Pushed: 2018-05-12 03:50:38.0
Homepage: https://goo.gl/mRB3Au.
Size: 2542
Language: Python
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
Copyright © 2018 NVIDIA Corporation. All rights reserved. Licensed under the CC BY-NC-SA 4.0 license (https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode).
This is the PyTorch implementation of our CVPR 2018 paper
MapNet uses a Conda environment that makes it easy to install all dependencies.
Install miniconda with Python 2.7.
Create the mapnet
Conda environment: conda env create -f environment.yml
.
Activate the environment: conda activate mapnet_release
.
We support the
7Scenes
and Oxford RobotCar datasets right
now. You can also write your own PyTorch dataloader for other datasets and put it in the
dataset_loaders
directory. Refer to
this README file for more details.
The datasets live in the data/deepslam_data
directory. We provide skeletons
with symlinks to get you started. Let us call your 7Scenes download directory
7SCENES_DIR and your main RobotCar download directory (in which you untar all
the downloads from the website) ROBOTCAR_DIR. You will need to make the following
symlinks:
cd data/deepslam_data &&
ln -s 7SCENES_DIR 7Scenes &&
ln -s ROBOTCAR_DIR RobotCar_download
Download
this fork of
the dataset SDK, and run cd scripts && ./make_robotcar_symlinks.sh
after
editing the ROBOTCAR_SDK_ROOT
variable in it appropriately.
For each sequence, you need to download the stereo_centre
, vo
and gps
tar files from the dataset website.
The directory for each 'scene' (e.g. full
) has .txt files defining the
train/test split. While training MapNet++,
you must put the sequences for self-supervised learning (dataset T in the paper)
in the test_split.txt
file. The dataloader for the MapNet++ models will use
both images and ground-truth pose from sequences in train_split.txt
and only
images from the sequences in test_split.txt
.
To make training faster, we pre-processed the images using
scripts/process_robotcar_images.py
. This script undistorts the images using
the camera models provided by the dataset, and scales them such that the shortest
side is 256 pixels.
The trained models for all experiments presented in the paper can be downloaded
here.
The inference script is scripts/eval.py
. Here are some examples, assuming
the models are downloaded in scripts/logs
. Please go to the scripts
folder to run the commands.
MapNet++ with pose-graph optimization (i.e., MapNet+PGO) on heads
:
thon eval.py --dataset 7Scenes --scene heads --model mapnet++ \
ights logs/7Scenes_heads_mapnet++_mapnet++_7Scenes/epoch_005.pth.tar \
nfig_file configs/pgo_inference_7Scenes.ini --val --pose_graph
an error in translation = 0.12 m
an error in rotation = 8.46 degrees
For evaluating on the train
split remove the --val
flag
To save the results to disk without showing them on screen (useful for scripts),
add the --output_dir ../results/
flag
See this README file for more information on hyper-parameters and which config files to use.
MapNet++ on heads
:
thon eval.py --dataset 7Scenes --scene heads --model mapnet++ \
ights logs/7Scenes_heads_mapnet++_mapnet++_7Scenes/epoch_005.pth.tar \
nfig_file configs/mapnet.ini --val
an error in translation = 0.13 m
an error in rotation = 11.13 degrees
MapNet on heads
:
thon eval.py --dataset 7Scenes --scene heads --model mapnet \
ights logs/7Scenes_heads_mapnet_mapnet_learn_beta_learn_gamma/epoch_250.pth.tar \
nfig_file configs/mapnet.ini --val
an error in translation = 0.18 m
an error in rotation = 13.33 degrees
PoseNet (CVPR2017) on heads
:
thon eval.py --dataset 7Scenes --scene heads --model posenet \
ights logs/7Scenes_heads_posenet_posenet_learn_beta_logq/epoch_300.pth.tar \
nfig_file configs/posenet.ini --val
an error in translation = 0.19 m
an error in rotation = 12.15 degrees
MapNet++ with pose-graph optimization on loop
:
thon eval.py --dataset RobotCar --scene loop --model mapnet++ \
ights logs/RobotCar_loop_mapnet++_mapnet++_RobotCar_learn_beta_learn_gamma_2seq/epoch_005.pth.tar \
nfig_file configs/pgo_inference_RobotCar.ini --val --pose_graph
error in translation = 6.74 m
error in rotation = 2.23 degrees
MapNet++ on loop
:
thon eval.py --dataset RobotCar --scene loop --model mapnet++ \
ights logs/RobotCar_loop_mapnet++_mapnet++_RobotCar_learn_beta_learn_gamma_2seq/epoch_005.pth.tar \
nfig_file configs/mapnet.ini --val
error in translation = 6.95 m
error in rotation = 2.38 degrees
MapNet on loop
:
thon eval.py --dataset RobotCar --scene loop --model mapnet \
ights logs/RobotCar_loop_mapnet_mapnet_learn_beta_learn_gamma/epoch_300.pth.tar \
nfig_file configs/mapnet.ini --val
error in translation = 9.84 m
error in rotation = 3.96 degrees
The executable script is scripts/train.py
. Please go to the scripts
folder to run these commands. For example:
chess
from 7Scenes
: python train.py --dataset 7Scenes
--scene chess --config_file configs/posenet.ini --model posenet --device 0
--learn_beta --learn_gamma
MapNet on chess
from 7Scenes
: python train.py --dataset 7Scenes
--scene chess --config_file configs/mapnet.ini --model mapnet
--device 0 --learn_beta --learn_gamma
MapNet++ is finetuned on top of a trained MapNet model:
python train.py --dataset 7Scenes --checkpoint <trained_mapnet_model.pth.tar>
--scene chess --config_file configs/mapnet++_7Scenes.ini --model mapnet++
--device 0 --learn_beta --learn_gamma
For example, we can train MapNet++ model on heads
from a pretrained MapNet model:
thon train.py --dataset 7Scenes \
eckpoint logs/7Scenes_heads_mapnet_mapnet_learn_beta_learn_gamma/epoch_250.pth.tar \
ene heads --config_file configs/mapnet++_7Scenes.ini --model mapnet++ \
vice 0 --learn_beta --learn_gamma
For MapNet++ training, you will need visual odometry (VO) data (or other sensory inputs such as noisy GPS measurements). For 7Scenes, we provided the preprocessed VO computed with the DSO method. For RobotCar, we use the provided stereo_vo. If you plan to use your own VO data (especially from a monocular camera) for MapNet++ training, you will need to first align the VO with the world coordinate (for rotation and scale). Please refer to the “Align VO” section below for more detailed instructions.
The meanings of various command-line parameters are documented in
scripts/train.py
. The values of various hyperparameters are defined in a
separate .ini file. We provide some examples in the scripts/configs
directory,
along with a README file explaining some
hyper-parameters.
If you have visdom = yes
in the config file, you will need to start a Visdom
server for logging the training progress:
python -m visdom.server -env_path=scripts/logs/
.
Calculates the network attention visualizations and saves them in a video
chess
in 7Scenes
:thon plot_activations.py --dataset 7Scenes --scene chess
ights <filename.pth.tar> --device 1 --val --config_file configs/mapnet.ini
tput_dir ../results/
Check here for an example video of
computed network attention of PoseNet vs. MapNet++.This has to be done before using VO in MapNet++ training. The executable script
is scripts/align_vo_poses.py
.
For the first sequence from chess
in 7Scenes
:
python align_vo_poses.py --dataset 7Scenes --scene chess --seq 1 --vo_lib dso
.
Note that alignment for 7Scenes
needs to be done separately for each sequence,
and so the --seq
flag is needed
For all 7Scenes
you can also use the script align_vo_poses_7scenes.sh
The script stores the information at the proper location in data
This must be calculated before any training. Use the scripts/dataset_mean.py
,
which also saves the information at the proper location. We provide pre-computed
values for RobotCar and 7Scenes.
Calculates the mean and stdev and saves them automatically to appropriate files
python calc_pose_stats.py --dataset 7Scenes --scene redkitchen
This information is needed to normalize the pose regression targets, so this
script must be run before any training. We provide pre-computed values for
RobotCar and 7Scenes.
python plot_vo_poses.py --dataset 7Scenes --scene heads --vo_lib dso --val
. To save the
output instead of displaying on screen, add the --output_dir ../results/
flag
The scripts/process_robotcar_gps.py
script must be run before using GPS for
MapNet++ training. It converts the csv file into a format usable for training.
This is advisable to do beforehand to speed up training. The
scripts/process_robotcar_images.py
script will do that and save the output
images to a centre_processed
directory in the stereo
directory. After the
script finishes, you must rename this directory to centre
so that the dataloader
uses these undistorted and demosaiced images.
If you find this code useful for your research, please cite our paper
roceedings{mapnet2018,
tle={Geometry-Aware Learning of Maps for Camera Localization},
thor={Samarth Brahmbhatt and Jinwei Gu and Kihwan Kim and James Hays and Jan Kautz},
oktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
ar={2018}