Tom Drummond

I am the Melbourne Connect Chair of Digital Innovation for Society
at the University of Melbourne


Research Topics:

Adversarial Pulmonary Pathology Translation for Pairwise Chest X-Ray Data Augmentation (with Yunyan Xing, Zongyuan Ge, Rui Zeng, Dwarikanath Mahapatra, Jarrel Seah and Meng Law)

This paper shows how to use Generative Adversarial Networks (GANs) to augment data for classes that are rare in the dataset.  The expanded dataset can then be used to train a classifier that outperforms a classifier trained only on the original real images.

[MICCAI 2019 paper]

Parallel optimal transport gan (with Gil Avraham and Yan Zuo)

This paper shows how to address the diversity problem in Generative Adversarial Networks by introducing a loss that explicitly pulls the generated distribution towards the real distribution in a low dimensional latent space.

[CVPR 2019 paper]

Look no deeper: Recognizing places from opposing viewpoints under varying scene appearance using single-view depth estimation (with Sourav Garg, Madhu Babu, Thanuja Dharmasiri, Stephen Hausler, Niko Suenderhauf, Swagat Kumar and Michael Milford)

This paper presents an approach to solving the hard problem of finding correspondences and localisation under extreme (180°) viewpoint variations.  Depth estimation is used to filter key points for probable appearance in the opposing view and a robust descriptor is learned to aid matching.

[ICRA 2019 paper]

The Importance of Metric Learning for Robotic Vision: Open Set Recognition and Active Learning (with Ben Meyer)

This paper shows how to use metric learning for active learning.  In a metric space, examples of novel classes typically map to empty parts of the space.  This can be detected automatically using the local ratio of unlabelled to labelled densities to select examples for active labelling.

[ICRA 2019 paper]

Real-time joint semantic segmentation and depth estimation using asymmetric annotations (with Vladimir Nekrasov, Thanuja Dharmasiri, Andrew Spek, Chunhua Shen and Ian Reid)

This paper shows the benefits of simultaneously estimating semantics and depth for a monocular image input stream.  The resulting network can perform this estimate at 13ms per frame, enabling it to be used in real-time systems.

[ICRA 2019 paper]

Learning factorized representations for open-set domain adaptation (with Mahsa Baktashmotlagh, Masoud Faraki, and Mathieu Salzmann)

This paper describes how to learn representations that transfer well between domains, while acknowledging that the target domain may have classes of data that are not present in the source domain.

[ICLR 2019 paper]

Eng: End-to-end neural geometry for robust depth and pose estimation using cnns (with Thanuja Dharmasiri and Andrew Spek)

This paper shows how to compute camera motion using networks to estimate depth per frame and optical flow between frames with uncertainty.  These estimates and uncertainties are then combined using conventional optimisation to obtain motion.

[ACCV 2018 paper]

Traversing latent space using decision ferns (with Yan Zuo and Gil Avraham)

This paper shows how to use a controller (based on decision ferns) to impose structure on the latent space of an autoencoder.

[ACCV 2018 paper] 

Approximate Fisher Information Matrix to Characterize the Training of Deep Neural Networks (WITH Zhibin Liao, Ian Reid and Gustavo Carneiro)

This paper looks at how to measure the condition number of the training problem for deep networks, how this is affected by batch size and learning rate, and how this characterises performance in terms of convergence and generalisation.

[PAMI 2018 paper]

Deep metric learning and image classification with nearest neighbour gaussian kernels (with Ben Meyer and Ben Harwood)

This paper shows how embedding a Gaussian kernel density classifier in latent space can be used to learn metric space representations of images.  The learned representation transfers well to novel classes providing good clustering performance on this unseen data.

[ICIP 2018 paper]

CReaM: Condensed real-time models for depth prediction using convolutional neural networks (with Andrew Spek and Thanuja Dharmasiri)

This paper shows how to use a complex model to train a simpler one by applying a loss to cause the embeddings in latent space to converge.  This accelerates depth estimation so that it can run at 30 frames per second on a TX2 for use in a VO/SLAM pipeline.

[IROS 2018 paper] 

Efficient Subpixel Refinement with Symbolic Linear Predictors (with Vincent Lui, Jonathon Geeves and Winston Yii)

This paper introduces a method for very rapid sub-pixel refinement of correspondences.  The method precomputes an update matrix that applies to the pixel intensity errors.  The update matrix is itself computed from a quadratic and linear form of the pixels in the image patch.  Finally the method also enables an estimation of the precision that will be obtained from a given reference patch.

[CVPR 2018 paper]