Tom Drummond

I am a professor at the Department of Electrical and
Computer Systems Engineering at Monash University

email: tom.drummond@monash.edu

Algorithmic methodologies for FPGA-based vision (with Yoong Kang Lim and Lindsay Kleeman)


This paper presents a methodology for developing computer vision algorithms for FPGAs and shows how the methodology can be implemented in two case studies.

[2012 Machine Vision and Applications paper]

Distributed visual processing for augmented reality (with Winston Yii and Wai Ho Li)



This paper presents a system which combines smartphones with networked infrastructure and fixed sensors and shows how these elements can be combined to deliver real-time augmented reality.  We use a Kinect to generate dynamic trackable models of the environment as it changes at video frame rate.

[ISMAR 2012 paper]

Corner matching refinement for monocular pose estimation (with Dinesh Gamage)

Many tasks in computer vision rely on accurate detection and matching of visual landmarks (e.g. image corners) between two images.  This paper presents a method for refining the coordinates of correspondences directly.  Thus given some coordinates in the first image, our goal is to maximise the accuracy of the estimate of the coordinates in second image corresponding to the same real world point without being too concerned about which real world point is being matched.


[Draft BMVC 2012 paper]

Transformative Reality: Improving bionic vision with robotic sensing (with Dennis Lui, Damien Browne, Lindsay Kleeman and Wai Ho Li)

Implanted visual prostheses provide bionic vision with very low spatial and intensity resolution when compared against healthy vision. Vision processing can make better use of the limited resolution by highlighting salient features such as edges. In this paper, we show how Transformative Reality extends and improves upon traditional vision processing in three ways.


[EMBC 2012 paper]

Robust egomotion estimation using ICP in inverse depth coordinates (with Dennis Lui, Titus Tang and Wai Ho Li)


This paper presents a 6 degrees of freedom egomotion estimation method using Iterative Closest Point (ICP) for low cost and low accuracy range cameras. Instead of Euclidean coordinates, the method uses inverse depth coordinates which better conforms to the error characteristics of raw sensor data. Extensive experiments were performed to evaluate different combinations of error metrics and parameters. The result is a real-time system that is accurate and robust across a variety of motion trajectories.


Rapidly constructed appearance models for tracking in augmented reality applications (with Jeremiah Neubert and John Pretlove)

This paper shows how a user can rapidly construct a 3D model that can be used for visual tracking.  The system uses point features for initialization and edge features for tracking.


[2011 Machine Vision and Applications paper]

Visual localisation of a robot with an external RGBD sensor (with Winston Yii, Nalika Damayanthi and Wai Ho Li)

This paper presents a novel approach to visual localisation that uses a camera on the robot coupled wirelessly to an external RGB-D sensor. Unlike systems where an external sensor observes the robot, our approach merely assumes the robots camera and external sensor share a portion of their field of view. Experiments were performed using a Microsoft Kinect as the external sensor and a small mobile robot. The robot carries a smartphone, which acts as its camera, sensor processor, control platform and wireless link. Computational effort is distributed between the smartphone and a host PC connected to the Kinect. Experimental results show that the approach is accurate and robust in dynamic environments with substantial object movement and occlusions.  This work won the best student paper prize at ACRA 2011.
[ACRA 2011 paper]

eBug - an open robotics platform for teaching and research (with Nick D'Ademo, Dennis Lui, Wai Ho Li and Ahmet Sekercioglu)


The eBug is a low-cost and open robotics platform designed for undergraduate teaching and academic research in areas such as multimedia smart sensor networks, distributed control, mobile wireless communication algorithms and swarm robotics. The platform is easy to use, modular and extensible.


[ACRA 2011 paper]





Transformative Reality: Augmented reality for visual prostheses (with Dennis Lui, Damien Browne, Lindsay Kleeman and Wai Ho Li)

Visual prostheses such as retinal implants provide bionic vision that is limited in spatial and intensity resolution. This limitation is a fundamental challenge of bionic vision as it severely truncates salient visual information. We propose to address this challenge by performing real time transformations of visual and non-visual sensor data into symbolic representations that are then rendered as low resolution vision; a concept we call Transformative Reality.
[ISMAR 2011 paper]

Rapid scene reconstruction on mobile phones from panoramic images (with Qi Pan, Clemens Arth, Gerhard Reitmayr and Ed Rosten)

This work presents a novel system that allows for the generation of a coarse 3D model of the environment within several seconds on mobile smartphones. By using a very fast and flexible algorithm a set of panoramic images is captured to form the basis of wide field-of-view images required for reliable and robust reconstruction. A cheap on-line space carving approach based on Delaunay triangulation is employed to obtain dense, polygonal, textured representations. The use of an intuitive method to capture these images, as well as the efficiency of the reconstruction approach allows for an application on recent mobile phone hardware, giving visually pleasing results almost instantly.
[ISMAR 2011 paper]

Deterministic sample consensus with multiple match hypotheses (with Paul McIlroy, Ed Rosten and Simon Taylor)

This paper proposes a deterministic scheme for selecting correspondences from feature matching to generate motion hypotheses. The method combines matching scores, ambiguity and the past performance of motion hypotheses generated by the matches, to estimate the probability that a feature match is correct. At every stage the best correspondences are chosen to generate a hypothesis. This method will therefore only spend time on poor or ambiguous matches when the best correspondences have proven themselves to be unsuitable. The result is a system that is able to operate efficiently on ambiguous data and is suitable for implementation on devices with limited computing resources. 
[2010 BMVC Paper]

Handheld augmented reality (with Simon Taylor and Connell Gauld)

Simon and Connell have launched their mobile Augmented Reality software called Popcode . Now available for iOS devices as well as Android.


Now called Zappar.

ProFORMA: Probabilistic Feature-based On-line Rapid Model Acquisition (with Qi Pan and Gerhard Reitmayr)

The generation of 3D models of real objects is very useful for many computer vision applications. This paper introduces ProFORMA, a system designed to enable on-line reconstruction of textured 3D objects rotated by a user's hand. Partial models are created very rapidly and displayed to the user to aid view planning, as well as used by the system to robustly track the object pose. The system works by calculating the Delaunay tetrahedralisation of a point cloud obtained from on-line structure from motion estimation which is then carved using a recursive probabilistic algorithm to rapidly obtain the surface mesh. This work won the Best Demo prize at ISMAR 2009.

[2009 BMVC Paper]

Reconstruction from uncalibrated affine silhouettes (with Paul McIlroy)

This work addresses the problem of model building from multiple affine silhouette views of an object in an uncontrolled environment such as an aircraft in flight. Each pair of silhouette views provides two outer epipolar tangency constraints on the relative motion between the cameras. For a scaled orthographic camera model with six degrees of freedom we show that it is possible to recover structure and motion from six or more silhouette views by solving the outer epipolar tangency constraints simultaneously. This work won the Best Student Poster award at BMVC 2009 

[2009 BMVC Paper]

High speed feature matching (with Simon Taylor and Ed Rosten)

This work presents a novel local feature matching method designed with a focus on runtime speed. This enables frame-rate localisation of known targets on low-powered devices such as mobile phones. This work won the Best Demo prize at CVPR 2009.


[2009 CVPR Workshop on Feature Detectors and Descriptors Paper]
[2009 BMVC Paper]
[Video showing operation]
[Video showing target with few features]
[Video showing multiple targets]