Deep learning based stereo matching on a small dataset

dc.contributor.advisor Sun, Changming
dc.contributor.advisor Sowmya, Arcot Wu, Rongcheng 2022-02-09T02:12:21Z 2022-02-09T02:12:21Z 2021
dc.description.abstract Deep learning (DL) has been used in many computer vision tasks including stereo matching. However, DL is data hungry, and a large number of highly accurate real-world training images for stereo matching is too expensive to acquire in practice. The majority of studies rely on large simulated datasets during training, which inevitably results in domain shift problems that are commonly compensated by fine-tuning. This work proposes a recursive 3D convolutional neural network (CNN) to improve the accuracy of DL based stereo matching that is suitable for real-world scenarios with a small set of available images, without having to use a large simulated dataset and without fine-tuning. In addition, we propose a novel scale-invariant feature transform (SIFT) based adaptive window for matching cost computation that is a crucial step in the stereo matching pipeline to enhance accuracy. Extensive end-to-end comparative experiments demonstrate the superiority of the proposed recursive 3D CNN and SIFT based adaptive windows. Our work achieves effective generalization corroborated by training solely on the indoor Middlebury Stereo 2014 dataset and validating on outdoor KITTI 2012 and KITTI 2015 datasets. As a comparison, our bad-4.0-error is 24.2 that is on par with the AANet (CVPR2020) method according to the publicly evaluated report from the Middlebury Stereo Evaluation Benchmark.
dc.language English
dc.language.iso en
dc.publisher UNSW, Sydney
dc.rights CC BY 4.0
dc.subject.other stereo matching
dc.subject.other deep learning
dc.title Deep learning based stereo matching on a small dataset
dc.type Thesis
dcterms.accessRights open access
dcterms.rightsHolder Wu, Rongcheng
dspace.entity.type Publication
unsw.relation.faculty Engineering School of Computer Science and Engineering School of Computer Science and Engineering
unsw.subject.fieldofresearchcode 4603 Computer vision and multimedia computation
unsw.thesis.degreetype Masters Thesis
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
public version.pdf
13.86 MB
Resource type