Publication:
Deep learning based stereo matching on a small dataset

dc.contributor.advisor Sun, Changming
dc.contributor.advisor Sowmya, Arcot
dc.contributor.author Wu, Rongcheng
dc.date.accessioned 2022-02-09T02:12:21Z
dc.date.available 2022-02-09T02:12:21Z
dc.date.issued 2021
dc.description.abstract Deep learning (DL) has been used in many computer vision tasks including stereo matching. However, DL is data hungry, and a large number of highly accurate real-world training images for stereo matching is too expensive to acquire in practice. The majority of studies rely on large simulated datasets during training, which inevitably results in domain shift problems that are commonly compensated by fine-tuning. This work proposes a recursive 3D convolutional neural network (CNN) to improve the accuracy of DL based stereo matching that is suitable for real-world scenarios with a small set of available images, without having to use a large simulated dataset and without fine-tuning. In addition, we propose a novel scale-invariant feature transform (SIFT) based adaptive window for matching cost computation that is a crucial step in the stereo matching pipeline to enhance accuracy. Extensive end-to-end comparative experiments demonstrate the superiority of the proposed recursive 3D CNN and SIFT based adaptive windows. Our work achieves effective generalization corroborated by training solely on the indoor Middlebury Stereo 2014 dataset and validating on outdoor KITTI 2012 and KITTI 2015 datasets. As a comparison, our bad-4.0-error is 24.2 that is on par with the AANet (CVPR2020) method according to the publicly evaluated report from the Middlebury Stereo Evaluation Benchmark.
dc.identifier.uri http://hdl.handle.net/1959.4/100072
dc.language English
dc.language.iso en
dc.publisher UNSW, Sydney
dc.rights CC BY 4.0
dc.rights.uri https://creativecommons.org/licenses/by/4.0/
dc.subject.other stereo matching
dc.subject.other deep learning
dc.title Deep learning based stereo matching on a small dataset
dc.type Thesis
dcterms.accessRights open access
dcterms.rightsHolder Wu, Rongcheng
dspace.entity.type Publication
unsw.accessRights.uri https://purl.org/coar/access_right/c_abf2
unsw.identifier.doi https://doi.org/10.26190/unsworks/1982
unsw.relation.faculty Engineering
unsw.relation.school School of Computer Science and Engineering
unsw.relation.school School of Computer Science and Engineering
unsw.subject.fieldofresearchcode 4603 Computer vision and multimedia computation
unsw.thesis.degreetype Masters Thesis
Files
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
Name:
public version.pdf
Size:
13.86 MB
Format:
application/pdf
Description:
Resource type