Engineering

Publication Search Results

Now showing 1 - 10 of 41
  • (2018) Chapre, Yogita Gunwant
    Thesis
    Indoor localization traditionally uses fingerprinting approaches based on Received Signal Strength (RSS), where RSS plays a crucial role in determining the nature and characteristics of location fingerprints stored in a radio-map. The RSS is a function of the distance between transmitter and receiver, which can vary due to in-path interference. This thesis identifies the factors affecting the RSS in indoor localization, discusses the effect of identified factors such as spatial, temporal, environmental, hardware and human presence on the RSS through extensive measurements in a typical IEEE 802.11 a/g/n network, and demonstrates the reliability of RSS-based location fingerprints using statistical analysis of the measured data for indoor localization. This thesis presents a novel Wi-Fi fingerprinting system CSI-MIMO, which uses fine-grained information known as Channel State Information (CSI). CSI-MIMO exploits frequency diversity and spatial diversity in an Orthogonal Frequency Division Multiplexing (OFDM) system using a Multiple Inputs Multiple Outputs (MIMO) system. CSI-MIMO uses either magnitude of CSI or a complex CSI location signature, depending on mobility in indoor environments. The performance of CSI-MIMO is compared to Fine-grained Indoor Fingerprinting System (FIFS), CSI with Single Input Single Output (SISO), and a simple CSI with MIMO. The experimental results show significant improvement with accuracy of 0.98 meter in a static environment and 0.31 meter in a dynamic environment, with optimal war-driving over existing CSI-based fingerprinting systems.

  • (2012) Xu, Jing
    Thesis
    The join query is a fundamental tool in many modern application areas including location-based services, geographic information system (GIS), finance and capital markets analysis, etc. Given two sets of objects U and V, a top-k similarity join returns k pairs of most similar objects from U x V. The top-k similarity joins have been extensively studied and used in a wide spectrum of applications such as information retrieval, decision making, spatial data analysis and data mining. In the conventional model of top-k similarity join processing, an object is usually regarded as a point in a multi-dimensional space and the similarity between two objects is usually measured by distance metrics such as Euclidean distance. However, in many applications such as decision making and e-business, an object may be described by multiple values (instances) and the conventional model is not applicable since it does not address the distributions of object instances. In this thesis, we study top-k similarity join queries over multi-valued objects. We formalize the problem of top-k similarity join over multi-valued objects, regarding quantile-based distance metrics which is applied to explore the relative instance distribution among the multiple instances of objects. Efficient and effective techniques to process top-k similarity joins over multi-valued objects are developed following a filtering-refinement framework. Novel distance, statistic and weight based pruning techniques are proposed. Comprehensive experiments on both real and synthetic datasets demonstrate the efficiency and effectiveness of our techniques.

  • (2012) An, Fengqi
    Thesis
    Non-photorealistic arts have been an invaluable form of media for over tens of thousands of years, and are widely used in animation and games today, motivating research into this field. Recently, the novel 2.5D Model has emerged, targetting the limitations of both 2D and 3D forms of cartoons. The most recent development is the 2.5D Cartoon Model. The manual building process of such models is labour intensive, and no automatic building method for 2.5D models exists currently. This dissertation proposes a novel approach to the problem of automatic creation of 2.5D Cartoon Models, termed Auto-2CM in this thesis, which is the first attempt of a solution to the problem. The proposed approach aims to build 2.5D models from real world objects. Auto-2CM collects 3D information on the candidate object using 3D reconstruction methods from Computer Vision, then partitions it into meaningful parts using segmentation methods from Computer Graphics. A novel 3D-2.5D conversion method is introduced to create the final 2.5D model, which is the first method for 3D-2.5D conversion. The Auto-2CM framework does not mandate specific algorithms of reconstruction or segmentation, therefore different algorithms may be used for different kinds of objects. The effect of different algorithms on the final 2.5D model is currently unknown. A perceptual evaluation of Auto-2CM is performed, which shows that by using different combinations of algorithms within Auto-2CM for specific kinds of objects, the performance of the system maybe increased significantly. The approach can produce acceptable models for both manual sketches and direct use. It is also the first experimental study of the problem.

  • (2012) Massoudi, Amir
    Thesis
    Analysing cell motility is an important process in medical and biomedical studies since most active cellular functions involve change in shape and movements. Manual observation and analysis of cellular images and data sets is a tedious and error prone task. Therefore designing a reliable automatic cell tracking system could considerably ease the burden on biologists. Because of the limitations of microscopic imaging techniques, together with cell characteristics, analysing biological cells is a challenging task. This thesis proposes novel methods for cell segmentation and tracking as well as mitosis detection. It presents a Graph Cut-based cell segmentation algorithm that is fully automatic and exploits temporal information in video microscopy to achieve better segmentation results. It also presents a cell tracking method based on the network flow algorithm that does not rely on perfect cell segmentation, and uses the information of multiple frames for cell association. The tracking algorithm is able to cope with cells entering or exiting a frame any time. To detect mitosis events, the network flow cell tracking algorithm is extended in a novel way that can detect mitosis events and track their daughter cells afterwards. The proposed methods have been tested on the phase-contrast microscopic videos provided by Garvan Institute of Medical Research. Quantitative and qualitative analysis presented in this thesis show that employing temporal information for both cell segmentation and mitosis detection does improve the results considerably.

  • (2012) Vassar, Alexandra
    Thesis
    The aim of this thesis is to determine whether selecting usability testing participants on the basis of their personality, as measured by the Myers-Briggs Type Indicator (MBTI) extraversion/introversion scale, can enhance the results obtained in usability testing in a web context. This thesis assesses whether extraverted subjects can uncover a more significant subset of problems and, in particular, a larger number of more severe problems, than introverts during usability testing. If this were the case, then the process of usability testing can become more efficient by using participants who are able to provide the highest quality feedback and are able to find the greatest number of, and the most severe problems leading to a decrease in the number of necessary participants and a reduction in project costs and a higher return on investment. Forty-three randomly selected candidates were given the MBTI test. Of these, twenty qualified as either extraverts or introverts and were, therefore, selected to take part in the study. Two sample groups were constituted, one comprising ten extraverts and the other ten introverts. Each of these participants was then asked to take part in a formal laboratory-based usability testing with an e-commerce website. The participants were required to complete a total of five tasks. Performance metrics from these tasks were observed and recorded, with the main focus being on the number and severity of usability problems found by each of the participants. The severity rating was assigned based on an average severity score taken from four evaluators. This was then combined with the frequency of problem occurrence to give an overall severity rating to the usability problem ranging from two to eight, with two representing a low-level cosmetic problem and eight representing a major usability problem, which rendered the system unusable. The study established that extraverts found more usability problems than introverts (p<0.001), and also a larger number of more severe problems. Additionally, extraverts found on average 96% of all unique usability problems, whereas introverts found just 28% of all such problems. A strong positive correlation was found between the degree of extraversion and the overall number of usability problems found, using a Pearson correlation coefficient R=0.85 (p<0.01). Extraverted participants were more confident in their feedback and more comfortable voicing their opinions than introverted participants. Overall, extraverts talked more often, as measured by the words-per-second metric (p<0.001), and instigated more helpful commentary than introverts; through the feedback of extraverted subjects, 100% of all Category 6 high severity usability problems were uncovered. Based on the results of the usability testing carried out, this study has recommended a selection process for participants in usability testing, based on their levels of extraversion and their realistic use of the product. It is hoped that this approach to the selection of participants will ensure that the greatest number of, and the most severe, usability problems are found during usability testing. This will lead to a more efficient testing process and make it possible to decrease sample sizes in usability testing without reducing the quality of the results obtained, thereby providing a decrease in project costs and a higher return on investment.

  • (2015) Sun, Yu-Jen
    Thesis
    The rising popularity of SaaS allows individuals and enterprises to leverage various services (e.g. Dropbox, Github, GDrive and Yammer) for everyday processes. Consequently, an enormous amount of Application Programming Interfaces (APIs) were generated since the demand of cloud services, allowing third-party developers to integrate these services into their processes. However, the explosion of APIs and the heterogeneous interfaces makes the discovery and integration of Web services a complex technical issue. Moreover, these disparate services do not in general communicate with each other, rather used in an ad-hoc manner with little or no customizable process support. This inevitably leads to "shadow processes", often only informally managed by e-mail or the like. We propose a framework to simplify the integration of disparate services and effectively build customized processes. We propose a platform for managing API-related knowledge and a declarative language and model for composing APIs. The implementation of the proposed framework includes an Knowledge Graph for APIs called "APIBase" and an agile services integration platform, called: "CaseWalls". We provide a knowledge-based event-bus for unified interactions between disparate services, while allowing process participants to interact and collaborate on relevant cases.

  • (2019) Vaghani, Kushal
    Thesis
    Social media platforms have empowered the democratization of the pulse of people in the modern era. Due to its immense popularity and high usage, data published on social media sites (e.g., Twitter, Facebook and Tumblr) is a rich ocean of information. Therefore data-driven analytics of social imprints has become a vital asset for organisations and governments to further improve their products and services. However, due to the dynamic and noisy nature of social media data, performing accurate analysis on raw data is a challenging task. A key requirement is to curate the raw data before fed into analytics pipelines. This curation process transforms the raw data into contextualized data and knowledge. We propose a data curation pipeline, namely CrowdCorrect, to enable analysts cleansing and curating social data and preparing it for reliable analytics. Our pipeline provides an automatic feature extraction from a corpus of social media data using existing in-house tools. Further, we offer a dual-correction mechanism using both automated and crowd- sourced approaches. The implementation of this pipeline also includes a set of tools for automatically creating micro- tasks to facilitate the contribution of crowd users in curating the raw data. For the purposes of this research, we use Twitter as our motivational social media data platform due to its popularity.

  • (2015) Roldugin, George
    Thesis
    The benefits of high level approach to parallel programming are well understood and are often desired in order to separate the domain view of the problem from the intricate implementation details. Yet, a naive execution of the resulting programs attracts unnecessary and even prohibitive performance costs. One convenient way of expressing a program is by composing collective operations on large data structures. Even if these collective operations are implemented efficiently and provide a high degree of parallelism, the result of each operation must be fully computed and written into memory before the next operation can consume it as input. The cost of transferring these intermediate results to and from memory has a very noticeable impact on the performance of the algorithm and becomes a serious drawback of this high level approach. Program optimisation which attempts to detect and eliminate the creation of intermediate results by combining multiple operations into one is known as fusion. While it is a well studied problem, there are unfilled gaps when it comes to fusing data parallel programs. In particular, I demonstrate solutions to the problems of fusion with multiple consumers as well as producing multiple results from one fused computation (tupling). Through my research, I have designed and implemented an embedded domain specific language called LiveFusion that offers fusible combinators operating on flat and segmented arrays. To achieve fusion I propose a generic loop representation and use the concept of rates to guide fusion. The results show that LiveFusion is considerably more effective at exploiting opportunities for fusion than previous systems. Specifically, the average performance increase of 3.2 for a non-trivial program indicates the attractiveness of the approach.

  • (2019) Yang, Jiahui
    Thesis
    Online multi-label classification has been widely used in various real world applications, such as twitter, facebook post, instagram, video search and RSS feeds. With the proliferation of multi-label classification, significant research efforts have been devoted. This thesis studies some approaches on online multi-label classification. Existing researches on online multi-label classification, such as online sequential multi-label extreme learning machine (OSML-ELM) and stochastic gradient descent (SGD), have shown a promising performance on multi-label classification. However, these works lack analysis of loss function and do not consider the label dependency. To fill the gap of current research, we propose a novel online metric learning paradigm for multi-label classification. Specifically, we first project instances and labels into a lower dimension for comparison, and then leverage the large margin principle to learn a metric with an efficient optimization algorithm. Moreover, we provide the theoretical analysis on the upper bound of the cumulative loss for our method. Comprehensive experiments on a number of benchmark multi-label datasets validate our theoretical studies and illustrate that our proposed online metric learning (OML) algorithm outperforms state-of-the-art approaches.

  • (2018) Kaul, Ity
    Thesis
    In the financial domain, the problem of extracting insights from either symbolic data such as vari- ous news documents, public disclosure filings or numerical data such as stock prices has been well researched. The aim of our research was to gain and merge insights from each of these data sets into a single coherent model. We experimentally demonstrated that by combining insights from price data using pattern and trend detection techniques and correlating them with insights gained from applying text mining techniques to news documents can be a powerful signal in stock trading. The results showed that overall, using various trading strategies we were able to leverage the combined sentiment signal from news docu- ments and trends from price data to make better trading decisions. Our trading strategies performed especially well when we were able to detect high level trends from price data and then correlate them with sentiment from news articles within that time frame.