Dataset:
New Generations of Internet of Things Datasets for Cybersecurity Applications based Machine Learning: TON_IoT Datasets

dc.date.accessioned 2021-11-26T10:46:18Z
dc.date.available 2021-11-26T10:46:18Z
dc.date.issued 2019 en_US
dc.description.abstract Collecting and analysing heterogeneous data sources from the Internet of Things (IoT) and Industrial IoT (IIoT) are essential for training and validating the fidelity of cybersecurity applications-based machine learning. However, the analysis of those data sources is still a big challenge for reducing high dimensional space and selecting important features and observations from different data sources. The study proposes a new testbed for an IIoT network that was utilised for creating new datasets called TON_IoT that collected Telemetry data, Operating systems data and Network data. The testbed is deployed using multiple virtual machines including hosts of windows, Linux and Kali Linux operating systems to manage the interconnections between the three layers of IIoT, Cloud and Edge/Fog systems. The initial statistical evaluation of the datasets reveals their capability for evaluating cybersecurity applications such as intrusion detection, threat intelligence, adversarial machine learning and privacy-preserving models. en_US
dc.identifier.uri http://hdl.handle.net/1959.4/resource/collection/resdatac_921/1
dc.language English
dc.language.iso EN en_US
dc.rights GPL en_US
dc.rights.uri https://www.gnu.org/licenses/gpl-3.0.html en_US
dc.subject.other datasets for cyber applications en_US
dc.subject.other Intrusion detection datasets en_US
dc.subject.other datasets for machine learning and cyber en_US
dc.title New Generations of Internet of Things Datasets for Cybersecurity Applications based Machine Learning: TON_IoT Datasets en_US
dc.type Dataset en_US
dcterms.accessRights open access
dcterms.accrualMethod The research methodology includes three main phases: 1) extending the testbed of IoT network at the Cyber Range labs at UNSW Canberra; 2) collecting and filtering heterogeneous datasets; and 3)initial evaluation of datasets using statistical and deep learning models, as briefly explained below. 1) Extending the testbed of IoT network at the Cyber Range labs at UNSW Canberra In the Cyber Range Labs of UNSW Canberra, a testbed network for the industry 4.0 network that includes IoT and IIoT devices and services has been designed. The testbed will be extended to generate a new systematic testbed of IIoT networks for creating new realistic datasets, as presented in Figure 1. The testbed is deployed using multiple virtual machines and hosts of windows, Linux and Kali Linux operating systems to manage the interconnection between the three layers of IoT, Cloud and Edge/Fog systems. A set of IoT devices and sensors, such as green gas IoT and industrial IoT actuators, is connected to MQTT gateways to publish and subscribe to various topics, such as measuring temperature and humidity. 2) Collecting and analysing heterogeneous datasets From the designed testbed network, there are four heterogonous data sources collected from telemetry data of IoT systems, data of Windows and Linux Ubuntu systems and their network traffic. The datasets contain a wide range of new attack surfaces and vectors, as well as legitimate events. For analysing the datasets, existing and new tools are utilised to extract multiple features for evaluating the efficiency of the datasets for validating cyber applications and improving big data analytics tools. 3) Evaluation of datasets using statistical and deep learning models The datasets have diverse patterns and large-scale events to assess different cyber applications-based learning models such as intrusion detection, privacy-preserving, and digital forensics systems. Deep learning and statistical algorithms can be used for evaluating the new datasets compared with current benchmark network and IoT datasets. en_US
dcterms.rights This dataset has been sponsored by the Australian Reserach Data Commons (ARDC). The Copyright is reserved to the Author, Dr Nour Moustafa, who is a Lecturer and Offensice secuity Theme lead at UNSW Canberra. This datasets should be publicly published and sustain their online availability. en_US
dcterms.rightsHolder Copyright 2019, Nour Moustafa en_US
dspace.entity.type Dataset en_US
unsw.accessRights.uri https://purl.org/coar/access_right/c_abf2
unsw.contributor.leadChiefInvestigator Moustafa, Nour en_US
unsw.contributor.researchDataCreator Moustafa, Nour en_US
unsw.coverage.temporalFrom 2019-09-02 en_US
unsw.description.storageURL https://cloudstor.aarnet.edu.au/plus/s/ds5zW91vdgjEj9i en_US
unsw.identifier.doi https://doi.org/10.26190/5d7ac9bfe8487 en_US
unsw.isPublicationRelatedToDataset Deep Neural Networks for Network Intrusion Detection
unsw.relation.FunderRefNo 34361
unsw.relation.OriginalPublicationAffiliation Moustafa, Nour, Sch of Engineering & IT (Sum), UNSW Canberra, en_US
unsw.relation.faculty UNSW Canberra
unsw.relation.fundingAgency MONASH UNIVERSITY
unsw.relation.fundingScheme AUSTRALIAN RESEARCH DATA COMMONS - TRANSFORMATIVE DATA PROGRAM SHARED GRANT
unsw.relation.projectDesc the project addresses the question of what standard platforms and methods could be utilised to collect and inspect heterogeneous industrial internet of things (iiot) data?. it is urgent to address this question for enhancing the sustainability of significant data collections that will drive innovation in cyber security research in australia. the project establishes and shares the data from a testbed industrial iot (iiot) network used for collecting heterogeneous datasets of telemetry sensors, network traffic, and operating systems of both windows and linux systems, empowered by standard formats, agreed protocols and well-defined data properties. we will be the first to build, collect and examine these large distributed data collections generated from iiot for cyber security applications. our research measures the prevalence, severity and mode of online cyber activity affecting australian cyberspace. the data and methods used and shared in this project will also help in identifying cyber-attack patterns, providing a basis for further developments of cybersecurity prevention strategies and guiding cyber-incident responses. innovative statistical and deep learning algorithms will be used to explore the technical and textual data acquired. the data and methods shared by this project will have a profound impact on the cybersecurity in australia, as it will improve advanced cyber defence tools.
unsw.relation.projectEndDate 2019-12-31
unsw.relation.projectStartDate 2019-01-01
unsw.relation.projectTitle a pilot database of industrial internet of things networks for cyber security applications en_US
unsw.relation.school School of Engineering and Information Technology
unsw.relation.unswGrantNo RG192500
Files
Resource type