Methodology Plan for Mock Dissertation
RESEARCH METHODOLOGY PLAN
Research Methodology Plan
Using Data Science Techniques to Enhance Data Security
Bindu Priyanka Ganta
University of the Cumberlands
ITS 837 - Prof Writing & Proposal Devel
DeAnna Miller
09/09/2020
RESEARCH METHODOLOGY PLAN
Using Data Science Techniques to Enhance Data Security
Problem Statement The purpose of this study was to describe how data security can be enhanced using data science techniques. Data science became very prominent field of study that is being used by many domains to solve problems that human cannot. Data science is study of data in which algorithms are developed using machine learning which are called models. These models are trained using the existing dataset with scenarios and possible outcomes. With the rise of Artificial Intelligence, data science adopted AI as well to make models adapt itself in various scenarios and threats. In this study we discuss the use of solution to enhance data security using ML techniques including supervised learning and unsupervised learning. Research Approach and Strategy Researcher’s goal was to use Machine learning models to enhance the data security by developing ML based authentication and ML based auto audit feature on access controls to protect the privacy of the data. Researcher used supervised learning, unsupervised learning and reinforcement learning as main strategies. The main goal was to come with a framework which is powered by multiple machine learning models to implement effective and efficient security layer on top of data. This layer includes ML-based IoT authentication, access control, secure offloading, and malware detection schemes to protect data privacy. Researcher followed below approaches to the solution. 1) Learning based authentication 2) Learning based malware-detection 3) Secure IOT devices using Learning 4) Machine learning based access control. Sources and Data collection Researcher chose NSL-KDD dataset to evaluate the performance of the detection method. This dataset is a benchmarking dataset for network intrusion. Dataset contains both test and training samples. We need training samples to train the machine learning model. We have a total of 125,973 training traffic samples and 22,554 test traffic samples. Seven weeks of network traffic data is used for training in the form of raw tcpdump. Data Collection and analysis methods Researcher used neutral network-based machine learning models to analyze the data. This model contains two phases training phase and test phase. As per researcher "There are three steps in training phase - data preprocessing, multi-feature extraction, multi-channel training, and also there are three steps in testing phase - data preprocessing, multi-feature extraction and attack detection. The first two steps are the same in the two phases. In the training phase, data preprocessing is a combination of processing steps to provide high-quality data, including data sampling, data cleansing and data dimensionality."[Deep Learning Based Multi-Channel Intelligent Attack Detection for Data Security]. Researcher also used Recurrent Neural Network which is widely used in natural language processing. RNN is powerful for modeling sequences by having cyclic connections. Comment by DeAnna: Independent variable? Dependent variable? Comment by DeAnna: What are your research questions? Comment by DeAnna: See rubric – you need more details here Comment by DeAnna: This is not a research data analysis tool. This is an IT tool. This paper is about your research study. What tool will you use to analyze your data from the study?
Ethical Considerations and Limitations
Data Science is a vast subject and mastering it with respect to enhance data security is tedious. As cited earlier, mastering the concepts of data science to have a deep knowledge in order to enhance data security is very difficult to achieve. The involved data is so contextual and may not help in achieving the expected result, due to its arbitrary nature. Comment by DeAnna: This is section is about the ethical considerations and limitations of your research study – not data science.
References
Jiang, F., Fu, Y., Gupta, B. B., Liang, Y., Rho, S., Lou, F., Meng, F., & Tian, Z. (2020). Deep Learning Based Multi-Channel Intelligent Attack Detection for Data Security. IEEE Transactions on Sustainable Computing, 5(2), 204–212. https://doi.org/10.1109/tsusc.2018.2793284
Xiao, L., Wan, X., Lu, X., Zhang, Y., & Wu, D. (2018). IoT Security Techniques Based on Machine Learning: How Do IoT Devices Use AI to Enhance Security? IEEE Signal Processing Magazine, 35(5), 41–49. https://doi.org/10.1109/msp.2018.2825478