Phd: Research paper-3
7/26/2020 Originality Report
https://ucumberlands.blackboard.com/webapps/mdb-sa-BB5a31b16bb2c48/originalityReport/ultra?attemptId=a78cf268-000c-45d1-b5c1-c493fceaaca4&course_id=_… 1/9
%81
%1
SafeAssign Originality Report Summer 2020 - Data Science & Big Data Analy (ITS-83… • Week 4 Research Paper
%82Total Score: High risk Sunil Kumar Parisa
Submission UUID: 16bd6e1d-6f1e-4e70-65b2-53879007709b
Total Number of Reports
1 Highest Match
82 % Research_Paper_4.docx
Average Match
82 % Submitted on
07/26/20 12:37 PM PDT
Average Word Count
1,483 Highest: Research_Paper_4…
%82Attachment 1
Institutional database (6)
Student paper Student paper Student paper
Student paper Student paper Student paper
Internet (1)
springeropen
Top sources (3)
Excluded sources (0)
View Originality Report - Old Design
Word Count: 1,483 Research_Paper_4.docx
3 2 7
5 1 6
4
3 Student paper 2 Student paper 7 Student paper
8
Modeling Uncertainty in ML and NLP
Sunil Kumar Parisa
University of Cumberland’s
ITS 836 – Data Science and Big Data Analytics
Dr. Kelly Wibbenmeyer
26th July 2020
Abstract
Big data analytics is the capacity to deal with huge volumes of information with shifting arrangements and multifaceted nature from related in- formation, semi-organized information, weblogs, gadget information, and unstructured configurations. Capacity to get bits of knowledge about your items (brands), clients, and workers from online life information and connect with your exchange framework information. Big data analytics is mak- ing life is easier. Anything which you use on a daily basis might be a result of big data analytics. ML techniques are normally not computationally efficient or effective enough to handle big data features as well as vulnerability. NLP procedures can help with making new traceable interfaces and recoup detectability joins by finding semantic closeness among accessible printed ancient rarities.
Keywords: Big data analytics, machine learning techniques, natural language processing
Addressing Uncertainty in ML and NLP
1
2
3
3
7/26/2020 Originality Report
https://ucumberlands.blackboard.com/webapps/mdb-sa-BB5a31b16bb2c48/originalityReport/ultra?attemptId=a78cf268-000c-45d1-b5c1-c493fceaaca4&course_id=_… 2/9
Addressing Uncertainty in ML and NLP
When working with big data analytics, ML is usually used to develop prediction models as well as knowledge gathering to enhance information-dri- ven dynamic. There are several ML procedures recommended for big data assessment; these procedures comprise element learning, deep learning, move to learn, circulated learning, and dynamic learning (Hassani, 2018). Feature learning involves several methods that help a system automatical- ly find the designs needed to collect data and the classification of unprocessed data. ML algorithm's performance is mainly affected by the choice of information depiction. Deep learning algorithms are meant to break down as well as generate critical information from huge sums of information and also information collected from various sources; however, current deep learning models require a high computational cost. Distributed learning can moderate the adaptability application of traditional ML via completing computations on informational indexes adopted among a few workstations to scale up the learning process. Transfer learning involves the use of information gathered from one source to a new source; at the same time, enhancing data movement from one area by moving data from a related space. Dynamic learning involves calculations that use ver- satile information collection forms that subsequently change parameters to gather the most useful information as fast as it could reasonably be ex- pected to accelerate ML activities and avoid naming challenges (Lue, 2019). The vulnerability challenges of ML procedures can be basically ascribed to gathering information with low integrity i.e., dubious and inadequate information as well as information with low value irrelevant to the present issue. Among the ML techniques, active learning, deep learning, as well as fuzzy rationale hypothesis, are extremely recommended to assist in vul- nerability test that reduces the level of risks. Risks can have a big impact on ML so long as poor or uncertain training tests, indistinct classifica- tion limits, and harsh information on the objective information. At times, the information is presented without names, which can be a challenge.
Representing Uncertainty Resulting From Big Data Analytics
Marking big data physically can be a challenge in terms of cost and exhausting in terms of labor. At the same time, using unlabeled data is very difficult as classifying information with hazy rules muddled outcomes. Active learning has addressed this problem by determining a subset of the most significant event for marking. Deep learning is another learning technique that can deal with inadequacy and irregularity challenges in the classification methodology. NLP has a reputable set of techniques and tools which cover both written and spoken languages (Walker, 2015). NLP is also applicable in many areas such as machine translation, information gathering, speech recognition, optical character recognition, spell checking, and many others. Machine Learning (ML), on the other hand, is an approach that could be used in Natural Language Processing and many other fields such as data sciences, decision-making systems, and artificial intelligence. We can easily say that NLP is an interdisciplinary computing field, while ML is a set of strategies and tools to address as well as solve different challenges in a variety of computing fields, including NLP. However, we should not forget that these topics are so getting entangled and intertwined, which makes it difficult to establish a clear line between their defini- tions. Natural language processing provides clarification to the above-mentioned problems using the vocabulary selection method, understanding synonyms, antonyms, homonyms using wordnet, lexicon formation, relationship identification, and Name entity recognition Stanford parser. NLP is an aid to ML and also Deep futuristic learning. Moreover, NLP augmented to ML reduce the search space and make it a guided search.
3
3
3
4
3
As a result, classier don't overfit while training and accuracy are improved. The addition of Semantics to NLP is a major thrust in today’s Learn- ing community.
Enhancing ML and NLP to Handle Big Data
NLP technique is integrated with ML, which helps gadgets to assess, decode, and even create content. NLP and big data handle huge amounts of content information and gradually get an incentive from such a dataset. Some common NLP practices comprise lexical procurement, word sense disambiguation i.e., determining which type of word is used in a sentence in an event a word has different implications and grammatical feature (POS) labeling i.e., hinder mining the capacity of the words through marking classes, for example, action word, thing, and so forth. Several NLP- based techniques have been used to conduct mining, including data gathering, theme demonstration, content outline, classification, grouping, ques- tion feedback, and supposition mining. For instance, financial and fraud detection may include finding proof of wrongdoing in huge datasets (Mora- bito, 2017). NLP technique especially named content extraction and data recovery can help oversee and scan through colossal measures of factual data, for example, criminal names and bank records, to support misrepresentation evaluation.
Impacts of Natural Language Programming in Big Data
Moreover, NLP and big data can be utilized to assess news stories and foresee rises and falls on the composite stock value file. The vulnerability affects NLP in big data in various ways. For instance, the catchphrase search is an exemplary methodology in data mining that is used to deal with a lot of factual information. Watchword search acknowledges as information a rundown of applicable words or expressions and searches the ideal arrangement of information for events of the significant words. The vulnerability can affect catchphrase search, as an archive that contains a watch- word isn't a confirmation of a report's pertinence. For instance, a catchphrase search, for the most part, coordinates accurate strings and overlooks words with a spelling error that may at present be important. Boolean administrators and fluffy pursuit innovations license more prominent flexibil- ity in that they can be utilized to scan for words like the ideal spelling. While big data using AI holds a ton of guarantee, a wide scope of challenges is presented when such methods are exposed to vulnerability. For example, every one of the attributes presents various sources of vulnerability, un- structured, inadequate, or noisy data. Moreover, the vulnerability can be installed in the whole assessment process. For instance, managing inade- quate and loose data is a basic test for most information mining and ML procedures (Hussain, 2016). Also, an ML algorithm may not get the ideal outcome if the preparation information is one-sided in any capacity. Scaling these worries up to the big data level will effectively exacerbate any er- rors or inadequacies of the whole investigation process. Accordingly, a moderating vulnerability in big data analytics must be at the cutting edge of any robotized procedure, as the vulnerability can have a significant influence on the exactness of its outcomes.
Conclusion
Data Analytics and Data Science can solve any and all business problems regardless of whether we have big data or regular data. However, the only difference with the big data analytics will be that we will be typically dealing with large and unstructured data on some sort of distributed computing such as Hadoop, AWS, etc. E-commerce issues with optimization of raw material stocks, rotation of goods, a decrease in warehouse space, and lo- gistics cost can be solved with the help of linear programming and the methods of big data analysis.
3
3
4
3
7/26/2020 Originality Report
https://ucumberlands.blackboard.com/webapps/mdb-sa-BB5a31b16bb2c48/originalityReport/ultra?attemptId=a78cf268-000c-45d1-b5c1-c493fceaaca4&course_id=_… 3/9
Source Matches (24)
Student paper 99% Student paper 97%
References
Hassani, M. (2018). Overview of efficient clustering methods for high-dimensional big data streams. Clustering Methods for Big Data Analytics, 25-42. https://doi.org/10.1007/978-3-319-97864-2_2
Hussain, A., & Roy, A. (2016). The emerging era of big data analytics. Big Data Analytics, 1(1). https://doi.org/10.1186/s41044-016-0004-2
Lue, R. (2019). Data science as a foundation for inclusive learning. 1.2. https://doi.org/10.1162/99608f92.c9267215
Morabito, V. (2015). Big data and analytics innovation practices. Big Data and Analytics, 157-176. https://doi.org/10.1007/978-3-319-10665- 6_8
Walker, R. (2015). Impact of analytics and big data on corporate culture and recruitment. From Big Data to Big Profits, 184-201. https://doi.org/10.1093/acprof:oso/9780199378326.003.0009
3
5 5
6 1
7
1
Student paper
University of Cumberland’s ITS 836 – Data Science and Big Data Analytics
Original source
University of the Cumberland’s ITS- 836 Data Science & Big Data Analytics
2
Student paper
Big data analytics is the capacity to deal with huge volumes of informa- tion with shifting arrangements and multifaceted nature from related in- formation, semi-organized informa- tion, weblogs, gadget information, and unstructured configurations. Ca- pacity to get bits of knowledge about your items (brands), clients, and workers from online life information and connect with your exchange framework information. Big data an- alytics is making life is easier. Any- thing which you use on a daily basis might be a result of big data analytics.
Original source
Big data analytics is the capacity to deal with huge volumes to informa- tion with shifting arrangements and multifaceted nature from organized information, semi-organized infor- mation, weblogs, gadget information and unstructured configuration Ca- pacity to get bits of knowledge about your items (brands), clients and workers from online life information and connect with your exchange framework information Big data an- alytics is making life is easier Any- thing which you use on daily basis might be result of big data analytics
7/26/2020 Originality Report
https://ucumberlands.blackboard.com/webapps/mdb-sa-BB5a31b16bb2c48/originalityReport/ultra?attemptId=a78cf268-000c-45d1-b5c1-c493fceaaca4&course_id=_… 4/9
Student paper 89%
Student paper 74%
Student paper 81%
Student paper 72%
Student paper 69%
3
Student paper
NLP procedures can help with mak- ing new traceable interfaces and re- coup detectability joins by finding se- mantic closeness among accessible printed ancient rarities.
Original source
Additionally, NLP procedures can as- sist with making new traceable inter- faces and recoup detectability joins by finding semantic closeness among available printed ancient rarities
3
Student paper
Big data analytics, machine learning techniques, natural language processing
Original source
Analytics techniques in data mining, deep learning and natural language processing
3
Student paper
Distributed learning can moderate the adaptability application of tradi- tional ML via completing computa- tions on informational indexes adopted among a few workstations to scale up the learning process.
Original source
Distributed learning can be utilized to moderate the adaptability issue of customary ML via completing com- putations on informational indexes appropriated among a few worksta- tions to scale up the learning procedure
3
Student paper
Dynamic learning involves calcula- tions that use versatile information collection forms that subsequently change parameters to gather the most useful information as fast as it could reasonably be expected to ac- celerate ML activities and avoid nam- ing challenges (Lue, 2019). The vul- nerability challenges of ML proce- dures can be basically ascribed to gathering information with low in- tegrity i.e., dubious and inadequate information as well as information with low value irrelevant to the present issue.
Original source
Dynamic learning alludes to calcula- tions that utilize versatile informa- tion collection (i.e., forms that conse- quently alter parameters to gather the most helpful information as fast as could reasonably be expected) so as to quicken ML exercises and over- come naming issues (Dasgupta, 2018) The vulnerability difficulties of ML procedures can be basically as- cribed to gaining from information with low veracity (i.e., dubious and inadequate information) and infor- mation with little value (i.e., irrele- vant to the present issue)
3
Student paper
Risks can have a big impact on ML so long as poor or uncertain training tests, indistinct classification limits, and harsh information on the objec- tive information. At times, the infor- mation is presented without names, which can be a challenge.
Original source
The vulnerability can affect ML as far as inadequate or uncertain training tests, indistinct classification limits, and harsh information on the objec- tive information At times, the infor- mation is spoken to without names, which can turn into a test
7/26/2020 Originality Report
https://ucumberlands.blackboard.com/webapps/mdb-sa-BB5a31b16bb2c48/originalityReport/ultra?attemptId=a78cf268-000c-45d1-b5c1-c493fceaaca4&course_id=_… 5/9
springeropen 71%
Student paper 70%
Student paper 96%4
Student paper
Representing Uncertainty Resulting From Big Data Analytics
Original source
Uncertainty in big data analytics
3
Student paper
Active learning has addressed this problem by determining a subset of the most significant event for mark- ing. Deep learning is another learn- ing technique that can deal with in- adequacy and irregularity challenges in the classification methodology. NLP has a reputable set of tech- niques and tools which cover both written and spoken languages (Walk- er, 2015). NLP is also applicable in many areas such as machine transla- tion, information gathering, speech recognition, optical character recog- nition, spell checking, and many others.
Original source
Active learning has explained this is- sue by choosing a subset of the most significant occasions for mark- ing Profound learning is another learning strategy that can deal with inadequacy and irregularity issues in the classification methodology NLP has an established set of method- ologies, tools, and techniques that cover both written and spoken (not to mention signed) languages (Nas- raoui & N'Cir, 2018) Also, it has large application areas such as machine translation, information extraction, speech recognition, optical character recognition, spell checking, and such
3
Student paper
Machine Learning (ML), on the other hand, is an approach that could be used in Natural Language Processing and many other fields such as data sciences, decision-making systems, and artificial intelligence. We can easily say that NLP is an in- terdisciplinary computing field, while ML is a set of strategies and tools to address as well as solve different challenges in a variety of computing fields, including NLP. However, we should not forget that these topics are so getting entangled and inter- twined, which makes it difficult to es- tablish a clear line between their de- finitions. Natural language process- ing provides clarification to the above-mentioned problems using the vocabulary selection method, understanding synonyms, antonyms, homonyms using wordnet, lexicon formation, relationship identifica- tion, and Name entity recognition Stanford parser.
Original source
Machine Learning (ML), on the other hand, is an approach that could be used in Natural Language Processing and many other fields such as data sciences, decision-making systems, and artificial intelligence We can per- haps say that NLP is an in- terdisciplinary field in computing, while ML is a set of approaches and tools to address and solve different problems in a variety of computing fields, including NLP However, we should not forget that these topics are so getting entangled and inter- twined, which makes it difficult to es- tablish a clear line between their de- finitions Natural language process- ing provides clarification to the above-mentioned problems using the vocabulary selection method, understanding synonyms, antonyms, homonyms using wordnet, lexicon formation, relationship identifica- tion, and Name entity recognition (Stanford parser)
7/26/2020 Originality Report
https://ucumberlands.blackboard.com/webapps/mdb-sa-BB5a31b16bb2c48/originalityReport/ultra?attemptId=a78cf268-000c-45d1-b5c1-c493fceaaca4&course_id=_… 6/9
Student paper 100% Student paper 77%3
Student paper
NLP is an aid to ML and also Deep futuristic learning. Moreover, NLP augmented to ML reduce the search space and make it a guided search. As a result, classier don't overfit while training and accuracy are im- proved. The addition of Semantics to NLP is a major thrust in today’s Learning community.
Original source
NLP is an aid to ML and also Deep futuristic learning Moreover, NLP augmented to ML reduce the search space and make it a guided search As a result, classier don't overfit while training and accuracy are im- proved The addition of Semantics to NLP is a major thrust in today's Learning community
3
Student paper
NLP technique is integrated with ML, which helps gadgets to assess, de- code, and even create content. NLP and big data handle huge amounts of content information and gradually get an incentive from such a dataset. Some common NLP practices com- prise lexical procurement, word sense disambiguation i.e., determin- ing which type of word is used in a sentence in an event a word has dif- ferent implications and grammatical feature (POS) labeling i.e., hinder mining the capacity of the words through marking classes, for exam- ple, action word, thing, and so forth. Several NLP-based techniques have been used to conduct mining, includ- ing data gathering, theme demon- stration, content outline, classifica- tion, grouping, question feedback, and supposition mining.
Original source
NLP is a strategy integrated into ML that empowers gadgets to assess, decipher, and even create content NLP and big data handle large mea- sures of content information and can get an incentive from such a dataset progressively Some stan- dard NLP practices include lexical procurement, word sense disam- biguation (i.e., figuring out which feeling of the word is utilized in a sentence when a name has different implications), and grammatical fea- ture (POS) labeling (i.e., hinder min- ing the capacity of the words through marking classes, for exam- ple, action word, thing, and so forth) A few NLP-based methods have been used to content mining, includ- ing data extraction, theme demon- strating, content outline, classifica- tion, grouping, and question feed- back, as well as supposition mining
7/26/2020 Originality Report
https://ucumberlands.blackboard.com/webapps/mdb-sa-BB5a31b16bb2c48/originalityReport/ultra?attemptId=a78cf268-000c-45d1-b5c1-c493fceaaca4&course_id=_… 7/9
Student paper 80%
springeropen 62%
Student paper 83%3
Student paper
For instance, financial and fraud de- tection may include finding proof of wrongdoing in huge datasets (Mora- bito, 2017). NLP technique especially named content extraction and data recovery can help oversee and scan through colossal measures of factual data, for example, criminal names and bank records, to support mis- representation evaluation.
Original source
For instance, financial and extortion examinations may include finding proof of wrongdoing in massive datasets (Morabito, 2017) NLP strategies (uniquely named sub- stance extraction and data recovery) can help oversee and filter through colossal measures of literary data, for example, criminal names and bank records, to support misrepre- sentation evaluation
4
Student paper
Impacts of Natural Language Pro- gramming in Big Data
Original source
Natural language processing and big data
3
Student paper
Moreover, NLP and big data can be utilized to assess news stories and foresee rises and falls on the com- posite stock value file. The vulnera- bility affects NLP in big data in vari- ous ways. For instance, the catch- phrase search is an exemplary methodology in data mining that is used to deal with a lot of factual in- formation. Watchword search ac- knowledges as information a run- down of applicable words or expres- sions and searches the ideal arrangement of information for events of the significant words.
Original source
Moreover, NLP and big data can be utilized to investigate news stories, and foresee rises and falls on the composite stock value file Vulnera- bility influences NLP in vast informa- tion in various ways For instance, a catchphrase search is an exemplary methodology in content mining that is used to deal with a lot of literary knowledge Watchword search ac- knowledges as information a run- down of applicable words or expres- sions and searches the ideal arrangement of data for events of the significant words
7/26/2020 Originality Report
https://ucumberlands.blackboard.com/webapps/mdb-sa-BB5a31b16bb2c48/originalityReport/ultra?attemptId=a78cf268-000c-45d1-b5c1-c493fceaaca4&course_id=_… 8/9
Student paper 97% Student paper 87%
Student paper 97%
3
Student paper
The vulnerability can affect catch- phrase search, as an archive that contains a watchword isn't a confir- mation of a report's pertinence. For instance, a catchphrase search, for the most part, coordinates accurate strings and overlooks words with a spelling error that may at present be important. Boolean administrators and fluffy pursuit innovations license more prominent flexibility in that they can be utilized to scan for words like the ideal spelling. While big data using AI holds a ton of guar- antee, a wide scope of challenges is presented when such methods are exposed to vulnerability.
Original source
The vulnerability can affect catch- phrase search, as an archive that contains a watchword isn't a confir- mation of a report's pertinence For instance, a catchphrase search, for the most part, coordinates accurate strings and overlooks words with a spelling error that may at present be important Boolean administrators and fluffy pursuit innovations license more prominent flexibility in that they can be utilized to scan for words like the ideal spelling While big data using AI holds a ton of guar- antee, a broad scope of difficulties is presented when such methods are exposed to vulnerability
3
Student paper
For example, every one of the attrib- utes presents various sources of vul- nerability, unstructured, inadequate, or noisy data. Moreover, the vulner- ability can be installed in the whole assessment process. For instance, managing inadequate and loose data is a basic test for most informa- tion mining and ML procedures (Hussain, 2016). Also, an ML algo- rithm may not get the ideal outcome if the preparation information is one-sided in any capacity.
Original source
For example, every one of the V at- tributes present various sources of weakness, for example, unstruc- tured, inadequate, or noisy data Moreover, the vulnerability can be installed in the whole assessment process For instance, managing in- sufficient and lose data is a basic test for most information mining and ML procedures (Ghosh & Liv- ingston, 2019) Also, an ML algorithm may not get the ideal outcome if the preparation information is one-sided in any capacity
3
Student paper
Scaling these worries up to the big data level will effectively exacerbate any errors or inadequacies of the whole investigation process. Accord- ingly, a moderating vulnerability in big data analytics must be at the cut- ting edge of any robotized proce- dure, as the vulnerability can have a significant influence on the exact- ness of its outcomes.
Original source
Scaling these worries up to the high data level will effectively exacerbate any errors or inadequacies of the whole investigation process Accord- ingly, a moderating vulnerability in big data analytics must be at the cut- ting edge of any robotized proce- dure, as vulnerability can have a sig- nificant influence on the exactness of its outcomes
7/26/2020 Originality Report
https://ucumberlands.blackboard.com/webapps/mdb-sa-BB5a31b16bb2c48/originalityReport/ultra?attemptId=a78cf268-000c-45d1-b5c1-c493fceaaca4&course_id=_… 9/9
Student paper 81%
Student paper 100%
Student paper 96%
Student paper 100%
Student paper 90%
Student paper 96%
3
Student paper
Clustering Methods for Big Data An- alytics, 25-42.
Original source
Clustering methods for big data analytics
5
Student paper
Hussain, A., & Roy, A.
Original source
Hussain, A., & Roy, A
5
Student paper
The emerging era of big data analyt- ics. Big Data Analytics, 1(1). https://doi.org/10.1186/s41044-016- 0004-2
Original source
The emerging era of Big Data Analyt- ics Big Data Analytics, 1(1) doi:10.1186/s41044-016-0004-2
6
Student paper
Big data and analytics innovation practices. Big Data and Analytics, 157-176.
Original source
Big Data and Analytics Innovation Practices Big Data and Analytics, 157-176
1
Student paper
https://doi.org/10.1007/978-3-319- 10665-6_8
Original source
doi:10.1007/978-3-319-10665-6_8
7
Student paper
Impact of analytics and big data on corporate culture and recruitment. From Big Data to Big Profits, 184- 201. https://doi.org/10.1093/acprof:oso/9 780199378326.003.0009
Original source
Impact of Analytics and Big Data on Corporate Culture and Recruitment From Big Data to Big Profits, 184-201 doi:10.1093/acprof:oso/9780199378 326.003.0009