Article-3ResearchPaper.docx

Saeed, F., Mohammed, F., & Gazem, N. (Eds.). (2019). Emerging Trends in Intelligent Computing and Informatics: Data Science, Intelligent Information Systems and Smart Computing (Vol. 1073). Springer Nature.

According to (Saeed, Mohammed, & Gazem, 2019) the role data plays in the livelihood of humanity is just unimaginable and even beyond comprehension. Nobody today in the 21st century can dispute the impact intelligent computing and informatics have had on their lives since virtually everything today is connected to information systems. Today it is virtually impossible to imagine a world without technology as everything that sustains humanity in one way or the other require an input from technology world. In order to bring to context the role technology has and is continuing to play in the livelihood of humanity and also the world of business, I will take you back in time shortly. In the 7th and even until 16th century the idea of an airplane was only a misconception and few people like the Wright brothers could imagine that. Today aviation industry is one of the most leading industries in the transport sector which have open the world to unlimited opportunities. The same case can be said for many success stories and revolution technology has brought to the world order of today (Banister, 2007). According to Apple Company founder and the then CEO the late Steve Jobs, he argued that in the recent future technology will form an integral part of the community and it will be hard for one to resist its eventual occurrence. Jobs sentiments were also shared by many other players in the tech world like Elon Musk, Mark Zuckerberg, Warren Buffet and Bill Gates. They all pointed to a future where even the business world won’t be able to do without continued cutting edge technologies like digital twin technology, digital ethics and privacy and lastly automation and AI driven development.

Digital Twin technology

Before its discovery industries like manufacturing had issues of increasing efficiency in their operations so that product and service delivery would meet the international standards and also make manufacturing sustainable. Also instances of failure along the manufacturing line were hard to detect and properly diagnose due to lack of a way to monitor data from each manufacturing point and coordinate the data to arrive at a well informed decisions (Saeed, Mohammed, & Gazem, 2019). All these challenges almost seems to be forgotten with the invention of digital twin technology (DT) which has streamlined the manufacturing sector. With this kind of technology sensors and data transmission technology were able to be install at all point of the production or manufacturing line, all with an objective of collecting critical data of the process and analyzing it. From the data collected it was possible to use data analytics skills in determining areas where if adjustments are made the all manufacturing like could be optimized. Further this technology answered the problem of data and data analytics in influencing the entire manufacturing division.

Today it is possible through simulation to integrate almost all physical and virtual features of a system and this can be done for an entire manufacturing line. For instance, if the Boeing company experiences a problems with one of their airbuses as it was with the Boeing 737 max airbus, all they need is to go back to the whole production line or cycle and determine where could the problem have arose from. And this has been made possible through data analytics facilitated by the DT technology. Although one of the challenges of smart manufacturing been with finding a seamless way to integrate physical and virtual spaces together, progress seems to be underway especially through simulations, data communications and even acquisition (Saeed, Mohammed, & Gazem, 2019). Moreover, in 2017 the manufacturing industry in the US and China experienced a rise in their overall returns with the estimation ranking the figures at close to 12 trillion US dollars as the value of the manufacturing industry globally. This figure represents a 12.56% rise from the previous figures reported in 2007.

Digital Ethics and Privacy

With continuous increase and advancement in technology, issues like cybercrime and data privacy have been of great concern to many players particularly those handling large volume of data. Cybercrimes have been on a constant rise over the last ten years with estimations putting the total cost of global cybercrime annually at a whooping one trillion US dollars. Just from the figures one can be deduce how cybercrimes is of concern to the world. The issue of data security and compromise of data privacy was evident recently in the case where Facebook Company was accused of selling private data of its clients to Cambridge Analytica. This issue brought uproar with many people accusing the management of Facebook Company for compromising the integrity of their data. That is just one example, to heighten the adrenaline what do you think would have if the US, Russia and even China launch codes for their nuclear weapons were compromise? The aftermath could be catastrophic, and for this reason and many more digital ethics and privacy technology aims at reducing instances of data been compromised and at the same time coming up with technologies that would ensure cybercrimes is managed if not totally eradicated. In addition, the challenge various technologies in digital ethics and data privacy like data masking is facing is dynamisms to deal with different forms of replicating

Automation and AI development technologies

Automation have and is continuing to change the world of manufacturing and business significantly. Previously before invention of autonomous robots and invention of artificial intelligence (AI) manufacturing and even service delivery in some industries and companies were greatly slowed down. For instance, the Toyota Company before automation or introduction of autonomous robots in their production factory, the company was only able to assembly 15 cars a day. But with AI technology and robotics in play, the company have been able to improve its efficiency greatly and also the turnover of cars assembled per day from just 15 to 140 a day. It therefore follows that the fusion of technology in business and human lives has created more opportunity and improved the way of life.

Data Governance and Embedded Data Encryption

Many institutions are facing imminent threats of protecting the data they have against any external invasion. Data is crucial to any organization, and firms spend millions of dollars investing in effective security system only to realize the problem they thought of cyber-attack vulnerability is not the only problem the firm faces. It is then advisable to normally do a background on all security systems from households to firms in order to ensure there are no problems hidden behind the visible threat. Some of the trends in data analytics and business intelligence includes data security and embedded data encryption which helps to deal with the challenge of data insecurity and integrity (Saeed, Mohammed, & Gazem, 2019). Also the use of various technological solutions like data masking, data encryption and data resilience systems that would prevent unwarranted access of the company’s data are on continually on increase to combat data insecurity amidst Big Data emergence.

Data masking is one of the effective technologies used largely in corporate word to provide security to the data of a company. Data masking which is normally known as data obfuscation is the process of hiding the native data of the company through modified content which creates so many steps and decryption process in order for one to be able to access those data. The sole reason for carrying out data masking to data system is to protect the data that is top level security for instance the coke formula of the Coca cola company is a classified data that the company would not want to fall in the wrong hands or at worse its competitors for they will be out of business as a result of that. For this technological solution to be considered effective it must be consistent in its function which is providing top level security to the data when various tests are done to the system.

Process of data masking often involves various steps with the first step in data masking includes finding the data which is normally the first step which involves identifying the data that is classified and grouping it from the data that is not classified. This is often carried out by most company’s chief information specialist or data security analyst who puts together a detailed list of all classified data that needs to be protected from unauthorized access. Secondly assessing the situation is usually the next step and at this stage the company needs to an oversight from the security administrator of the company on the in the data security information status of the company with regard to security since this will determine the type of data masking technique to be adopted by the company (Saeed, Mohammed, & Gazem, 2019). He or she will also offer suggestion to be best data location and to what level does the data should be masked. Further the implementation of masking is normally the third stage and after carrying out the assessment of the situation and finding the data that requires masking the next step is to implement data masking technique which the company finds appropriate to use. Remember that for big organization that deals with large volumes of data, it isn’t feasible to assume that a single or just using an easy to incorporate system can be used in the entire company’s data (Saeed, Mohammed, & Gazem, 2019). Rather implementation must take into account effective and proper planning so that in the process no data is lost and therefore securing the entire data as a whole. Lastly testing data masking results which is the last step in data masking process since through various leaks might be identified early and their remedy found. Conducting the test ensures that the masking configuration yield the expected outcome. If it doesn’t then the DBA will restore the database to the pre-mask state. Tweaks the masking algorithms and completes the data making process from the start. Some of the common data masking techniques includes; Encryption, Character scrambling, Nulling out or deletion, Number and data variance, Substitution and Shuffling.

Data resilience is one of the technique that is normally used to protect data and only avail it when it’s needed especially in the production line of a company or factory. There exist several technologies which address the data resilience and they include: Logical replication which is a technique a widely used multisystem data resiliency topology for high recoverability in space normally the IBM space. Logical replication is normally deployed through a product that is provided by a high availability independent software vendor (ISV). The replication is then run through software approaches on objects, the changes to the objects are replicated to often a backup drive or copy. In addition, most logical replication solutions allow for the additional characters or modification beyond the object replication ascertaining solid proof systems.

Data encryption is one of the new and rapidly technologies that many companies are using to secure their highly classified data from been misused by those not meant to handle those data. Data encryption can be made where the data is guarded by several embedded security passcodes which can only be bypassed using the right encryption key (Saeed, Mohammed, & Gazem, 2019). This technique which has been adopted by many large data firms have help in safeguarding the data integrity and reliability.

Big Data conditioning has also an impact on the quality of the final data obtained from any raw data collected. In addition, a condition applied in data analysis will largely depend on data type. An example includes classifying your data on the basis of gender is an option in the clustering of your data as either male or female which in R script can be abbreviated as (‘gender’, ‘F’} or (‘gender’, ‘M’). Putting such conditions helps in grouping the data in recognizable and easily understandable clusters for analysis and visualization (Saeed, Mohammed, & Gazem, 2019). Also data condition approach can include nationality, economic status and even religion. Data types which are easily recognizable in R- programing language includes integers, numeric and factor. Alternatively data types can be classified into discrete and continuous where discrete includes things like numerical such as 1, 2 and 3 while continuous data type includes integers, and also logical data type besides infinite possibilities. In addition R data structures can include matrix, vectors and data frame, all these data structures and type have proven to be very comprehensive in clustering and performing data analytics. Despite R been widely preferred in simulation the data size capability is very limited, at the time of building the scripts and running of R scripts uses libraries that are restricted to 32-bit integers. And this means that a section of the vectors and indices are constrained to 32-bit as stated earlier. Also it is possible to find some data frames run out of space during the process of executing and running R even on a powerful sizeable memory PCs. And lastly overly data issues faced from a programing standpoint in any type of data analysis and visualizations includes missing values, having leading and trailing spaces which interrupt the flow of data and dates which aren’t properly interpreted properly in terms of time.

Data cleaning is equally important as conditioning of data since not all raw data either collected or mined is useful, meaning that all the data which might be of little significance needs to be syphoned out and only vital once left for analysis and visualization. Therefore data cleaning involves the back and forth operation of converting the raw data into reliable statistical figures which can be used to make reasonable decisions. Some of the biggest data firms such as Facebook and Amazon receives large bits of data in their servers that analyzing the data can be a problem unless the data is first cleaned and the properly structured (Saeed, Mohammed, & Gazem, 2019). Moreover performing data cleaning helps in increasing the data reliability and increases the content base of the data. Some of the data cleaning techniques used in R includes removing duplicates in the data so that the remaining data can be void of repetition that might reduce the data’s reliability and integrity. In addition, checking your data for any errors, normalizing it and even fixing and bringing in new inputs helps in converting the raw data from been technically correct to consistent data. Another data cleaning technique includes performing error highlighting so that all possible lines of codes or even extra spaces that might exist in the data that consume unnecessary space. These data cleaning techniques helps sieve the data so that the remaining exercise of data analysis and visualization effective leading to production of reliable data.

The most effective and world-wide used statistical technologies or technological tools in analyzing and possibly visualizing data includes the statistical package for the social science commonly abbreviated as (SPSS). Besides this statistical tool makes it possible to perform and compile various descriptive statistics including parametric and even none parametric analysis. But the only limitation of this statistical tool is that analysis of a wide range of data is impossible hence data cleaning and conditioning is necessary to improve the reliability and accuracy of the data been analyzed (Romero, & Ventura, 2013). Another statistical tool is the R which is the foundation for statistical computing also this tool of analysis although been very preferred has its own inconsistencies especially of data size it can handle as mentioned above. Other statistical tools includes the Microsoft excel and mat lab which equally are very detailed in analyzing any type of data as long as it’s numerical.

The results obtained indicates that statistical analyses depended more on the type of data been scrutinized and the way the data has been classified. From the data finally generated it was properly structured and organized as compared to the original or rather raw data. There can be possible ways in which one could discretely misrepresent a data especially when performing cleaning and analysis on the same data at hand. This could be as a results of either negligence or just wrong techniques been used in the analysis otherwise the misrepresentation shouldn’t be expected.

In conclusion data science is vital in the era of today be it in business, technological world or even in the medical research field since quality data helps in deducing proper conclusions and steps to be taken in improving any situation at hand. But all these can only be possible if the approaches used in the analysis process are appropriate and context related. Further the role of business intelligence in shaping the future of businesses and even various sectors of the economy is just insurmountable. Although data analytics isn’t something new its continued evolution especially amidst advent of Big Data has proven very helpful especially in analyzing the vast data and making information extraction from such data possible.