HEALTH-COP COMPANY
Network and Workflow Description
Data mining is a complex process that involves several activities undertaken sequentially for the entire process to be successful. As such, there are specific protocols that must be followed in data mining. The desired goals and objectives are the guiding principles upon which the type of data to be analyzed is identified. The main goal for Health-Cop is to establish links between diet composition and health issues. More specifically, the company will focus on analysis of data from various health facilities, websites, databases and health journals. The analysis is intended to provide new forms of data that can be interpreted to give meaningful patterns. To facilitate the process of data mining, there are several aspects that must be considered such as: statistics, clustering of data, rules of association, data classification, visualization and the decision tree.
Network Description
Health Cop company will set up is network system using both the windows and Linux based operating system. The company will have 10 desktop computers and 5 portable computers. The 10 desktop computers will be connected together via a metered Wi-Fi service. The desktops will be the main engine of the company. All the desktops will be configured with an algorithm that constantly searches for specific keywords from various databases. The portables computers will be connected to the internet via modems. A modem is much safer since it limits the connectivity to only the device being used. Internet connectivity via modem is facilitated through local area networks (LAN), through to the service providers, (Cui, et.al., 2016). Multiple firewalls are set up within the company networks to sort out undesired data traffic from the local network on the computer devices. Comment by Mark O'Connell: ? Comment by Mark O'Connell: Think you mean Wifi??
The most suitable firewall for the network would be a layer 3 open systems interconnection (OSI) model, which guarantees maximum security to the local network systems, (Greenberg, et.al., 2016). This type of models is well designed to suit communication within several computers in a standard network system. The network router will generate IP addresses whose packets will be used to launch communication between the computers used. All the data mined through configured search engine will be relayed to the devices whose IP addresses are saved in the network. Since all computers will be assignsed specific IP addresses, resource sharing and data transfer will be effectively done via the network systems. Comment by Mark O'Connell: ? routers don’t “generate” IP addresses. Not sure what you mean. Comment by Mark O'Connell: No. The Windows OS has the TCP/IP that generates packets Comment by Mark O'Connell: What devices? What is transferring data to what? Comment by Mark O'Connell: I’m getting lost here
The network architecture will be designed with two firewalls configured into a two firewall demilitarized zone (DMZ). The DMZ is located on a neutral level that serves as the linkage and contact point between Health-Cop network systems and the internet. This is very crucial for maintaining maximum network security. This kind of security protocol also ensures that company networks are not less exposed vulnerable to any threat that may be launched via internet. Health-cop’s domain name servers will remain secured and thus the process of translating IP addresses will be much effective. Interpretation and translation of IP addresses by the company DNS servers facilitates retrieval of data from the internet. Comment by Mark O'Connell: vulnerable Comment by Mark O'Connell: If you only have laptops you are probably using regional DNS servers outside your network from Windows on your PCs
The switch devices have been incorporated in the network architecture since Health-cop networks will be configured on the Windows server system. Since the system is configured with layer 3 OSI, enterprise level uses must be used for the process of packet routing to be successfully executed, (Shatri, et.al., 2017). This facilitates the transfer of data packets to different computers in the company system. Switches are much better compared to hubs in a network because they only relay data packets to a specific MAC address destination. Windows server system also requires routers to be installed and configured in the system to facilitate the application of virtual private network (VPN) devices. Comment by Mark O'Connell: ? Comment by Mark O'Connell: What? Enterprise uses must be used… Comment by Mark O'Connell: Not sure what you’re getting at with this Layer 3 OSI issue. Layer 3 devices use focused IP routing whereas Layer 2 device are in broadcast mode to all devices on the network. Your network is tiny.
Image result for describing network architecture diagram example
Figure 1 Network Architecture
The network incorporates a backbone to host the two switches that have been designed in the network system. The backbone also facilitates effective communication between multiple devices operating within the Health-Cop network systems.
Work flow
The first stage of the data mining process will be data collection. The search algorithms will be configured to detect certain key words from the databases analyzed. The collected data will then be stored on the network and into the computer drives. Once the data is stored on the data, the data will be subjected to screening and cleansing procedures on the network systems. Afterwards, the data will be classified according to different clusters and patterns identified through the analysis, (Adamuz-Hinojosa, et.al., 2018). Comment by Mark O'Connell: ? I assume when a Keyword is found – some volume of data gets extracted from the databases being searched Comment by Mark O'Connell: What? “Once the data is stored on the data…”
Image result for big data mining network diagram
Figure 2Network diagram
The analytics will also involve regression predictions and outlier identification procedures. Finally, the data will be sorted and the relevant data sent to the network optimization unit while the irrelevant data will be directed back to the regression n analysis. The relevant data will be interpreted in the network optimization units and used to create patterns that explain the links between a certain dietary behavior with a specific lifestyle disease. Comment by Mark O'Connell: Regression analysis in order to make predictions Comment by Mark O'Connell: What? Makes no sense Comment by Mark O'Connell: excellent
In conclusion, the company’s network is designed to source for voluptuous data from various internet sources, databases and the cloud, and come up with relevant patterns that would be used to facilitate data analysis and reporting. The entire process must follow the outlined protocol for success to be achieved. Comment by Mark O'Connell: WRONG WORD
References
Adamuz-Hinojosa, O., Ordonez-Luciana, J., Ameigeiras, P., Ramos-Munoz, J. J., Lopez, D., & Folgueira, J. (2018). Automated network service scaling in nfv: Concepts, mechanisms and scaling workflow. IEEE Communications Magazine, 56(7), 162-169.
Cui, L., Yu, F. R., & Yan, Q. (2016). When big data meets software-defined networking: SDN for big data and big data for SDN. IEEE network, 30(1), 58-65.
Greenberg, A., Lahiri, P., Maltz, D. A., Patel, P. K., Sengupta, S., Jain, N., & Kim, C. (2016). U.S. Patent No. 9,497,039. Washington, DC: U.S. Patent and Trademark Office.
Shatri, V., Kurtaj, L., & Limani, I. (2017, May). Hardware-in-the-loop architecture with MATLAB/Simulink and QuaRC for rapid prototyping of CMAC neural network controller for ball-and-beam plant. In 2017 40th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO) (pp. 1201-1206). IEEE.