week7 discussion

binghanem
responses.docx

Week 7 DISCUSSION 3

Alina:

If there is one thing the past decade has afforded us, it is to be able to collect, share, and save data in large quantities. You have to look at how technology has evolved in recent years to see how far it has come. Most businesses originated in times where the only means of collecting data and storing it involved paper invoices and bins. Ninety percent of the data generated today is defined as unstructured data (Marr, 2019). Unstructured data is simply data that does not have a recognizable structure, leaving the data raw and unorganized; this data can either be non-textual or textual. An example of an unstructured textual data would be an email. Several things go into unstructured and structured data. A data lake is a pool that contains both structured and unstructured data that is generally stored as is without any specific purpose in mind, and it can be built on several technologies, one of them being Hadoop (Dennis, 2018). The opposite of unstructured would be structured data; structured data is stored within a data warehouse compared to the data lake unstructured is thrown into. The data warehouse architect is responsible for designing data warehouse solutions and working with conventional data warehouse technologies to come up with plans that best support a business or organization. (Dennis, 2018).  Analytical tools in analyzing data is a necessary cost when running an organization or business. Because the bulk of the data generated today is unstructured data, organizations must find ways to analyze that data to make crucial decisions (Marr, 2019). These decisions are driven by the possibilities data has to offer, whether it be from customer behavior or financial data pertinent to the business.

            Many scenarios could describe unstructured data. The ease of structured data is far superior compared to unstructured data. Unstructured data files include text an multimedia content, things like email messages, word documents, videos, photos, audio files, presentations, webpages, and other highly used business documents. Each business invested in analytical tools could easily view that unstructured data. A company or a business receives much feedback, none of that feedback would be able to fit in some nice little tiny box that can be quickly searched, so there needs to be another way. Unstructured data is more significant in technology, a lot of that unstructured data is human-generated, and that information becomes useful to businesses trying to sell products to those people. Businesses looking at email messages, the social media responses that include video and audio; sometimes, this can be thrown into a data lake and collect and for further analyzing. Technology is essential, and today we use it all day, and every day it seems like. In order for businesses to succeed, they need to be able to collect and analyze to ensure success.

Dennis, A. (2018). Data Lakes 101: An Overview. Retrieved June 23, 2020, from

             https://www.dataversity.net/data-lakes-101-overview/

Marr, B. (2019). What Is Unstructured Data And Why Is It So Important To Businesses?

An Easy Explanation For Anyone. Retrieved June 23, 2020, from  https://www.forbes.com/sites/bernardmarr/2019/10/16/what-is-unstructured-data-and-why-is-it-so-important-to-businesses-an-easy-explanation-for-anyone/

Madison:

There is a difference between structured data and unstructured data in analytics, which many jobs use both of them together. An article stated, “Structured data is highly-organized and formatted in a way so it's easily searchable in relational databases” (Pickell, 2018). It is easier to recognize than unstructured data. The article also mentioned, “Unstructured data has no pre-defined format or organization, making it much more difficult to collect, process, and analyze” (Pickell, 2018). Unstructured is messier and harder to search and find. Unstructured data can be pictures, videos, audio transmissions, sensors, social media feeds, and disorganized data from the web (Shacklett, 2017). In order to first analyze if the data is unstructured, there must be programs and analytics tools that can identify the data and the sources like data mining. Multi-platform data architecture (MDA) is to give users options for capturing, storing, integrating, and processing the rapidly diversifying data (Russom, 2018). Data warehouse can also be used to evaluate unstructured data while using structured data to help make decisions by using the structured sources. Also, data lakes will help store data for applications, “Data Lakes allow you to store relational data like operational databases and data from line of business applications, and non-relational data like mobile apps, IoT devices, and social media” (Fontichiaro, 2018). Data lakes are better for social media and applications on devices. Finally, Hadoops are also involved with unstructured data. Hadoops are open-sources of software framework that stores data and applications on clusters of commodity hardware (“What is Hadoop?” n.d.).

There are many scenarios where unstructured data may be needed for a specific purpose for companies. One scenario that I came up and researched was healthcare workers have certain purposes for unstructured data in the workplace. Healthcare uses unstructured quite frequently for the photos, audios, videos, and medical use images.  An example would be that a technician can describe an image using codes or keywords, and that metadata could be entered into the computing system, adding information and making the image searchable (“The Challenges,” 2019). Technicians can also use structure data to help figure out unstructured data, as mentioned in the example.

 

References

Fontichiaro, K. (2018). Big data. Retrieved from https://aws.amazon.com/big-data/datalakes-and-analytics/what-is-a-data-lake/

Pickell, D. (2018, November 16). Structured vs Unstructured Data – What's the Difference? Retrieved from https://learn.g2.com/structured-vs-unstructured-data

Russom, P. (2018, October 10). Multiplatform Data Architectures. Retrieved from https://tdwi.org/webcasts/2018/10/diq-all-multiplatform-data-architectures.aspx#:~:text=A multiplatform data architecture (MDA,clouds, and other data platforms.&text=The point of the MDA,processing today's rapidly diversifying data.

Shacklett, M. (2017, July 15). Unstructured data: A cheat sheet. Retrieved from https://www.techrepublic.com/article/unstructured-data-the-smart-persons-guide/#:~:text=Unstructured data comes from documents,reading this smart person's guide.

The Challenges of Unstructured Healthcare Data. (2019, May 3). Retrieved from https://www.carevoyance.com/blog/unstructured-data-healthcare#:~:text=Based on IDC research, the,, PowerPoint slides, and emails.

What is Hadoop? (n.d.). Retrieved June 23, 2020, from https://www.sas.com/en_us/insights/big-data/hadoop.html