hacking on social media platform
Running head: MEASURES TO REDUCE DATA DUMP IN MINIMAL TIME 1
DOD-ETL: distributed on-demand ETL for near real-time business intelligence
Vamsi Krishna Sanneboina
Monroe College
KG604: Graduate Research & Critical Analysis
Professor Manya Bouteneff
9 Nov 2022
MEASURES TO REDUCE DATA DUMP IN MINIMAL TIME 2
Measures to Reduce Data Dump in Minimal Time
DOD-ETL: distributed on-demand ETL for near real-time business intelligence
1. Researcher(s)
The article information revealed that it was written by several authors whose first names are as
follows; Cunha, Oliveira Pereira, and, Machado, and one of the researchers was is reliable and
affiliated to Department of Computer Science, Universidade Federal de Minas Gerais, Belo
Horizonte, Brazil.
2. Purpose
To establish and understand the framework with which Business Intelligence (BI) can facilitate
near real-time approach. The study was completed to review (BI) and the process of Extract
Transform Load (ETL). The researchers wanted to develop an ETL solution near real-time, and
implement it in using Demonstrated on Demand (DOD) ETL. The study proposed the DOD-ETL
as a technology that can achieve near real-time ETL through a number of multiple strategies. The
study finally compares DOD-ETL with other related works.
3. Date of Data Collection
Data for the study was collected in 2019 and published in Open Access.
4. Place of Data Collection
The research was conducted in a higher learning institution in Brazil known as the Universidade
Federal de Minas Gerais, Belo Horizonte,.
5. Method of Data Collection
The data was collected from secondary sources to solve the problems of the research
including integration of data sources, mastering data overheads, degradation of performance, and
MEASURES TO REDUCE DATA DUMP IN MINIMAL TIME 3
backing up data. The study also covered several publications other publications which covered
the frameworks of Stream Processing that help to solve real-time ETL.
6. Findings
The experiments in the study revealed that DOD-ETL significantly increase the speed of Spark.
The DOD-ETL contains the In-Memory Table Updater data dump from Message Queue but is
still able to process data at very high rates than the baseline. The study also showed that DOD-
ETL customizations have no negative impact on the fault-tolerance and scalability of Spark
Streaming. In other words, DOD-ETL techniques and strategies help to reduce the run time of
ETL. This model outperforms a modern framework for Stream Processing.
MEASURES TO REDUCE DATA DUMP IN MINIMAL TIME 4
Objective Summary 3
The study was conducted by an individual from Universidade Federal de Minas Gerais,
Belo Horizonte, Brazil, named Oliveira, alongside others like Machado, Cunha, Pereira, and
ultimately used Open Access to publish it in 2019. The research was conducted to discover how
people establish near real-time BI.
The study was conducted using secondary sources of data that present various solutions
for technological issues. Some of the problems included the integration of data sources, backup,
and mastering data overheads. The researchers also considered publications that contain
information on Stream Processing.
The baseline has a higher rate of data processing than the DOD-ETL, and DOD-ETL can
perform this well despite the fact that it also encompasses an In-Memory Table Updater data
dump originating from Message Queue. The study also showed that DOD-ETL customizations
have no negative effect on the either the tolerance of fault or scalability of Spark Streaming.
DOD-ETL techniques and strategies help to reduce the run time of ETL. This model outperforms
a modern framework for Stream Processing.
MEASURES TO REDUCE DATA DUMP IN MINIMAL TIME 5
Reference
Machado, G. V., Cunha, Í., Pereira, A., & Oliveira, L. B. (2019). DOD-ETL: distributed on-
demand ETL for near real-time business intelligence. Journal of Internet Services and
Applications, 10(1), 1-15. https://link.springer.com/article/10.1186/s13174-019-0121-z