Intro To Data Research Paper

profilenags18
researchlist1.docx

1. Web Mining – Web mining is an application of data mining for discovering data patterns from the web. Web mining is of three categories – content mining, structure mining and usage mining. Content mining detects patterns from data collected by the search engine. Structure mining examines the data which is related to the structure of the website while usage mining examines data from the user’s browser. The data collected through web mining is evaluated and analyzed using techniques like clustering, classification, and association. It is a very good topic for the thesis in data mining.

2. Predictive Analytics – Predictive Analytics is a set of statistical techniques to analyze the current and historical data to predict the future events. The techniques include predictive modeling, machine learning, and data mining. In large organizations, predictive analytics help businesses to identify risks and opportunities in their business. Both structured and unstructured data is analyzed to detect patterns. Predictive Analysis is a lengthy process and consist of seven stages which are project defining, data collection, data analysis, statistics, modeling, deployment, and monitoring. It is an excellent choice for research and thesis.

3. Oracle Data Mining – Oracle Data Mining, also referred as ODM, is a component of Oracle Advanced Analytics Database. It provides powerful data mining algorithms to assist the data analysts to get valuable insights from data to predict the future standards. It helps in predicting the customer behavior which will ultimately help in targeting the best customer and cross-selling. SQL functions are used in the algorithm to mine data tables and views. It is also a good choice for thesis and research in data mining and database.

4. Clustering – Clustering is a process in which data objects are divided into meaningful sub-classes known as clusters. Objects with similar characteristics are aggregated together in a cluster. There are distinct models of clustering such as centralized, distributed. In centroid-based clustering, a vector value is assigned to each cluster. There are various applications of clustering in data mining such as market research, image processing, and data analysis. It is also used in credit card fraud detection.

5. Text mining – Text mining or text data mining is a process to extract high-quality information from the text. It is done through patterns and trends devised using statistical pattern learning. Firstly, the input data is structured. After structuring, patterns are derived from this structured data and finally, the output is evaluated and interpreted. The main applications of text mining include competitive intelligence, E-Discovery, National Security, and social media monitoring. It is a trending topic for the thesis in data mining.

6. Fraud Detection – The number of frauds in daily life is increasing in sectors like banking, finance, and government. Accurate detection of fraud is a challenge. Data mining techniques help in anticipation and detection of fraud. Data mining tools can be used to spot patterns and detect fraud transactions. Through data mining, factors leading to fraud can be determined.

7. Data Mining as a Service(DMaaS) – It is a service for mining of data on the cloud. The result can be shared for scientific research. The interactive analysis of data can be done on the cloud. It will leverage the existing interface.

8. Graph Mining – It is an application of data mining to extract useful patterns from the graphs. The underlying data can be used for classification and clustering. There are certain tools for graph mining like GASTON and gSpan. The application of graph mining includes biological network, web data, cheminformatics and many more. It is one of the good topics in data mining for thesis and research.

9. Fuzzy Clustering – Fuzzy Clustering is a type of clustering in which a single data point can be a part of more than one cluster. In non-fuzzy clustering, a data point belongs to only one distinct cluster. Fuzzy Clustering finds its application in bioinformatics, image analysis, and marketing. Fuzzy Clustering employs k-means algorithms to solve various complex computation problems. It is a very challenging thesis topic in data mining.

10. Domain Driven Data Mining – It is a methodology of data mining to discover actionable knowledge and insight from complex data in a composite environment. Data-driven pattern mining faces challenges in the discovery of actionable knowledge from databases. To tackle this issue, domain driven data mining has been proposed and this will promote the paradigm shift from data-driven pattern mining to domain-driven data mining. This is another good thesis topic in Data Mining.

11. Decision Support System – It is a type of information system to support businesses and organizations in decision making. It helps people to make a better decision about problems which may be unstructured or semi-structured. Data Mining techniques are used in decision support systems. These techniques help in finding hidden patterns and relations from the data. Developing a decision support system requires time, cost, and effort.

12. Opinion Mining – Opinion mining, also known as sentiment mining, is a natural language processing method to analyze the sentiments of customers about a particular product. It is widely used in areas like surveys, public reviews, social media, healthcare systems, marketing etc. Automated opinion mining employs machine learning algorithms to analyze the sentiments

13. Super Computer Data Mining

The aim of this project is to produce a super-computing data mining resource for use by the UK academic community which utilizes a number of advanced machine learning and statistical algorithms for large datasets. In particular, a number of evolutionary computing-based algorithms and the ensemble machine approach will be used to exploit the large-scale parallelism possible in super-computing. This purpose is embodied in the following objectives:

· to develop a massively parallel approach for commonly used statistical and machine learning techniques for exploratory data analysis

· to develop a massively parallel approach to the use of evolutionary computing techniques for feature creation and selection

· to develop a massively parallel approach to the use of evolutionary computing techniques for data modelling

· to develop a massively parallel approach to the use of ensemble machines for data modelling consisting of many well-known machine learning algorithms;

· to develop an appropriate super-computing infra-structure to support the use of such advanced machine learning techniques with large datasets.

14. Time Series Data Mining

· Time Series Classification

· Time Series Data Mining Electricity Usage Patterns

15. Genetic Programming for Constructive Induction

16. Machine Learning Ensemble Methodology

17. Effective Metrics for Clustering

18. Clustering Rules

19. Clustering Ensemble

20. Machine learning