History of Data Mining

 Nowadays, information is proliferating very rapidly and it is difficult to distinguish between what will and what will not. Data mining is the method used to get the ones that will work out of these chunks of information.

From the past to the present, data have always been interpreted, information has been sought and hardware has been created for this. In this way, information has become carried from the past to the present.

In the 1950s, the first computers were used for counts.
In the 1960s, the concept of database and data storage took its place in the technology world. In addition, scientists were able to develop computers with simple learning in the late 1960s.

In the 1970s, relational Database Management Systems applications were started to be used. Computer experts, on the other hand, have developed expert systems based on simple rules and provided simple machine learning.
In the 1980s, Database Management Systems became widespread and used in scientific fields, engineering etc. started to be applied in areas. In these years, companies have created databases of data about their customers, competitors and products. These databases contain large amounts of data and can be accessed using SQL database query language or similar languages.

In the 1990s, it has begun to be considered how to find useful information from databases whose amount of data has increased exponentially. 1989, the meeting of the Working Group of Knowledge Discovery in Databases of KDD (IJCAI) -89 and 1991, the concluding declaration of KDD (IJCAI) -89 ‘’ Knowledge Discovery in Real Databases: A Report on the IJCAI-89 Workshop ‘’ The process accelerated with the introduction of definitions and concepts. In 1992, the first software for data mining was developed.

In the 2000s, data mining has developed continuously and has been applied to almost all areas. As the benefits of the results are seen, studies and practices in this field are constantly increasing.
The years of 2010 took place in history as the time period in which the concept of Big Data became more widespread. The biggest impact on this was the spread of social media sites and mobile life.
2020 is now; In parallel with the increasing data diversity with disciplines such as Internet of Things, Cloud Computing, and Deep Learning, data processing algorithms have also diversified.

It is possible to use data mining wherever there is a large volume of data. Today, data mining applications are widely used in many areas where decision-making is needed. For example, education, biology, finance, stock market, genetics, health, insurance, industry, intelligence, etc. Successful applications are seen in many branches. It is known that successful Data Mining applications have been made in all these sectors for the last 30 years.

The fields and applications that use data mining can be specified as follows:

  • Determining the purchasing patterns of customers,
  • Finding links between demographic characteristics of customers,
  • Increasing the response rate in mail campaigns,
  • Retaining existing customers, gaining new customers,
  • Market basket analysis,
  • Customer relations management,
  • Customer evaluation,
  • Sales forecast,
  • Customer distribution,
  • Various marketing campaigns,
  • Creating marketing strategies,
  • Cross selling analysis [2].
  • Hidden correlations between different financial indicators,
  • Detection of credit card frauds,
  • Determining customer groups according to credit card expenditures,
  • Evaluation of loan requests,
  • Customer distribution,
  • Detection of irregularity,
  • Risk analysis,
  • Risk management,
  • Estimating customers who will request new policies,
  • Detection of insurance frauds,
  • Stock price prediction,
  • Detecting fraudulent accounts and frauds [3].
  • Point of sale data analysis,
  • Shopping cart analysis,
  • Supply and store layout optimization,
  • General market analysis,
  • Optimization of trading strategies.
  • Quality and improvement analysis,
  • The density estimates of the lines,
  • Profile analysis of website visitors.
  • Making sense of large amount of scientific data produced during the simulation and analysis of systems in laboratory or computer environment [4].
  • Gene research,
  • Discovery and classification of new virus types,
  • Facilitating diagnoses by determining the characteristics of diseases,
  • Investigating the side effects of drugs used together,
  • Estimation of test results,
  • Product development,
  • Determination of the treatment process,
  • Medical diagnosis.
  • Identifying criminal tendencies,
  • Intelligence units.

The benefits it provides to researchers, company managers and administrators in the above-mentioned areas where data mining is used can be listed as follows [3]:

  • Medicine is one of the fields where data is kept the most. Especially in recent years, diseases have begun to be classified with the gene maps resulting from the incredible rapid progress of genetics. It is now possible to carry out studies on which genes are likely to develop which diseases.
  • Thanks to the collaboration with genetics, a lot of criminology information can be obtained from the prevention of these events before they occur by making predictions about which individuals are prone to commit crimes, to the calculation of many possibilities based on the spelling characters of the users.
  • Using data from simulation environments, predictions and solutions can be generated in engineering, production or problem solving.
  • Evaluations made with data mining algorithms can provide reliable and shorter time-consuming results without the need for long-term experiments and test cases.
  • In banking activities, marketing strategies can be developed with sales packages to be created in cooperation with distributor companies that sell machinery and equipment for small businesses.
  • It can make existing customers better known by managers.
  • Especially in the financial sector, existing customers can be divided into segments and credit risk behavior models can be created to minimize the risk to new customers.
  • New risk management policies can be created for all customers with similar characteristics by examining the payment performance of existing customers and determining the common characteristics of customers with poor payment performance.

The most used areas in Turkey can be listed as banking, insurance and stock exchange. When the resources are examined, it is seen that the fields where data mining is mostly used are medicine, biology and genetics.

[1] Aldana, W.A., “Data mining industry: emerging trends and new opportunities”, Yüksek Lisans Tezi, Massachusetts Institute of Technology, Massachusetts, 11 (2000).

[2] İnan, O., “Veri madenciliği”, Yüksek Lisans Tezi, Selçuk Üniversitesi Fen Bilimleri Enstitüsü, Konya, 1–50 (2003).

[3] Albayrak, M., “EEG sinyallerindeki epileptiform aktivitenin veri madenciliği süreci ile tespiti”, Doktora Tezi, Sakarya Üniversitesi Fen Bilimleri Enstitüsü, Sakarya, 56–70 (2008).

[4] Akgöbek, Ö. ve Çakır, F., “Veri madenciliğinde bir uzman sistem tasarımı”, Akademik Bilişim 09, Harran Üniversitesi, ġanlıurfa, 801–806 (2009).

Hiç yorum yok:

Yorum Gönder