Unsupervised Learning

 K-means clustering: K-Mean Clustering [1] is a widely used method to automatically divide a data set into k groups. First, the cluster centers are selected and then rearranged recursively as follows [2]:

  1. Each sample di is assigned to the nearest cluster center.
  2. Each cluster center 𝐶𝑗 is updated as the mean of its constituent instances.

The algorithm converges when there is no further change in the sample assignment to sets. Euclidean distance calculation is used in the calculation for the assignment to the nearest cluster center [3].



Association rules: Association rules are used to find relationships between large data sets. This technique was developed by Agrawal, Imielinski and Swami in 1993 [4]:

𝐼 = {𝑖1, 𝑖2,…, 𝑖𝑛} → set of attributes of n products,
𝐷 = {𝑡1, 𝑡2,…, 𝑡𝑚} → the set of operations called database,

Each process in the D set has a unique process ID number and contains a subset of the elements in the I set. As a rule: It is defined as 𝑋 => 𝑌, 𝑋, 𝑌 ⊆ 𝐼. The association rule can be defined as the relationship between two objects in the same shopping cart. In a basket, there is said to be a positive association rule between two objects when the presence of an object known as a precursor pen increases the likelihood of the other object known as a successor pen. In addition, when the presence of a predecessor object in a basket increases the probability that the successor object is not in the same basket or in the process, the rule of negative association of the n-predecessor object with the successor object is mentioned. Association rules are not symmetrical, therefore, having a certain association rule between object X and object Y does not require that Y be a union rule with X in the opposite direction [5]. There are different algorithms used for association rules, and the Apriori algorithm is the best known among these algorithms:

  • AIS
  • Apriori
  • CHARM
  • FP-Growth
  • Partition
  • RARM (Rapid Association Rule Mining)
  • SETM

In the next post, I will talk about semi-supervised learning …

[1] MacQueen, J. B. (1967). Proceedings of the Fifth Symposium on Math, Statistics, and Probability. Some methods for classification and analysis of multivariate observations, 281–297. Berkeley: CA: University of California Press.

[2] Wagstaf, K., Cardie, C., Rogers, S., and Schroedl, S. (2001). Proceedings of the Eighteenth International Conference on Machine Learning. Constrained K-means Clustering with Background Knowledge, 577–584. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.

[3] Dinçer, E. (2006). Veri Madenciliğinde K-Means Algoritması ve Tıp Alanında Uygulanması. Yüksek Lisans Tezi, Kocaeli Üniversitesi, Fen Bilimleri Enstitüsü, Kocaeli.

[4] Agrawal, R., Imieliński, T., and Swami, A. (1993). Mining association rules between sets of items in large databases, SIGMOD ’93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data. Washington, D.C., USA: ACM New York, NY, USA. doi: 10.1145/170035.170072

[5] MALIK, Z. M., AL-SHEHABI, S., and Dökeroğlu, T. (2018). Gözetimsiz Makine Öğrenme Teknikleri ile Miktara Dayalı Negatif Birliktelik Kural Madenciliği. Düzce Üniversitesi Bilim ve Teknoloji Dergisi, 6, 1119–1138.

[6] Savaş, S. (2019), Karotis Arter Intima Media Kalınlığının Derin Öğrenme ile Sınıflandırılması, Gazi Üniversitesi Fen Bilimleri Enstitüsü Bilgisayar Mühendisliği Ana Bilim Dalı, Doktora Tezi, Ankara.

Hiç yorum yok:

Yorum Gönder