The success achieved as a result of the classification made by Krizhevsky and his friends through Deep Convolutional Neural Netrowks named AlexNet [1] in the competition to recognize the world’s most important objects named ImageNet in 2012 has been the greatest impact of deep learning in the world literature. After this success, deep learning has been on the rise. Deep learning was first introduced in the literature in 2006 with a method called deep thought networks [2]. Deep Belief Nets (DBN) were tested using the preferred MNIST (image data of 70,000 28 x 28 pixel handwritten characters ranging from 0–9) to measure and estimate the accuracy of each image recognition method. The development of deep learning over time is shown in Figure [3].
In recent years, techniques developed in deep learning research affect a wide range of information processing studies in expanded scopes, in both traditional and new forms, including the most effective and important aspects of machine learning and artificial intelligence.
Although deep learning is a sub-field of machine learning, it is the application area of deep neural networks that is becoming widespread day by day. Instead of specialized algorithms for each study in this field, it is aimed that solutions based on learning data cover a wider data set. Deep learning is a promising approach to solving artificial intelligence problems in machine learning.
There are various definitions of deep learning [4]:
Definition 1: A class of machine learning techniques that uses multiple nonlinear computing layers for supervised or unsupervised feature extraction and transformation, model analysis, and classification.
Definition 2: A subfield within machine learning that relies on algorithms to learn multiple levels of representation to model complex relationships between data. Thus, high-level features and concepts are defined as low-level features, and such a hierarchy of features is called a deep architecture. Most of these models are based on learning from unsupervised representations.
Definition 3: It is a subfield of machine learning based on the learning of several levels of representation that correspond to a hierarchy of features or factors or concepts in which higher-level concepts are defined from lower-level concepts, and the same low-level concepts can help define many higher-level concepts. Deep learning is part of a wider family of machine learning methods based on learning representations. An observation (e.g., an image) can be represented in many ways (e.g., a pixel vector), but some displays make it easier to learn interesting tasks (e.g. is this the image of a human face?) From examples and research in this field, what is better? It tries to determine how it can be represented and how to learn.
Definition 4: Deep learning is a set of algorithms trying to learn at multiple levels corresponding to different levels of abstraction in machine learning. Artificial neural networks are generally used. In these learned statistical models, the levels correspond to different concept levels where higher-level concepts are defined from lower-level concepts, and the same low-level concepts can help define higher-level concepts.
Definition 5: Deep Learning is a new area of machine learning research that has been introduced with the aim of bringing machine learning closer to one of its original purposes (artificial intelligence) Deep learning is about learning multiple levels of representation and abstraction that help in understanding data like images, sounds, and texts.
Deep learning is a machine learning technique that uses the deep neural network. Deep neural networks are multi-layered neural networks containing two or more hidden layers [5].
In deep learning, there is a structure based on learning more than one feature level or representation of data. High-level features create a hierarchical representation derived from lower-level features [6]. The representation for an image can be a vector of density per pixel values or features such as edge clusters, custom shapes. Some of these features represent the data better. In deep learning methods, effective algorithms are used for hierarchical feature extraction that best represents the data, instead of the manually extracted features [7].
There are two main aspects common to the various high-level definitions of deep learning [4]:
- Models consisting of multiple layers or nonlinear computing stages,
- Methods for the supervised or unsupervised learning of feature representation in successive higher, more abstract layers.
High capacity (especially GPU) machines and large amounts of data are needed to run deep learning algorithms and solve problems. Unlike standard machine learning algorithms that break down problems and solve them individually, deep learning solves the problem from start to finish. More importantly, the more data a deep learning algorithm is fed, the better the task execution. Also, the time factor is important. Not time-bound studies can produce better results when fed with big data.
Three major reasons for the popularity of deep learning today are greatly increased processor capabilities (e.g., graphics processors (GPU)), massive increase in data used for education, and recent advances in machine learning and signal / information processing research. These developments have enabled deep learning methods to effectively utilize complex, compound nonlinear functions, learn distributed and hierarchical feature representations, and effectively use both labeled and unlabeled data [4].
In machine learning, the algorithm distinguishes between a square and a triangle based on information provided by humans. In deep learning, the program does not start with pre-fed information. Instead, it uses an algorithm to determine how many corners the shapes have, whether those corners are connected, and whether they are perpendicular. The algorithm ultimately determines whether an added circle fits the order of squares and triangles.
Solving problems such as image and / or sound identification, which can be easily done by humans, is difficult for artificial intelligence methods. These intuitive problems can be solved by computers learning the ability of the simplest concepts to understand and experience the world hierarchically. With the knowledge gained through experience, there is no need for formulas and calculations specific to each problem used by computers. When the hierarchical structure is considered as a graphic, a deep multi-layered structure is formed, each of which is built on top of the other. For this reason, artificial intelligence methods based on hierarchical structure emerge as deep learning [8].
The research areas of deep learning neural networks are at the intersection between artificial intelligence, graphic modeling, optimization, pattern recognition and signal processing [4]. These algorithms have started to show themselves in many applications such as driverless vehicles, health services, movie suggestions, translation services, chatbots, page suggestions, advertising services.
The factors that make deep learning architectures such a popular field of study are as follows:
- Making common text, image and sound datasets available for research around the world.
- The production of high processing power graphics cards (GPU).
- AlexNet, ZFNet, ResNet, GoogLeNet, VGG16–19, Inception etc. The introduction of deep architectures such as.
- Keras, Tensorflow, Theano, Caffe, Pytorch, MatConvNet etc. The use of deep learning platforms and libraries such as.
- Activation functions, data training and data augmentation methods and effective optimizers are developed and put into use by researchers.
Deep neural networks have two or more hidden neural network layers. In deep neural networks, more comprehensive relationships are established from simple to complex in data. Each layer tries to establish a relationship between the previous layer and itself. Thus, a more detailed examination is made about the inputs and a more accurate decision is made. The figure shows a deep neural network structure with three hidden layers.
Deep neural network structure |
Different activation functions can be used while building deep neural networks. These functions can vary according to the type, structure, size of the data and the person creating the model. The activation function determines the output this cell will produce in response to the input to the cell. Usually a nonlinear function is chosen. Major activation functions are Sigmoid, TanH and ReLU.
Activation functions |
Activation functions formulas |
REFERENCES
[1] Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). 25th International Conference on Neural Information Processing Systems. ImageNet Classification with Deep Convolutional, 1097–1105. Lake Tahoe, Nevada: NIPS’12 Proceedings.
[2] Hinton, G. E., Osindero, S., and Teh, Y.-W. (2006). A fast learning algorithm for deep belief nets. Neural Comput, 1527–1554.
[3] Ay Karakuş, B. (2018). Derin Öğrenme ve Büyük Veri Yaklaşımları ile Metin Analizi. Doktora Tezi, Fırat Üniversitesi Fen Bilimleri Enstitüsü, Elazığ.
[4] Deng, L., and Yu, D. (2013). Deep Learning Methods and Applications. Foundations and Trends in Signal Processing, 7, 197–387. doi: 10.1561/2000000039
[5] Phil, K. (2017). MATLAB Deep Learning: With Machine Learning, Neural Networks andArtificial Intelligence. Seoul, Soul-t’ukpyolsi, Korea: Apress.
[6] Bengio, Y. (2009). Learning Deep Architectures for AI. Foundations and Trends in Machine Learning, 2, 1–127. doi: 10.1561/2200000006
[7] Song, H. A., and Lee, S.Y. (2013). International Conference on Neural Information Processing. Hierarchical Representation Using NMF, 466–473. Daegu, South Korea.
[8] Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning. MIT Press. http://www.deeplearningbook.org
[9] Savaş, S. (2019), Karotis Arter Intima Media Kalınlığının Derin Öğrenme ile Sınıflandırılması, Gazi Üniversitesi Fen Bilimleri Enstitüsü Bilgisayar Mühendisliği Ana Bilim Dalı, Doktora Tezi, Ankara.
Hiç yorum yok:
Yorum Gönder