Classification of Leaves Based on the Shape of Leaves Using Convolutional Neural Network Methods

One part of the tree, namely the leaves, which grow on the branches, has several types of leaves consisting of 4 shapes, ranging from circular shapes, elongated shapes, and some even have a finger shape. Often we mistake the shapes of these leaves. This study discusses the classification of leaves based the shape of the leaf bones using the Convolutional Neural Network, which is used to classify data that has been labeled using one of the methods, namely supervised learning. The purpose of this method is to classify a variable into the variables that have been listed. The goal is to classify leaves based on leaf shape to implement a Convolutional Neural Network algorithm model for leaf classification based on bone shape, which will produce an accuracy value. Accuracy values are obtained from conducting experiments at the training and trial stages. So it can be concluded using the epochs parameter of 30 and a batch size of 128, using ReLU and Softmax activations. The results obtained for the accuracy value for training are 98.52%, while the validation is 89.06%.


Introduction
Plants play a very vital role in our life as they provide us food and oxygen. We require a good understanding and knowledge about plants and identify new and rare species to increase the agricultural productivity and also to support drug industry [1]. Plants are one of the living things, which have their parts, from the roots to the fruit. One of the charms of knowing about plants is the leaves. Leaves have many shapes, ranging from circular shapes, elongated shapes, and some even have a finger shape. Often we mistake the shapes of these leaves. To assist in identifying these leaves, an application for estimating leaves based on their bone shape was created using the Deep Learning methods.
Deep learning is a scientific field that is currently developing. Being able to study like the neural network of the human brain is an advantage of deep learning itself.. With deep learning, it can be easy to solve problems such as identifying plants based on leaf bone shape. One significant method of identifying objects is CNN (Convolutional Neural Network). Therefore, a study was conducted to classify leaves based on the shape of the leaf bones, and to determine the level of accuracy obtained using the CNN technique. This is what underlies the making of "Classification of Leaves Based on the Shape of Leaves Using Convolutional Neural Network Methods".

B. Preprocessing Database
Before carrying out the stage of building the CNN model architecture, preprocessing is carried out first, so that the required data is in accordance with the architectural requirements of the CNN model. This stage is to get better data when processed by the CNN model. Preprocessing is needed to improve image quality, preserve the edges within the image, and enhance the image [2]. What was done during the preprocessing stage was scaling and splitting the dataset. In the scaling stage, which is to align all image sizes to 128x128 pixels. To split the dataset is to solve the training dataset and validation dataset to 80% and 20% which are used to carry out the training stage.

C. CNN Model Formation
After the preprocessing stage is complete, then the next step is to create a Convolutional Neural Network model to classify leaves based on the shape of the leaf bones.

D. CNN Model Training
After the model has been created, the next step is to carry out the model training phase. This training uses a dataset that has been shared previously, namely the training dataset and the validation dataset.

A. Leaves
Leaves are one of the most important organs of plants in maintaining life, because plants are obligate autotrophic organisms that must supply their own energy needs through the conversion of sunlight into chemical energy [3]. Leaves have 4 leaf bone shapes based on bone shape, namely curved, parallel, pinnate, and fingering. [4]

B. Deep Learning
According to Yanming Guo [5] stated that deep learning is a subfield of machine learning which attempts to learn high-level abstractions in data by utilizing hierarchical architectures. It is an emerging approach and has been widely applied in traditional artificial intelligence domains, such as semantic parsing, transfer learning, natural language processing, computer vision and many more. The computer vision field has studied deep learning intensively in recent years, as a consequence, related approaches have emerged in large numbers. In general, the deep learning method consists of 4 categories according to the basic method, "Deep" in Deep Learning comes from the many layers that are built into the Deep Learning model, which is usually a Neural Network. Convolutional Neural Network (CNN) can consist of many, many layers of the model, where each layer takes input from the previous layer, processes it, and outputs it to the next layer [7].

C. Convolutional Neural Network (CNN)
CNN is a multi-layer overseeing the learning neural network. Feature based on correlation for local features, neurons in CNN only need to be extracted, then the output of all higher level neurons is combined to obtain global features, which avoids the complex process of feature extraction and data reconstruction in traditional deep learning algorithms. Better predictive results have been achieved by CNN in various fields for example image processing, natural language processing and medical information. For example, a level of image processing that is close to human eye recognition in image processing accuracy has been achieved by CNN based on its unique feature extraction method. [8].
Convolution layer, pooling layer, and fully connected layer are the 3 layers of CNN in general. The weight is divided by the convolution layer while the sampling function is carried out by the pooling layer for the output generated by the convolution layer and the data rate is reduced from the layer below it. Output for several fully connected layers uses the output from the pooling layer [9].

D. Dropout Reguralization
Dropouts, a way to avoid overfitting and also to make the learning process more agile. The dropout corresponds to eliminating neurons on the network, both hidden and visible layers. By deleting a neuron that means temporarily removing it from the network. The removed neurons will be randomly selected [10].
A dropout is defined as a neural network that is regulated by means of added interference to the hidden unit. The dropout applied to the neural network is the same as the "thin" tissue sampled. The thin network consists of all dropout survivors. [11].

E. Activation Function
The activation function is a non-linear function in Artificial Neural Networks that can convert input data into higher dimensions in order to perform simple hyperlane cuts that allow classification [12].
a. ReLU (Rectified Linear Unit) A strong biological and mathematical basis is the activation function that ReLU introduced by. Neural training in training was enhanced by demonstrating that in 2011. The thresholding value was at 0, namely f (x) = max (0, x) which it did. To put it simply, it issues 0 when x <0, and conversely, it issues a linear function when x ≥ 0 [13].
b. Softmax A more responsive result and better probabilistic interpretation is given by softmax, compared to other classification algorithms. Computes the probability of all labels, possible inside softmax. Softmax will take from a real vector and transform it into a vector with values between zero and one that add up to one value [14].

F. Optimizer Function
Optimizer shapes the model into the most accurate shape by adjusting the weight. Loss function is a guide to notify the optimizer when it is moving in the right or wrong direction [15]. Classification of Leaves Based… ■ 3

IAIC Transactions on Sustainable Digital Innovation (ITSDI)
In this paper, the model uses a type of loss function, namely categorical cross-entropy loss. A multi-class classification assignment using a loss function is an understanding of Categorical cross-entropy. One of the possible categories can only have an example and which model to decide which is the assignment of categorical cross-entropy [16].

H. Confusion Matrix
The confusion matrix contains information about the classification system that performs both actual and predictive classification. The system performance is usually used data in the matrix by evaluating it. For classifiers of two classes, confusion matrix will be shown in Table  1. [17]. Python Python is an open source programming language with the intention of being a general purpose. It is optimized for software quality, developer productivity, program portability, and component integration. At least hundreds of thousands of developers around the world use python in various fields. [18].
For most compilation tasks, python is quite agile, compared to compiled low-level languages, user-friendly high-level languages are often slowed down.For the performance of critical pieces of code, combining python with specially written code, it will generate almost optimal speeds in most cases [19].
Python has a popular and excellent package manager called PIP. Using PIP you can start installing or removing Python libraries that will be or are not being used anymore [20].

J. Tensorflow
TensorFlow is used to experiment with deep learning models, train models on large datasets, and make them fit for production. In addition, TensorFlow also supports large-scale training and inference using hundreds of servers that use a Graphic Processing Unit (GPU) for efficient training [21].

A. Results of the CNN Model Training
The following are the results of training accuracy and loss from the training dataset. Classification of Leaves Based… ■ 3

Figure 1 Graph of Training Loss Results
For the loss results, it can be seen in Figure 1 that shows a significant decrease in the training dataset and validation dataset. The training dataset has decreased significantly at epoch 0-5 between 3.0 and 0.5. Then the graph continues to decline, almost reaching 0.0. The validation dataset has decreased from 0-5 between 1.5 to 0.5. The next validation dataset graph runs steadily until it experiences a slight increase between 20-30 epochs.

Figure 2 Graph of Accuracy Training Results
The accuracy results can be seen in Figure 2, where the training dataset and the validation dataset have increased in accuracy. The training dataset has increased in accuracy from epochs 0-5 with an accuracy value of 0.3 to 0.8, then the training dataset has continued to increase until it almost reaches a value of 1.0. The validation dataset has increased accuracy between epochs 0-5 with an accuracy value of 0.3 to 0.8.

Figure 3 Results of the Confusion Matrix
Prediction visualization uses a test dataset that has been tested before. The displayed image will show a leaf image and two text above the leaf image, namely the text 'True' and the text 'Predicted'. 'True' means that the image displayed is the image in the test dataset in accordance with the shape of the leaves, while 'Predicted' means that the image displayed is an image predicted by a computer. Figure 4 shows the correct image and the incorrect image, which shows 4 of the 800 test datasets. Two of the four pictures predict wrong, while the other two predict right. This happens because there is ambiguity in the image. The image still has defects so it is still wrong when predicting. One example in Figure 4 is that the text 'True' contains pinnate, while 'Predicted' contains parallel, meaning that the system predicts it is wrong to define.

C. Prediction Visualization
In Figure 4 Displays the correct image and the incorrect image, which is shown 4 of the 800 test datasets. In this picture, some of them have predicted wrong, because there is ambiguity in the image. The image still has defects so it is still wrong when predicting.

Conclusion
In this study, he succeeded in completing the making of leaf classification based on the shape of the leaf bone using the Convolutional Neural Network (CNN) method. The CNN model has an architecture in the form of 1 input layer, 2 convolution layers, 2 pooling layers, 1 flatten layer and 1 fully connected layer. The activation functions used are ReLU and Softmax.
Implementation of the Convolutional Neural Network method with the validation dataset that has been divided by 20% from the 1600 training dataset to 1280 for the training dataset and 320 for the validation dataset. The CNN model has epochs of 20 and a batch size of 128, which takes 7 minutes. The training dataset produced an accuracy of 98.52% and a loss of 5.54%, while the validation dataset produced an accuracy of 89.06% and a loss of 39.49%.