Deep learning algorithms work with almost any kind of data and require large amounts of computing power and information to solve complicated issues. Now, let us, deep-dive, into one of the most famous deep learning algorithms: Convolutional Neural Network (CNN).
If you are looking for unsupervised learning algorithms, read our article about it here:
Table of Contents
What is Deep Learning?
Deep learning is a subfield of machine learning in which artificial neural networks are used to analyze and solve complex problems. Deep learning neural networks are built with multiple layers of interconnected nodes that can learn and extract features from input data. Deep learning models are trained on large datasets, allowing them to detect patterns and correlations in data that a human would find difficult or impossible to detect.
Deep learning has brought about a significant transformation in the area of artificial intelligence, paving the way for the creation of intelligent systems capable of independent learning, adaptation, and decision-making. Various applications such as image and speech recognition, natural language processing, and autonomous driving have witnessed remarkable achievements through the use of these models.
Why Python for Deep Learning?
Python has gained widespread popularity as a programming language due to its versatility and ease of use in diverse domains of computer science, especially in the field of deep learning. Thanks to its extensive range of libraries and frameworks specially tailored for deep learning, Python has emerged as a top choice among many machine learning professionals.
Python has emerged as the language of choice for deep learning, and here are some of the reasons why:
1. Simple to learn and use:
Python is a high-level programming language that is easy to learn and use, even for those who are new to programming. Its concise and uncomplicated syntax makes it easy to write and understand. This allows developers to concentrate on solving problems without worrying about the details of the language.
2. Abundant libraries and frameworks:
Python has a vast ecosystem of libraries and frameworks that cater specifically to deep learning. Some of these libraries include TensorFlow, PyTorch, Keras, and Theano. These libraries provide pre-built functions and modules that simplify the development process, reducing the need to write complex code from scratch.
3. Strong community support:
Python has a large and active community of developers contributing to its development, maintenance, and improvement. This community offers support and guidance to beginners, making it easier to learn and use Python for deep learning.
4. Platform independence:
Python is platform-independent, which means that code written on one platform can be easily executed on another platform without any modification. This makes it easier to deploy deep learning models on different platforms and devices.
5. Easy integration with other languages:
Python can be easily integrated with other programming languages, such as Delphi, C++, and Java, making it ideal for building complex systems that require integrating different technologies.
Overall, Python’s ease of use, an abundance of libraries and frameworks, strong community support, platform independence, and ease of integration with other languages make it an indispensable tool for machine learning practitioners. Its popularity continues to soar as a result.
What is Convolutional Neural Network (CNN)?
Convolutional Neural Networks (CNNs – not to be confused with the US news channel of the same name) are specialized artificial neural networks designed for processing and analyzing images and other multidimensional data. CNNs consist of multiple layers of interconnected nodes, each performing a specific task, such as feature extraction, classification, or prediction. Yann LeCun developed the first CNN in 1988 when it was called LeNet. It was used for recognizing characters like ZIP codes and digits.
One of the key features of CNNs is the use of convolutional layers, which employ a mathematical operation known as convolution to filter and extract features from the input data. This allows CNNs to learn and recognize patterns and features in images with greater accuracy and efficiency than traditional machine learning algorithms.
Below is an example of an image processed via CNN:
CNNs have shown remarkable success in a wide range of applications, including image classification, object recognition, face detection, and natural language processing. The ability of CNNs to learn and extract complex features from images has led to breakthroughs in many fields, such as medical imaging, autonomous driving, and robotics. Moreover, CNNs have proven to be highly effective in solving many real-world problems, such as identifying fraudulent transactions in finance or detecting defects in manufacturing processes.
The architecture and design of CNNs are still evolving, and researchers continue to explore new ways to optimize their performance and improve their capabilities. Recent advancements in CNNs include techniques such as transfer learning, which allows pre-trained CNN models to be adapted to new tasks, and ensemble methods, which combine multiple CNN models to achieve even greater accuracy and robustness. As deep learning continues to advance, CNNs are likely to remain a critical tool for processing and analyzing complex data, particularly in image and signal processing.
Prerequisites for performing CNNs with Python
To perform CNNs with Python, you need to have some prerequisites in terms of programming knowledge, tools, and libraries. The following are some of the prerequisites for performing CNNs with Python:
1. Python programming
You should understand Python programming concepts well, including data types, control structures, functions, and object-oriented programming (OOP) concepts. You should also be comfortable with using Python libraries and packages.
2. NumPy
NumPy is a Python library for numerical computing, which is essential for performing mathematical operations involved in deep learning. It supports multi-dimensional arrays, linear algebra, and other mathematical functions used in machine learning.
Our past article about Python GUI for NumPy
3. matplotlib
Matplotlib is a Python library for data visualization, which is useful for visualizing the data used in training and evaluating CNNs. It provides a wide range of visualization tools, including line plots, scatter plots, histograms, and heatmaps.
Our past article about Python GUI for matplotlib
4. Keras
Keras provides a user-friendly interface for building and training deep learning models, including CNNs.
Our past article about Python GUI for Keras:
Demo video:
5. TensorFlow
TensorFlow is a more flexible and scalable framework that can be used to build and deploy more complex deep learning models.
Our past article about Python GUI for TensorFlow:
Demo video:
6. Image data
To train and test CNNs, you will need a dataset of images in the required format. The images should be labeled or annotated for classification, object detection, or segmentation tasks.
7. GPU hardware (optional, depend on your tasks)
Deep learning models require a lot of computing power, and training CNNs on large datasets can take a long time on a CPU. To speed up the training process, you can use a GPU (Graphics Processing Unit) to accelerate the computations.
In summary, to perform CNNs with Python, you should understand Python programming, NumPy, and Matplotlib. You should also be familiar with deep learning frameworks such as Keras or TensorFlow, have access to image data, and have access to GPU hardware to accelerate the training process.
How do I build and train a Convolutional Neural Network from scratch?
Finally, what we are already waiting for, is the hands-on section to build and train your own CNN from scratch.
The following is the code example for CNN:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 |
# Load the data: Fashion MNIST dataset from keras.datasets import fashion_mnist (train_X,train_Y), (test_X,test_Y) = fashion_mnist.load_data() # Analyze the data import numpy as np from keras.utils import to_categorical import matplotlib.pyplot as plt print('Training data shape : ', train_X.shape, train_Y.shape) print('Testing data shape : ', test_X.shape, test_Y.shape) ## Find the unique numbers from the train labels classes = np.unique(train_Y) nClasses = len(classes) print('Total number of outputs : ', nClasses) print('Output classes : ', classes) ## Take a look at what the images in your dataset plt.figure(figsize=[5,5]) ## Display the first image in training data plt.subplot(121) plt.imshow(train_X[0,:,:], cmap='gray') plt.title("Ground Truth : {}".format(train_Y[0])) ## Display the first image in testing data plt.subplot(122) plt.imshow(test_X[0,:,:], cmap='gray') plt.title("Ground Truth : {}".format(test_Y[0])) plt.show() # Data preprocessing ## Convert each 28 x 28 image of the train and test set into a matrix of size 28 x 28 x 1 which is fed into the network train_X = train_X.reshape(-1, 28,28, 1) test_X = test_X.reshape(-1, 28,28, 1) print((train_X.shape), (test_X.shape)) ## Convert data from int8 to float32 format, and rescale pixel values in range 0-1 train_X = train_X.astype('float32') test_X = test_X.astype('float32') train_X = train_X / 255. test_X = test_X / 255. ## Convert the class labels from categorical into a one-hot encoding vector train_Y_one_hot = to_categorical(train_Y) test_Y_one_hot = to_categorical(test_Y) ### Display the change for category label using one-hot encoding print('Original label:', train_Y[0]) print('After conversion to one-hot:', train_Y_one_hot[0]) ## Split train (80%) & test data (20%) from sklearn.model_selection import train_test_split train_X,valid_X,train_label,valid_label = train_test_split(train_X, train_Y_one_hot, test_size=0.2, random_state=13) print((train_X.shape), (valid_X.shape), (train_label.shape), (valid_label.shape)) # Model the data import keras from keras.models import Sequential,Input,Model from keras.layers import Dense, Dropout, Flatten from keras.layers import Conv2D, MaxPooling2D from keras.layers.normalization import BatchNormalization from keras.layers.advanced_activations import LeakyReLU ## Set 64 batch size for minimum memory power batch_size = 64 epochs = 20 num_classes = 10 # NN architecture fashion_model = Sequential() fashion_model.add(Conv2D(32, kernel_size=(3, 3),activation='linear',input_shape=(28,28,1),padding='same')) fashion_model.add(LeakyReLU(alpha=0.1)) fashion_model.add(MaxPooling2D((2, 2),padding='same')) fashion_model.add(Conv2D(64, (3, 3), activation='linear',padding='same')) fashion_model.add(LeakyReLU(alpha=0.1)) fashion_model.add(MaxPooling2D(pool_size=(2, 2),padding='same')) fashion_model.add(Conv2D(128, (3, 3), activation='linear',padding='same')) fashion_model.add(LeakyReLU(alpha=0.1)) fashion_model.add(MaxPooling2D(pool_size=(2, 2),padding='same')) fashion_model.add(Flatten()) fashion_model.add(Dense(128, activation='linear')) fashion_model.add(LeakyReLU(alpha=0.1)) fashion_model.add(Dense(num_classes, activation='softmax')) # Compile the model fashion_model.compile(loss=keras.losses.categorical_crossentropy, optimizer=keras.optimizers.Adam(), metrics=['accuracy']) print(fashion_model.summary()) # Train the model fashion_train = fashion_model.fit(train_X, train_label, batch_size=batch_size,epochs=epochs,verbose=1,validation_data=(valid_X, valid_label)) # Evaluate model on the test set test_eval = fashion_model.evaluate(test_X, test_Y_one_hot, verbose=0) print('Test loss:', test_eval[0]) print('Test accuracy:', test_eval[1]) ## Plot the accuracy and loss plots between training and validation data accuracy = fashion_train.history['accuracy'] val_accuracy = fashion_train.history['val_accuracy'] loss = fashion_train.history['loss'] val_loss = fashion_train.history['val_loss'] epochs = range(len(accuracy)) plt.plot(epochs, accuracy, 'bo', label='Training accuracy') plt.plot(epochs, val_accuracy, 'b', label='Validation accuracy') plt.title('Training and validation accuracy') plt.legend() plt.figure() plt.plot(epochs, loss, 'bo', label='Training loss') plt.plot(epochs, val_loss, 'b', label='Validation loss') plt.title('Training and validation loss') plt.legend() plt.show() # Add a dropout layer to overcome the problem of overfitting batch_size = 64 epochs = 20 num_classes = 10 ## NN architecture fashion_model = Sequential() fashion_model.add(Conv2D(32, kernel_size=(3, 3),activation='linear',padding='same',input_shape=(28,28,1))) fashion_model.add(LeakyReLU(alpha=0.1)) fashion_model.add(MaxPooling2D((2, 2),padding='same')) fashion_model.add(Dropout(0.25)) fashion_model.add(Conv2D(64, (3, 3), activation='linear',padding='same')) fashion_model.add(LeakyReLU(alpha=0.1)) fashion_model.add(MaxPooling2D(pool_size=(2, 2),padding='same')) fashion_model.add(Dropout(0.25)) fashion_model.add(Conv2D(128, (3, 3), activation='linear',padding='same')) fashion_model.add(LeakyReLU(alpha=0.1)) fashion_model.add(MaxPooling2D(pool_size=(2, 2),padding='same')) fashion_model.add(Dropout(0.4)) fashion_model.add(Flatten()) fashion_model.add(Dense(128, activation='linear')) fashion_model.add(LeakyReLU(alpha=0.1)) fashion_model.add(Dropout(0.3)) fashion_model.add(Dense(num_classes, activation='softmax')) ## Model summary fashion_model.summary() ## Compile model fashion_model.compile(loss=keras.losses.categorical_crossentropy, optimizer=keras.optimizers.Adam(),metrics=['accuracy']) ## Train drop out fashion_train_dropout = fashion_model.fit(train_X, train_label, batch_size=batch_size,epochs=epochs,verbose=1,validation_data=(valid_X, valid_label)) ## Save model fashion_model.save("fashion_model_dropout.h5py") # Evaluate your model on the test set test_eval = fashion_model.evaluate(test_X, test_Y_one_hot, verbose=1) print('Test loss:', test_eval[0]) print('Test accuracy:', test_eval[1]) ## Plot the accuracy and loss plots between training and validation data for the one last time accuracy = fashion_train_dropout.history['accuracy'] val_accuracy = fashion_train_dropout.history['val_accuracy'] loss = fashion_train_dropout.history['loss'] val_loss = fashion_train_dropout.history['val_loss'] epochs = range(len(accuracy)) plt.plot(epochs, accuracy, 'bo', label='Training accuracy') plt.plot(epochs, val_accuracy, 'b', label='Validation accuracy') plt.title('Training and validation accuracy') plt.legend() plt.figure() plt.plot(epochs, loss, 'bo', label='Training loss') plt.plot(epochs, val_loss, 'b', label='Validation loss') plt.title('Training and validation loss') plt.legend() plt.show() # Predict Labels predicted_classes = fashion_model.predict(test_X) predicted_classes = np.argmax(np.round(predicted_classes),axis=1) predicted_classes.shape, test_Y.shape ## Show examples of correct labels correct = np.where(predicted_classes==test_Y)[0] print("Found %d correct labels" % len(correct)) for i, correct in enumerate(correct[:9]): plt.subplot(3,3,i+1) plt.imshow(test_X[correct].reshape(28,28), cmap='gray', interpolation='none') plt.title("Predicted {}, Class {}".format(predicted_classes[correct], test_Y[correct])) plt.tight_layout() plt.show() ## Show examples of incorrect labels incorrect = np.where(predicted_classes!=test_Y)[0] print("Found %d incorrect labels" % len(incorrect)) for i, incorrect in enumerate(incorrect[:9]): plt.subplot(3,3,i+1) plt.imshow(test_X[incorrect].reshape(28,28), cmap='gray', interpolation='none') plt.title("Predicted {}, Class {}".format(predicted_classes[incorrect], test_Y[incorrect])) plt.tight_layout() plt.show() # Classification report from sklearn.metrics import classification_report target_names = ["Class {}".format(i) for i in range(num_classes)] print(classification_report(test_Y, predicted_classes, target_names=target_names)) |
The code above does the following operations (for more details, read the comments inside the code snippet):
1. Introduce you to convolutional neural networks with Keras.
2. Load, explore, and analyze the data.
3. Preprocess your data
You’ll resize, rescale, and convert your labels into one-hot encoding vectors and split up your data in training and validation sets.
4. Construct the neural network model
You’ll model the data and form the network. You’ll compile, train and evaluate the model, visualizing the accuracy and loss plots.
5. Learn about the concept of overfitting and how you can overcome it by adding a dropout layer.
6. Revisit your original model and re-train the model. You’ll also re-evaluate your new model and compare the results of both models.
7. Make predictions on the test data, convert the probabilities into class labels, and plot a few test samples that your model correctly classified and incorrectly classified.
8. Finally, visualize the classification report, which will give you more in-depth intuition about which class was (in)correctly classified by your model.
Let’s run the above code using PyScripter IDE. And the following are some selected outputs:
1. Take a look at the images in your dataset:
2. Model summary:
3. Train the model:
4. Plot the accuracy and loss plots between training and validation data:
5. Plot the accuracy and loss plots between training and validation data after adding the Dropout layer to our model:
6. Show correct labels:
7. Show incorrect labels:
8. Classification report:
Congratulations, now you have learned how to build and train a Convolutional Neural Network (CNN) from scratch and successfully run it inside PyScripter IDE with high speed & performance.
Visit our other AI-related articles here:
Click here to get started with PyScripter, a free, feature-rich, and lightweight Python IDE.
Download RAD Studio to create more powerful Python GUI Windows Apps in 5x less time.
Check out Python4Delphi, which makes it simple to create Python GUIs for Windows using Delphi.
Also, look into DelphiVCL, which makes creating Windows GUIs with Python simple.
References & further readings
[1] Biswal, A. (2023).
Top 10 Deep Learning Algorithms You Should Know in 2023. Simplilearn. simplilearn.com/tutorials/deep-learning-tutorial/deep-learning-algorithm
[2] Cmglee. (2021).
Comparison of the LeNet and AlexNet convolution, pooling and dense layers. Wikimedia. en.wikipedia.org/wiki/Convolutional_neural_network #/media/File:Comparison_image_neural_networks .svg
[3] Schulz, H., & Behnke, S. (2012).
Deep learning: Layer-wise learning of feature hierarchies. KI-Künstliche Intelligenz, 26, 357-363.
[4] Sharma, A. (2017).
Convolutional Neural Networks in Python with Keras. DataCamp Blog. datacamp.com/tutorial/convolutional-neural-networks-python
[5] Zhang, A., Lipton, Z. C., Li, M., & Smola, A. J. (2021).
Dive into deep learning. arXiv preprint arXiv:2106.11342.