AI, Blog

What are Convolutional neural networks (CNNs)?

Convolutional neural networks (CNNs) are a type of neural network that are particularly good at processing data with a grid-like topology, such as an image. They are inspired by the structure of the visual cortex, which is organized into small regions called “receptive fields”, each of which processes a small part of the visual field.

The key idea behind convolutional neural networks is that the same weights and operations are applied at each receptive field, so that the network can learn to recognize patterns or features anywhere in the input. This is done using a mathematical operation called convolution, which is where the name “convolutional neural network” comes from.

Convolution is a mathematical operation that takes two functions and produces a third function that represents the overlap of the two functions as they are shifted relative to each other. In the context of convolutional neural networks, the two functions are usually the input image and a small matrix of weights called a “kernel” or “filter”, and the output is a new image that has been transformed by the kernel. The process of convolution involves sliding the kernel over the input image, element-wise multiplying the elements of the kernel and the overlapping region of the input image, and summing the products to produce a single output value for each position. This is repeated for every position in the input to produce the output image.

Convolution is a powerful operation that can be used to extract features from images, such as edges, corners, and patterns. It can also be used to blur, sharpen, or otherwise modify images in various ways. In a convolutional neural network, multiple convolutional layers are typically stacked on top of each other, with each layer learning to detect more complex features based on the features detected by the previous layer.

Convolution in 2D

Suppose we have an input image that is 4×4 pixels, and a kernel or filter that is also 4×4 pixels. The kernel is a matrix of weights that will be used to transform the input image.

Input image:
[a b c d]
[e f g h]
[i j k l]
[m n o p]

Kernel:
[w1 w2 w3 w4]
[w5 w6 w7 w8]
[w9 w10 w11 w12]
[w13 w14 w15 w16]

To compute the output of the convolution, we slide the kernel over the input image, element-wise multiplying and summing the elements at each position. For example, to compute the first output value, we would do the following:

output[0][0] = (a*w1) + (b*w2) + (c*w3) + (d*w4) + (e*w5) + (f*w6) + (g*w7) + (h*w8)

We would then move the kernel one position to the right and compute the next output value:

output[0][1] = (b*w1) + (c*w2) + (d*w3) + (e*w4) + (f*w5) + (g*w6) + (h*w7) + (i*w8)

We would repeat this process until the kernel has been applied to every position in the input image, resulting in a new image that is the same size as the input.

Here’s an example of how this looks when applied to a real image:

Input image:

[0 0 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0 0 0] [0 0 0 1 1 1 0 0 0 0] [0 0 0 1 1 1 0 0 0 0] [0 0 0 1 1 1 0 0 0 0] [0 0 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0 0 0]

Kernel:

[0 0 1] [0 1 0] [1 0 0]

Output image:

[0 0 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0 0 0] [0 0 0 0 1 0 0 0 0 0] [0 0 0 0 0 1 0 0 0 0] [0 0 0 0 0 0 1 0 0 0] [0 0 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0 0 0] [0 0 0 0 0 0 0 0 0 0]

In this example, the kernel is detecting vertical edges in the input image. The output image is smaller than the input because the kernel “cuts off” the edges of the image, which is a common practice in convolutional neural networks.

Another Real Example ;

how to use convolution to blur an image using Python and the popular machine learning library TensorFlow:

import tensorflow as tf

# Load the input image
input_image = tf.io.read_file('input.jpg')
input_image = tf.image.decode_jpeg(input_image, channels=3)

# Create the blur kernel
kernel = tf.ones([5, 5, 3, 3]) / 25

# Use convolution to apply the blur kernel to the input
output_image = tf.nn.conv2d(input_image, kernel, strides=1, padding='SAME')

# Save the output image
output_image = tf.image.encode_jpeg(output_image)
tf.io.write_file('output.jpg', output_image)

This code loads an image, creates a blur kernel using a 5×5 matrix of ones, and applies the kernel to the input image using the conv2d function. The strides and padding parameters control the size of the output and how the kernel is applied to the input. The output image is then saved to a file.

Applications of CNN`S

Convolution is a powerful mathematical operation that is widely used in a variety of fields, including image and signal processing, machine learning, and data analysis. Some examples of where convolution is used include:

  • Image processing: Convolution can be used to apply various image filters, such as blur, sharpening, edge detection, and noise reduction. It can also be used to resize or rotate images.
  • Signal processing: Convolution can be used to smooth, filter, or otherwise modify signals in various ways. It is commonly used in audio processing, where it can be used to apply effects such as echo, reverb, and noise reduction.
  • Machine learning: Convolution is a key building block of convolutional neural networks, which are widely used for tasks such as image classification, object detection, and natural language processing.
  • Data analysis: Convolution can be used to smooth or filter data, or to analyze the relationships between different data sets. It is often used in finance, where it can be used to analyze time series data such as stock prices.

I hope you like the simple explanation, I will come up with more details with a Project in next Blogs

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

#Convolutional Neural Net

#Machine Learning

#Artificial Intelligence

#Towards Data Science

#Beginners Guide

Leave a Reply