Using a Convolutional Neural Network(CNN) to Diagnose Malaria

Ishaana Misra
Analytics Vidhya
Published in
6 min readFeb 2, 2021

--

I’m sure that you’ve probably heard of computer vision, a pretty self-explanatory term: giving the computer the ability to “see”. A more technical definition, however, would be the following: giving the computer the ability to analyze an image and correctly classify the image into its respective category.

In this article, I will be using some technical terms which I won’t be covering in this article, so I’d recommend reading this article I wrote about neural networks before diving into this one if you're unfamiliar with the subject:

Data Preprocessing

In machine learning, we spend 80% of our time dealing with data and getting it ready(also known as preprocessing it), so before we get into convolutional neural networks, training, and accuracy, we need to deal with our data. This includes gathering data, making sure the images are consistent, and also making any other changes that we see useful.

Gathering data can take a very long time, but thankfully this step can be skipped since there is already an existing data set of 27,558 images of cells both infected and uninfected with Malaria.

A few of the cell images; uninfected(top) and infected(bottom)

Now that we have the data, we can work on preprocessing it. As you can see, the images above are all different dimensions, making this aspect of the images inconsistent with each other.

This becomes a problem since a) it will become challenging to build a CNN if all of the images are of different dimensions and b) we want our CNN to detect differences between images which are significant to interpreting the image as parasitized or uninfected, and differing dimensions isn’t one of these differences. So, let’s resize the images by making them 50 x 50 pixels.

Also, we need to convert these images to grayscale since, in this situation, color isn’t relevant. Besides, it's much easier to see to convert an image to numerical values since a grayscale image is seen by a computer as an array of pixel values between 0(black) and 255(white). All values in between represent shades of gray.

Resizing image and then converting to grayscale

Now we can get into how computers see using CNNs, but first of all…

What Is a Convolutional Neural Network?

A convolutional neural network is a kind of deep neural network which is typically used for analyzing images, though in recent years it has been used to analyze other types of data as well.

For our purposes, we will be inputting an image to the CNN, and expect it to classify the image. Classifying an image would mean determining which of the two possible classes the image belongs to, uninfected, or infected. We do convert the image into an array of pixel values, so that array can also be seen as the input to our CNN.

How Do CNNs Work?

What makes a CNN distinct from other types of neural networks, is its ability to pick up on an image's distinct features, which include edges and textures. But, what makes a CNN unique in this way?

Typical neural networks take in multiple numerical values as inputs and each node in the next layer (refer to the article I mentioned earlier) looks at each of these inputs. We could put our array of pixel values through this neural network, with each pixel value as an input, but that would become redundant and unnecessary.

If you think of how humans look at images or anything for that matter, we don’t look at the pixel by pixel and then try to piece it together and make sense of it. We look at the edges and textures, which is a much faster method of making sense of images. The question is, how do we replicate this with a computer?

With a CNN, instead of looking at each pixel by itself, we look at a group of pixels at a time. It can be difficult to visualize, so check out the diagram below.

Above is an oversimplified example. We’ve split up this image into a grid of 4x4 units. We look at specific sections of this image using something called a kernel. The kernel convolves over the image, looking at certain sections, also known as tiles. Each of the nodes above corresponds to a certain section of the image. The red node corresponding to the red tile, the blue node corresponding to the blue tile, etc.

This helps us remove the redundancy we would have had if we used a deep neural network instead since here each node is only responsible for a certain section of the image as opposed to the entire image.

Now, check out the slightly more complex diagram below.

Image: Van Hiep Phung, et. al

We have the input (our image) upon which the kernel convolves to identify the features, and then we compare these features using a fully connected neural network. A fully connected neural network is basically just a neural network in which each of the nodes in one layer takes into account each of the nodes in the previous layer as is shown above in the classification section.

Finally, the output layer classifies the data into its respective class.

Making Our CNN

When making our CNN, we don’t need to limit ourselves to just one convolutional layer. We can have multiple, which means that we go through the process of having a kernel convolve over an image multiple times.

In fact, having more convolutional layers means that we can better detect features within our image. In our CNN, we’ll have 3 convolutional layers and 2 fully connected ones.

So, our neural network is now complete! We’re not done just yet though, since we still need to train our model, using a process called backpropagation.

Backpropagation, in simple terms, means updating our model’s weights and bias each time an image passes through it, based on how wrong or right the model was in classifying. Through this process, we can increase the accuracy of our model.

So, once we have trained our model, we need a way to test how efficient and accurate it was in classifying the images. We test the model by giving it many images and then checking whether the class our model sorted the image into was the real class of the model.

Through this process, I tested the model and found out that it had an accuracy of 87.1%.

You can see my code here:

You can also check out this video I made on the subject:

Ishaana Misra is an 8th grader interested in AI and healthcare, especially the intersection of the two. She is also an Innovator at The Knowledge Society.

LinkedIn: https://www.linkedin.com/in/ishaana-misra/

Check out my newsletters: https://ishaana.substack.com

--

--

Ishaana Misra
Analytics Vidhya

Student at Stuyvesant learning about cryptography and Bitcoin.