WomanLife: Deep Learning for the detection and classification of breast cancer

La Paz. Deep Learning. 2021

Today, thanks to deep learning algorithms of artificial intelligence, we have the possibility to automate the classification of images, so this tool can help medical personnel in the classification and early detection of breast cancer. In this way, women suffering from this disease could be diagnosed automatically, in time to start treatment.

Breast cancer is the most common type of cancer in women and is also one of the main causes of death according to the WHO (WHO, 2020). Early detection is the single most important factor in lowering cancer treatment costs and mortality. To make it possible it is necessary to have medical ultrasound images and specialists who can explain them. However, the lack of these creates a gap in access to early treatment in countries with little or not enough access to specialized diagnostic services and whose population receives low and middle income.


Description of the problem

Our project consists of the detection and classification of breast cancer in women between 25 and 75 years old. This is possible from the development of an deep learning model trained with images obtained using ultrasound scanners that result in the segmentation of the type of cancer that could be suffered.


Objective

Allow women suffering from breast cancer to be automatically diagnosed using a deep learning model so that they can start treatment early and safely, reducing costs and the mortality rate. To meet this objective, we have proposed a tool that uses artificial intelligence to provide greater agility to the process through self-diagnosis with ultrasound images.


Model selection

The breast cancer detection and classification project works with ultrasound images of three types, labeled as benignmalignant and neutral, so the deep learning model selected for its execution is convolutional networks with TensorFlow Keras.


Datasets

The dataset was collected from Baheya Hospital for Early Detection and Treatment of Women’s Cancer, Cairo, Egypt. It contains 780 breast ultrasound images, in women between 25 and 75 years old (133 normal, 437 benign and 210 malignant) with an average image size of 500 x 500 pixels, some of which are seen below,

Fig. 1. Samples of images

The images from the original dataset contain mask images that do not provide meaningful information to the model we developed, for this reason Shell statements were used to remove them from the dataset we are using.


Implemented techniques

We must emphasize that until now there is a shortage of public data sets of breast cancer ultrasound images and it prevents the good performance of the algorithms. Because of this, the authors who made public the dataset we used, recommend augmenting data using GANs.

Our project developed GAN networks for each class in order to obtain more accurate results and 150 epochs were used.

However, it failed to create usable images, for this reason we declined the use of this technique. The challenge is to develop the GAN with a greater number of epochs and with a better neural network configuration to obtain more realistic images.

Fig. 2. MALIGNANT
Source: Compiled by authors using Matplotlib
Fig. 3. BENIGN
Source: Compiled by authors using Matplotlib
Fig.4. NORMAL
Source: Compiled by authors using Matplotlib


Network definition

Within the possible design patterns in Keras, subclassing has been implemented to use the low-level APIs of Keras. You can consult more information about this in the following article:

https://towardsdatascience.com/3-keras-design-patterns-every-ml-engineer-should-know-cae87618c7e3

The structure of the network consists of:

  • Preprocessing layer: Resizing, Rescaling and Normalization
  • Conv2D: 32 filters, 4 strides, ‘same’ padding and ReLU activation
  • MaxPooling2D: pool_size of (3,3), ‘same’ padding and 2 strides
  • Flatten
  • Dense: 512 neurons and ReLU activation
  • Dropout (0.4)
  • Dense: 3 neurons and SoftMax activation

We are based on AlexNet architecture, on which we made some adjustments like number of neurons, fully connected layers and dropout values.

We use Adam optimizer with learning rate of 0.0001, the Sparse Categorical Crossentropy loss function and Sparse Categorical Accuracy function.

Fig. 5. Model summary — Source: Compiled by authors


Training

TensorBoard was used to observe the real-time behavior of the accuracy and loss values, which provides useful graphs to analyze results and many controls for their manipulation.

Fig. 6. Dashboard TensorBoard — Source: Compiled by authors

Earlystopping

We use EarlyStopping as a form of regularization to avoid overfitting when training the model. For example, if the loss value stops decreasing, the training will stop even though all iterations have not been completed.


Conclusions and future works

WomanLife is intended to be an easy-to-access, low-cost medical diagnostic tool.

This AI is not only beneficial for women who use it but also has the potential to become a medical assistant. We want to clarify that WomanLife does not intend to replace medical specialists but to provide a tool that facilitates their work.

From now on we intend to optimize the model using a GAN network to obtain greater precision and use techniques that find the correct parameters for training the model (Hyperparameter tuning).

Our project also developed an application that, given an image scanned with the camera or selected from the gallery, goes through the developed network and returns a series of probabilities related to the type of cancer suffered.

The model was developed in pure TensorFlow, converted, saved and exported to TensorFlow Lite.

Fig. 7. Sample of the operation of the application prototype — Source: Own elaboration
Fig. 8. Conversion from TensorFlow to TensorFlow Lite architecture — Source: Own elaboration


Sources

You can access to notebook and mobile application through my GitHub repositories bellow:

https://github.com/edcalderin/BreastCancerDetection_CNN

https://github.com/edcalderin/BreastCancerDetection_app

Here, you will can find more projects related to Data Science and Machine Learning. In summary, it contains all my work so far. Any reply or comment is always welcome.


About the authors

Erick Calderin Morales

Systems engineer with experience in software development, master’s student in systems engineering and master’s degree in data science with an affinity for artificial intelligence.

Linkedin: https://www.linkedin.com/in/erick-calderin-5bb6963b/

Sharon Maygua Mendiola

Mechatronics engineering student with a degree in physics.

Linkedin: https://www.linkedin.com/in/sharon-sarai-maygua-mendiola-22288019a/

References

Saturdays.AI

Presentación del proyecto: DemoDay

Repositorio

En el siguiente repositorio se encuentra el código usuado para desarrollar esta aplicación:

https://github.com/SaturdaysAI/Projects/tree/master/Lapaz/2021.DL/BreastCancerDetection_CNN-master/BreastCancerDetection_CNN-master

WRITTEN BY

Erick Calderin

Systems Engineer passionated to Deep Learning and Artificial Inteligence

Saturdays.AI

¡Más inteligencia artificial!

Saturdays.AI is an impact-focused organization on a mission to empower diverse individuals to learn Artificial Intelligence in a collaborative and project-based way, beyond the conventional path of rogramas intensivos donde se realizan proyectos para el bien (#ai4good).

Infórmate de nuestro master sobre inteligencia artifical en https://saturdays.ai/master-ia-online/

Si quieres aprender más inteligencia artificial únete a nuestra comunidad en community.saturdays.ai o visítanos en nuestra web www.saturdays.ai ¡te esperamos!