“Hello World!” Neural Networks with TensorFlow and Keras API. (DL1)

Felipe Gonzalez
4 min readJun 24, 2021

In this small tutorial we are going to make our first neural network with TensorFlow and its Keras API, it will be able to identify numbers written by hand, for that purpose we will use a dataset called MNIST which is provided by Keras itself.

For them we will install the TensorFlow library, which comes with Keras, we will also need NumPy and MatPlotLib.

pip install tensorflow
pip install numpy
pip install matplotlib

First of all we must import our TensorFlow library

import tensorflow as tf

Then we must load the MNIST dataset, which, as mentioned above, has 28x28 pixel images of handwritten numbers.

mnist = tf.keras.datasets.mnist

Then we must unpack our data sets in training and testing.

(x_train,y_train),(x_test,y_test) = mnist.load_data()

For a better visualization of the numbers, we will use matplotlib, with which we will graph some first numbers.

fig,ax = plt.subplots(3,4)
cont = 1
for i in range(3):
for j in range(4):
ax[i,j].imshow(x_train[cont], cmap=plt.cm.binary)
cont = cont + 1

To see how our program sees each of these numbers, what we have to do is print it as is with the print function, in this case we are going to print the number that is in position 7 of our dataset.

print(x_train[7])

Visually we can clearly see that it is the number 3, the zeros are the background and the numbers that make up the “big 3” are the shades of black that make up the “big 3”.

It is always good practice to normalize our data, so that the values are not in such a wide range (they go from 0 to 255), but rather that these are more uniform and can be processed more easily by our neural network..

x_train = tf.keras.utils.normalize(x_train, axis=1)
x_test = tf.keras.utils.normalize(x_test, axis=1)

Now if we want to see our new normalized image, we just have to visualize it as we did previously.

plt.imshow(x_train[7], cmap=plt.cm.binary);
print(x_train[7])

Now that we have our data pre-processed, we can start to design our neural network, in this case, as our dataset comes “clean” and “adequate” out of the box, the pre-processing was short and easy, however, in most of the In real life cases, approximately 70–80% of the time is spent preprocessing the data to make it suitable for our neural network.

First we will design the architecture of our neural network, which visually would look something like this.

First, each image is converted into a flat vector, then the neural network will enter, which will process and learn these vectors, then it will enter a last layer whose activation function is softmax, this activation function will be in charge of returning each class as a percentage.

model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation=tf.nn.relu, input_shape=(28,28)),
tf.keras.layers.Dense(128, activation=tf.nn.relu),
tf.keras.layers.Dense(10, activation=tf.nn.softmax),
])

After designing our neural network, we have to specify the metrics (optimizer, loss and evaluation metric) with which this neural network is going to be trained, this in TensorFlow is called compiling, for this case the “Adam” optimizer will be used, To evaluate the loss, as it is a case of classification with more than one class, “sparse categorical crossentropy” will be used and “accuracy” as an evaluation metric.

model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])

Next, we will proceed to train the neural network specifying x_train, y_train and epoch, which are the number of iterations with which the neural network will be trained, which must be a positive integer number.

model.fit(x_train, y_train, epochs=10)

When the neural network has finished training, it can be evaluated.

_, val_acc = model.evaluate(x_test, y_test)
print(val_acc)

Now a prediction will be made, as mentioned above, the activation function of our last layer “softmax”, which returns all the classes in percentage and the result will be the class with the highest probability, in this case, we look at the prediction number zero , the class that gave us the most probability is class 7.

predict = model.predict(x_test)
predict[0]
array([1.8378154e-15, 7.6408641e-10, 2.2498982e-10, 1.2333948e-07, 1.0962036e-17, 8.6380547e-15, 1.3369861e-20, 9.9999988e-01, 6.1377384e-16, 3.1727512e-13], dtype=float32)

And to confirm.

import numpy as np
print(np.argmax(predict[0]))
7
plt.imshow(x_test[0], cmap=plt.cm.binary);

You can access this example by entering the Jupyter Notebook in Google Colab that is in the following link.

--

--

Felipe Gonzalez

Python BackEnd Developer, Machine Learning Python and AWS (Amazon Web Service)