Artificial intelligence or data-driven modeling is so hot these days. I find it is so magical that I decided to learn some basics and give it a try.

In this post, I will discuss the implementation of a fully connected artificial neural network (FCANN). For the activation functions, the `ReLU`

functions is used. For the loss functions, the sum-of-squares and the cross entropy with Softmax are used. The neural network (NN) is trained by on-line backscatter propagation (BP) method. To demonstrate the FCANN, first I tried to train a NN by data obtained from the sine function. Then, another NN was trained to recognize hand-written numbers from available labeled data set.

The concept of FCANN is so straight forward that there is no need for in-depth explanation. Some specific concepts must be defined to fit this particular post. The input is a vector **X**, the output of the NN is a vector **y** and the training data is expressed by **Y**. For classification, **y** and **Y** are transformed by the Softmax function.

For the BP method, the most important thing is obtaining the partial derivatives of the loss function with respect to the model parameters (\(w_{mn}^j\) and \(b_{n}^j\)). This could be discussed by the following short article.