From the course: Deep Learning: Getting Started

Setup and initialization

- [Instructor] How does a deep learning model get trained? We will explore this process in detail in this chapter. We will start with setup and initialization for the training process in this video. Before we start training the neural network, the input data needs to be prepared. This includes applying a number of processing techniques to convert samples into numeric vectors. They are then transposed optionally to create the input vectors. The target variables may also undergo similar transformations. To help with training, the input data is usually split into training, test, and validation sets. A training data set is used to run through the neural network and fit the parameters like weights and biases. Once a model is created, the validation data set is used to check for its accuracy and error rates. The result from this validation is then used to refine the model and recheck. When a find a model is obtained, it is used to predict on the test set to measure the final model performance. The usual split of input data between the training, validation, and test sets is 80 to 10 to 10. In order to create the initial model, a set of values need to be selected for various parameters and hyper-parameters. This includes the number of layers and the number of nodes in each layer. We also need to select the activation functions for each layer. Then, there are hyper parameters like epoch, batch sizes, and error functions that need to be selected. How do we make the initial selection? It may be based on our own intuition and experience. It can also be based on references in best practices and suitability of techniques to the specific problem. Whatever values are selected, they are then refined as the model is trained. If the final results of the model are not acceptable, then we will go back, adjust the parameters, and then retrain the model. Finally, we also need to initialize the weights and biases for each of the nodes in the neural network. We will start with some value and then the neural network will learn the right values for these based on the error rates are obtained during the training process. Multiple techniques for initialization are available. In zero initialization, we initialize all values to zeros. The preferred technique though is random initialization. In random initialization, we initialize the weights and biases to random values. obtained from a standard normal distribution whose mean is zero and standard deviation is one. Once we are done with setup and installation, we are ready to do some training.

Contents