List Headline Image
Updated by on May 26, 2019
15 items   1 followers   0 votes   3 views

Top Deep Learning Interview Questions You Must Know

Deep Learning is one of the Hottest topics of 2018-19 and for a good reason. There have been so many advancements in the Industry wherein the time has come when machines or Computer Programs are actually replacing Humans. Artificial Intelligence is going to create 2.3 million Jobs by 2020 and to crack those job interview I have come up with a set of Deep Learning Interview Questions.


Differentiate between AI, Machine Learning and Deep Learning.

Artificial Intelligence is a technique which enables machines to mimic human behavior.

Machine Learning is a subset of AI technique which uses statistical methods to enable machines to improve with experience.

Deep learning is a subset of ML which make the computation of multi-layer neural network feasible. It uses Neural networks to simulate human-like decision making.


Do you think Deep Learning is Better than Machine Learning? If so, why?

Though traditional ML algorithms solve a lot of our cases, they are not useful while working with high dimensional data, that is where we have a large number of inputs and outputs. For example, in the case of handwriting recognition, we have a large amount of input where we will have a different type of inputs associated with different type of handwriting.

The second major challenge is to tell the computer what are the features it should look for that will play an important role in predicting the outcome as well as to achieve better accuracy while doing so.


What is Perceptron? And How does it Work?

If we focus on the structure of a biological neuron, it has dendrites which are used to receive inputs. These inputs are summed in the cell body and using the Axon it is passed on to the next biological neuron as shown below.

Dendrite: Receives signals from other neurons
Cell Body: Sums all the inputs
Axon: It is used to transmit signals to the other cells

Similarly, a perceptron receives multiple inputs, applies various transformations and functions and provides an output. A Perceptron is a linear model used for binary classification. It models a neuron which has a set of inputs, each of which is given a specific weight. The neuron computes some function on these weighted inputs and gives the output.


What is the role of weights and bias?

For a perceptron, there can be one more input called bias. While the weights determine the slope of the classifier line, bias allows us to shift the line towards left or right. Normally bias is treated as another weighted input with the input value x0.


What are the activation functions?

Activation function translates the inputs into outputs. Activation function decides whether a neuron should be activated or not by calculating the weighted sum and further adding bias with it. The purpose of the activation function is to introduce non-linearity into the output of a neuron.

There can be many Activation functions like:

Linear or Identity
Unit or Binary Step
Sigmoid or Logistic


Explain Learning of a Perceptron.

Explain Learning of a Perceptron.

Initializing the weights and threshold.
Provide the input and calculate the output.
Update the weights.
Repeat Steps 2 and 3

Wj (t+1) – Updated Weight
Wj (t) – Old Weight
d – Desired Output
y – Actual Output
x – Input


What is the significance of a Cost/Loss function?

A cost function is a measure of the accuracy of the neural network with respect to a given training sample and expected output. It provides the performance of a neural network as a whole. In deep learning, the goal is to minimize the cost function. For that, we use the concept of gradient descent.


What is gradient descent?

Gradient descent is an optimization algorithm used to minimize some function by iteratively moving in the direction of steepest descent as defined by the negative of the gradient.

Stochastic Gradient Descent: Uses only a single training example to calculate the gradient and update parameters.

Batch Gradient Descent: Calculate the gradients for the whole dataset and perform just one update at each iteration.

Mini-batch Gradient Descent: Mini-batch gradient is a variation of stochastic gradient descent where instead of single training example, mini-batch of samples is used. It’s one of the most popular optimization algorithms.


What are the benefits of mini-batch gradient descent?

  • This is more efficient compared to stochastic gradient descent.
  • The generalization by finding the flat minima.
  • Mini-batches allows help to approximate the gradient of the entire training set which helps us to avoid local minima.

What are the steps for using a gradient descent algorithm?

  • Initialize random weight and bias.
  • Pass an input through the network and get values from the output layer.
  • Calculate the error between the actual value and the predicted value.
  • Go to each neuron which contributes to the error and then change its respective values to reduce the error.
  • Reiterate until you find the best weights of the network.

Create a Gradient Descent in python.

params = [weights_hidden, weights_output, bias_hidden, bias_output]

def sgd(cost, params, lr=0.05):

grads = T.grad(cost=cost, wrt=params)
updates = []

for p, g in zip(params, grads):
updates.append([p, p - g * lr])

return updates

updates = sgd(cost, params)


What are the shortcomings of a single layer perceptron?

Well, there are two major problems:

Single-Layer Perceptrons cannot classify non-linearly separable data points.
Complex problems, that involve a lot of parameters cannot be solved by Single-Layer Perceptrons


What is a Multi-Layer-Perceptron

A multilayer perceptron (MLP) is a deep, artificial neural network. It is composed of more than one perceptron. They are composed of an input layer to receive the signal, an output layer that makes a decision or prediction about the input, and in between those two, an arbitrary number of hidden layers that are the true computational engine of the MLP.


What are the different parts of a multi-layer perceptron?

Input Nodes: The Input nodes provide information from the outside world to the network and are together referred to as the “Input Layer”. No computation is performed in any of the Input nodes – they just pass on the information to the hidden nodes.

Hidden Nodes: The Hidden nodes perform computations and transfer information from the input nodes to the output nodes. A collection of hidden nodes forms a “Hidden Layer”. While a network will only have a single input layer and a single output layer, it can have zero or multiple Hidden Layers.

Output Nodes: The Output nodes are collectively referred to as the “Output Layer” and are responsible for computations and transferring information from the network to the outside world.


What Is Data Normalization And Why Do We Need It?

Data normalization is very important preprocessing step, used to rescale values to fit in a specific range to assure better convergence during backpropagation. In general, it boils down to subtracting the mean of each data point and dividing by its standard deviation.

These were some basic Deep Learning Interview Questions. Now, let’s move on to some advanced ones.

To keep reading more interview questions on Deep Learning, you can click here