# Deep Learning

## October 01, 2019

This is a learning note about deep learning.

## Definition

The term, Deep Learning, refers to training Neural Networks, sometimes very large Neural Networks.

## Logistic Regression

It tries to solve binary classification problem.

### Given x, want

\begin{align*} \hat{y} = P(y = 1 | x) \end{align*}

### Sigmoid function

\begin{align*} \sigma(z) = \frac{1}{1 + {e^{-z}}} \end{align*}

### Output

\begin{align*} \hat{y} = \sigma({w^T} + b) \end{align*}

### Loss (error) function

\begin{align*} L(\hat{y}, y) = -(y\log \hat{y} + (1-y)\log (1-\hat{y})) \end{align*}

### Cost function

\begin{align*} J(w, b) = \frac{1}{m} \sum_{i=1}^m L(\hat{y}^{(i)}, y^{(i)}) \end{align*} \begin{align*} = -\frac{1}{m} \sum_{i=1}^m (y^{(i)}\log \hat{y}^{(i)} + (1-y^{(i)})\log (1-\hat{y}^{(i)})) \end{align*}

Modify the paramters w and b to reduce the loss L by utilizing derivatives with a Computation Graph.

## Neural Network

### Neural Network Representation

Input layer, hidden layer(s), output layer

### Activation Function (binary classification)

If output is 0, 1, use sigmoid activation function, or tanh activation function for -1, 1.

For all other cases, use ReLU, or the rectified linear unit activation function.

### Gradient descent for Neural Networks

• Forward propagation
