Neural Networks and Deep Learning

Neural networks are computing systems inspired by biological neural networks. Deep learning uses neural networks with many layers to learn hierarchical representations from data, achieving breakthrough performance in vision, language, and more.

Artificial Neurons

An artificial neuron computes a weighted sum of inputs, adds a bias, and applies an activation function. Activation functions: sigmoid (0-1), ReLU (max(0,x), most common), tanh (-1 to 1), softmax (multi-class output). Neurons are organised in layers.

Network Architecture

A feedforward network has input layer, hidden layers, and output layer. Deep networks have multiple hidden layers. More layers enable learning more abstract features. Width (neurons per layer) and depth (number of layers) determine network capacity.

Training and Backpropagation

Training minimises a loss function (cross-entropy for classification, MSE for regression). Backpropagation computes gradients using the chain rule. Gradient descent (SGD, Adam, RMSprop) updates weights iteratively. Learning rate, batch size, and epochs are key hyperparameters.

Convolutional Neural Networks

CNNs specialise in image processing. Convolutional layers detect local patterns. Pooling layers reduce dimensions. Famous architectures: LeNet, AlexNet, VGG, ResNet, EfficientNet. CNNs power image classification, object detection, and medical imaging.

Recurrent Neural Networks

RNNs process sequential data. LSTM and GRU solve the vanishing gradient problem. Transformers (attention mechanism) have largely replaced RNNs for NLP tasks.

Applications

Computer vision, NLP, speech recognition, generative AI (GANs, diffusion models), autonomous driving, drug discovery, and game playing.

Summary

Neural networks and deep learning enable learning complex patterns from data. CNNs handle images, RNNs/Transformers handle sequences.

Neural Networks and Deep Learning