Neural Networks and Deep Learning
Neural networks are computing systems inspired by biological neural networks. Deep learning uses neural networks with many layers to learn hierarchical representations from data, achieving breakthrough performance in vision, language, and more.
Artificial Neurons
An artificial neuron computes a weighted sum of inputs, adds a bias, and applies an activation function. Activation functions: sigmoid (0-1), ReLU (max(0,x), most common), tanh (-1 to 1), softmax (multi-class output). Neurons are organised in layers.
Network Architecture
A feedforward network has input layer, hidden layers, and output layer. Deep networks have multiple hidden layers. More layers enable learning more abstract features. Width (neurons per layer) and depth (number of layers) determine network capacity.
Training and Backpropagation
Training minimises a loss function (cross-entropy for classification, MSE for regression). Backpropagation computes gradients using the chain rule. Gradient descent (SGD, Adam, RMSprop) updates weights iteratively. Learning rate, batch size, and epochs are key hyperparameters.
Convolutional Neural Networks
CNNs specialise in image processing. Convolutional layers detect local patterns. Pooling layers reduce dimensions. Famous architectures: LeNet, AlexNet, VGG, ResNet, EfficientNet. CNNs power image classification, object detection, and medical imaging.
Recurrent Neural Networks
RNNs process sequential data. LSTM and GRU solve the vanishing gradient problem. Transformers (attention mechanism) have largely replaced RNNs for NLP tasks.
Applications
Computer vision, NLP, speech recognition, generative AI (GANs, diffusion models), autonomous driving, drug discovery, and game playing.
Summary
Neural networks and deep learning enable learning complex patterns from data. CNNs handle images, RNNs/Transformers handle sequences.