Learn how computers mimic the human brain to recognize images, understand speech, and make predictions!
Your brain has 100 billion neurons connected together. š§
When you see a cat, neurons fire in patterns: "Fur? Check. Whiskers? Check. Pointy ears? Check. IT'S A CAT!" š±
Neural Networks copy this idea! They're computer "brains" with artificial neurons that learn patterns.
| Feature | Machine Learning | Deep Learning |
|---|---|---|
| Learns from | Features YOU define | Raw data (learns features itself!) |
| Data needed | Less data OK | Needs LOTS of data |
| Best for | Structured data (tables) | Images, text, audio |
A neuron is like a tiny decision maker:
A neural network is a stack of layers of "neurons": each neuron multiplies its inputs by learned weights, adds a bias, and passes the result through an activation function (e.g. ReLU). The network is trained by adjusting these weights using the training data (forward pass ā loss ā backpropagation ā gradient descent) so that the output layer's predictions get closer to the true labels.
INPUTS WEIGHTS SUM + ACTIVATION
------ ------- ----------------
xā āāāāāāāāāāāāāāā Ć wā (0.7) āā
ā
xā āāāāāāāāāāāāāāā Ć wā (0.3) āā¼āāā Ī£ (sum) āāā f(sum) āāā OUTPUT
ā
xā āāāāāāāāāāāāāāā Ć wā (-0.2) āā
ā
+ bias
Example:
Inputs: [1, 2, 3]
Weights: [0.7, 0.3, -0.2]
Bias: 0.1
Sum = (1Ć0.7) + (2Ć0.3) + (3Ć-0.2) + 0.1
= 0.7 + 0.6 - 0.6 + 0.1 = 0.8
If sum > 0: Neuron fires! ā
It decides whether the neuron should "fire" or not.
Think of it as a volume knob - it controls the output!
| Function | What It Does | When to Use |
|---|---|---|
| ReLU | If input < 0, output 0. Otherwise, pass through. | Most common! Use in hidden layers. |
| Sigmoid | Squishes output between 0 and 1 | Binary classification (yes/no) |
| Softmax | Outputs probabilities that sum to 1 | Multi-class (cat/dog/bird) |
INPUT LAYER HIDDEN LAYERS OUTPUT LAYER
----------- ------------- ------------
ā āāāāāāāāāāāāāāā ā ā āāāāāāāāāāāāāāā ā
/ ā² ā± ā²
ā āāāāāāāāāāāāāāāāā ā ā āāāāāāāāāāāāāāāāā ā
ā² ā± ā² ā±
ā āāāāāāāāāāāāāāā ā ā āāāāāāāāāāāāāāā ā
(Features) (Learn patterns) (Prediction)
Image Example:
- Input: 784 pixels (28Ć28 image)
- Hidden: 128 neurons, then 64 neurons
- Output: 10 neurons (digits 0-9)
This is where your data enters the network.
Example: For a 28Ć28 pixel image, you have 784 input neurons (one per pixel).
These are the magic layers that learn patterns!
More layers = "Deeper" network = Can learn more complex patterns!
This gives you the final prediction!
Example: For digit recognition, 10 neurons output probabilities for 0-9.
EPOCH 1: Prediction: 2, Actual: 7 ā Error: HIGH š¢
EPOCH 100: Prediction: 7, Actual: 7 ā Error: LOW š
Error (Loss) Over Time:
^
ā ā²
ā ā²
ā ā²___
ā ā²____
ā ā²________
āāāāāāāāāāāāāāāāāāāāāāāāāāāā Epochs
The goal: Get that error as LOW as possible!
| Term | Simple Explanation |
|---|---|
| Epoch | One complete pass through ALL training data |
| Batch Size | How many samples to process before updating weights |
| Learning Rate | How big the weight updates are (too high = overshoots, too low = slow) |
| Loss Function | Measures how wrong the predictions are |
| Backpropagation | The math that figures out how to adjust each weight |
import numpy as np import tensorflow as tf from tensorflow.keras import layers, models from sklearn.model_selection import train_test_split from sklearn.datasets import make_moons from sklearn.preprocessing import StandardScaler # Set seeds for reproducibility np.random.seed(42) tf.random.set_seed(42)
# Create a "moon" shaped dataset (hard for simple models!) X, y = make_moons(n_samples=2000, noise=0.25, random_state=42) # Split into train and test X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.2, random_state=42 ) # Scale the data (very important for neural networks!) scaler = StandardScaler() X_train = scaler.fit_transform(X_train) X_test = scaler.transform(X_test) print(f"Training samples: {len(X_train)}") print(f"Test samples: {len(X_test)}") # Training samples: 1600 # Test samples: 400
# Create a Sequential model (layers stacked one after another) model = models.Sequential([ # Input layer + First hidden layer layers.Dense(64, activation='relu', input_shape=(2,)), # 2 input features layers.Dropout(0.2), # Prevents overfitting # Second hidden layer layers.Dense(32, activation='relu'), layers.Dropout(0.2), # Output layer (1 neuron for binary classification) layers.Dense(1, activation='sigmoid') # Outputs 0-1 probability ]) # Show the model architecture model.summary() # Model: "sequential" # _________________________________________________________________ # Layer (type) Output Shape Param # # ================================================================= # dense (Dense) (None, 64) 192 # dropout (Dropout) (None, 64) 0 # dense_1 (Dense) (None, 32) 2080 # dropout_1 (Dropout) (None, 32) 0 # dense_2 (Dense) (None, 1) 33 # ================================================================= # Total params: 2,305
# Compile: Tell the model how to learn model.compile( optimizer='adam', # The learning algorithm loss='binary_crossentropy', # Error measurement metrics=['accuracy'] # What to track ) # Train: Let the network learn! history = model.fit( X_train, y_train, epochs=50, # 50 passes through data batch_size=32, # 32 samples at a time validation_split=0.2, # Use 20% for validation verbose=1 ) # Epoch 1/50 - accuracy: 0.55 - val_accuracy: 0.60 # Epoch 25/50 - accuracy: 0.89 - val_accuracy: 0.88 # Epoch 50/50 - accuracy: 0.93 - val_accuracy: 0.91
# Test on unseen data test_loss, test_accuracy = model.evaluate(X_test, y_test) print(f"\nā Test Accuracy: {test_accuracy:.2%}") # ā Test Accuracy: 91.25% # Make predictions predictions = model.predict(X_test[:5]) print("Predictions:", predictions.flatten()) print("Actual:", y_test[:5]) # Predictions: [0.92 0.08 0.87 0.03 0.95] # Actual: [1 0 1 0 1]
When your model memorizes the training data instead of learning patterns!
Like a student who memorizes test answers but can't solve new problems.
| Technique | What It Does | How to Use |
|---|---|---|
| Dropout | Randomly "turns off" neurons during training | layers.Dropout(0.2) |
| Early Stopping | Stops training when validation loss stops improving | callbacks.EarlyStopping(patience=5) |
| L2 Regularization | Penalizes large weights | kernel_regularizer='l2' |
| More Data | More examples = harder to memorize | Data augmentation, collect more |
In one sentence: why does a neural network with many layers need more data and careful regularization than a shallow one?
| Concept | Simple Explanation |
|---|---|
| Neural Network | Computer "brain" made of connected neurons that learns patterns |
| Neuron | Takes inputs, multiplies by weights, outputs if sum is big enough |
| Layer | Group of neurons that process data together |
| Deep Learning | Neural networks with many hidden layers |
| Training | Adjusting weights repeatedly to minimize prediction errors |
| Epoch | One complete pass through all training data |
| Overfitting | Model memorizes training data, fails on new data |
You've learned the foundations of Deep Learning!
Next steps: Try CNNs for images, RNNs for sequences!