Introduction to Neural Networks
Explore the fundamentals of neural networks, Keras, and TensorFlow.
Topic 1: What is a Neural Network?
An Artificial Neural Network (ANN) is a subfield of Machine Learning inspired by the structure and function of the human brain. It uses interconnected nodes (or "neurons") in a layered structure to find complex patterns in large datasets.
While a Decision Tree asks explicit "yes/no" questions, a Neural Network learns "fuzzy" patterns by passing data through a series of interconnected layers.

Real-World Analogy: The Human Brain
Your brain is made of billions of neurons. When you see a cat, a specific set of neurons fire. Some recognize the "furry texture," others "sharp ears," and others "whiskers." Your brain combines these signals to conclude "Cat."
A Neural Network does the same. Each "neuron" in the network is a small mathematical function that learns to recognize a tiny part of a pattern. By stacking them in layers, the network can combine simple patterns (like pixels) into complex ideas (like "Cat").
Key Parts of a Neural Network
- Neuron (or Node): A small unit that receives inputs, applies a calculation (involving weights), and passes the result to an "activation function."
- Input Layer: The "senses" of the network. It has one neuron for each feature in your data (e.g., for Iris, 4 neurons: sepal length, sepal width, petal length, petal width).
- Hidden Layer(s): The "brain" of the network. This is where the magic happens. "Deep" learning simply means there is more than one hidden layer.
- Output Layer: The final decision. For classification, it has one neuron for each class (e.g., for Iris, 3 neurons: Setosa, Versicolor, Virginica).
- Weights: These are the "knowledge" of the network. Each connection between neurons has a weight, which is just a number. The "learning" process is all about tuning these weights.
Topic 2: How Does a Network "Learn"?
A network "learns" by adjusting its internal weights to minimize its mistakes. This process is a loop with four main steps.
Analogy: Target Practice
Imagine you're blindfolded and trying to hit a bullseye (the correct answer). You take a shot (a guess), and a friend tells you "You missed 5 inches to the left." You use that feedback (the "error") to adjust your aim and shoot again. You repeat this until you hit the bullseye.
- Forward Propagation (The Guess): You feed data (e.g., a flower's measurements) into the input layer. It flows through the hidden layers, weights are applied, and the output layer makes a prediction (e.g., "70% Virginica, 20% Versicolor, 10% Setosa").
- Loss Function (The "Error"): The network compares its prediction to the *actual* label (e.g., "100% Virginica"). It uses a **Loss Function** (like "Cross-Entropy") to calculate a single number, the "loss" or "error," which represents how wrong the guess was.
- Backpropagation (The "Blame"): This is the most brilliant part. The network calculates the "error" at the end, then works *backwards* through the layers to see which weights were most "responsible" for the mistake.
- Optimization (The "Adjustment"): The network uses an **Optimizer** (like "Adam" or "SGD" - Stochastic Gradient Descent) to slightly adjust the "blameworthy" weights in the correct direction to reduce the error.
The network repeats this loop thousands of times (over "epochs") with all the training data, getting slightly better with every single adjustment.
Topic 3: Building a Model with Keras
We don't have to build this from scratch. We use high-level libraries like Keras (which runs on top of TensorFlow) to build networks easily.
The most common way is with the Sequential API, which lets you build a model layer-by-layer like stacking bricks.
Key Layer Types:
Dense: A "fully-connected" layer. Every neuron in this layer is connected to every neuron in the previous layer. This is the workhorse of most basic NNs.Flatten: Used to unroll multi-dimensional data (like an image) into a 1D list so aDenselayer can read it.Dropout: A regularization technique to prevent overfitting. It randomly "drops" a fraction of neurons during training, forcing the network to be more robust.
Example: A Simple Keras Model
Here is how you build a simple network for the Iris dataset (4 features, 3 classes):
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
# 1. Initialize the model
model = Sequential()
# 2. Add the Input & First Hidden Layer
# We need to specify the input shape for the *first layer only*.
# units=10: 10 neurons in this layer
# activation='relu': The "Rectified Linear Unit" is the most common and effective activation.
# input_shape=[4]: Our Iris data has 4 features
model.add(Dense(units=10, activation='relu', input_shape=[4]))
# 3. Add a second hidden layer (optional)
model.add(Dense(units=10, activation='relu'))
# 4. Add the Output Layer
# units=3: 3 neurons, one for each class (Setosa, Versicolor, Virginica)
# activation='softmax': This special activation converts the output into probabilities that sum to 1.
model.add(Dense(units=3, activation='softmax'))
# You can now see your model's architecture
model.summary()
Topic 4: Compiling and Training
After building the model architecture, you must compile it. This is where you tell Keras which tools to use for the learning process (Steps 2, 3, and 4 from Topic 2).
# 1. Compile the model
model.compile(
# 'adam' is an efficient and popular optimizer. It's a great default.
optimizer='adam',
# This loss function is perfect for multi-class classification
# when your labels are simple integers (0, 1, 2...).
loss='sparse_categorical_crossentropy',
# We want to track 'accuracy' as the model trains.
metrics=['accuracy']
)
# 2. Train the model
# We "fit" the model to the training data.
# epochs=50: The model will see the *entire* training dataset 50 times.
# validation_data: The model will check its performance on the test set after each epoch.
history = model.fit(
X_train, y_train,
epochs=50,
validation_data=(X_test, y_test)
)
# 3. Evaluate the final model
loss, accuracy = model.evaluate(X_test, y_test)
print(f"Final Test Accuracy: {accuracy*100:.2f}%")