Ml Structure Of Neural Networks
π
2026-06-23 | π Machine Learning
Basic Structure of Neural Networks
\\\\n\\\\nIn the wave of artificial intelligence, deep learning is undoubtedly the brightest star, and the core of deep learning is the **neural network**. It mimics the working mechanism of neurons in the human brain, granting machines the ability to learn and perceive through layered connections and computations.\\\\n\\\\nFor beginners, understanding the basic structure of neural networks is the first key to unlocking the door to deep learning.\\\\n\\\\nThis article will guide you from scratch, step-by-step deconstructing the composition of neural networks, and using vivid analogies and clear code to help you thoroughly grasp their working principles.\\\\n\\\\n* * *\\\\n\\\\n## What is a Neural Network? A Vivid Analogy\\\\n\\\\nImagine you are teaching a child who has never seen cats and dogs to distinguish between them. What would you do?\\\\n\\\\n1. You might first show them lots of pictures of cats and dogs.\\\\n2. You would point out features: Look, a cat's ears are usually pointy, and its face is relatively round; a dog's ears might droop, and its face is longer.\\\\n3. The child's brain (neural network) receives these pictures (input data) and your guidance (labels).\\\\n4. The neurons in their brain start working, trying to find the key patterns (like ear shape, face shape) to distinguish cats from dogs.\\\\n5. After multiple corrections and learning, a judgment model is formed in their brain. Next time they see a new animal picture, they can confidently say whether it's a cat or a dog.\\\\n\\\\n**A neural network is a simplified mathematical model of this child's brain.** It is a network system composed of a large number of **artificial neurons** interconnected, capable of automatically learning features and patterns from input data, and using them for prediction or decision-making.\\\\n\\\\n!(#)\\\\n\\\\n* * *\\\\n\\\\n## Basic Building Block of Neural Networks: The Neuron\\\\n\\\\nThe neuron is the most basic computational unit of a neural network. It simulates the process of a biological neuron: receiving signals - processing signals - transmitting signals.\\\\n\\\\n### Working Process of a Neuron\\\\n\\\\nA typical artificial neuron mainly does three things:\\\\n\\\\n## Examples\\\\n\\\\n# This is pseudocode logic for neuron computation to help you understand the process\\\\n\\\\ndef artificial_neuron(inputs, weights, bias):\\\\n\\\\n"""\\\\n\\\\n Simulate the computation process of an artificial neuron.\\\\n\\\\n Parameter:\\\\n\\\\n inputs: List of input signals, e.g., [x1, x2, x3]\\\\n\\\\n weights: List of weights corresponding to each input, e.g., [w1, w2, w3]\\\\n\\\\n bias: Bias term, a constant\\\\n\\\\n Return:\\\\n\\\\n output: Output of the neuron\\\\n\\\\n """\\\\n\\\\n# 1. Weighted sum: multiply each input by its corresponding weight, then add the bias\\\\n\\\\n weighted_sum =0\\\\n\\\\nfor i in range(len(inputs)):\\\\n\\\\n weighted_sum += inputs * weights\\\\n\\\\n weighted_sum += bias\\\\n\\\\n# 2. ActivationFunctionProcess: through a nonlinear function, decide whether"Activation"And output the signal\\\\n\\\\n output = activation_function(weighted_sum)\\\\n\\\\nreturn output\\\\n\\\\nLet's use a diagram and a table to understand this more intuitively:\\\\n\\\\n!(#)\\\\n\\\\n**Detailed Explanation of Neuron Components' Functions:**\\\\n\\\\n| Component | Analogy | Mathematical Expression | Function |\\\\n| --- | --- | --- | --- |\\\\n| **Input $x$** | Signals from other neurons | $x_{1} , x_{2} , . . . , x_{n}$ | Receives external information or the output of neurons from the previous layer. |\\\\n| **Weight $w$** | Importance of the signal | $w_{1} , w_{2} , . . . , w_{n}$ | Determines the degree of influence each input has on the neuron's output. **The learning process is the process of continuously adjusting these weights.** |\\\\n| **Bias $b$** | Neuron's activation threshold | $b$ | A constant used to adjust how easily the neuron is activated. It can be understood as shifting the weighted sum up or down as a whole. |\\\\n| **Weighted Sum $z$** | Total signal strength | $z = left(right. x_{1} w_{1} + x_{2} w_{2} + . . . + x_{n} w_{n} left.right) + b$ | Synthesizes all input signals. |\\\\n| **Activation Function $f$** | Switch and processor | $a = f left(right. z left.right)$ | Introduces **non-linearity**. Without it, a multi-layer network would degenerate into a single-layer network, unable to learn complex patterns. |\\\\n\\\\n### Common Activation Functions\\\\n\\\\nActivation functions bring non-linear capabilities to neural networks. Here are the three most commonly used ones:\\\\n\\\\n1. **Sigmoid**\\\\n\\\\n * **Formula**: $f left(right. z left.right) = frac{1}{1 + e^{- z}}$\\\\n * **Characteristics**: Compresses input to between (0, 1). Often used in the output layer of binary classification problems. Prone to the vanishing gradient problem.\\\\n * **Graph**: A smooth S-shaped curve.\\\\n\\\\n2. **ReLU (Rectified Linear Unit)**\\\\n\\\\n * **Formula**: $f left(right. z left.right) = m a x left(right. 0 , z left.right)$\\\\n * **Characteristics**: Simple computation, effectively mitigates the vanishing gradient problem, and is currently the most commonly used activation function for hidden layers.\\\\n * **Graph**: A polyline bending at the origin, outputting 0 for negative numbers, and outputting positive numbers as they are.\\\\n\\\\n3. **Softmax**\\\\n\\\\n * **Formula**: $f left(right. z_{i} left.right) = frac{e^{z_{i}}}{sum_{j = 1}^{K} e^{z_{j}}}$\\\\n * **Characteristics**: Converts the output of multiple neurons into a probability distribution (the sum of all outputs is 1). **Exclusively used for the output layer of multi-classification problems.**\\\\n\\\\n* * *\\\\n\\\\n## Layered Structure of Neural Networks\\\\n\\\\nA single neuron has limited capabilities, just as a single brain cell cannot think. When we organize a large number of neurons into layers, we form a powerful neural network. A typical neural network contains the following three layers:\\\\n\\\\n!(#)\\\\n\\\\n### 1. Input Layer\\\\n\\\\n* **Role**: The senses of the network, responsible for receiving raw data.\\\\n* **Characteristics**: The number of neurons in this layer usually equals the number of features in the input data. For example, a 28x28 pixel grayscale image flattened is 784 features, corresponding to 784 input neurons. **The input layer does no computation**; it only passes data along.\\\\n\\\\n### 2. Hidden Layer\\\\n\\\\n**Role**: The brain of the network, responsible for complex feature extraction and transformation.\\\\n\\\\n**Characteristics**:\\\\n\\\\n* Situated between the input and output layers, there can be one or more layers (this is where deep learning comes from).\\\\n* Every neuron in each layer receives the output of all neurons from the previous layer as input, computes its own output, and passes it to the next layer (this is called fully connected).\\\\n* Neurons in the hidden layer use activation functions like ReLU to introduce non-linearity.\\\\n\\\\n### 3. Output Layer\\\\n\\\\n**Role**: The decision-maker of the network, outputting the final prediction result.\\\\n\\\\n**Characteristics**: The number of neurons is determined by the task.\\\\n\\\\n* **Binary Classification**: 1 neuron (using Sigmoid) or 2 neurons (using Softmax).\\\\n* **Multi-classification (K classes)**: K neurons (using Softmax).\\\\n* **Regression** (predicting a continuous value): 1 neuron (usually no activation function).\\\\n\\\\n!(#)\\\\n\\\\nDemo Diagram:\\\\n\\\\nβ Network Structure β Understanding Neural Networks\\\\n\\\\nA neural network consists of multiple layers of neurons. Data enters from the input layer on the left, undergoes feature extraction in the hidden layers, and finally, the output layer on the right provides the predicted answer.\\\\n\\\\nInput Layer Hidden Layer Output Layer Weight Connections Activation Signals\\\\n\\\\n* * *\\\\n\\\\n## Practice: Building a Neural Network with Python\\\\n\\\\nNo amount of theory beats hands-on practice. Below, we will use the `NumPy` library to build a simplest three-layer neural network (1 hidden layer) from scratch and perform a forward propagation calculation.\\\\n\\\\n## Examples\\\\n\\\\nimport numpy as np\\\\n\\\\n# Define ActivationFunction\\\\n\\\\ndef sigmoid(x):\\\\n\\\\n"""Sigmoid ActivationFunction"""\\\\n\\\\nreturn 1 / (1 + np.exp(-x))\\\\n\\\\ndef relu(x):\\\\n\\\\n"""ReLU ActivationFunction"""\\\\n\\\\nreturn np.maximum(0, x)\\\\n\\\\n# Initialize a simple neural network\\\\n\\\\ndef initialize_network(input_size, hidden_size, output_size):\\\\n\\\\n"""\\\\n\\\\n Initialize network weights and biases.\\\\n\\\\n Parameter:\\\\n\\\\n input_size: Number of neurons in the input layer\\\\n\\\\n hidden_size: Number of neurons in the hidden layer\\\\n\\\\n output_size: Number of neurons in the output layer\\\\n\\\\n Return:\\\\n\\\\n network: Contains Parameters for each layer