Scalars and vectors are foundational mathematical concepts used in machine learning. Complex machine learning algorithms, such as neural networks, are built around scalars and vectors. Therefore, a solid understanding of these concepts is essential for navigating the many complex neural-network-based architectures you will encounter
The scaler
A scalar is a single real number representing magnitude or quantity, such as 5, 2, 7.6 or even -3.
Scalars can also be complex or boolean:
While we have defined a scalar as a real number, in standard machine learning it can also be a complex number or a boolean value. For the scope of this post and standard machine learning packages, focusing on real numbers is sufficient.
The Vector
A vector is a sequence of real numbers (scalars) representing magnitude and direction, such as [5, 2], [7.6, -3], or [1, 0, 4].
We represent a vector as a column of elements. When a vector has real-valued components, we say it exists in an dimensional space:
In machine learning, we use vectors to represent a single data point in your dataset. Each individual number inside the vector is called a feature
Now observe the mathematical representation above, is a single datapoint, a data point have have many features . these feature can be anything, depending on your data, lets take for example a housing price prediction dataset, in other to predict the price, we need specific features like the area of the house, the number of bedrooms, and the age of the house.
See how from the above math equation it clearly shows how we represent features in vectors, where each feature (e.g area can be represented as )
Vector Operations
We have seen how to represent a data point as a vector. The next step is to understand how to perform mathematical operations on vectors. By combining these operations, we can build the mathematical models that power machine learning algorithms
Scalar Multiplication
The simplest operation is multiplying a vector by a scalar. This is often used to "weight" our features. When you multiply a vector by a scalar, you multiply every individual feature inside that vector by that number.
For example, if we want to double the values in our house vector:
Lets see this in code:
Vector Addition
To add two vectors together, both vectors must have the same number of features (the same dimensions). For example, if vector represents a house with 3 features, you can only add it to another vector that also has 3 features.
We perform this addition element-wise. This means we simply add the numbers that are in the same position.
To represent this in code we will use numpy to perform the vector addition:
Notice how the result is a new vector: [4500+2300, 8+10, 55+20]
Transposition
Transposing a vector is simply the act of flipping its orientation. If you have a column vector (a vertical list), transposing it turns it into a row vector (a horizontal list).
In machine learning mathematics, transposition is important because many vector operations require specific dimensions. A feature vector is usually written as a column vector:
The column vector:
The Transposed Vector ():
Lets represent this in code:
The dot product
To calculate the Dot Product, you multiply the corresponding elements of two vectors and then add all the results together. because we are adding everything up, at the end, the result of the Dot Product is always a scaler number.
Consider a house price prediction problem where a data point is represented by features such as area, number of bedrooms, and age of the house. We can represent these features as a column vector:
Our model also has a set of parameters, often called weights, which determine how much each feature contributes to the prediction:
To make a prediction, we need to multiply the feature values by their corresponding weights and add the results together. This operation is known as the dot product. For the multiplication to be mathematically valid, one vector must be a row vector and the other must be a column vector. This is where the transpose becomes important.
By transposing the feature vector, we obtain:
We can now multiply the row vector by the weight vector:
The result is a single number:
Lets see this in code:
Summary
- Scalars are single numbers that represent magnitude, serving as the simplest building blocks of data.
- Vectors are lists of scalars that represent a single data point, where every individual number is a specific feature.
- Basic Operations like addition and scalar multiplication allow us to combine data points or scale their features element-wise.
- Transposition is the process of flipping a vector's orientation from a column to a row, which is essential for mathematical alignment.
- The Dot Product is the most important operation in machine learning; it is how a model multiplies features by weights to calculate a final prediction or output.
