NumPy.diff() in Python: Calculating Array Differences

Python’s NumPy library is great! Did you know it offers us a ‘diff’ function to calculate differences? If not, this article is for you. Here, we will discuss this particular function in detail along with its parameters and its implementation. Also, we will discuss an alternative to this function. So let’s begin.

Introduction to Numpy.diff()

NumPy’s diff function in Python calculates differences between consecutive elements in an array.

Say you have a list of numbers and you’re interested in the difference between each number and the one before it. For instance, with [5, 10, 15], you’d like to know there’s a 5-unit increase between 5 and 10, and another 5-unit increase between 10 and 15. Numpy Diff handles this task swiftly and effectively.

Introduction to Numpy.diff()

Syntax and Parameters

Following is the Syntax of this function:

numpy.diff(a, n=1, axis=-1)

Now let’s learn about what each parameter means:

  • a: This array is where you want to find the differences between consecutive elements. It can be either 1D or multi-dimensional.
  • n: This indicates how many times the differences are calculated recursively. By default, it’s 1, meaning it calculates differences once. You can adjust it if you need to calculate differences multiple times.
  • axis: This determines the axis along which differences are computed. By default, it’s -1, indicating the last axis. If your array has multiple dimensions, you can modify it to compute differences along a different axis.

Examples of NumPy.diff()

Now that we know about the function and its parameters, let’s try to implement the concept we learned and calculate differences in data. Here are some examples:

Example 1

First, let’s see a simple implementation on 1-D array data.

import numpy as np

arr = np.array([0, 5, 10, 15, 20])
differences = np.diff(arr)
print("Original array:", arr)
print("1-Dimensional Differences array:", differences)

Output:

This code finds the differences between consecutive elements in the array using NumPy’s diff function. It then prints both the original array and the resulting differences.

Example 1 Output

Example 2

Now let’s see if we can use the same function with multidimensional data.

import numpy as np

arr = np.array([[1, 2, 3],
                   [4, 4, 6],
                   [7, 8, 10]])
diff_r = np.diff(arr, axis=0)
diff_c = np.diff(arr, axis=1)
print("Original 2D array:")
print(arr)
print("\nDifferences along rows (axis=0):")
print(diff_r)
print("\nDifferences along columns (axis=1):")
print(diff_c)

When axis=0, it means we are calculating differences along the vertical axis, which implies differences between consecutive rows. For example, the difference between the first row [1, 2, 3] and the second row [4, 4, 6] would be [4-1, 4-2, 6-3], which is [3, 2, 3]. Likewise, the difference between the second row and the third row would be [7-4, 8-4, 10-6], resulting in [3, 4, 4].

Example 2 Illustration

And when axis=1, differences are calculated horizontally, meaning between consecutive columns. For example, the difference between the first column [1, 4, 7] and the second column [2, 4, 8] would be [2-1, 4-4, 8-7], resulting in [1, 0, 1]. Similarly, the difference between the second column and the third column would be [3-2, 6-4, 10-8], resulting in [1, 2, 2].

Example 2 Illustration 2

Output:

Example 2 Output

Example 3

We can also calculate differences according to orders and recursion. In this example, let’s try to calculate the differences in different orders: first, second, and third. We will achieve this using the n parameter.

import numpy as np

arr = np.array([5, 7, 13, 18, 21])
diff_first_order = np.diff(arr)
diff_second_order = np.diff(arr, n=2)
diff_third_order = np.diff(arr, n=3)

print("Original array:", arr)
print("First-order differences array:", diff_first_order)
print("Second-order differences array:", diff_second_order)
print("Third-order differences array:", diff_third_order)

We’re finding the differences between neighbouring numbers in the array arr diff_first_order stores these differences. Then, diff_second_order finds differences between those differences, and diff_third_order calculates differences again. This helps us understand how the values change over time and their patterns.

Illustration 3 Output

Output:

Example 3 Output

Simple Alternatives to NumPy.diff()

Now, if you don’t wish to use the NumPy diff function to calculate the difference, there is an easy alternative. You can simply calculate the difference using some logic. Observe the given code closely:

arr = [2, 5, 7, 13, 20]
differences = [arr[i+1] - arr[i] for i in range(len(arr)-1)]
print("Original array:", arr)
print("Differences array:", differences)

Here we find the differences between consecutive elements in the array arr by subtracting each element from the next one. This generates a new array named differences, storing these calculated differences. Finally, we print both the original array and the differences array.

Conclusion

So that’s it for this article. I hope you now have a good understanding of the topics discussed. We covered the ‘diff‘ function in detail, including its parameters and implementations. The examples provided were easy and simple, I hope you found them helpful. Additionally, we discussed another option to the ‘diff’ function. Now it’s your time to try them out and use them yourself.

Further Reading:

Reference

https://numpy.org/doc/stable/reference/generated/numpy.diff.html

Snigdha Keshariya
Snigdha Keshariya
Articles: 97