Describe a NumPy Array in Python
NumPy is a Python library for numerical computing. It provides multidimensional arrays and many mathematical functions to efficiently perform operations on them. In this article, we will perform a descriptive analysis of a NumPy array to understand its key statistics.
Initializing a NumPy Array
Initializing a NumPy Array means creating a new array with some starting values using NumPy function np.array().
import numpy as np
arr = np.array([4, 5, 8, 5, 6, 4, 9, 2, 4, 3, 6])
print(arr)
Output
[4 5 8 5 6 4 9 2 4 3 6]
To analyze a NumPy array effectively, we focus on two key types of statistics:
- Central Tendency
- Dispersion
Measures of Central Tendency
Measures of central tendency summarize a dataset by identifying a typical or central value, such as the mean or median, that represents the overall trend of the data.
1. mean(): takes a NumPy array as an argument and returns the arithmetic mean of the data.
np.mean(arr)
2. median(): takes a NumPy array as an argument and returns the median of the data.
np.median(arr)
The following example illustrates the usage of the mean() and median() methods.
import numpy as np
arr = np.array([12, 5, 7, 2, 61, 1, 1, 5])
mean = np.mean(arr)
median = np.median(arr)
print("Mean:", mean)
print("Median:", median)
Output
Mean: 11.75 Median: 5.0
Measures of Dispersion
Measures of dispersion describe how spread out or varied the values in a dataset are, showing whether the data points are close to the average or widely scattered.
1. amin() : it takes a NumPy array as an argument and returns the minimum.
np.amin(arr)
2. amax() : it takes a NumPy array as an argument and returns maximum.
np.amax(arr)
3. ptp() : it takes a NumPy array as an argument and returns the range of the data.
np.ptp(arr)
4. var() : it takes a NumPy array as an argument and returns the variance of the data.
np.var(arr)
5. std() : it takes a NumPy array as an argument and returns the standard variation of the data.
np.std(arr)
Example: The following code illustrates amin(), amax(), ptp(), var() and std() methods.
import numpy as np
arr = np.array([12, 5, 7, 2, 61, 1, 1, 5])
min_val = np.amin(arr)
max_val = np.amax(arr)
rng = np.ptp(arr)
var = np.var(arr)
std = np.std(arr)
print("Min:", min_val)
print("Max:", max_val)
print("Range:", rng)
print("Variance:", var)
print("Std Dev:", std)
Output
Min: 1 Max: 61 Range: 60 Variance: 358.1875 Std Dev: 18.925842121290138