A violin plot in Matplotlib is a method of visualizing the distribution and probability density of data across different categories. Unlike a box plot, which summarizes the data distribution using quartiles and median, a violin plot adds a kernel density estimation on either side, showing the full distribution shape. This provides a richer understanding of the data by illustrating its spread, skewness, and multimodality. Violin plots are particularly useful for comparing multiple datasets side by side, as they combine the simplicity of a box plot with the detailed data distribution insights of a density plot. In Matplotlib, the plt.violinplot() function is used to create violin plots, with options to customize the appearance, such as colors, width, and the type of kernel used for density estimation.
Example - Creating the Violin Plot
In this example we will create a simple violin plot i.e. the violin plot will be created for each dataset variable. Two libraries are required i.e. numpy library to generate the dataset and the matplotlib.pyplot sa plt to create the violin plot.
import numpy as np import matplotlib.pyplot as plt
To generate the data we will use np.random.normal() and using this function we will generate 8 numpy arrays that will be created inside the list. The range for each numpy array will change 0-2 up to 0-9. However, every array will contain the 200 number of samples.
data = [np.random.normal(0, i, 200) for i in range(2,10)]
The next step is create an empty figure with figure size of 12 by 8 inches. Then the plt.violinplot() will be used with the data as the function argument.
plt.figure(figsize=(12,8)) plt.vionlinplot(data)
Additionally if you want to can create title, xlabel, and ylabel. The title ”Violin Plot Example” will be created using the plt.title() function. The xlabel ”Dataset” will be created using the plt.xlabel() function while ”Value” will be the ylabel that will be created using the plt.ylabel(). The grid will
be added using the plt.grid(True) function. Finally the generated plot will be shown using the plt.show() function.
import numpy as np import matplotlib.pyplot as plt data = [np. random.normal (0, i, 200) for i in range(2,10)] plt.figure(figsize=(12,8)) plt.violinplot(data) plt.title ("Violin Plot Example") plt.xlabel("Dataset") plt.ylabel("Value") plt.grid(True) plt.show()
When you execute the previous code the result shown in Figure 1 is obtained.
Figure 1 - Violin plot from synthetically generated "data" variable