I've been working with matplotlib for the past few weeks, learning how to make different graphs and such. I am currently at a stand still. I am working with gene expression data, so I have a csv file that contains 3 columns of data (mutated, frameshift and nonmutated). However, I am attempting to make a violin plot and I keep getting an error:
Traceback (most recent call last):
File "/home/fmohamed/Documents/violinplot_script.py", line 39, in <module>
axes.violinplot(all_data,
AttributeError: 'numpy.ndarray' object has no attribute 'violinplot'
I am not sure what I am doing wrong, but my code is located below:
import matplotlib.pyplot as plt
import numpy as np
import csv
#import data:
with open('/home/fmohamed/Documents/oc_data.csv') as csvfile:
spamreader = csv.reader(csvfile, delimiter = ' ')
array = []
for row in spamreader:
array.append(row)
# sort out bad data:
splitArray = [row[0].split(',') for row in array]
splitArray = [row for row in splitArray if row[0] != '' and row[1] != '' and
row[2] != '']
descr = splitArray[0]
splitArray.pop(0) #remove the column descriptions from array
all_data = []
for row in splitArray:
all_data.append(float(row[0]))
all_data.append(float(row[1]))
all_data.append(float(row[2]))
all_data = np.array(all_data)
fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(12, 5))
# plot violin plot
axes.violinplot(all_data,
showmeans=False,
showmedians=False)
axes.set_title('violin plot')
# adding horizontal grid lines
axes.yaxis.grid(True)
axes.set_xticks([y+1 for y in range(len(all_data))])
axes.set_xlabel('xlabel')
axes.set_ylabel('ylabel')
# add x-tick labels
plt.setp(axes, xticks=[y+1 for y in range(len(all_data))],
xticklabels=['mutated', 'frameshift', 'nonmutated'])
plt.show()
Thank you all in advance for any help that you can provide.
axesis a NumPy array of 3Axesobjects. You can doaxes[0].violinplot(first_dataset)(so you'd need a loop to plot all three datasets). As @chthonicdaemon suggests, look into Seaborn and itsFacetGridobjects if you want a way to make that violin plot for all three datasets in a single command.all_dataarray appears to be a single 1d array. Is that what you intended? If so, what's the purpose of your three separate columns in thesubplotscall?