0

I am using a dataframe which includes the following columns:

Country, GNI, CarSalesPerCap. I am using kmeans to create clusters. In the algorithm i pass the dataframe with the two numeric columns: 'GNI', 'CarSalesPerCap'.

Then i am using plotly to create a scatter plot, where x-axis is the CarsalesPerCap and Y-axis is GNI. My question is, how am i going to add to the plot the corresponding country for each point plotted on the graph.

df = pd.read_sql_query(query,conn)
df = df.dropna()



#Cluster the data
kmeans = KMeans(n_clusters=6, random_state=0).fit(df1)
labels = kmeans.labels_

#Glue back to originaal data
df['clusters'] = labels


#Lets analyze the clusters
print (df)
cluster0=df.loc[df['clusters'] == 0]
cluster1=df.loc[df['clusters'] == 1]
cluster2=df.loc[df['clusters'] == 2]
cluster3=df.loc[df['clusters'] == 3]
cluster4=df.loc[df['clusters'] == 4]
cluster5=df.loc[df['clusters'] == 5]

p0 = go.Scatter(x=cluster0['CarSalesPerCap'],
                y= cluster0['GNI'],
                mode='markers',
                marker=dict(color='black')
                )

p1 = go.Scatter(x=cluster1['CarSalesPerCap'],
                y= cluster1['GNI'],
                mode='markers',
                marker=dict(color='teal')
                )

p2 = go.Scatter(x=cluster2['CarSalesPerCap'],
                y= cluster2['GNI'],
                mode='markers',
                marker=dict(color='grey')
                )
p3 = go.Scatter(x=cluster3['CarSalesPerCap'],
                y= cluster3['GNI'],
                mode='markers',
                marker=dict(color='pink')
                )
p4 = go.Scatter(x=cluster4['CarSalesPerCap'],
                y= cluster4['GNI'],
                mode='markers',
                marker=dict(color='purple')
                )
p5 = go.Scatter(x=cluster5['CarSalesPerCap'],
                y= cluster5['GNI'],
                mode='markers',
                marker=dict(color='orange')
                )

layout = go.Layout(xaxis=dict(ticks='',
                              showticklabels=True,
                              zeroline=True,
                              title = 'CarSalesPerCap'),

                   yaxis=dict(ticks='',
                              showticklabels=True,
                              zeroline=True,
                              title='GNI'),
                   showlegend=False, hovermode='closest')

fig = go.Figure(data=[p0,p1,p2,p3,p4,p5], layout=layout)

py.offline.plot(fig)
1
  • You could color-code the countries or use different marker types for different countries. Commented Jun 18, 2018 at 15:50

1 Answer 1

0

You can add a text element to your trace and it will allow you to overlay anything you want. If you add your country column then it will be displayed on hover. If you want a permanent label you can add annotations

import plotly.graph_objs as go
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
init_notebook_mode(connected=True)
import pandas as pd
df = pd.DataFrame({'country':["USA", "MEXICO", "CANADA"], 'x':[1, 2, 4], 'y':[5, 6, 7]})
p0 = go.Scatter(
    x=df.x,
    y= df.y,
    mode='markers',
    marker=dict(
        color='#E90',
        size=15
    ),
    text = df.country,    
)

data = [p0]

iplot(data)

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.