-1

the data source/file location on the web is: https://www.newyorkfed.org/medialibrary/media/survey/empire/data/esms_seasonallyadjusted_diffusion.csv However as there were issues connecting I saved it ('esms_seasonallyadjusted_diffusion.csv') locally which is the best to do for speed anyway and I also saved it to github: https://github.com/me50/hlar65/blob/master/ESMS_SeasonallyAdjusted_Diffusion.csv')

2 questionsw:

  1. When trying to access the weblink (even though clicking on it downloads the file) I get a connection error: "URLError: <urlopen error Tunnel connection failed: 403 Forbidden>"
  2. my code (I am a beginner!) looks clunky. Is there a cleaner better way to express?

Thanks everyone for your help '''

import pandas as pd
import numpy as np
from pandas.plotting import scatter_matrix
import scipy as sp
from scipy import stats
import matplotlib.pyplot as plt
import seaborn as sn


df = dd.read_csv('https://www.newyorkfed.org/medialibrary/media/survey/empire/data/esms_seasonallyadjusted_diffusion.csv')
 
df = df.rename(columns={'surveyDate':'Date',
                        'GACDISA': 'IndexAll', 
                        'NECDISA': 'NumberofEmployees',
                        'NOCDISA': 'NewOrders',
                        'PPCDISA': 'PricesPaid',
                        'PRCDISA': 'PricesReceived'})
headers = df.columns

df['Date'] = pd.to_datetime(df['Date'])
df.set_index('Date', inplace=True)

IndexAll = df['IndexAll']
NumberofEmployees = df['NumberofEmployees']
NewOrders = df['NewOrders']
PricesReceived = df['PricesReceived']

data = df[['IndexAll', 'NumberofEmployees', 'NewOrders', 'PricesReceived']]
data2 = data.copy()

ds = data2
FS_A = 14
FS_L = 16
FS_T = 20
FS_MT = 25

fig, ((ax0, ax1), (ax2,ax3)) = plt.subplots(nrows=2, ncols=2, figsize=(20,15))

# density=True : probability density i.e. prb of an outcome; False = actual # of frequency

ds['IndexAll'].plot(ax=ax0, color='red')
ax0.set_title('New York Empire Manufacturing Index', fontsize = FS_T)
ax0.set_ylabel('Date', fontsize = FS_A) 
ax0.set_xlabel('Empire Index', fontsize = FS_L) 
ax0.tick_params(labelsize=FS_A)

ds['NumberofEmployees'].plot(ax=ax1, color='blue')
ax1.set_title('Empire: Number of Employees', fontsize = FS_T)
ax1.set_ylabel('Date', fontsize = FS_L) 
ax1.set_xlabel('Number of Employees', fontsize = FS_L) 
ax1.tick_params(labelsize=FS_A)

ds['NewOrders'].plot(ax=ax2, color='green')
ax2.set_title('Empire: New Orders', fontsize = FS_T)
ax2.set_ylabel('Date', fontsize = FS_L) 
ax2.set_xlabel('New Orders', fontsize = FS_L) 
ax2.tick_params(labelsize=FS_A)

ds['PricesReceived'].plot(ax=ax3, color='black')
ax3.set_title('Empire: Prices Received', fontsize = FS_T)
ax3.set_ylabel('Date', fontsize = FS_L) 
ax3.set_xlabel('Prices Received', fontsize = FS_L) 
ax3.tick_params(labelsize=FS_A)

fig.tight_layout()
fig.suptitle('New York Manufacturing Index Main Components - Showing the Depths of COVD19 in 2020', fontsize = FS_MT)
fig.tight_layout()
fig.subplots_adjust(top=0.88)
fig.subplots_adjust(bottom = -0.2) 
fig.savefig("Empire.png")
plt.show()

'''

2
  • The first question is not reproducible df = pd.read_csv('https://www.newyorkfed.org/medialibrary/media/survey/empire/data/esms_seasonallyadjusted_diffusion.csv') works without issue to access the csv file. The second question is a duplicate of your previous question. SO questions should focus on one question, not multiple questions. Also, do not repeatedly ask the same question. How to ask a good question Commented Oct 12, 2020 at 21:13
  • Does this answer your question? How to plot columns from a dataframe, as subplots Commented Oct 12, 2020 at 21:13

1 Answer 1

1

For the first question have you set any proxy? I think it comes from the proxy setting.

About the second one I can do some cleanup but it is very dependent on the developer code style. you can write a script in several different ways.

note that:

  • You can parse date in read_csv call
  • Do all the stuff you want to do with df in one step using parentheses
  • You can define arrays for your parameters and draw all plots in a for loop
df = (
    pd
    .read_csv('https://www.newyorkfed.org/medialibrary/media/survey/empire/data/esms_seasonallyadjusted_diffusion.csv',
             parse_dates=['surveyDate'])
    .rename(columns={'surveyDate':'Date',
                        'GACDISA': 'IndexAll', 
                        'NECDISA': 'NumberofEmployees',
                        'NOCDISA': 'NewOrders',
                        'PPCDISA': 'PricesPaid',
                        'PRCDISA': 'PricesReceived'})
    .set_index('Date')
)

FS_A = 14
FS_L = 16
FS_T = 20
FS_MT = 25

titles = ['New York Empire Manufacturing Index','Empire: Number of Employees','Empire: New Orders','Empire: Prices Received']
xlabels = ['Empire Index','Number of Employees','New Orders','Prices Received']
colors=['red','blue','green','black']
columns = ['IndexAll', 'NumberofEmployees', 'NewOrders', 'PricesReceived']
ds = df[columns]
k=0
fig, axes = plt.subplots(nrows=2, ncols=2, figsize=(20,15))
for i in range(2):
    for j in range(2):
        ds[columns[k]].plot(ax=axes[i][j], color=colors[k])
        axes[i][j].set_title(titles[k], fontsize = FS_T)
        axes[i][j].set_ylabel('Date', fontsize = FS_A) 
        axes[i][j].set_xlabel(xlabels[k], fontsize = FS_L) 
        axes[i][j].tick_params(labelsize=FS_A)
        k+=1
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.