1

I have downloaded the data and would like to change columns named USD and EUR to numeric and also treat the column date as a date. I would also like to get rid of the missing values in the dataframe named result3.

library(dplyr)
library(ggplot2)
library(reshape2) 

getNBPRates <- function(year) {
  url1 <- sprintf(
    paste0("https://www.nbp.pl/kursy/Archiwum/archiwum_tab_a_", year, ".csv"), 
    year)
  url1 <- read.csv2(url1, header=TRUE, sep=";", dec=",") %>% 
    select(data, X1USD, X1EUR) %>% 
    rename(usd=X1USD, eur=X1EUR, date=data) %>%
    slice(-1)
  transform(url1, date = as.Date(as.character(date), "%Y%m%d"))
}

a <- getNBPRates(year=2015)

head(as.data.frame(a))

years<- c(2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020)

result <- lapply(years, getNBPRates)

result3 <- Reduce(rbind, result)
1
  • Error: object 'getNBPRates' not found. Also, Reduce(rbind, result) is horribly inefficient and will get very slow with more data; use instead do.call(rbind, result). The difference is that with Reduce, rbind is copied n-1 times (where n is length(years) here), and with each time it makes a complete copy of the data in memory ... eventually that gets big and slow; with do.call, rbind is called only once. No copying-over-copying-over-copying ... Commented Jan 24, 2022 at 18:55

2 Answers 2

1
getNBPRates <- function(year) {
  url1 <- sprintf(paste0("https://www.nbp.pl/kursy/Archiwum/archiwum_tab_a_", year, ".csv"))
  url1 <- read.csv2(url1, header=TRUE, sep=";", dec=",", fileEncoding = "Windows-1250")
  url1 <- url1 |>
    select(data, X1USD, X1EUR) |>
    slice(-1) |>
    filter(row_number()<= n()-3) |>
    mutate(data = as.Date(data, format = "%Y%m%d"), usd = as.numeric(gsub(",", ".", X1USD)), eur = as.numeric(gsub(",", ".", X1EUR))) |>
    select(-c(X1USD, X1EUR))
}

years<- c(2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020)
result <- lapply(years, getNBPRates)
result3 <- Reduce(rbind, result)

And what you understand with "to get rid of the missing values in dataframe named result3."? If that's the missing dates, then you have to substitute it with some logic. If I'm not mistaken - if there is no NBP for particular day, a last one has to be taken.

Sign up to request clarification or add additional context in comments.

Comments

0

To change a column to numeric you can use as.numeric(column_name)

Based on the date format in the archiwum_tab_a_2015.csv file, you can change the date column with as.Date(column_name, format = "%Y%m%d")

To remove all missing values you can use complete.cases(data):

mydata[complete.cases(mydata),]

2 Comments

When I use as.numeric(column_name), almost the whole column is with missing values. Also the complete.cases(data) creates the values True or false and don't remove missing values
You just need to index the whole dataframe using the results of complete.cases, so something like mydata[complete.cases(mydata),]. As for using as.numeric if you have missing values there isn't much you can do. I would suggest removing the missing values first, although you might find the data needs further pre-processing if values which should convert to numeric are becoming NA after using as.numeric

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.