1

Just to preface, I'm very fresh with R, and sorry for the "special characters". I'm currently tryign to read this CSV file I'm working with. Here is my code

X17_01_24_Rawdata_SSB_fish_2021 <- read_delim("17-01-24-Rawdata-SSB-fish-2021.csv", 
    delim = ";", escape_double = FALSE, col_names = FALSE, 
    locale = locale(encoding = "latin1"), 
    trim_ws = TRUE, skip = 2)

Here is the first few line of the CSV

"09283: Eksport av fisk, etter land, statistikkvariabel, ?r og varegruppe",
,
"land;""Verdi (mill. kr) 2021 Fisk"," krepsdyr og bl?tdyr i alt"";""Verdi (mill. kr) 2021 Laks"";""Verdi (mill. kr) 2021 Torsk"";""Verdi (mill. kr) 2021 Sild"";""Verdi (mill. kr) 2021 Makrell"";""Verdi (mill. kr) 2021 Sei"";""Verdi (mill. kr) 2021 ?rret"";""Verdi (mill. kr) 2021 Hyse"";""Verdi (mill. kr) 2021 Lange"";""Verdi (mill. kr) 2021 Brosme"";""Verdi (mill. kr) 2021 Uer"";""Verdi (mill. kr) 2021 Kveite"";""Verdi (mill. kr) 2021 Annen fisk"";""Verdi (mill. kr) 2021 Reker"";""Verdi (mill. kr) 2021 Andre skalldyr/bl?tdyr"""
Albania;4;0;0;1;0;0;0;0;0;0;0;0;0;:;3,
Andorra;0;0;0;0;0;0;0;0;0;0;0;0;0;:;0,
Belarus;1135;179;0;170;58;0;701;0;0;0;1;0;25;:;0,

enter image description here Image of the table from which the CSV was generated

My goal is for the data to be split at the delimiter ; into their seperat columns. If I skip line 3 as well, it works and I just have default column names. When I dont skip it, nothing is split. I could just rename the columns manually, but that seems very bruteish. And I'm aware that if they eventually could be split, they'd need cleaning as well

Is read_delim the wrong tool? How does it work, and whats confusing it from doing what its "supposed to?"

5
  • 1
    are you willing to edit your question to include the CSV example in a code block (triple-` delimited) ? it's very hard to figure out where the line breaks are ... and whether the quotation characters are literal (in the file) or not Commented Jan 17, 2024 at 14:18
  • Of course, my apologies, ill move it into code chunk right away. The quotation is exactly like the file. Commented Jan 17, 2024 at 14:21
  • A simple delimited file has only values separated by delimiters, e.g., a;b;c. This is a problem if the delimiter could occur inside a value, if the first value should be a;b and ; is the delimiter, we use quotes for grouping, "a;b";c or "a;b";"c". Your 3rd line is very confusing - my guess is the first column name is supposed to be land, in which case the it should start "land";, but because the first ; is inside the quotes syntactically it is ignored, so R is probably thinking the first column name is land;""Verdi (mill. kr) 2021 Fisk"," krepsdyr og bl?tdyr i alt". Commented Jan 17, 2024 at 14:33
  • It would help if you could show us what you want the column names to be, so I can confirm my guess. Commented Jan 17, 2024 at 14:33
  • I added a screenshot of the system that generated the CSV, which was a bit easier than describing the split in detail. It is indeed confusing and I'm unsure how the clutter was generated on that line. Commented Jan 17, 2024 at 21:40

1 Answer 1

1

Changing col_names to TRUE and quote to "" (empty string) seems to do what you want, once you remove all the extra quotation marks from the column names (I think the main problem is that your semicolon delimiters are inside quotation marks in the column names)

read_delim("tmp.dat",
    delim = ";", escape_double = FALSE, col_names = TRUE, 
    locale = locale(encoding = "latin1"), 
    trim_ws = TRUE, skip = 2,
    quote = "") |>
  rename_with( ~ stringr::str_remove_all(., '"')

I would probably follow this with a

rename_with( ~ stringr::str_remove("Verdi (mill. kr) 2021 "))

...

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you. This solved the problems, and did what I wanted
You are encouraged to click the check-mark to accept the answer, if it solved your problem ...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.