1

Using Windows 10, R 4.0.3 and RStudio 1.4.1103

I have a script (written by a developer), the output of which is a kind of a tree diagram in txt. The piece of code is:

for (index in seq_len(nrow(file))) {
write(paste0(path, if (index == nrow(file)) '└' else '├' , '──', file[index, 'name']), tree_filename, append = TRUE)

newPath = if (index == nrow(file)) paste0(path, '    ') else paste0(path, '│   ')
treefunction(file[index, id_column_header], newPath)}

The characters │ and └ appear correctly in RStudio when typed into the code. However, when the output of the function is saved in .txt, these characters become +'s and -'s for me, while for the developer all works perfectly (pls see image below with both outputs).enter image description here

What I have tried so far: I have set utf-8 in .RProfile and the .txt file is encoded in utf-8 (I have checked).

The developer is using linux (I'm not sure which version). Could someone please help with what I should do so the └ type characters display as they should? Thank you very much.

2
  • Sometimes the RStudio text editor can't render UTF-8 characters correctly. What happens when you open the file in Notepad++ or a better editor? I'm trying to figure out whether there is an actual encoding problem, or just a rendering problem with the editor. Commented Feb 13, 2021 at 0:59
  • Thank you very much, @DavidJ.Bosak, for taking the time to respond. If I open my output in Notepad++, it looks the same as the right-hand-side picture, i.e. not with the correct characters. And the developer's version looks like the left-hand-side picture even in simple Notepad. Commented Feb 13, 2021 at 22:23

1 Answer 1

1

First of all, I sympathize with you. Encoding on Windows is a nightmare. There is a guy named Tomas Kalibera on the R Core team who is working to fix this. Probably in the next year or so it will be greatly improved. Here is a link that explains how he is going to fix it.

Second, I think you can solve your problem now by make a few changes to the way you are writing the strings:

  1. Use Unicode character codes instead of direct strings. These codes are known as "box drawing codes". A complete list and further information can be found here.

  2. Open your file with encoding = 'native.enc'

  3. Use writeLines instead of write with the useBytes = TRUE option.

Here is an example:

f <- file("test.txt", open = "w", encoding = "native.enc")
writeLines("\U251C\U2500\U2500 Herr Dvorek Frank von Lakatos", f, useBytes = TRUE)
writeLines("\U2502   \U2514\U2500\U2500 Dr Maria Lakatos", f, useBytes = TRUE)
close(f)

The result in Notepad++ looks like this:

Box codes rendered in Notepad++

I'm working in same environment as you. So I think this should work.

If you need to read the file back in, use this:

mylines <- readLines("test.txt", encoding = "UTF-8")
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you very much for this, David! I greatly appreciate that you tried this, worked out a solution and took the time to explain.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.