0

I have a large dataset and I wanted to read it in with fread, taking advantage of the select argument to only pick columns I need for this specific analysis. I was wondering if anyone knows a way that I could also filter a text column? something like dt %>% filter(str_detect(name, "Mary")) but in the fread function? Thanks.

2
  • That is not possible with fread but it is possible with package sqldf. This package accepts SQL statements making it very flexible but also slow to very slow. In the case of large files (how large is it?) it can save a lot of memory since it filters out what is not needed keeping only what is in the SELECT statement. Commented Nov 23, 2020 at 22:47
  • Thanks. My dataset is about 1.4 GB and with fread it takes about 4 mins to be uploaded. Since there are a couple of these datasets that I work with, I think speed is really important for me. Commented Nov 23, 2020 at 22:56

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.