1

I have a text file that contains image pixel values as below:

#1 nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,1.225475199490155542e-01,1.044848965437222138e-01,1.237544502265838786e-01,1.715363333404669177e-01,1.922596029233400172e-01,1.809632738682011854e-01,1.797130234316194342e-01,1.738541208375123936e-01,1.444294554581726231e-01,1.321258390981746855e-01,1.344635498234532101e-01,1.436132527743466947e-01,1.395290556225499690e-01,1.374780604935658956e-01,1.346506483347080507e-01,1.280550646990075425e-01,1.248504215497622527e-01,1.178248061901537996e-01,1.298443201619972898e-01,1.553180115989083732e-01,1.580724143044860419e-01,1.784962367422186780e-01,1.907025124594779186e-01,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan
#2 nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,1.029349857154064074e-01,9.448919637788849579e-02,1.059611529861169132e-01,1.123315418475298866e-01,1.044427274454799576e-01,1.201363996329007783e-01,1.282688456719490722e-01,1.251468493081038524e-01,1.305904505950917782e-01,1.166948019212366294e-01,1.099250506785318382e-01,1.136641770357243175e-01,1.130515076243375772e-01,1.184654413023679964e-01,1.445082878208643895e-01,1.663965434098903795e-01,1.663395733842590318e-01,1.752476275152526075e-01,1.685796922638230499e-01,1.482366311004082449e-01,1.309908022384465853e-01,1.261424559469170870e-01,1.268358150633545067e-01,1.255352810594060065e-01,1.259829554332418666e-01,1.289792505226832475e-01,1.297540150693830274e-01,1.209480533761810861e-01,1.285694058734546119e-01,1.369298058593048373e-01,1.461700389952401702e-01,1.431042116739904002e-01,1.712214395634834019e-01,1.818925300859868255e-01,2.010257021882600748e-01,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan
#3 nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,9.861446120242163549e-02,9.304676318969960780e-02,9.864122376278822157e-02,1.075597393605647739e-01,1.131419975961711483e-01,1.146375133556569031e-01,1.204342658911874697e-01,1.228412754806565282e-01,1.240924670494341492e-01,1.163476394083799020e-01,1.073797480686657368e-01,1.017817224886293226e-01,1.131027905414023760e-01,1.114406335131803705e-01,1.227824308916071611e-01,1.329011478552513670e-01,1.441114715371090704e-01,1.604792748573601047e-01,1.527513461191236099e-01,1.380147589010027598e-01,1.288032806310404343e-01,1.338005227090968141e-01,1.255554854466473802e-01,1.173452604805394900e-01,1.143985402480809654e-01,1.202454679138123678e-01,1.267178125929230847e-01,1.241315491837501339e-01,1.347653795894559747e-01,1.349437732217280139e-01,1.301418957926175068e-01,1.313508293861232468e-01,1.742619338497762571e-01,1.858488867892321983e-01,1.877861224975270471e-01,1.803044688712685528e-01,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan
#4 nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,8.736886296542736852e-02,8.908654375958220684e-02,8.620668521597033007e-02,9.500020858506200150e-02,1.126136404574935440e-01,1.187951696788534656e-01,1.168147013779436694e-01,1.109278058355442492e-01,1.128276541805584010e-01,1.173942164407098532e-01,1.152133179543410046e-01,1.111828410014303326e-01,1.192855572113103724e-01,1.157219419285210882e-01,1.051462987022579870e-01,1.042841664852307976e-01,1.263179021208208075e-01,1.543027512926945510e-01,1.531517647661817527e-01,1.370377223529097022e-01,1.217984978313198102e-01,1.340931752979427627e-01,1.274053299614930634e-01,1.206931794950223541e-01,1.149389700113669505e-01,1.083743218115938711e-01,1.135429261076744967e-01,1.224571336189042570e-01,1.316256830092336905e-01,1.296892050846524258e-01,1.220541991422918193e-01,1.251462726710364792e-01,1.475487955738131740e-01,1.8
.
.
.
.

It has a matrix of values in the txt file, one value corresponding to a pixel. Each row is split as above. (scroll to the right)

When I read the file in R as:

txt <- read.table("ndvi_20180102_081439_1005_3B.txt")

It produces a data.frame as below:

   V1
#1 nan,nan,nan,nan,-0.131231,nan,nan,nan,....
#2 nan,nan,nan,1.2323,nan,nan,-1,2313,nan,....
.
.
.
#187 nan,nan,nan,1.12323,nan,nan,...
#188 nan,nan,0.2323,nan,nan,...

I want it in this form to calculate the mean of pixel values:

#1 nan
#2 nan
#3 -1,23232
#4 nan
.
.
.
.

I tried to separate with tidyverse::separate but I don't want to calculate the number of variables since I need to do it for about 439 files in a loop.

Finally, I want this form:

  #file1 #file2 #file3 ...... #file439
#1 nan     nan   nan
#2 nan     nan   nan
#3 nan    -1,32  nan
#4 -1,3    0,22  nan
.
.
.

How can I convert the text in the desired form?

8
  • 3
    I don't understand the connection between the input file and the expected output. Maybe you should explain that to us. Commented Feb 13, 2019 at 7:16
  • It has a matrix of values in the txt file, one value corresponding to a pixel. Each row is split as above. I edited the values of the text. Commented Feb 13, 2019 at 7:25
  • So right now you get a dataframe when using read.table(), but you want all the columns of the dataframe concatenated to one vector? Commented Feb 13, 2019 at 7:27
  • @LAP it produces only one column. I want to separate them into multiple columns without defining the number of columns since I don't know the number of variables in each file. Commented Feb 13, 2019 at 7:29
  • What do you know? Do you know the number of rows each file should produce? Commented Feb 13, 2019 at 7:29

3 Answers 3

1

An easy workaround for this would be to use the fread function from the data.table package.

txt <- fread(file = "ndvi_20180102_081439_1005_3B.txt",sep = ",")

To get the mean along each column you can use

txt[,lapply(X = .SD,FUN = mean),.SDcols = colnames(txt)]

Hope that helps

Sign up to request clarification or add additional context in comments.

3 Comments

You might however want to treat the "nan" present in the data before calculating the mean or just use txt[,lapply(X = .SD,FUN = mean,na.rm = T),.SDcols = colnames(txt)] instead
Hi, I am going to calculate the mean for the total, not for each column. It really helped me with the first code!! I am going to melt it and omit NA's then calculate the mean for the column
Glad it helped.
0

data.table's fread mentioned by @Rage is a good choice. We need to take a little effort to deal with your first column or rather header "#1 nan" which is separated by spaces:

library(data.table)

x <- "#1 nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,1.225475199490155542e-01,1.044848965437222138e-01,1.237544502265838786e-01,1.715363333404669177e-01,1.922596029233400172e-01,1.809632738682011854e-01,1.797130234316194342e-01,1.738541208375123936e-01,1.444294554581726231e-01,1.321258390981746855e-01,1.344635498234532101e-01,1.436132527743466947e-01,1.395290556225499690e-01,1.374780604935658956e-01,1.346506483347080507e-01,1.280550646990075425e-01,1.248504215497622527e-01,1.178248061901537996e-01,1.298443201619972898e-01,1.553180115989083732e-01,1.580724143044860419e-01,1.784962367422186780e-01,1.907025124594779186e-01,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan
#2 nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,1.029349857154064074e-01,9.448919637788849579e-02,1.059611529861169132e-01,1.123315418475298866e-01,1.044427274454799576e-01,1.201363996329007783e-01,1.282688456719490722e-01,1.251468493081038524e-01,1.305904505950917782e-01,1.166948019212366294e-01,1.099250506785318382e-01,1.136641770357243175e-01,1.130515076243375772e-01,1.184654413023679964e-01,1.445082878208643895e-01,1.663965434098903795e-01,1.663395733842590318e-01,1.752476275152526075e-01,1.685796922638230499e-01,1.482366311004082449e-01,1.309908022384465853e-01,1.261424559469170870e-01,1.268358150633545067e-01,1.255352810594060065e-01,1.259829554332418666e-01,1.289792505226832475e-01,1.297540150693830274e-01,1.209480533761810861e-01,1.285694058734546119e-01,1.369298058593048373e-01,1.461700389952401702e-01,1.431042116739904002e-01,1.712214395634834019e-01,1.818925300859868255e-01,2.010257021882600748e-01,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan
#3 nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,9.861446120242163549e-02,9.304676318969960780e-02,9.864122376278822157e-02,1.075597393605647739e-01,1.131419975961711483e-01,1.146375133556569031e-01,1.204342658911874697e-01,1.228412754806565282e-01,1.240924670494341492e-01,1.163476394083799020e-01,1.073797480686657368e-01,1.017817224886293226e-01,1.131027905414023760e-01,1.114406335131803705e-01,1.227824308916071611e-01,1.329011478552513670e-01,1.441114715371090704e-01,1.604792748573601047e-01,1.527513461191236099e-01,1.380147589010027598e-01,1.288032806310404343e-01,1.338005227090968141e-01,1.255554854466473802e-01,1.173452604805394900e-01,1.143985402480809654e-01,1.202454679138123678e-01,1.267178125929230847e-01,1.241315491837501339e-01,1.347653795894559747e-01,1.349437732217280139e-01,1.301418957926175068e-01,1.313508293861232468e-01,1.742619338497762571e-01,1.858488867892321983e-01,1.877861224975270471e-01,1.803044688712685528e-01,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan
#4 nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,8.736886296542736852e-02,8.908654375958220684e-02,8.620668521597033007e-02,9.500020858506200150e-02,1.126136404574935440e-01,1.187951696788534656e-01,1.168147013779436694e-01,1.109278058355442492e-01,1.128276541805584010e-01,1.173942164407098532e-01,1.152133179543410046e-01,1.111828410014303326e-01,1.192855572113103724e-01,1.157219419285210882e-01,1.051462987022579870e-01,1.042841664852307976e-01,1.263179021208208075e-01,1.543027512926945510e-01,1.531517647661817527e-01,1.370377223529097022e-01,1.217984978313198102e-01,1.340931752979427627e-01,1.274053299614930634e-01,1.206931794950223541e-01,1.149389700113669505e-01,1.083743218115938711e-01,1.135429261076744967e-01,1.224571336189042570e-01,1.316256830092336905e-01,1.296892050846524258e-01,1.220541991422918193e-01,1.251462726710364792e-01,1.475487955738131740e-01,1.8"

DT <- fread(x, fill=TRUE, na.strings="nan")

DT[, c("V0", "V1") := tstrsplit(V1, " ", fixed=TRUE)]
set(DT, which(DT[["V1"]]=="nan"),"V1", NA)
DT[, V1 := as.numeric(V1)]
cnames <- DT$V0
DT[, V0 := NULL]
DT <- transpose(DT)
DT <- na.omit(DT)
setnames(DT, names(DT), cnames)

print(head(DT))

DTmean <- DT[, lapply(.SD, mean)]
print(DTmean)

Results:

> print(head(DT))
          #1        #2        #3        #4
1: 0.1225475 0.1136642 0.1017817 0.1111828
2: 0.1044849 0.1130515 0.1131028 0.1192856
3: 0.1237545 0.1184654 0.1114406 0.1157219
4: 0.1715363 0.1445083 0.1227824 0.1051463
5: 0.1922596 0.1663965 0.1329011 0.1042842
6: 0.1809633 0.1663396 0.1441115 0.1263179

> print(DTmean)
          #1        #2        #3        #4
1: 0.1477638 0.1407628 0.1330294 0.1976434

Comments

0

This is how I solve it from the answer that @Rage provides:

library(data.table)
library(tidyverse)

txt <- fread(file = "ndvi_20180102_081439_1005_3B.txt",sep = ",")

proc_txt <- function(f) {
  txt <- fread(file = f, sep = ",")

  txt <- gather(txt)
  txt <- na.omit(txt)

  mean <- mean(txt$value)

  return(mean)
}

txt_files <- list.files(path=".", pattern=".txt")

df_list <- lapply(xml_files, proc_txt)
final_df <- do.call(rbind, df_list)

The final output is a table that has one column and has mean of all values of pixel in a single file for example:


n1 <- "nan, nan, 2, 3, 4, nan"
n2 <- "nan, 1, 2, 3, nan"
n3 <- "nan, 3, 4, 5, nan"

The code above produces a table such as:

n1 3
n2 2
n3 4

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.