Select data frame elements from multiple lists inside a list

Question

I have a list (IPCs) containing multiple data frames.

here is a sample from my list:

  $ http://www.sumobrain.com/patents/us/Measured-object-support-mechanism-for-unbalance-measuring-apparatus/4981043.html           
:List of 1
..$ :'data.frame':  3 obs. of  5 variables:
.. ..$ X1: chr [1:3] "2001826A" "2857764A" "3452604A"
.. ..$ X2: chr [1:3] "1935-05-21" "1958-10-28" "1969-07-01"
.. ..$ X3: chr [1:3] "Russell et al." "Frank" "Schaub"
.. ..$ X4: chr [1:3] "73/478" "73/477" "73/475"
.. ..$ X5: chr [1:3] "Machine for balancing heavy bodies" "Rotor balance testing machine" "BALANCE TESTING APPARATUS HEAD"
$ http://www.sumobrain.com/patents/us/Encoder-with-wide-index/4982189.html   
 :List of 1
..$ :'data.frame':  8 obs. of  5 variables:
.. ..$ X1: chr [1:8] "3500449A" "4212000A" "4233592A" "4524347A" ...
.. ..$ X2: chr [1:8] "1970-03-10" "1980-07-08" "1980-11-11" "1985-06-18" ...
.. ..$ X3: chr [1:8] "Lenz" "Yamada" "Leichle" "Rogers" ...
.. ..$ X4: chr [1:8] "341/6" "341/16" "341/6" "341/3" ...
.. ..$ X5: chr [1:8] "ELECTRONIC ENCODER INDEX" "Position-to-digital encoder" "Method for detection of the angular position of a part driven in rotation and instrumentation using it" "Position encoder" ...
$ http://www.sumobrain.com/patents/us/Device-for-detecting-at-least-one-variable-relating-to-the-movement-of-a-movable-body/4982106.html   
:List of 1
..$ :'data.frame':  2 obs. of  5 variables:
.. ..$ X1: chr [1:2] "3956973A" "4797564A"
.. ..$ X2: chr [1:2] "1976-05-18" "1989-01-10"
.. ..$ X3: chr [1:2] "Pomplas" "Ramunas"
.. ..$ X4: chr [1:2] "92/5R" "307/119"
.. ..$ X5: chr [1:2] "Die casting machine with piston positioning control" "Robot overload detection mechanism"

I would like to select only the first and fifth elements (X1 and X5) from all data frames, to later construct a further dataset with only these two elements.

I have tried to grab X1 with this:

citations_IPC <- sapply(IPCs, function(x){
y<-x[,1]
return(y)
})

and X5 with:

citations_titles <- sapply(IPCs[[1]], function(z){
e<-z[,5]
return(e)
})

Then I convert citations_IPCs and citations_titles into a single data frame with:

citation_list <-  data.frame(IPC = unlist(lapply(citations_IPC, paste)), title = unlist(lapply(citations_titles, paste)) )

1#problem

If I write the sapply function on an individual list (e.g. IPCs[[1]]) I get the result I want:

citations_IPC <- sapply(IPCs[[1]], function(x){
y<-x[,1]
return(y)
})

result:

> citations_IPC
      [,1]      
 [1,] "3415985A"
 [2,] "3916190A"
 [3,] "4088895A"
 [4,] "4633084A"
 [5,] "4670651A"
 [6,] "4860224A"

However, this function doesn't work for the whole lists (IPCs). The error I get is: "Error in x[, 1] : incorrect number of dimensions"

I am guessing the problem might be due to a few lists within my dataset with no data frame, no observations and no variables. In that case I would need a function which allows me to use the sapply() on the dataset despite the lines without data frame.

Please any suggestions would be really appreciated.

Many thanks

str(IPCs)

> str(IPCs)
 List of 19
 $ http://www.sumobrain.com/patents/us/Method-and-apparatus-for-the-quantitative,-depth-differential-analysis-of-solid-samples-with-the-use-of-two-ion-beams/4982090.html       :List of 1
  ..$ :'data.frame':    6 obs. of  5 variables:
  .. ..$ X1: chr [1:6] "3415985A" "3916190A" "4088895A" "4633084A" ...
  .. ..$ X2: chr [1:6] "1968-12-10" "1975-10-28" "1978-05-09" "1986-12-30" ...
  .. ..$ X3: chr [1:6] "Castaing et al." "Valentine et al." "Martin" "Gruen et al." ...
  .. ..$ X4: chr [1:6] "250/309" "250/309" "250/309" "250/309" ...
  .. ..$ X5: chr [1:6] "Ionic microanalyzer wherein secondary ions are emitted from a sample surface upon bombardment by neutral atoms" "Depth profile analysis apparatus" "Memory device utilizing ion beam readout" "High efficiency direct detection of ions from resonance ionization of sputtered atoms" ...
 $ http://www.sumobrain.com/patents/us/Set-on-oscillator/4982165.html    
 :List of 1
  ..$ :'data.frame':    2 obs. of  5 variables:
  .. ..$ X1: chr [1:2] "4437066A" "4558282A"
  .. ..$ X2: chr [1:2] "1984-03-13" "1985-12-10"
  .. ..$ X3: chr [1:2] "Gordon" "Lowenschuss"
  .. ..$ X4: chr [1:2] "328/14" "307/523"
  .. ..$ X5: chr [1:2] "Apparatus for synthesizing a signal by producing samples of such signal at a rate less than the Nyquist sampling rate" "Digital frequency synthesizer"
 $ http://www.sumobrain.com/patents/us/Voltage-measuring-apparatus/4982151.html 
 :List of 1
  ..$ :'data.frame':    7 obs. of  5 variables:
  .. ..$ X1: chr [1:7] "3419802A" "3419803A" "4446425A" "4603293A" ...
  .. ..$ X2: chr [1:7] "1968-12-31" "1968-12-31" "1984-05-01" "1986-07-29" ...
  .. ..$ X3: chr [1:7] "Pelenc et al." "Pelenc et al." "Valdmanis et al." "Mourou et al." ...
  .. ..$ X4: chr [1:7] "324/96" "324/96" "" "" ...
  .. ..$ X5: chr [1:7] "Apparatus for current measurement by means of the faraday effect" "Apparatus for current measurement by means of the faraday effect" "Measurement of electrical signals with picosecond resolution" "Measurement of electrical signals with subpicosecond resolution" ...

missuse · Accepted Answer · 2017-10-14 15:07:14Z

3

Here is an example:

First lets make a list with some random iris columns:

data(iris)
lis = list(iris[1:3], iris[2:4])

using lapply with a custom function to extract columns 1 and 2 from each data frame. If they are not named the same force a rename of the columns for the next step:

b = lapply(lis, function(x){
  z = x[,c(1,2)]
  colnames(z) = c("z1", "z2")
  return(z)
}
)

Now b is a list of only the columns you wish.

rbind the data frames in b:

do.call(rbind, b)

done

answered Oct 14, 2017 at 15:07

missuse

19.9k3 gold badges29 silver badges53 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Rui Barradas · Accepted Answer · 2017-10-16 15:48:38Z

1

Here is a way to do what I understand of your question.
First some fake data.

op <- options(stringsAsFactors = FALSE)  # to make sure we have characters not factors
set.seed(9506)

nr <- c(6, 2, 7)
IPCs <- lapply(1:3, function(n){
        res <- as.data.frame(replicate(5, sample(LETTERS, nr[n], TRUE)))
        names(res) <- paste0("X", 1:5)
        res
})
names(IPCs) <- paste0("df", seq_along(dat))
str(IPCs)
options(op)   # put it back as it was

Now the code to extract the 1st and 5th columns of each data.frame and paste them together in order to form a df.

result <- list(
    sapply(IPCs, `[[`, 1),
    sapply(IPCs, function(x) x[[ncol(x)]])
)
result <- as.data.frame(lapply(result, function(x) sapply(x, paste, collapse = "")))
names(result) <- c("citations_IPC", "citations_titles")
result

edited Oct 16, 2017 at 15:48

answered Oct 14, 2017 at 15:55

Rui Barradas

78k8 gold badges41 silver badges75 bronze badges

11 Comments

Amleto Over a year ago

This looks like a really good solution, but unfortunately it does not work for my dataset. I get the error: "Error in FUN(X[[i]], ...) : subscript out of bounds". Might it be that within my dataset there are lists with no variables, and therefore I get this error?

Rui Barradas Over a year ago

@Amleto So in your dataset there are lists with no variables? Can you update the question with the output of str(IPCs)? The dataset I've made up has the same structure as your post.

Amleto Over a year ago

I think the problem with my dataset is that I have a number of lists inside a list. I add the str(IPCs) above.

Rui Barradas Over a year ago

@Amleto No, that was not the problem. The problem was that in your first post of str your df's all had the same number of rows and now they don't. Give me just a few minutes to think about this

Amleto Over a year ago

The df's variables are always 5 while the observations change, but that is not the problem. Sorry

|

Collectives™ on Stack Overflow

Select data frame elements from multiple lists inside a list

str(IPCs)

2 Answers 2

Comments

11 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

str(IPCs)

2 Answers 2

Comments

11 Comments

Your Answer

Sign up or log in

Post as a guest

Related