2

I am trying to sample 50 Taxon at random for a new dataframe from my original dataframe that contains 100 Taxon. For the 50 taxon randomly selected I want to keep information for all 4 columns. A subset of my original dataframe (high.diversity) looks like this:

                           Taxon              C     N    func.group
1         Curculionidae.Ischapterapion.sp. -29.06  2.19  herbivore
2         Curculionidae.Ischapterapion.sp. -29.27  1.60  herbivore
3              Curculionidae.Protapion.sp. -28.45  1.91  herbivore
4              Curculionidae.Protapion.sp. -25.99  0.55  herbivore
5              Curculionidae.Protapion.sp. -28.27  1.52  herbivore
6              Curculionidae.Hypera.meles  -25.41  3.38  herbivore
7                Curculionidae.Sitona.sp.  -27.05  2.01  herbivore
8                Curculionidae.Sitona.sp.  -26.70  3.07  herbivore
.....
230

For each of my Taxon I have between 1-5 replicates, so that I have 100 taxon but 230 data points. (e.g. Curculionidae.Ischapterapion.sp. has 2 replicates in the above table).

I have successfully sampled 50 rows at random using the following code:

new.df<-high.diversity[sample(nrow(high.diversity),50),]

However, my problem is that the above code gives 50 rows, but what I actually want is to select 50 Taxon at random, and have all replicates for each of those Taxon. (i.e. 50 Taxon each with multiple replicates might give nearer to 100 rows). Therefore I need to change the above code to select 50 random Taxon and include all replicates within those Taxon.

Could anyone suggest how I might achieve this?

Thanks very much,

M

1 Answer 1

2

Sample from your Taxons and the subset your data.frame to these taxons:

df <- read.table(header = TRUE, stringsAsFactors=FALSE, text = '                          Taxon              C     N    func.group
1         Curculionidae.Ischapterapion.sp. -29.06  2.19  herbivore
2         Curculionidae.Ischapterapion.sp. -29.27  1.60  herbivore
3              Curculionidae.Protapion.sp. -28.45  1.91  herbivore
4              Curculionidae.Protapion.sp. -25.99  0.55  herbivore
5              Curculionidae.Protapion.sp. -28.27  1.52  herbivore
6              Curculionidae.Hypera.meles  -25.41  3.38  herbivore
7                Curculionidae.Sitona.sp.  -27.05  2.01  herbivore
8                Curculionidae.Sitona.sp.  -26.70  3.07  herbivore')

set.seed(1234)
take <- sample(unique(df$Taxon), 2)
df[df$Taxon %in% take, ]
                             Taxon      C    N func.group
1 Curculionidae.Ischapterapion.sp. -29.06 2.19  herbivore
2 Curculionidae.Ischapterapion.sp. -29.27 1.60  herbivore
3      Curculionidae.Protapion.sp. -28.45 1.91  herbivore
4      Curculionidae.Protapion.sp. -25.99 0.55  herbivore
5      Curculionidae.Protapion.sp. -28.27 1.52  herbivore
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.