0

I am ignorant when it comes to R programming and programming in general but I have two pieces of code that have come across a similar problem (for me). Here we go...

(A)

I currently have a function that returns record(s) of a patient, trial number, and other information. It looks like this:

     ID trial     start   finish     mark     mean    number
903 A34    19     90910 18775077     8236  -0.0197  1.972876
904 A34    19  18782377 23089165     2343   0.0374  2.052525
905 A34    19  23093018 43203507    10267  -0.0162  1.977668
906 A34    19  43203990 43447468       93   0.2138  2.319478
907 A34    19  43447802 43663369      112  -0.0355  1.951387
908 A34    19  43663624 43834506       80  -0.5385  1.376973
909 A34    19  43834848 59097854     8655  -0.0095  1.986873

Below is the code I have written for it.

getRS <- function(CNA, samples = NULL, trial = NULL){ race <- racing.summary(subset(CNA, samplelist = samples, triallist = trial)) race$number <- (2^race$mean)*2 return(race) }

I am wondering if it is possible to use this output in a new function to do simple arithmetics. I am looking to subtract 'finish' from 'start' to create 'length', create a new 'mean' with all the means from above and extract the largest 'number' to create 'max.number' whilst not displaying 'mark' at all.

An output similar to this:

ID    trial     max.length          mean    max.number
A34       19       20110489   -0.05260000     2.3194777

AND/OR

(B)

I have an alternative function that creates a data frame of ALL the patients with the already calculated data. I used this code:

getSum <- function (){
  race_mean <- as.data.frame(df %>% group_by(ID, trial) %>% summarise(mean = mean(mean)))
  race_length <- as.data.frame(df %>% group_by(ID,trial) %>% summarise(max.length = max(end - start)))
  seg_number <- as.data.frame(df %>% group_by(ID,trial) %>% summarise(max.number = max(number)))
  race_m_l_merge <- as.data.frame(merge(x = race_length, y = race_mean))
  race_m_l_n_merge <- as.data.frame(merge(x = race_m_l_merge, y = race_number))
  ordered_summary <- as.data.frame(race_m_l_n_merge[order(race_m_l_n_merge$trial),])
  View(ordered_summary)
}

Which gives an output like this:

      ID trial    max.length         mean       max.number
1    A22     1      96637812   -1.648909e-01     2.6989533
25   A23     1     101363101   -6.275455e-02     2.2468441
49   A24     1      72598875   -5.878000e-02     2.8204004
73   A25     1     112628591   -3.346917e-01     2.0675182
97   A26     1      55490417    7.621429e-02     2.4512200
121  A28     1     130879821   -4.218571e-02     2.0679481
145  A29     1      72590096   -3.093417e-01     2.3450196
169  A30     1      32642030    4.242500e-02     2.6375528
193  A32     1      34350731   -8.188372e-02     2.1149155
217  A33     1      77537981   -1.305833e-01     2.1125713

With this, I would like to create a function as to specify which ID and which trial I would like to lookup like so: Function("A22",1).

I'm hoping that my R Script for the future would work arbitrarily for future endeavors so any help would be much appreciated either on my question A, B or perhaps both! Or even suggestions for links to helpful websites. :)

1 Answer 1

1

If you have already defined your functions getRS and getSum, then you can call them inside a new function.

Uou just have to change the line that contains View(ordered_summary) in getSum to return(ordered_summary) or simply ordered_summary, so you it returns an object you can further manipulate.

lookup_function <- function(data_lookup, id_lookup, trial_lookup) {
  data_df <- getRS(CNA = data_lookup)
  summary_df <- getSum(df = data_df)
  subset(x = results_df, subset = (ID == id_lookup & trial == trial_lookup))
}

You can write this function in a concise way, if you feel inclined to do so.

lookup_function <- function(data_lookup, id_lookup, trial_lookup) {
  subset(x = getSum(getRS(data_lookup)), subset = (ID == id_lookup & trial == trial_lookup))
}

Or, if you don't want to have three different functions, you can create a function that has getRS and getSum defined inside itself.

lookup_function <- function(data_lookup, id_lookup, trial_lookup) {
  data_df <- getRS(CNA = data_lookup)
  summary_df <- getSum(df = data_df)
  subset(x = results_df, subset = (ID == id_lookup & trial == trial_lookup))
}

lookup_function <- function(data_lookup, id_lookup, trial_lookup) {
  getRS <- function(CNA, samples = NULL, trial = NULL){
    race <- 
      racing.summary(subset(CNA, samplelist = samples, triallist = trial))
    race$number <- 
      (2 ^ race$mean) * 2

    race
  }

  getSum <- function(df) {
    unordered_summary <- 
      df %>% 
      group_by(ID, trial) %>% 
      summarise(mean = mean(mean),
                max.length = max(end - start),
                max.number = max(number)) %>% 
      data.frame()

    ordered_summary <- 
      data.frame(unordered_summary[order(unordered_summary$trial), ])

    ordered_summary
  }

  data_df <- getRS(CNA = data_lookup)

  summary_df <- getSum(df = data_df)

  subset(x = results_df, subset = (ID == id_lookup & trial == trial_lookup))
}

I have edited the code for getSum, as I didn't see a reason to call summarize three times, instead of a single time. You can use your own function, of course, as I don't know the particulars of your task at hand.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you so much! I appreciate you taking the time to do this and really like the suggestions of editing the current code I have!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.