0

I want to see how a model performs when I make the variable 'year' piecewise linear. I know there are automatic methods to define this within the model and to look for the best cut-point. Still, I prefer making a piecewise variable as it is more transparant to me and in addition, I think the solution to this problem can help on other occasions as well.

So I want to make variables defined like

  year1997up<-0
  year1997up[year>1997]<-year[year>1997]-1997
  year1997up[year<=1997]<-rep(0,sum(year<=1997))
  year1997down<-0
  year1997down[year<1997]<-year[year<1997]-1995
  year1997down[year>=1997]<-rep(2,sum(year>=1997))

So that year is piecewise divided with cut-point 1997.

I want to do this for all years from 1997 till 2011 and to automate this process, I wrote a function:

piece.var.fun<-function(up,down,i,data){

  within(data,{

    up<-0
    up[year>=i]<-year[year>=i]-i
    up[year<i]<-rep(0,sum(year<i))


    down<-0
    down[year<=i]<-year[year<=i]-1995 
    down[year>i]<-rep(i-1995,sum(year>i))
})
}


test.dataset<-piece.var.fun(up="year2000up",down="year2000down",data=StartM,i=2000)

The idea was to use this function in combination with mapply on vectors containing the names I want, the variables are just called up and down instead of year2000up and year2000down. This way, I can't use it to make the variables for different years, as they are all named the same.

So, how can I use a function like this and make the name of the variables include the the year?

1 Answer 1

1

Use assign:

yr <- 1995
varname <- sprintf('year%idown', yr)
down <- # ... define `down` as before

assign(varname, down)

You can create your up a little easier, something like

up <- cumsum(year > i)

Aside: your down doesn't make much sense to me - why the hard-coded 1995? and why do you stick the '2' on the end? I imagine you might be able to construct it similarly to up depending on what you want.

Another aside: Also, if you construct your up and down inside piece.var.fun using i already, there is no need to pass in the variable name "year2000up" and so on into the function? But anyway, that is peripheral to your question.

But in any case, to answer your question, to include the year in the variable name you create a string with the variable name and use assign.

Sign up to request clarification or add additional context in comments.

1 Comment

The 1995 is just to scale it (if I want to use it in interaction-term, that' necessary to keep the multicollinearity under control). About the '2': you're right, that's just plain silly, I can't believe I wrote that. Thanks for pointing that out.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.