3

I have a dataframe df of soccer team information by game (MATCHID) with these initial values

 TEAMID Venue LEAGUEPOS MATCHID
 WHU     A         5       1
 COV     H        12       1
 EVE     H        15       2
 MNU     A         2       2
 ARS     A         3       3
 LEI     H         4       3

I wish to create just one row for each game so that it would end up looking like

MATCHID HomeTeam AwayTeam HomePos AwayPos
   1       COV      WHU     12      5      etc.

so I want to create some new columns , delete others and remove duplicated rows.

I am having trouble with first stage trying

df$HomeTeam <- df$TEAMID[df$Venue == "H"]

as this produces

 TEAMID Venue LEAGUEPOS MATCHID HomeTeam
   WHU     A         5       1      COV
   COV     H        12       1      EVE
   EVE     H        15       2      LEI
   MNU     A         2       2      STH
   ARS     A         3       3      TOT
   LEI     H         4       3      WIM

With the HomeTeam just showing the sequential TEAMID for each record with a Venue = H

2 Answers 2

5

This can be easily achieved using the function reshape which is a part of base R.

# READ DATA
mydf = read.table(textConnection("
TEAMID Venue LEAGUEPOS MATCHID
 WHU     A         5       1
 COV     H        12       1
 EVE     H        15       2
 MNU     A         2       2
 ARS     A         3       3
 LEI     H         4       3"), 
 sep = "", header = T, colClasses = rep('character', 4))

# RESHAPE DATA
reshape(mydf, idvar = 'MATCHID', timevar = 'Venue', direction = 'wide')

Here is the output produced

  MATCHID TEAMID.A LEAGUEPOS.A TEAMID.H LEAGUEPOS.H
1       1      WHU           5      COV          12
3       2      MNU           2      EVE          15
5       3      ARS           3      LEI           4

NOTE: An alternate way to do this is to use cast and melt functions from reshape package.

require(reshape)
mydf_m = melt(mydf, id = c('MATCHID', 'Venue'))
cast(mydf_m, MATCHID ~ Venue + variable)
Sign up to request clarification or add additional context in comments.

Comments

1

reshape() in base R does what you want, if a little clunkily. Here is your data:

con <- textConnection(" TEAMID Venue LEAGUEPOS MATCHID
 WHU     A         5       1
 COV     H        12       1
 EVE     H        15       2
 MNU     A         2       2
 ARS     A         3       3
 LEI     H         4       3
")
dat <- read.table(con, header = TRUE, stringsAsFactors = FALSE)
close(con)

We reshape() this, get the columns in the requested order, and update the columns names:

newdat <- reshape(dat, direction = "wide", timevar = "Venue", idvar = "MATCHID")
## reorder
newdat <- newdat[, c(1,4,2,5,3)]
names(newdat) <- c("MatchID","HomeTeam","AwayTeam","HomePos","AwayPos")

This gives us:

> newdat
  MatchID HomeTeam AwayTeam HomePos AwayPos
1       1      COV      WHU      12       5
3       2      EVE      MNU      15       2
5       3      LEI      ARS       4       3

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.