1

I've a .csv dataframe in which one of the columns is a ZIP code. The ZIP code is a factor. Here is an example:

Country<- c("US","US","US","CAN","CAN")
ZIP<- C(00210,01210,65483.0,H3P,H3P3C)
data<- data.frame(Country,ZIP)

I did the following but the output is not what I want:

data$ZIP<-round(as.numeric(as.character(data$ZIP)), 0) 

Although it removed the decimals but now the zip code 00210, 01210 became 210 and 1210. Also, zip codes for CANADA became NA. I want to preserve the zip code numbers to 5 digit and preserve the zip codes of CANADA.

How can I do that?

Thank you.

1 Answer 1

2

Try this

data$ZIP <- sub("\\.\\d+$", "", data$ZIP)

#       Country   ZIP
# 1      US 00210
# 2      US 01210
# 3      US 65483
# 4     CAN   H3P
# 5     CAN H3P3C

Explanation

From the help page, a typical usage of sub is

sub(pattern, replacement, x)

x is a character vector where matches are sought...

In our case x'll be the ZIP column (values of the ZIP column to be specific).

The pattern is ("\\.\\d+$"):

\\. matches the dot

\\d+ matches one or more numeric characters

$ matches the end of the input string.

The replacement pattern is "". It replaces numeric chars beginning from a match of dot till the end with an empty string.

For example

sub("\\.\\d+$", "", 21358.222)
# "21358"

Hope that helps.

Sign up to request clarification or add additional context in comments.

2 Comments

Yes, your edited code work. I also converted my zip variable from factor to character. For future understanding can you briefly explain or direct to some source on the use of this form "\\.\\d+$" in resolving this issue.
The help which can be accessed using?sub can give you info. I'll do an update if you want to understand how the pattern replaces the decimal with an empty string.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.