2

I'm trying to insert a "+" symbol into the middle of a postcode. The postcodes following a pattern of AA111AA or AA11AA. I want the "+" to be inserted before the final number, so an output of either AA11+1AA or AA1+1AA. I've found a way to do this using stringr, but it feels like there's an easier way to do this that how I'm currently doing it. Below is my code.

pc <- "bt43xx"

pc <- str_c(
      str_sub(pc, start = 1L, end = -4L), 
      "+", 
      str_sub(pc, start = -3L, end = -1L)
      )

pc
[1] "bt4+3xx"

3 Answers 3

5

Here are some alternatives. All solutions work if pc is a scalar or vector. No packages are needed. Of them (3) seems particularly short and simple.

1) Match everything (.*) up to the last digit (\\d) and then replace that with the first capture (i.e. the match to the part within the first set of parens), a plus and the second capture (i.e. a match to the last digit).

sub("(.*)(\\d)", "\\1+\\2", pc)

2) An alternative which is even shorter is to match a digit followed by a non-digit and replace that with a plus followed by the match:

sub("(\\d\\D)", "+\\1", pc)
## [1] "bt4+3xx"

3) This one is even shorter than (2). It matches the last 3 characters replacing the match with a plus followed by the match:

sub("(...)$", "+\\1", pc)
## [1] "bt4+3xx"

4) This one splits the string into individual characters, inserts a plus in the appropriate position using append and puts the characters back together.

sapply(Map(append, strsplit(pc, ""), after = nchar(pc) - 3, "+"), paste, collapse = "")
## [1] "bt4+3xx"

If pc were known to be a scalar (as is the case in the question) it could be simplified to:

paste(append(strsplit(pc, "")[[1]], "+", nchar(pc) - 3), collapse = "")
[1] "bt4+3xx"
Sign up to request clarification or add additional context in comments.

Comments

1

This regular expression with sub and two back references should work.

sub("(\\d?)(\\d[^\\d]*)$", "\\1+\\2", pc)
[1] "bt4+3xx"
  • \\d? matches 1 or 0 numeric characters, 0-9, and is captured by (). It will match if at least two numeric characters are present.
  • \\d[^\\d]* matches a numeric character followed by all non numeric characters, and is captured by ()
  • $ anchors the regular expression to the end of the string
  • "\\1+\\2" replaces the matched elements in the first two points with themselves and a "+" in the middle.

2 Comments

This doesn't seem to fit OP's need as the + is after the digits. I would do sub("(.*)(\\d\\D{2})", "\\1+\\2", pc)
Ugh. I missed that. Thank you for the catch.
1
sub('(\\d)(?=\\D+$)','+\\1',pc,perl=T)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.