1

I want to have the output of $var below to be John D

my $var = "John Doe";

I have tried $var =~ s/(.+\b.).+\z],'\1.'//g;

2
  • 3
    and what do you want if it is "Lester del Rey"? Commented Jun 13, 2013 at 15:48
  • Not really an issue here could simply be Lester D then Commented Jun 13, 2013 at 17:51

6 Answers 6

3

Here's a general solution (feel free to swap in '\w' where I used '.', and add a \s where I used \s+)

my $var = "John Doe";
(my $fname, my $linitial) = $var =~ /(.*)\s+(.).*/

Then you have the values

$fname = 'John';
$linitial = 'D';

and you can do:

print "$fname $linitial";

to get

"John D"

EDIT Until you do your next match, each of the capture parentheses creates a variable ($1 and $2, respectively), so the whole thing can be shortened a bit as follows:

my $var = "John Doe";
$var =~ /(.*)\s+(.).*/
print "$1 $2";
Sign up to request clarification or add additional context in comments.

2 Comments

is there anyway to do this in one statement without creating new variables?
Yep! Perl has a (mostly) awesome penchant for auto-creating variables, which are perfectly suited for this sort of situation. I've updated my answer to include this shortened alternative
1

To replace the last sequence of non-whitespace characters with just the initial character, you could write this

use strict;
use warnings;

my $var = "John Doe";

$var =~ s/(\S)\S*\s*$/$1/;

print $var;

output

John D

Comments

0

Assuming your string has ascii names this will work

$var =~ s/([a-zA-Z]+)\s([a-zA-Z]+)/$1." ".substr($2,0,1)/ge;

1 Comment

for some reason this just returns a 1 instead of John D
0
$var = "John Doe";
s/^(\w+)\s+(\w)/$1 \u$2/ for $var;

2 Comments

This won't work, and just titlecases the second word, i.e. John van DoeJohn Van Doe.
I just wrote an anwer with a similar regex that explains how it works. You should then be able to understand why your regex doesn't show the requested behaviour.
0

A simple regex that solves this problem is the substitution

s/^\w+\s+\K(\w).*/\U$1/s

What does this do?

  • ^ \w+ \s+ matches a word at the beginning of the string, plus whitespace towards the next word
  • \K is the keep escape. It keeps the currently matched part outside of that substring that is considered “matched” by the regex engine. This avoids an extra capture group, and is practically a look-behind.
  • (\w) matches and captures one “word” character. This is the leading character of the second word in the string.
  • .* matches the rest of the string. I do this to overwrite any other names that may come: you stated that Lester del Ray should be transformed to Lester D, not Lester D Ray as a solution with \w* instead of the .* part would have done. The /s modifier is relevant for this, as it enables . to match every character including newlines (who knows what's inside the string?).
  • The substitution uses the \U modifier to uppercase the rest of the string, which consists of the value of the capture.

Test:

$ perl -E'$_ = shift; s/^\w+\s+\K(\w).*/\U$1/s; say' "Lester del Ray"
Lester D
$ perl -E'$_ = shift; s/^\w+\s+\K(\w).*/\U$1/s; say' "John Doe"
John D

Comments

-1

Something like this might be a little more usable/reusable in the long run.

$initial = sub { return substr shift, 0, 1 ; };

make a get initial function

$var =~ s/(\w)\s+(\w)/&$initial($1) &$initial($2)/sge;

Then replace the first and second results using execute in the regex;

1 Comment

I can't use a sub routine for this case.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.