1

I need to replace 3 Hebrew Unicode characters into another 3 Hebrew Unicode characters. I looked into PHP syntax and searched but this is the best I could write. It works and does the job.

I'm wondering and would like to know if this is the most optimal way of replacing a Unicode character into another Unicode character in PHP before I turn this into a tiny little function.

Is there a better syntax in PHP for this?

$re1 = '/[\x{05B1}]/u';
$re2 = '/[\x{05B2}]/u';
$re3 = '/[\x{05B3}]/u';

$subst1 = json_decode('"\u05B6"');
$subst2 = json_decode('"\u05B0"');
$subst3 = json_decode('"\u05B8"');

//Replace (Niqqud with Cantillation) with (just Niqqud)
$bible_content = preg_replace($re1, $subst1, $bible_content);
$bible_content = preg_replace($re2, $subst2, $bible_content);
$bible_content = preg_replace($re3, $subst3, $bible_content);

Starting input for $bible_content:

וַ/יִּקְרָא אֱלֹהִים לָ/אוֹר יוֹם וְ/לַ/חֹשֶׁךְ קָרָא לָיְלָה וַ/יְהִי עֶרֶב וַ/יְהִי בֹקֶר יוֹם אֶחָד׃ אַשְׁרֵי הָ/אִישׁ אֲשֶׁר לֹא הָלַךְ בַּ/עֲצַת רְשָׁעִים וּ/בְ/דֶרֶךְ חַטָּאִים לֹא עָמָד וּ/בְ/מוֹשַׁב לֵצִים לֹא יָשָׁב׃ חֳ

Expected output for $bible_content:

וַ/יִּקְרָא אֶלֹהִים לָ/אוֹר יוֹם וְ/לַ/חֹשֶׁךְ קָרָא לָיְלָה וַ/יְהִי עֶרֶב וַ/יְהִי בֹקֶר יוֹם אֶחָד׃ אַשְׁרֵי הָ/אִישׁ אְשֶׁר לֹא הָלַךְ בַּ/עְצַת רְשָׁעִים וּ/בְ/דֶרֶךְ חַטָּאִים לֹא עָמָד וּ/בְ/מוֹשַׁב לֵצִים לֹא יָשָׁב׃ חָ

4
  • It seems the answer for this can be found in another question at stackoverflow.com/questions/3140734/… The solution there also seems like a cleaner solution Commented Jun 13, 2017 at 0:10
  • Your code doesn't seem to work when I test it online. Do you have any other charset declarations (or similar) prior to this code block? Online test: sandbox.onlinephpfunctions.com/code/… Commented Jun 13, 2017 at 0:48
  • I edited the input and output. The forward slashes are OK to be there. Commented Jun 13, 2017 at 0:56
  • The third one was too difficult to find.. i just put one letter as example at the very end, now it should have all 3 characters. Commented Jun 13, 2017 at 1:27

1 Answer 1

2

PHP 7.0 has a new syntax for unicode characters in string literals. Furthermore, you can use the strtr function to handle character-to-character replacements.

$from = "\u{05B1}\u{05B2}\u{05B3}";
$to = "\u{05B6}\u{05B0}\u{05B8}";

echo strtr($bible_content, $from, $to). "\n";

Now, I can't read Hebrew (or even make it flow properly RTL, apparently :P ), so you'll have to judge whether it did the right thing or not.

Sign up to request clarification or add additional context in comments.

2 Comments

Just noticed.. 06B0 should be 05B0 for $to, in case other people want to take the code and try..
Fixed the typo, thanks. (Obviously, I can't tell the difference :P )

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.