1

How it's possible that shell_exec give a different results than when the command is manually triggered? I'm executing

shell_exec('/usr/bin/wc --words ' . $filepath). 

My PHP is 7.4.3 , I have also tried wc -w. It gives a different output when is triggered manually from shell. I have tested it with webmaster user from shell and its running under webmaster user on web. I have tested this with running whoami. It just doesn't make any sense. It doesn't give any errors or anything, it just gives different outputs. When I run it with webmaster from shell it gives a regular ouput, eq. exact number of words. Anyone have any idea where to look since I have already tried users and permissions?

13
  • Can you add an example of your filepath, the current output and the expected output to make it reproducible for us? Commented Feb 28, 2020 at 15:59
  • @ChristophKluge sure, its an abs filepath ie. /path/to/file/name.txt owned by webmaster from /to/ downwards including file. Response is in correct format but with incorrect value ie. instead of 16022 it returns 1675 as number of words. Commented Feb 28, 2020 at 20:32
  • If you use md5sum instead of /usr/bin/wc --words, does it show a different checksum? Commented Feb 28, 2020 at 20:40
  • 1
    Which number do you see in shell_exec and which do you see in a shell for the same command with LC_ALL=C in front? Commented Feb 28, 2020 at 21:22
  • 1
    Then you can go to wherever it's correct and run locale, which will show some variables. Copy the value for LC_CTYPE and put it in front of your command, for example LC_CTYPE="en_US.UTF-8" /usr/bin/wc --words arabian.txt. This should make the number match up with the correct one. Different languages and encodings have a different idea of what constitutes a word Commented Feb 28, 2020 at 21:30

1 Answer 1

1

Make sure to use the same locale, especially for the LC_CTYPE character encoding setting. Different languages have different ideas of what a word and non-word character is:

gnu/linux$ echo 'الدروس المستفادة من' | LC_CTYPE=C wc -w
0
gnu/linux$ echo 'الدروس المستفادة من' | LC_CTYPE="en_US.UTF-8" wc -w
3

You may also see this differ between versions of wc, here on macOS:

macos$ echo 'الدروس المستفادة من' | LC_CTYPE=C wc -w
   3
macos$ echo 'الدروس المستفادة من' | LC_CTYPE="en_US.UTF-8" wc -w
   5

It can sometimes be hard to predict which locale ends up being used, because it can be set by individual applications, by the user configuration, by system defaults, or even by settings on the system the user SSH'd from.

To see the current locale in effect, you can run locale. Copy the settings from there to make the numbers match up.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.