Newest 'unicode' Questions

0 votes

1 answer

121 views

Is there a list of every scancode that Linux uses?

I am making another remapper like xkb, sxhkd, xmodmap etc. because I don't like the other ones and in this one I want a more simple and terse syntax that I find nice to use and to make an API that ...

knowledge seeker

9

asked Sep 18 at 6:12

0 votes

1 answer

85 views

Nonstandard subnational flag emoji: What part of the system is responsible?

So, I'm using Linux Mint 21.3 with MATE 1.26.0. I've noticed that my system supports a number of nonstandard flag emoji. I'm wondering what part of the system is responsible for this, if this is ...

Harry Altman

101

asked Jul 24 at 19:51

1 vote

1 answer

174 views

Gibberish characters in EFI variables

Do gibberish characters found in EFI variables serve any purpose? Out of curiosity, i am trying to read out EFI variables. Specifically ones related to the booting mechanism. Under /sys/firmware/efi/...

Magikarp

11

asked Apr 7 at 23:03

0 votes

2 answers

262 views

To have or not Byte Order Mark (BOM) in UTF-8 text files?(Linux)

Is it advisable to have or not Byte Order Mark (BOM) in UTF-8 text files on Linux? Is it correct to say byte order (even for multi-byte characters) is already strictly defined/fixed in UTF-8 standard? ...

strider

113

asked Feb 28 at 23:43

6 votes

3 answers

573 views

How to make Perl half/full width-insensitive regular expressions?

In Perl, /a/i matches both A and a, so I don't have to write /A|a/. What is the easy way to write /４|4/ ? Yes, I'm talking about $ unicode ４ 4|grep U+ U+FF14 FULLWIDTH DIGIT FOUR U+0034 DIGIT FOUR ...

Dan Jacobson

573

asked Feb 26 at 2:16

5 votes

2 answers

892 views

iconv fails to detect valid utf-8 character as utf-8

My input data is as follows (as generated by hexdump): 000000f0 69 61 6e e2 80 99 73 20 65 79 65 73 20 61 62 72 |ian...s eyes abr| When I open this html () file in Firefox, it displays these ...

AlastairG

233

asked Jan 6 at 15:43

0 votes

2 answers

216 views

How to insert text before the first line of an UTF-8 with BOM file

This question is closely related to: How to insert text before the first line of a file?. I deliberately made the title similar to that question to highlight this. Except the target file is UTF-8 with ...

Avenger

151

asked Dec 6, 2024 at 19:19

1 vote

0 answers

104 views

Unconsistent display of unicode chars between software and Ubuntus

I have two computers installed slightly differently : A: KUbuntu based on 22.04.3 LTS B: Ubuntu 24.04.1 LTS + KDE somehow added after I noticed between the 2 that some (not all) Unicode chars where ...

Pascal

11

asked Nov 19, 2024 at 9:13

0 votes

0 answers

76 views

Cross-platform method of checking if using terminal emulator or tty

I am looking for a cross platform way to check if I am using a terminal emulator (with support for unicode characters) or a TTY session (with only support for ASCII chars). I initially tried to use if ...

Sarp User

21

asked Oct 17, 2024 at 1:10

2 votes

1 answer

88 views

Cannot insert the mapsto character ↦ in groff

I am trying to learn how to insert the mapsto (↦, U+21A6) character in groff. I am trying to use this code to insert the character \[u21A6] But I get the following error message and nothing is ...

andrei-n

23

asked Sep 20, 2024 at 8:10

2 votes

1 answer

735 views

Which interpreter for "Unicode text, UTF-8 text executable"

I'm trying to set up a keybinding for an executable which is in my home. For this, I set the command: sh -c '\"/path/to/the/executable\" --options' But, it does not work, and, when I'm ...

Phantom

503

asked Jul 9, 2024 at 15:53

11 votes

3 answers

2k views

UTF-8 characters in POSIX shell script comments - anything against it?

I would like to include a couple of non-ASCII characters in my POSIX shell script comments. Note this is in no way a duplicate of e.g. "Which character encodings are supported by posix?" as ...

Vlastimil Burián

31.3k

asked Jun 18, 2024 at 22:21

1 vote

0 answers

60 views

Ignore Accent Differences in Zsh Autocomplete

Suppose I have a directory named cálculo in the current directory. How can I autocomplete its name after typing the starting characters without the accent? $ cd calc<tab> $ cd cálculo/ I failed ...

sidyll

207

asked May 12, 2024 at 18:19

1 vote

1 answer

300 views

Fontawesome icons are not pasted correctly

I am using Fedora and installed fontawesome via sudo dnf install fontawesome fonts. Later because it didn't work I also additionally installed the font manually via downloading the zip from the Github ...

Sinthoras

23

asked Apr 2, 2024 at 17:51

2 votes

0 answers

176 views

How do I disable UTF-8 in an xterm (or X, really)?

I have a system running Debian unstable where I don't want to have UTF-8 in my xterms (or at all). But I recently discovered that somehow I now have UTF-8 in my xterms and other windows. It might have ...

ftpltl

21

asked Mar 16, 2024 at 12:03

0 votes

0 answers

103 views

ls: single-column vs. multi-column layout, non-Unicode characters in filenames

Create a directory ~/test with abcdefghijklmnopqrstuvwxyz and zyxwvutsrqponmlkjihgfedcba files in it. ls ~/test will list them using multi-column layout: abcdefghijklmnopqrstuvwxyz ...

jsx97

1,387

asked Mar 13, 2024 at 7:27

2 votes

2 answers

365 views

Search and replace composed Unicode characters

I have a deep folder structure on a Debian machine where the directory names and the filenames contain some "special" characters (ä,ö,ü). However, these are not in "ISO-8859-1" ...

Stubenhocker.tech

799

asked Mar 11, 2024 at 13:03

3 votes

1 answer

156 views

'ls name' and 'ls | grep name' with accent different

I am on Xigmanas (NAS freebsd). I'll explain the situation as simply as possible: :; set | egrep 'LC_A|LANG' GDM_LANG=fr_FR.UTF-8 LANG=fr_FR.UTF-8 LC_ALL=fr_FR.UTF-8 SLIM_LANG=fr_FR.UTF-8 :; ls -i ...

Dhénin Jean-Jacques

31

asked Mar 9, 2024 at 16:47

0 votes

0 answers

91 views

Terminal: Help understanding behavior with UTF-8 text

I am trying to understand the following behavior I am observing on my Ubuntu system. Consider the following two files: $ hexdump -C 1.txt 00000000 d9 82 d8 a8 d8 a7 d9 86 d9 8a 5e d9 84 d9 86 d8 |.....

malat

3,469

asked Feb 20, 2024 at 14:57

1 vote

0 answers

56 views

XQuartz xterm UTF-8 resource name

I was using UTF-8 resources names like these ones: wengé*Background: #321 wengé*Foreground: #ffb and this was working with XQuartz 2.8.1 through this direct call like from within the ...

athena

1,095

asked Feb 18, 2024 at 16:51

1 vote

1 answer

83 views

Crossmark symbol (\u274c) doesn't work in debian 12

I have moved from Ubuntu 22.04 to Debian 12, I have a bash function that outputs crossmark if command failed and checkmark if command succeed. The checkmark works, but the crossmark doesn't. Here is ...

Liso

131

asked Jan 25, 2024 at 14:09

1 vote

1 answer

477 views

What puts the terminal in Unicode mode?

I have a Debian server which is not properly displaying Unicode characters when logged in locally, without starting X11. Unicode works after running unicode_start (until the terminal is closed). It ...

Fadeway

183

asked Jan 12, 2024 at 14:26

1 vote

1 answer

78 views

How to use unix `mv` to rename files with unicode spaces(not U+20)?

$ ls cn* cn blah blah.txt $ ls cn\ * ls: cannot access 'cn *': No such file or directory $ ls cn*|hexdump -C 00000000 63 6e e2 80 85 62 6c 61 68 c2 a0 62 6c 61 68 2e |cn...blah..blah.| 00000010 74 ...

Remi Arntzen

13

asked Dec 3, 2023 at 19:34

5 votes

2 answers

254 views

Why is ls sorting Chinese filenames by length?

I've run into a bit of a weird behaviour that I don't fully understand with ls and Chinese filenames. I'm running macOS 13.6.1 with SIP enabled (no core OS modifications), MacPorts installed, and US ...

nneonneo

1,198

asked Nov 25, 2023 at 11:05

2 votes

1 answer

724 views

Can awk be told to count the character string length rather than byte string length for '%10s' printf formats?

Try this for an output of |Ü| X|: echo 'Ü X' | awk '{printf("|% 2s|% 2s|\n", $1, $2)}' Obviously awk counts the byte length, not the character length of the Ü, so the count is 2 and no left ...

Harald

1,040

asked Nov 9, 2023 at 9:49

0 votes

2 answers

157 views

groff -mandoc creating "ESC[1m" versus overstriking with backspace for bold text

I found that groff uses different ways to indicate bold text for the utf8 output format. On FreeBSD 14, groff emits escape codes for a terminal (ESC, [1m): $ printf ".Dd today\n.Sh NAME\n" | ...

Jens

1,894

asked Nov 6, 2023 at 19:10

0 votes

1 answer

175 views

Why is MB_CUR_MAX 6 instead of 4 for UTF-8? (Linux, glibc)

MB_CUR_MAX is defined by glibc as 'a positive integer expression that is the maximum number of bytes in a multibyte character in the current locale.' If I print the value I get 1. I assume that this ...

Sebastian Carlos

262

asked Oct 30, 2023 at 9:51

0 votes

2 answers

549 views

I need to create a pipe to convert string from UTF-8 to UTF-7-IMAP

To automate the command line creation of hundreds of directories in IMAP maildirs, I would need to be able to convert UTF-8 strings to UTF-7-IMAP on the fly. In php, I found a way to do it with a ...

Chris972

43

asked Sep 28, 2023 at 4:31

0 votes

2 answers

371 views

Listing filenames with special characters

I have a zsh shell (with oh-my-zsh default config). Why I ls filenames with special characters, they are printed as: ''$'\316\262''=0.35-L=32-m=10.jld2' This should be: β=0.35-L=32-m=10.jld2 but the ...

a06e

1,837

asked Sep 23, 2023 at 14:33

1 vote

0 answers

50 views

Debian terminal not displaying correct Unicode half-block characters [duplicate]

I have a program that prints Unicode half-block characters (U+2580, U+2584), but on Debian 10 terminals (just the fullscreen terminal, no X), it's printing diamonds instead of half-blocks. The two ...

Jason C

1,947

asked Sep 20, 2023 at 13:31

1 vote

0 answers

119 views

Ctrl-Shift-U requires extra U in Ubuntu 23.04 Cinnamon?

I'm running a new install of Ubuntu 23.04 with cinnamon desktop 5.6.7 Typing Ctrl-Shift-u in a terminal does nothing unles the next character is another u; then the underlined u appears and I can ...

jimav

131

asked Aug 17, 2023 at 21:22

1 vote

2 answers

1k views

Entering special characters the same way on Windows and Linux

ctrlshiftu followed by the hex value of a Unicode character enters that character. For example, ctrlshiftu41 enters 'A', whose value is 0x41 in hex and 65 in decimal. There's also the compose key, ...

glibg10b

428

asked Jul 5, 2023 at 9:52

1 vote

1 answer

107 views

Expand tabs in file with utf8 characters

I use expand to expand tabs to spaces. For utf8 files expand doesn't work correctly. E.g. in ć\ta tab is expanded to 6 spaces while in a\ta to 7 spaces. How do I make it work for utf8 files?

Marcin Król

253

asked Jun 6, 2023 at 16:09

-1 votes

2 answers

127 views

Is ∞ allowed in UTF-8 Encoded files?

Are lemniscates, ∞, allowed in UTF-8 Encoded files? I am hoping that students with less than six months of computer programming experience can use a search engine to type something like "is ...

Samuel Muldoon

99

asked May 28, 2023 at 22:59

2 votes

2 answers

234 views

Unicode Supplementary Multilingual Plane (Plane 1) glyphs in xterm

I'm trying to display Unicode Supplementary Multilingual Plane (Plane 1) glyphs in xterm. Those glyphs are in the U+010000..U+01FFFF range (https://unifoundry.com/pub/unifont/unifont-15.0.01/...

e___e

55

asked May 19, 2023 at 11:36

7 votes

1 answer

2k views

How should I interpret the fact that a Unicode code point is shown in two completely different ways in two different terminal emulators?

This is kind of a spin off from an older question I asked. Here's the screenshot from that question: In the bottom left is URxvt, and you can see a lighting bolt-like icon at the beginning of the ...

Enlico

2,362

asked May 15, 2023 at 14:56

4 votes

4 answers

556 views

Collect chars from strings and print their unicode

Context (skip, if you don't care; read, if you suspect I'm totally on the wrong track) For an embedded system with small memory, I want to generate fonts which contain only those glyphs actually ...

Philippos

13.8k

asked Apr 19, 2023 at 11:28

1 vote

0 answers

164 views

Pasting non-ascii (utf8) into remote urxvt terminal

For pasting text, in urxvt/rxvt-unicode one can use middle button to paste PRIMARY selection. I can do such Mouse-Middle-Click paste in my local urxvt terminal and even a remote server, in Chinese/...

xpt

1,924

asked Mar 24, 2023 at 18:08

0 votes

0 answers

278 views

Script for awscli check not working with crontab schedule

I have written a small code snippet to check the aws cli version #!/usr/bin/env bash if [ -e "/usr/local/bin/aws" ]; then myAWS="/usr/local/bin/aws" else ...

AashkaTe

1

asked Mar 11, 2023 at 13:10

9 votes

3 answers

4k views

Box character doesn't display properly in Linux terminal

I was just writing a C++] program that uses the box characters to display information. I ran the program on macOS and used the terminal app and it worked fine. When I switched to Debian Linux using ...

sherbit fish

147

asked Mar 6, 2023 at 9:43

2 votes

2 answers

1k views

How to combine settings from multiple locales in Linux?

When I installed Linux I set my locale to en_US.UTF-8. However I want to override some but not all of the settings in that locale. Specifically, I would like the Measurement to be Metric instead of ...

bch6595

21

asked Feb 17, 2023 at 2:24

1 vote

1 answer

112 views

Command similar to ascii for ascii extended and/or for unicode?

ascii command in Linux is fast and great. It allows us to search for a character or for a code point and returns all relevant results for a given search. Is there something similar for ASCII extended (...

demacj

13

asked Feb 14, 2023 at 11:03

4 votes

1 answer

319 views

How do I create a zip that preserves unicode character composition on linux?

I'm on Debian. I have a file called Sóanr.jpg. According to https://emojidissector.com/, this is made of the following code points: S 0053 LATIN CAPITAL LETTER S o 006F LATIN SMALL LETTER O ...

bennlich

143

asked Feb 2, 2023 at 9:59

1 vote

3 answers

114 views

Writing bash arguments with trunctation

I want to print the first two arguments of a bash function, with the unicode character \u2263 on each side using a two space separation. The thing is that the final unicode must display at column 70. ...

Vera

1,373

asked Jan 27, 2023 at 14:50

3 votes

1 answer

320 views

Different encoding/Unicode interpretation using terminal vs using shell script

I was working on a keymap script (map keys from one language keyboard layout to another). And after a lot of hard time trying to get everything working I found out that different characters are ...

Andrew15_5

291

asked Jan 24, 2023 at 22:37

3 votes

0 answers

253 views

Unnormalized UTF-8 directory names

I noticed something interesting in one of my directories: $ ls -li total 36 2625309 drwxrwxr-x 2 dotancohen dotancohen 4096 Jul 4 2022 Español 2625385 drwxrwxr-x 2 dotancohen dotancohen 4096 Jul ...

dotancohen

16.5k

asked Jan 19, 2023 at 13:44

0 votes

0 answers

167 views

Is there a way to remove specific emoji from being rendered in any application while using Cinnamon desktop?

I am slightly annoyed with some emojis. So I was wondering, how could I remove/prevent some emojis from being rendered at all? Replacing them with some other emoji like cute cat face could work too. ...

user556471

asked Jan 12, 2023 at 15:41

0 votes

1 answer

120 views

testdisk utility reports nonexistent files from a exFAT drive used with Windows - why?

I tried to recover lost files from an exFAT thumb drive with the testdisk package on linux. It was very good at finding deleted files. However as I went through the entries, I saw weird entries. The ...

ero47543

1

asked Jan 5, 2023 at 14:07

1 vote

0 answers

46 views

Cannot use unicode shortcut on non-english layouts

I’m using US and RU layouts, and while I can use Ctrl+Shift+u, when I have US layout selected, when I try to use it with RU layout selected, it just doesn’t work. Didn’t find anything related to it in ...

Sebekerga

19

asked Jan 4, 2023 at 3:18

1 vote

3 answers

915 views

Looking up and Inputing arbitrary unicode characters in console/terminal

I'm looking for a simple, generic way to input arbitrary unicode characters in a text document on the terminal(e.g. in a terminal editor). A basic method I can imagine is having a simple text(utf-8) ...

Charles Langlois

201

asked Dec 9, 2022 at 2:06

Questions tagged [unicode]