Skip to main content

Questions tagged [text-encoding]

Filter by
Sorted by
Tagged with
5 votes
9 answers
3k views

I have been working on launching high-altitude balloons (HABs, or weather balloons) and I have been using LoRa to enable long-range communication with my balloons. It's been great and pretty reliable, ...
Lv_InSaNe_vL's user avatar
-2 votes
1 answer
4k views

I guess most of you already met them. You get them from your data sources, see them in your logs, or in the output from your legacy systems. Some strings you can't really read. To derive any useful ...
Martin Grey's user avatar
2 votes
2 answers
571 views

Today I went across a weird case for which I have no explanation, so here I am. I have two files with identical content, but one is encoded in UTF-8 and the other one is in IBM EBCDIC. Both of them ...
rodripf's user avatar
  • 137
5 votes
1 answer
428 views

When you encode a code point to code units based on UTF-8, then if the code point fits on 7 bits, the most significant bit is set to zero so that it tells you it is a character which is stored on 1 ...
codepersonnel49's user avatar
3 votes
4 answers
3k views

We have an app that receives a web service request, processes it and sends it back to our client by another web service call. There is a unique field in the request, a tracking Id, which currently ...
Suraj Muraleedharan's user avatar
0 votes
1 answer
91 views

A few times in my career I've found myself writing decoders for responses from IoT products or weird apis that insted of using JSON or XML as a response, they reply with something like ...
Héctor Salazar's user avatar
1 vote
1 answer
91 views

Let's say we have a generic table like below: id, name, price, quantity 20 product_x 5,00 100 20 product_y 5,00 100 20 ...
user avatar
4 votes
4 answers
4k views

A follow-up to Difference between '\n' and '\r\n'. It's been few decades since the schism was introduced. Nowadays, when documents are being exchanged over the internet, typically ...
Ondra Žižka's user avatar
10 votes
1 answer
3k views

Git can generate patches/diffs for binary files as well as for text files. I'm trying to figure out what encoding it uses for its binary patches. Here is an example: diff --git a/www/images/...
Dan Lenski's user avatar
-1 votes
2 answers
317 views

I have a printer and SDK to work with it in Java. Printer working well with english letters and digits but doesn't print correctly special symbols like 'ä' or 'ê'. I suppose that I need to convert ...
BArtWell's user avatar
  • 107
-1 votes
2 answers
9k views

I'm working on a project that requires a TCP connection between a client and server. The current protocol encodes the data into hex and then sends it. However, hex increases the length of the payload ...
Awn's user avatar
  • 155
2 votes
1 answer
1k views

I have the wikipedia data dump and trying to decode special characters in the page titles, except a lot of characters don't match up the "standard" ascii encoding (referencing from here.) As an ...
AltusVultur's user avatar
4 votes
2 answers
417 views

ISO 8859-1 contains a few letter-free diacritics: The diaeresis (¨), the acute accent (´), the cedilla (¸) and the macron (¯).¹ Why were they included? As far as I know (please correct me if I am ...
Heinzi's user avatar
  • 9,868
7 votes
2 answers
10k views

Suppose a program A opens a text file A using encoding A to decode the file, and a program B opens a text file B using encoding B. When we copy some text from file B in program B to file A in ...
Tim's user avatar
  • 5,565
1 vote
2 answers
77 views

I am not sure whether this question is a good fit for this site, but if it is not, please let me know and I will take it down. If it is off-topic, some general info on where I can look for these ...
Luke's user avatar
  • 273
21 votes
4 answers
4k views

According to the Wikipedia article, UTF-8 has this format: First code Last code Bytes Byte 1 Byte 2 Byte 3 Byte 4 point point Used U+0000 U+007F 1 0xxxxxxx U+0080 U+...
qbt937's user avatar
  • 321
0 votes
1 answer
5k views

I am trying to create a basic licensing system where I take a unique ID from the client computer, and I get this Hexadecimal string (hyphens removed e.g. "84-18-CE-...."): "...
SolaGratia's user avatar
2 votes
2 answers
6k views

When is it beneficial to use encodings other than UTF-8? Aside from dealing with pre-unicode documents, that is. And more importantly, why isn't UTF-8 the default in most languages? That is, why do I ...
Electric Coffee's user avatar
4 votes
4 answers
19k views

Please can you answer a couple of questions based on the code below (excludes the try/catch blocks), which transforms input XML and XSL files into an output XSL-FO file: File xslFile = new File("...
Helen Reeves's user avatar
8 votes
2 answers
16k views

I was wondering if ffmpeg supported gpu acceleration. I was reading on their websites and came across contradicting information. http://www.ffmpeg.org/general.html#Video-Codecs -H.264 / AVC / MPEG-4 ...
Jason123's user avatar
  • 143
3 votes
2 answers
1k views

I’m having some problems debugging an encoded javacscript. This script I’m referring to given in this link over here. The encoding here is simple and it works by shifting the unicodes values to ...
miles away's user avatar
8 votes
2 answers
2k views

I recently implemented incoming emails for an application and boy, did I open the gates of hell? Since then every other day an email arrives that makes the app fail in a different way. One of those ...
Pablo Fernandez's user avatar
2 votes
2 answers
895 views

http://php.net/manual/en/function.mb-convert-encoding.php Say I do: $encoded = mb_convert_encoding ($original); That looks like simple enough. WHat I am imagining is the following $original has a ...
user4951's user avatar
  • 739
4 votes
3 answers
3k views

At the moment non/semi sensitive information is sent from one page to another via GET on our web application. Such as user ID or page number requested etc. Sometimes slightly more sensitive ...
hozza's user avatar
  • 313
21 votes
4 answers
61k views

I am interested in encoding a string I have and I am curious if there is a type of encoding that can be used that will only include alpha and numeric characters and would preferably shorten the number ...
Abe Miessler's user avatar
41 votes
8 answers
5k views

I thought Unicode was designed to get around the whole issue of having lots of different encoding due to a small address space (8 bits) in most of the prior attempts (ASCII, etc.). Why then are there ...
Matthew Scharley's user avatar