5

How can I replace the string in Java?

E.g.,

String a = "adf�sdf";

How can I replace and avoid special characters?

2
  • 3
    Welcome to SO, zahir! Where are you getting your strings from? Random users? A web service? Are you trying to replace something with that string, or use that string to replace something else? Commented Apr 9, 2010 at 14:24
  • It looks like Mojibake - "...the garbled text that is the result of text being decoded using an unintended character encoding." Commented Feb 1, 2023 at 19:03

4 Answers 4

14

You can get rid of all characters outside the printable ASCII range using String#replaceAll() by replacing the pattern [^\\x20-\\x7e] with an empty string:

a = a.replaceAll("[^\\x20-\\x7e]", "");

But this actually doesn't solve your actual problem. It's more a workaround. With the given information it's hard to nail down the root cause of this problem, but reading either of those articles must help a lot:

Sign up to request clarification or add additional context in comments.

6 Comments

Hmm, there seems to be a markdown bug (link 2 isn't correctly parsed), but I can't seem to locate/fix it?
@BalusC: Happens to me all the time (since I link to the Java6 docs a lot), you want to replace the space near the end with %20.
@T.J. yes, that was it, thanks :) BTW: Firefox normally escapes them before pasting, but it didn't happen correctly for some odd reason. I re-created the link and the problem went away.
@BalusC: I find very ironic that you point out a Joel article... His first article on Unicode was full of errors and misunderstanding: I remember him posting it and thinking "WTF!?". It was a "ah ah I got it" memorable moment from Joel, that was full of errors. It's actually since he posted his first article on Unicode that I started taking everything he ever said and keeps saying with a huge grain of salt ;)
@Wiz: That was also one of the reasons I wrote another one myself to clarify the one and other more, even in simple terms and with practical examples and solutions. But.. It are really not that much errors in Joel's article as you seem to insinuate?
|
2

It is hard to answer the question without knowing more of the context.

In general you might have an encoding problem. See The Absolute Minimum Every Software Developer (...) Must Know About Unicode and Character Sets for an overview about character encodings.

Comments

2

Assuming that you want to remove all special characters, you can use the character class \p{Cntrl}. Then you only need to use the following code:

stringWithSpecialCharcters.replaceAll("\\p{Cntrl}", replacement);

1 Comment

That works if you assume "special characters" means ASCII control characters. In my experience it usually means punctuation, but in this case it's anyone's guess.
0

You can use Unicode escape sequences (such as \u201c [an opening curly quote]) to "avoid" characters that can't be directly used in your source file encoding (which defaults to the default encoding for your platform, but you can change it with the -encoding parameter to javac).

4 Comments

source file encoding defaults to the platform default encoding, i.e. usually not UTF-8.
@Michael: Thanks, fixed. I wasn't just inventing that, I wonder what language/environment it actually related to? ;-) Or was it true in 1996 or something...
I doubt that, since UTF-8 wasn't specified until 1993, and Java instead used to have the recommendation to use native2ascii before distributing source code. I'd expect UTF-8 to be the default in some newer systems, though.
@Michael: 1993 is earlier than 1996, and I remember it being all nifty and cool that Java supported these weird Unicode things, so it's possible, though not likely. ;-) (native2ascii, crikey, that's a blast from the past) Thanks, though, the info pre-edit was clearly wrong in 2010 regardless!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.