How to do URL decoding in Java?

Question

In Java, I want to convert this:

https%3A%2F%2Fmywebsite%2Fdocs%2Fenglish%2Fsite%2Fmybook.do%3Frequest_type

To this:

https://mywebsite/docs/english/site/mybook.do&request_type

This is what I have so far:

class StringUTF 
{
    public static void main(String[] args) 
    {
        try{
            String url = 
               "https%3A%2F%2Fmywebsite%2Fdocs%2Fenglish%2Fsite%2Fmybook.do" +
               "%3Frequest_type%3D%26type%3Dprivate";

            System.out.println(url+"Hello World!------->" +
                new String(url.getBytes("UTF-8"),"ASCII"));
        }
        catch(Exception E){
        }
    }
}

But it doesn't work right. What are these %3A and %2F formats called and how do I convert them?

The problem is that just because the URL can be UTF-8, the question really has nothing to do with UTF-8. I've edited the question suitably. — C. K. Young
– C. K. Young, Commented May 26, 2011 at 12:19
It could be (in theory) but the string in your example is not a UTF-8 encoded String. It is a URL-encoded ASCII string. Hence the title is misleading. — Stephen C
– Stephen C, Commented May 26, 2011 at 12:20
It is also worth noting that all the characters in the url string are ASCII, and this is also true after the string has been URL decoded. '%' is an ASCII char and %xx represents an ASCII char if xx is less than (hexadecimal) 80. — Stephen C
– Stephen C, Commented May 26, 2011 at 12:34

kryger · Accepted Answer · 2019-02-04 13:07:11Z

775

This does not have anything to do with character encodings such as UTF-8 or ASCII. The string you have there is URL encoded. This kind of encoding is something entirely different than character encoding.

Try something like this:

try {
    String result = java.net.URLDecoder.decode(url, StandardCharsets.UTF_8.name());
} catch (UnsupportedEncodingException e) {
    // not going to happen - value came from JDK's own StandardCharsets
}

Java 10 added direct support for Charset to the API, meaning there's no need to catch UnsupportedEncodingException:

String result = java.net.URLDecoder.decode(url, StandardCharsets.UTF_8);

Note that a character encoding (such as UTF-8 or ASCII) is what determines the mapping of characters to raw bytes. For a good intro to character encodings, see this article.

edited Feb 4, 2019 at 13:07

kryger

13.2k8 gold badges47 silver badges68 bronze badges

answered May 26, 2011 at 12:04

Jesper

208k47 gold badges325 silver badges361 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

laz Over a year ago

The methods on URLDecoder are static so you don't have to create a new instance of it.

Jesper Over a year ago

@Trismegistos Only the version where you don't specify the character encoding (the second parameter, "UTF-8") is deprecated according to the Java 7 API documentation. Use the version with two parameters.

Shahar Over a year ago

If using java 1.7+ you can use the static version of the "UTF-8" string: StandardCharsets.UTF_8.name() from this package: java.nio.charset.StandardCharsets. Relevant to this: link

Michal Over a year ago

Be careful with this. As noted here: blog.lunatech.com/2009/02/03/… This is not about URLs, but for HTML form encoding.

Evgeny Bovykin Over a year ago

Doesn't work if there is a '+' in url. See bugs.openjdk.java.net/browse/JDK-8179507

|

Alexander Pogrebnyak · Accepted Answer · 2011-05-26 12:02:41Z

82

The string you've got is in application/x-www-form-urlencoded encoding.

Use URLDecoder to convert it to Java String.

URLDecoder.decode( url, "UTF-8" );

answered May 26, 2011 at 12:02

Alexander Pogrebnyak

45.8k10 gold badges111 silver badges123 bronze badges

Comments

Ilya Serbis · Accepted Answer · 2020-01-19 13:18:46Z

59

This has been answered before (although this question was first!):

"You should use java.net.URI to do this, as the URLDecoder class does x-www-form-urlencoded decoding which is wrong (despite the name, it's for form data)."

As URL class documentation states:

The recommended way to manage the encoding and decoding of URLs is to use URI, and to convert between these two classes using toURI() and URI.toURL().

The URLEncoder and URLDecoder classes can also be used, but only for HTML form encoding, which is not the same as the encoding scheme defined in RFC2396.

Basically:

String url = "https%3A%2F%2Fmywebsite%2Fdocs%2Fenglish%2Fsite%2Fmybook.do%3Frequest_type";
System.out.println(new java.net.URI(url).getPath());

will give you:

https://mywebsite/docs/english/site/mybook.do?request_type

edited Jan 19, 2020 at 13:18

Ilya Serbis

22.5k8 gold badges92 silver badges78 bronze badges

answered May 9, 2013 at 3:07

Nick Grealy

26.2k9 gold badges112 silver badges127 bronze badges

8 Comments

Aaron Over a year ago

In Java 1.7 the URLDecoder.decode(String, String) overload is not deprecated. You must be referring to the URLDecoder.decode(String) overload without the encoding. You might want to update your post for clarification.

Emerson Farrugia Over a year ago

This answer is misleading; that block quote has nothing to do with the deprecation. The Javadoc of the deprecated method states, and I actually quote

@deprecated The resulting string may vary depending on the platform's default encoding. Instead, use the decode(String,String) method to specify the encoding.

Pelpotronic Over a year ago

getPath() for URIs only returns the path part of the URI, as noted above.

Pelpotronic Over a year ago

Unless I'm mistaken, the "path" is known to be that part of a URI after the authority part (see: en.wikipedia.org/wiki/Uniform_Resource_Identifier for definition of path) - it seems to me the behaviour I am seeing is the standard/correct behaviour. I'm using java 1.8.0_101 (on Android Studio). I'd be curious to see what you get as "getAuthority()" is called. Even this article/example seems to indicate that path is only the /public/manual/appliances part of their URI:quepublishing.com/articles/article.aspx?p=26566&seqNum=3

crow Over a year ago

@Pelpotronic The code in the post actually does print the output that it shows (at least for me). I think the reason for this is that, because of the URL encoding, the URI constructor is actually treating the entire string, (https%3A%2F...), as just the path of a URI; there is no authority, or query, etc. This can be tested by calling the respective get methods on the URI object. If you pass the decoded text to the URI constructor: new URI("https://mywebsite/do....."), then calling getPath() and other methods will give correct results.

|

Eric Leschinski · Accepted Answer · 2013-04-09 00:33:53Z

19

%3A and %2F are URL encoded characters. Use this java code to convert them back into : and /

String decoded = java.net.URLDecoder.decode(url, "UTF-8");

edited Apr 9, 2013 at 0:33

Eric Leschinski

155k96 gold badges423 silver badges337 bronze badges

answered May 26, 2011 at 12:03

laz

28.7k6 gold badges56 silver badges50 bronze badges

2 Comments

vuhung3990 Over a year ago

it not convert %2C too, it's (,)

dNurb Over a year ago

this needs to be wrapped in a try/catch block.. read more about checked exceptions (this one) vs unchecked stackoverflow.com/questions/6115896/…

Ronak Poriya · Accepted Answer · 2015-06-16 07:12:51Z

8

public String decodeString(String URL)
    {

    String urlString="";
    try {
        urlString = URLDecoder.decode(URL,"UTF-8");
        } catch (UnsupportedEncodingException e) {
            // TODO Auto-generated catch block

        }

        return urlString;

    }

answered Jun 16, 2015 at 7:12

Ronak Poriya

2,5193 gold badges21 silver badges20 bronze badges

1 Comment

abarisone Over a year ago

Could you please elaborate more your answer adding a little more description about the solution you provide?

Hsm · Accepted Answer · 2014-11-14 17:44:50Z

7

 try {
        String result = URLDecoder.decode(urlString, "UTF-8");
    } catch (UnsupportedEncodingException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }

answered Nov 14, 2014 at 17:44

Hsm

1,54017 silver badges16 bronze badges

Comments

Sorter · Accepted Answer · 2014-08-10 12:31:02Z

6

I use apache commons

String decodedUrl = new URLCodec().decode(url);

The default charset is UTF-8

answered Aug 10, 2014 at 12:31

Sorter

10.3k6 gold badges71 silver badges75 bronze badges

Comments

rinuthomaz · Accepted Answer · 2014-10-15 04:29:28Z

import java.io.UnsupportedEncodingException;
import java.net.URISyntaxException;

public class URLDecoding { 

    String decoded = "";

    public String decodeMethod(String url) throws UnsupportedEncodingException
    {
        decoded = java.net.URLDecoder.decode(url, "UTF-8"); 
        return  decoded;
//"You should use java.net.URI to do this, as the URLDecoder class does x-www-form-urlencoded decoding which is wrong (despite the name, it's for form data)."
    }

    public String getPathMethod(String url) throws URISyntaxException 
    {
        decoded = new java.net.URI(url).getPath();  
        return  decoded; 
    }

    public static void main(String[] args) throws UnsupportedEncodingException, URISyntaxException 
    {
        System.out.println(" Here is your Decoded url with decode method : "+ new URLDecoding().decodeMethod("https%3A%2F%2Fmywebsite%2Fdocs%2Fenglish%2Fsite%2Fmybook.do%3Frequest_type")); 
        System.out.println("Here is your Decoded url with getPath method : "+ new URLDecoding().getPathMethod("https%3A%2F%2Fmywebsite%2Fdocs%2Fenglish%2Fsite%2Fmybook.do%3Frequest")); 

    } 

}

You can select your method wisely :)

Selva R · Accepted Answer · 2021-05-30 12:42:35Z

2

If it is integer value, we have to catch NumberFormatException also.

try {
        Integer result = Integer.valueOf(URLDecoder.decode(urlNumber, "UTF-8"));
    } catch (NumberFormatException | UnsupportedEncodingException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }

answered May 30, 2021 at 12:42

Selva R

413 bronze badges

Comments

x7BiT · Accepted Answer · 2020-04-14 13:20:26Z

1

Using java.net.URI class:

public String getDecodedURL(String encodedUrl) {
    try {
        URI uri = new URI(encodedUrl);
        return uri.getScheme() + ":" + uri.getSchemeSpecificPart();
    } catch (Exception e) {
        return "";
    }
}

Please note that exception handling can be better, but it's not much relevant for this example.

answered Apr 14, 2020 at 13:20

x7BiT

4964 silver badges5 bronze badges

Comments

salam_verdim_alana_panyatkasi · Accepted Answer · 2022-09-30 11:42:13Z

-1

I was having this problem too and came here as an answer. But I used the code of the friend whose question was approved, it didn't work. I tried something different and it worked, so I'm sharing the following line of code in case it helps.

URLDecoder.decode(URLDecoder.decode(url, StandardCharsets.UTF_8)))

answered Sep 30, 2022 at 11:42

salam_verdim_alana_panyatkasi

3834 silver badges12 bronze badges

Collectives™ on Stack Overflow

How to do URL decoding in Java?

11 Answers 11

10 Comments

Comments

8 Comments

2 Comments

1 Comment

Comments

Comments

Comments

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

11 Answers 11

10 Comments

Comments

8 Comments

2 Comments

1 Comment

Comments

Comments

Comments

Comments

Comments

Comments

Linked

Related