1

I am using HTMLUnit in Java to connect to a remote URL and derive some information from the web page obtained.

I am using the following code:

final WebClient webClient = new WebClient(BrowserVersion.INTERNET_EXPLORER_6_0, "companyproxy.server", 8080);
final DefaultCredentialsProvider scp = new DefaultCredentialsProvider();
scp.addProxyCredentials("username", "password","companyproxy.server",8080);
webClient.setCredentialsProvider(scp);

final URL url = new URL("http://htmlunit.sourceforge.net");
final HtmlPage page = (HtmlPage)webClient.getPage(url);
System.out.println(page.asXml());

After providing details for the proxy server I am getting this error message:

SEVERE: Credentials cannot be used for NTLM authentication:
org.apache.commons.httpclient.UsernamePasswordCredentials
org.apache.commons.httpclient.auth.InvalidCredentialsException: Credentials cannot be used for NTLM authentication: org.apache.commons.httpclient.UsernamePasswordCredentials
    at org.apache.commons.httpclient.auth.NTLMScheme.authenticate(NTLMScheme.java:332)
    at org.apache.commons.httpclient.HttpMethodDirector.authenticateProxy(HttpMethodDirector.java:320)
    at org.apache.commons.httpclient.HttpMethodDirector.authenticate(HttpMethodDirector.java:232)
    at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(HttpMethodDirector.java:170)
    at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:397)
    at org.apache.commons.httpclient.HttpClient.executeMethod(HttpClient.java:346)
    at com.gargoylesoftware.htmlunit.HttpWebConnection.getResponse(HttpWebConnection.java:97)
    at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1477)
    at com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1435)
    at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:327)
    at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:388)
    at com.test.Test.main(Test.java:25)
Jun 5, 2009 9:28:35 AM org.apache.commons.httpclient.HttpMethodDirector processProxyAuthChallenge
INFO: Failure authenticating with NTLM <any realm>@companyproxy.server:8080
Jun 5, 2009 9:28:35 AM com.gargoylesoftware.htmlunit.WebClient printContentIfNecessary
INFO: statusCode=[407] contentType=[text/html]
Jun 5, 2009 9:28:35 AM com.gargoylesoftware.htmlunit.WebClient printContentIfNecessary
INFO: <HTML><HEAD>
<TITLE>Access Denied</TITLE>
</HEAD>

....

Exception in thread "main" com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException:

407 Proxy Authentication Required for http://htmlunit.sourceforge.net/
    at com.gargoylesoftware.htmlunit.WebClient.throwFailingHttpStatusCodeExceptionIfNecessary(WebClient.java:535)
    at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:332)
    at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:388)
    at com.test.Test.main(Test.java:25)

Can you please provide some info on this ?

5 Answers 5

5

I had the same problem and found a solution in web. Forget setCredentialsProvider(). Use this:

String userAndPassword = username + ":" + password;
String userAndPasswordBase64 = Base64.encodeBase64String(userAndPassword.getBytes());
webClient.addRequestHeader("Proxy-Authorization", "Basic "+userAndPasswordBase64);

This Base64 class is from Apache Commons Codec.

I used the following to pass port and host, but probably your way is fine too.

webClient.getProxyConfig().setProxyHost(proxyHost);
webClient.getProxyConfig().setProxyPort(proxyPort);
Sign up to request clarification or add additional context in comments.

1 Comment

This solution seems to work for http proxies. Does the header get forwarded to normal websites the webclient is visiting? I'll test it later but if you already have tested it, let me know.
2

Although you have not put the full stack trace in, I am guessing that the error is being thrown on the line:

final HtmlPage page = (HtmlPage)webClient.getPage(url);

This is because the getPage call is returning an UnexpectedPage rather than an HtmlPage. Looking at the documentation for UnexpectedPage, it appears the page request is coming back with a Content-type that is not text/html so htmlunit is not sure what to do with it. You should turn up the debugging and see what is actually coming back to figure out the error.

Comments

1

I am not able to use HtmlUnit to do NTLM authentication on the proxy server. When i used HttpClient (HtmlUnit is built on top of this) and set the proxy setting with NTLM authentication.It worked. Here is the code for the same.

HttpClient client = new HttpClient();
client.getHostConfiguration().setProxy("companyproxy.server", 8080);
List authPrefs = new ArrayList();
authPrefs.add(AuthPolicy.NTLM);

client.getState().setProxyCredentials(
    new AuthScope(null, 8080, null),
    new NTCredentials("username", "pwd", "", "DOMAIN"));

client.getParams().setParameter(AuthPolicy.AUTH_SCHEME_PRIORITY, authPrefs);

GetMethod method = new GetMethod(url);

method.getParams().setParameter(HttpMethodParams.RETRY_HANDLER, 
        new DefaultHttpMethodRetryHandler(3, false));

Comments

0

As Rob said, HtmlUnit is not able to detect it is an HTML page.

Please provide sample to the user-list, so we can investigate further

Comments

0

With HTMLUnit 2.14, this works for me :

    DefaultCredentialsProvider cp = (DefaultCredentialsProvider) client.getCredentialsProvider();
    cp.addNTLMCredentials(proxyUser, proxyPassword, proxyHost, proxyPort, null, domain);

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.