1

I am trying to download a file from a website using HtmlUnit 2.11. However , I am getting UnknownHostException . Below is the code and the complete stack trace:

Code:

final WebClient webClient = new WebClient(
                BrowserVersion.INTERNET_EXPLORER_8);

        URL Url = new URL("https://340bopais.hrsa.gov/reports");

        HtmlPage page = webClient.getPage(Url);
        HtmlSubmitInput button = page
                .getElementByName("ContentPlaceHolder1_lnkCEDailyReport");

        final HtmlPage page2 = button.click();

Exception Trace:

Exception in thread "main" java.net.UnknownHostException: 340bopais.hrsa.gov
    at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
    at java.net.InetAddress$1.lookupAllHostAddr(Unknown Source)
    at java.net.InetAddress.getAddressesFromNameService(Unknown Source)
    at java.net.InetAddress.getAllByName0(Unknown Source)
    at java.net.InetAddress.getAllByName(Unknown Source)
    at java.net.InetAddress.getAllByName(Unknown Source)
    at org.apache.http.impl.conn.SystemDefaultDnsResolver.resolve(SystemDefaultDnsResolver.java:45)
    at org.apache.http.impl.conn.DefaultClientConnectionOperator.resolveHostname(DefaultClientConnectionOperator.java:278)
    at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:162)
    at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294)
    at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:640)
    at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:479)
    at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
    at com.gargoylesoftware.htmlunit.HttpWebConnection.getResponse(HttpWebConnection.java:171)
    at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1484)
    at com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1402)
    at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:304)
    at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:373)
    at src.main.java.DataDownloader.main(DataDownloader.java:30)
6
  • 2
    Are you able to ping that URL from your command prompt? Commented Dec 13, 2017 at 13:18
  • It is not able to determine IP address of the the URL : https://340bopais.hrsa.gov/reports Commented Dec 13, 2017 at 13:22
  • @khAn , I tried the following command : ping 340bopais.hrsa.gov. The response was : Ping request could not find host 340bopais.hrsa.gov. Please check the name and try again. Also, "tracert 340bopais.hrsa.gov" gave following result : Unable to resolve target system name 340bopais.hrsa.gov. Commented Dec 13, 2017 at 13:25
  • 2
    why do you use INTERNET_EXPLORER_8? Commented Dec 13, 2017 at 13:34
  • 2
    There is a problem with this website's security certificate. if URL run in browser.. Commented Dec 13, 2017 at 13:35

1 Answer 1

1

PING (the Packet Internet Groper) is an ICMP (Internet Control Management Protocol) protocol.

HTTPS is a Transport protocol.

Many network providers and service managers restrict access to their resources for only the necessary protocols and ports.

It is quite likely that the organisation that hosts 340bopais.hrsa.gov has configured firewalls and other network infrastructure to only permit TCP traffic on port 80 and 443 to their server.


Update:

I successfully, downloaded the file using java, and selenium. I made the whole code into a repository and you can download my code. But here i explain it to you how to work with it:

  1. Use your Eclipse to make a maven project

  2. Add a folder called driver into the resource folder

  3. Download this chrome.exe driver, and put it into the driver folder.

  4. Add this dependency into your pom.xml:

        <dependency>
            <groupId>org.seleniumhq.selenium</groupId>
            <artifactId>selenium-java</artifactId>
            <version>3.4.0</version>
        </dependency>
    
  5. Into the main method type:

    public static void main(String[] args) {

            File file = new 
                           File(StackApplication.class.getClassLoader().getResource("driver/chromedriver.exe").getFile());
                String driverPath=file.getAbsolutePath();
                System.out.println("Webdriver is in path: "+driverPath);
                System.setProperty("webdriver.chrome.driver",driverPath);
    
                WebDriver driver=new ChromeDriver();
                driver.navigate().to("https://340bopais.hrsa.gov/reports");
                driver.findElement(By.xpath("//*[@id=\"headingTwo\"]/h4/a")).click();
                driver.findElement(By.xpath("//*[@id=\"ContentPlaceHolder1_lnkCEDailyReport\"]")).click();
    
    
    
        }
    

And it works like a charm

Sign up to request clarification or add additional context in comments.

8 Comments

That's the next part @Salman . The code is not able to connect to the website itself.
Hi @Salman , please tell me how should I connect to this website using Java ?
Hi @Salman , I haven't worked on Selenium ; but you can still post your answer. I will try to implement it.
Thanks @Salman. Looks like you are a Java nerd. I will try your solution tomorrow. I hope it's compatible with Java 6 as that is the latest version available on our server.
Wonderful @Salman !!!! Can't thank you enough... It worked for me very well. Just I had to use lower version of Selenium . Also, I had to explicitly mention the jar for org.springframework.boot.autoconfigure in Maven dependencies. It's working very well on my machine. I have to run it on the Unix server. Will work on that separately if I face issues there. But yes, you have given me the right solution.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.