0

I have a website from where I want to fetch video link from using Jsoup. But Im unable to do so my program throws an error. Can somebody please help me?

Here is the code:

import java.io.IOException;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class MovMaker {
  public static void main(String[] args) {
    try {
      String url="http://www.tamilyogi.tv/7aum-arivu-2011-hd-720p-tamil-movie-watch-online/";
      Document doc = Jsoup.connect(url).get();
      Element vid = doc.getElementsByTag("video").get(0);
       System.out.println("\nlink: " + vid.attr("src"));
        System.out.println("text: " + vid.text());
      catch (IOException e) {
    e.printStackTrace();
    }
  }
}

My Error:

Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 at java.util.ArrayList.rangeCheck(Unknown Source) at java.util.ArrayList.get(Unknown Source) at MovMaker.main(MovMaker.java:16)

The page source where I want to fetch the data from is: Here

Im new to java and jsoup completely I would be thankful if someone can give me the code.

Regards, Bhuvanesh

1
  • Please consider accepting my answer if your problem is solved. If not, please specify what exactly the problem is now. Commented Feb 8, 2016 at 9:51

2 Answers 2

1

There is no <video> tag in the directly loaded html of the link you have given. The tag is instead created by some JavaScript in the browser. Since JSoup does not run any JavaScript you are out of luck here.

What you can do is either use something like

or you analyze the contents of the html and maybe the network traffic that happens in the browser when you load that site in order to find out if you can construct the link from that info by hand. In your case I had a quick view on the html and found that the video tag is generated within an IFrame. In the source of the IFrame you find this part:

<script type="text/javascript">  jwplayer("vplayer").setup({
    sources: [{file:"http://cdn7.vidmad.tv/h7todtdxamlbu3tf6rutlihpzoz4di2fcsaje74hlrcqda7qibjmlb4vblxq/v.mp4",label:"720p"},{file:"http://cdn7.vidmad.tv/h7todtdxamlbu3tf6rutlihpzoz4di2fcsaje74hljcqda7qibjjd3opruyq/v.mp4",label:"360p","default": "true"},{file:"http://cdn7.vidmad.tv/h7todtdxamlbu3tf6rutlihpzoz4di2fcsaje74hlbcqda7qibjgcvfli2eq/v.mp4",label:"240p"}],
    image: "http://cdn7.vidmad.tv/i/01/00000/cjwf05thn2vm.jpg",
    duration:"9607",
    width: "100%",
    height: "350",
    aspectratio: "16:9",
    preload: "none",
    androidhls: "true",
    startparam: "start"

    ,tracks: []
    ,skin: "glow",abouttext:"VidMAD", aboutlink:"http://vidmad.tv"
  });

...

</script>

So the URL is part of a <script> tag. You can use regular expressions to get it:

Document doc = Jsoup.connect("http://www.tamilyogi.tv/7aum-arivu-2011-hd-720p-tamil-movie-watch-online/")
        .userAgent("Mozilla/5.0")
        .get();

Element iframeEl = doc.select("iframe[src*=embed]").first();
if (iframeEl != null){
    Document frameDoc = Jsoup.connect(iframeEl.attr("src"))
            .userAgent("Mozilla/5.0")
            .get();
    Elements scriptEls = frameDoc.select("script");
    for (Element scriptEl :scriptEls ){
        String html = scriptEl.html();
        Pattern p = Pattern.compile("sources:\\s*\\[\\{file:\"([^\"]+)");
            Matcher m = p.matcher(html);
            if (m.find()){
                String link = m.group(1);
                System.out.println(link);
                break;
            }
    }   
}

Of course my solution above only works for this site and link. You may need to adapt the approach to fit your needs, but the general idea should be clear now.

Sign up to request clarification or add additional context in comments.

Comments

0

`

Document doc = Jsoup.connect("http://www.tamilyogi.tv/7aum-arivu-2011-hd-720p-tamil-movie-watch-online/")
            .userAgent("Mozilla/5.0")
            .get();
    Element iframeEl = doc.select("iframe").first();
    System.out.println(iframeEl.absUrl("src"));

`

Hope this works for you.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.