1

I m using the Itextsharp v5.4.2 with mvc4 web app, when trying to add the view returned on the page, with few javascripts loaded, it is failing to parse the html string in the html parser of the itextsharp.

Kindly help me to know like is there any alternate way can parse the webpage to be converted to pdf using itextsharp. Correct me if i m using the wrong approach.

<script type="type/javascript">

$(document).ready(function(){});

</script> 

<html><table>adsfasdf..</table> some table elements.........</html>

C#code:

PdfWriter writer= PdfWriter.GetInstance(doc, new FileStream(pdfpath + "/abcdtest.pdf", FileMode.Create));

            doc.Open();
var parsedHtmlElement = HTMLWorker.ParseToList(new StringReader(decodedHtmlElement), null);
5
  • If you are using Html to pdf so <script> tag not working.please don't use javascript in html to Pdf. Commented Jul 22, 2013 at 10:26
  • so, is there no other way to parse that page then? Please let me know to ignore the script tags used in the html string to be passed for the htmlstring in pdfconversion Commented Jul 22, 2013 at 10:28
  • that means you want only html tag output in your Pdf.am I am right? Commented Jul 22, 2013 at 10:37
  • Yes.. that can be in C# also so that can filter the html codes from the page i get and parse it.. Please let me know, it wil be helpful.. thanks in Advance.. :) Commented Jul 22, 2013 at 10:48
  • I have added the code.Check it. Commented Jul 22, 2013 at 11:39

2 Answers 2

3

Use This Function Pass your Html string in HTMLCode and file Save Path in filePath.

 public void converttopdf(string HTMLCode, string filePath)
 {
        Document document = new Document();

        try
        {

            HTMLCode = Regex.Replace(HTMLCode, @"(<script[^*]*</script>)", "", RegexOptions.IgnoreCase);

            PdfWriter.GetInstance(document, new FileStream(filePath, FileMode.Create));
            document.Open();

            List<IElement> htmlarraylist = HTMLWorker.ParseToList(new StringReader(HTMLCode), null);
            for (int k = 0; k < htmlarraylist.Count; k++)
            {
                document.Add((IElement)htmlarraylist[k]);
            }

            document.Close();
        }
        catch
        {
        }
 }
Sign up to request clarification or add additional context in comments.

1 Comment

got one more idea and posted below.. :)
1

One more way also it can be resolved, like, in the javascript code we can take the html alone, instead of passing to the C# and replacing the script tags.

like this,

function IgnoreScripts(htmlString)
{
 var div = document.createElement('div');
        div.innerHTML = htmlString;
        var scripts = div.getElementsByTagName('script');
        var i = scripts.length;
        while (i--) {
            scripts[i].parentNode.removeChild(scripts[i]);
        }
        return div.innerHTML;
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.