Issue while converting HTML to PDF using itextSharp for < symbol

Question

I am trying to convert HTML content to PDF using itextSharp in .net application using c#. While doing so m gettting my content trucated after '<' symbol. For conversion I am using following line:

HTMLWorker.ParseToList(new StringReader(htmlContent, null);

This is the code snippet with need to add referece 'itextsharp.dll' reference

Following is Code Snippet

using System;
using System.Collections.Generic;
using System.Linq;
using System.IO;
using iTextSharp.text;
using iTextSharp.text.pdf;
using iTextSharp.text.html.simpleparser;
using iTextSharp.text.pdf.draw;

namespace ConsoleApplication2
{
    class Program
    {
        static void Main(string[] args)
        {
            CreatePDF();
        }

        static void CreatePDF()
        {
           string fileName = string.Empty;

            DateTime fileCreationDatetime = DateTime.Now;

            fileName = string.Format("{0}.pdf", fileCreationDatetime.ToString(@"yyyyMMdd") + "_New" + fileCreationDatetime.ToString(@"HHmmss"));

            string pdfPath = "D:\\" + fileName;

            using (FileStream msReport = new FileStream(pdfPath, FileMode.Create))
            {
                //step 1
                using (Document pdfDoc = new Document(PageSize.A4, 10f, 10f, 140f, 10f))
                {
                    try
                    {
                        PdfWriter pdfWriter = PdfWriter.GetInstance(pdfDoc, msReport);
                        //pdfWriter.PageEvent = new ConsoleApplication1.ITextEvents();

                        //open the stream
                        pdfDoc.Open();

                        {
                            Paragraph para = new Paragraph("Hello world. Checking Header Footer", new Font(Font.FontFamily.HELVETICA, 16));
                            para.Alignment = Element.ALIGN_CENTER;
                            pdfDoc.NewPage();
                            string str = "<b>Try<</b>";

                            StringReader TheStream = new StringReader(str.ToString());

                            List<IElement> htmlElementsh = HTMLWorker.ParseToList(TheStream, null);

                            IElement htmlElementh = (IElement)htmlElementsh[0];
                            pdfDoc.Add((Paragraph)htmlElementh);
                        }

                        pdfDoc.Close();

                    }
                    catch (Exception ex)
                    {
                        //handle exception
                    }

                    finally
                    {

                    }
                }
            }
        }
    }
}

Parsing HTML content directly to PDF can cause unexpected behavior, some tags are not supported and cause inappropriate results or corrupted file. Why is there an extra < after Try? — Codeek
– Codeek, Commented Dec 19, 2014 at 4:57
I can't understand why you are using IElement to insert normal string and why have you created a paragraphp 'para' if you are not adding it to pdfDoc? what how much data you are getting in your pdf and what is getting truncated? — Codeek
– Codeek, Commented Dec 19, 2014 at 5:03
put your html content within panel HtmlTextWriter hw = new HtmlTextWriter(sw); panel.RenderControl(hw); — Manish Goswami
– Manish Goswami, Commented Dec 19, 2014 at 5:08
@Codeek : I am using IElement coz its requirement of my application , the pdf I am trying to generate is pdf of 3-4 pages containing various HTML tables n u can ignore that those first 2 lines. Those are in actual code. N after '<' my that line gets truncated e.g. m having Line 1:"Line 1 < Testing " I ll get in pdf just "Line 1" — Vrishali
– Vrishali, Commented Dec 19, 2014 at 5:31

mkl · Accepted Answer · 2020-02-04 10:05:23Z

3

Kindly use < instead of < and use > instead of >.

edited Feb 4, 2020 at 10:05

mkl

97k17 gold badges144 silver badges302 bronze badges

answered Dec 19, 2014 at 7:10

Ravi Kanasagra

6111 gold badge10 silver badges23 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Issue while converting HTML to PDF using itextSharp for < symbol

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related