Convert HTML content to PDF

How to convert HTML content to PDF? Once, this question was very intricate. But after getting the solution I found it like a piece of cake. Here, I have provided solution with C# code so that you don’t have to waste your time by looking here and there for the solutions.

I have used iTextSharp.dll (version 4.0.6.0) in my solution which is very useful indeed. You can get it from here.

Review the below code, where I have converted content of the GridView to HTML and then I have converted that HTML content into a PDF file.

I have put comments with the code for more specification.
//Include following name space to use iTextSharp library
using iTextSharp.text;
using iTextSharp.text.pdf;
//Document is inbuilt class, available in iTextSharp
Document document = new Document(PageSize.A4, 80, 50, 30, 65);
StringBuilder strData = new StringBuilder(string.Empty);
//I have provided Path for the HTML which will be generated from GridView content
string strHTMLpath = Server.MapPath("MyHTML.html");
//I have provided Path for the PDF file which will be generated from HTML content
string strPDFpath = Server.MapPath("MyPDF.pdf");
try
{
StringWriter sw = new StringWriter();
sw.WriteLine(Environment.NewLine);
sw.WriteLine(Environment.NewLine);
sw.WriteLine(Environment.NewLine);
sw.WriteLine(Environment.NewLine);
HtmlTextWriter htw = new HtmlTextWriter(sw);
//gvSearchResult is a GridView, I have converted its content to HTML and will acquire final PDF file
/*Here i have set AllowPaging and AllowSorting property of GridView as false, As my aim is to get whole content of the GridView in a single table, by setting these properties and binding the GridView again will remove paging and sorting property from it. */
gvSerchResult.AllowPaging = false;
gvSerchResult.AllowSorting = false;
BindGridView();
//Rendering the HtmlTextWriter
gvSerchResult.RenderControl(htw);
/*Here i have set AllowPaging and AllowSorting property of GridView as true, As my aim is to get whole content of the GridView is now finished and I have acquire its content in HtmlTextWriter. Then by setting properties again I will get the original form of my GridView again*/
gvSerchResult.AllowPaging = true;
gvSerchResult.AllowSorting = true;
BindGridView();
StreamWriter strWriter = new StreamWriter(strHTMLpath, false, Encoding.UTF8);
strWriter.Write("<html><head><link href=Style.css rel=stylesheet type=text/css /></head><body>" + htw.InnerWriter.ToString() + "</body></html>");
strWriter.Close();
strWriter.Dispose();
iTextSharp.text.html.simpleparser.
StyleSheet styles = new iTextSharp.text.html.simpleparser.StyleSheet();
styles.LoadTagStyle("ol", "leading", "16,0");
PdfWriter.GetInstance(document, new FileStream(strPDFpath, FileMode.Create));
document.Add(new Header(iTextSharp.text.html.Markup.HTML_ATTR_STYLESHEET, "Style.css"));
document.Open();
ArrayList objects;
styles.LoadTagStyle("li", "face", "garamond");
styles.LoadTagStyle("span", "size", "8px");
styles.LoadTagStyle("body", "font-family", "times new roman");
styles.LoadTagStyle("body", "font-size", "10px");
document.NewPage();
objects = iTextSharp.text.html.simpleparser.
HTMLWorker.ParseToList(new StreamReader(strHTMLpath, Encoding.Default), styles);
for (int k = 0; k < objects.Count; k++)
{
document.Add((IElement)objects[k]);
}
}
catch (Exception ex)
{
throw ex;
}
finally
{
document.Close();
Response.Write(Server.MapPath("~/" + strPDFpath));
Response.ClearContent();
Response.ClearHeaders();
Response.AddHeader("Content-Disposition", "attachment; filename=" + strPDFpath);
Response.ContentType = "application/octet-stream";
Response.WriteFile(Server.MapPath("~/" + strPDFpath));
Response.Flush();
Response.Close();
if (File.Exists(Server.MapPath("~/" + strPDFpath)))
{
File.Delete(Server.MapPath("~/" + strPDFpath));
}
}

Whether you found this article useful or not? please provide your valuable comments.

43 Responses to “Convert HTML content to PDF”

  1. Great Article, it helped me a lot. Thank you very much, keep mailing me about ur good work. Thank a lot once again

  2. Fantastic Article !!!!,
    I have checked so many articles on this topic & this certainly is the best.

    Thanks a lot & keep up the good work….

  3. Hi!
    I used ur Code, didn’t change even 1 leter. Build succsess, but when i browse it i got a mistake:
    “~/C:\Inetpub\wwwroot\FinfoBase\Admin\FINFOBASE_Form\MyPDF.pdf’ is not a valid virtual path” – C:\Inetpub\wwwroot\FinfoBase is where i placed my project.

    Source Error at line : Respone.write(Server.Mapath(“~”+strPDFpath));

    Help me plz!

  4. Hi Nam,

    You need to provide only virtual path for the Server.MapPath( ) method. For example “~/FinfoBase/Admin/…../MyPDF.pdf”.

    There for use the code like
    Server.MapPath(“~/FinfoBase/Admin/…../MyPDF.pdf”)

    Hope this will help to solve your problem.
    Also check on the web, how to provide virtual path for this method.

  5. Thanks hspinfo for the response, but now i get another error, name “Could not find file ‘C:\Inetpub\wwwroot\FinfoBase\Admin\FINFOBASE_Form\MyPDF.pdf’”.
    I guess that i failed to generate the “MyPDF.pdf” file or “MyHTML.html” file from the begining but i dont know how to fix it.
    Could you mind send me the example you ‘ve done so that better for me and dont waste your time.
    Thanks again .

  6. Hi,

    How do i do with standalone java. What is the jar file that i need to include to get the following classes to convert the html content to pdf.

    string strHTMLpath = Server.MapPath(”MyHTML.html”);
    Environment.NewLine
    HtmlTextWriter htw = new HtmlTextWriter(sw);

    Please share.

    Regards,
    Vijay.B

  7. Hi Nam,

    You can just copy/paste code that is given here… and it will generate the HTML and PDF files on the same folder where your project is…

  8. Hi Vijay,

    I am not aware with the Java, so I don’t know anything much on that…

    sorry dear…

  9. Hi hspinfo!

    Here are those code:

    using System;
    using System.Data;
    using System.Configuration;
    using System.Collections;
    using System.Web;
    using System.Web.Security;
    using System.Web.UI;
    using System.Web.UI.WebControls;
    using System.Web.UI.WebControls.WebParts;
    using System.Web.UI.HtmlControls;
    using iTextSharp.text;
    using iTextSharp.text.pdf;
    using System.Text;
    using System.IO;
    using FinfoBase.Engine;

    public partial class Admin_FINFOBASE_Form_HTMLToPDF : System.Web.UI.Page
    {
    protected void Page_Load(object sender, EventArgs e)
    {
    if (!IsPostBack)
    {
    BindGridView();
    }
    }
    protected void BindGridView()
    {
    DataTable dt = FINFOBASE_FormDB.GetAll();
    gvSerchResult.DataSource = dt;
    gvSerchResult.DataBind();
    }
    protected void btnConvert_Click(object sender, EventArgs e)
    {
    Document document = new Document(PageSize.A4, 80, 50, 30, 65);
    StringBuilder strData = new StringBuilder(string.Empty);
    //I have provided Path for the HTML which will be generated from GridView
    string strHTMLpath = Server.MapPath(“MyHTML.html”);
    //I have provided Path for the PDF file which will be generated from HTML content
    string strPDFpath = Server.MapPath(“MyPDF.pdf”);
    try
    {
    StringWriter sw = new StringWriter();
    sw.WriteLine(Environment.NewLine);
    sw.WriteLine(Environment.NewLine);
    sw.WriteLine(Environment.NewLine);
    sw.WriteLine(Environment.NewLine);
    HtmlTextWriter htw = new HtmlTextWriter(sw);
    //gvSearchResult is a GridView, I have converted its content to HTML and will acquire final PDF file
    /*Here i have set AllowPaging and AllowSorting property of GridView as false, As my aim is to get
    whole content of the GridView in a single table, by setting these properties and binding
    the GridView again will remove paging and sorting property from it. */
    gvSerchResult.AllowPaging = false;
    gvSerchResult.AllowSorting = false;
    BindGridView();
    //Rendering the HtmlTextWriter
    gvSerchResult.RenderControl(htw);
    /*Here i have set AllowPaging and AllowSorting property of GridView as true, As my aim is to get
    whole content of the GridView is now finished and I have acquire its content in
    HtmlTextWriter. Then by setting properties again I will get the original form of my GridView again*/
    gvSerchResult.AllowPaging = true;
    gvSerchResult.AllowSorting = true;
    BindGridView();
    StreamWriter strWriter = new StreamWriter(strHTMLpath, false, Encoding.UTF8);
    strWriter.Write(“” + htw.InnerWriter.ToString() + “”);
    strWriter.Close();
    strWriter.Dispose();
    iTextSharp.text.html.simpleparser.
    StyleSheet styles = new iTextSharp.text.html.simpleparser.StyleSheet();
    styles.LoadTagStyle(“ol”, “leading”, “16,0″);
    PdfWriter.GetInstance(document, new FileStream(@”C:\Inetpub\wwwroot\FinfoBase\Admin\FINFOBASE_Form\MyPDF.pdf”, FileMode.Create));
    document.Add(new Header(iTextSharp.text.html.Markup.HTML_ATTR_STYLESHEET, “Style.css”));
    document.Open();
    ArrayList objects;
    styles.LoadTagStyle(“li”, “face”, “garamond”);
    styles.LoadTagStyle(“span”, “size”, “8px”);
    styles.LoadTagStyle(“body”, “font-family”, “times new roman”);
    styles.LoadTagStyle(“body”, “font-size”, “10px”);
    document.NewPage();
    objects = iTextSharp.text.html.simpleparser.
    HTMLWorker.ParseToList(new StreamReader(strHTMLpath, Encoding.Default), styles);
    for (int k = 0; k < objects.Count; k++)
    {
    document.Add((IElement)objects[k]);
    }
    }
    catch (Exception ex)
    {
    throw ex;
    }
    finally
    {
    document.Close();
    Response.Write(Server.MapPath(“~/Admin/FINFOBASE_Form/MyPDF.pdf “));
    Response.ClearContent();
    Response.ClearHeaders();
    Response.AddHeader(“Content-Disposition”, “attachment; filename=” + strPDFpath);
    Response.ContentType = “application/octet-stream”;
    Response.WriteFile(Server.MapPath(“~/Admin/FINFOBASE_Form/MyPDF.pdf “));
    Response.Flush();
    Response.Close();
    if (File.Exists(Server.MapPath(“~/Admin/FINFOBASE_Form/MyPDF.pdf “)))
    {
    File.Delete(Server.MapPath(Server.MapPath(“~/Admin/FINFOBASE_Form/MyPDF.pdf “)));
    }
    }
    }
    }

  10. Hello Nam,

    You have written wrong code for the below line:

    PdfWriter.GetInstance(document, new FileStream(@”C:\Inetpub\wwwroot\FinfoBase\Admin\FINFOBASE_Form\MyPDF.pdf”, FileMode.Create));

    pls change to :
    PdfWriter.GetInstance(document, new FileStream(strPDFpath, FileMode.Create));

    Pls assign path to the variable strPDFPath, and strHTMLPath while declaring, and use the same variable for all process it as below, and do not provide any static path during the process:
    string strHTMLpath = Server.MapPath(”~/Admin/FINFOBASE_Form/MyHTML.htm”);
    string strPDFpath = Server.MapPath(”~/Admin/FINFOBASE_Form/MyPDF.pdf”);

  11. Hi… I have a problem with this code. It generates right the pdf file, but when I try to make another one, I get an error. It said that HTML file is still in use. What should I do to solve this?

    Thanks in advance and sorry for my bad english.

  12. silent Bob Says:

    Hi there Vijay,

    To do this using Java you should check out iText which is the original Java implementation of the iTextSharp library.

    Hope this Helps

  13. silent Bob Says:

    Also to Lucía,

    The reason the HTML file is still in use is because the HtmlTextWriter is never closed in the code. The easiest way to remedy this would be in the finally block add the line
    htw.Close();
    NOTE: you will also need to move the definition of htw out of the try block so that it is available inside the finaly block.
    This should release any resources held by the HtmlTextWriter object and let you reuse the file.

    :-)

  14. How way can convert to multi languages.

    Thanks.

  15. Does anybody have an example of working project in C# if so please send it rared to my email kristofer8@o2.pl. Thx a lot in advance.

  16. hi!
    Thank you very much for your information about convert html to pdf but i can’t convert this especially another text for example can you help me how can ı convert in asp.net FCKEditor1.value convent to pdf with html tags like tables or msn smiles or normal text.
    i need so much very urgently .
    Thank for your interest
    i am waiting for your return
    Güncel Sarıman

  17. ”’ Html to pdf….

    Dim document As Document
    Dim strData As StringBuilder
    Dim strHTMLpath As String = “”
    Dim strPDFpath As String = “”
    Dim sw As StringWriter
    Dim strHtml As String = “”
    Dim strWriter As StreamWriter
    Dim styles As iTextSharp.text.html.simpleparser.StyleSheet
    Dim HTMLWorker As iTextSharp.text.html.simpleparser.HTMLWorker
    Dim objects As ArrayList

    document = New Document(PageSize.A4, 80, 50, 30, 65)
    strData = New StringBuilder(String.Empty)

    strHTMLpath = Server.MapPath(“MyHTML.html”)
    strPDFpath = Server.MapPath(“MyHTML.pdf”)
    Try
    sw = New StringWriter
    sw.WriteLine(Environment.NewLine)

    Dim i As Integer
    strHtml += “”
    For i = 0 To 70
    strHtml += “”
    strHtml += “  ภาษาไทย 111″
    strHtml += “ ”
    strHtml += “ ”
    Next
    strHtml += “”
    strHtml += “”
    strHtml += “ English 222″
    strHtml += ”  ”
    strHtml += “ ”
    strHtml += “”
    strHtml += “”
    strHtml += “”

    strWriter = New StreamWriter(strHTMLpath, False, Encoding.UTF8)
    strWriter.Write(“” + strHtml + “”)
    strWriter.Close()

    styles = New iTextSharp.text.html.simpleparser.StyleSheet
    styles.LoadTagStyle(“ol”, “leading”, “16″)

    PdfWriter.GetInstance(document, New FileStream(strPDFpath, FileMode.Create))
    document.Add(New Header(iTextSharp.text.html.Markup.HTML_ATTR_STYLESHEET, “Style.css”))

    ”footer
    Dim HeaderFooter As HeaderFooter
    HeaderFooter = New HeaderFooter(New Phrase(“Page “), True)
    HeaderFooter.Border = 1
    document.Footer = HeaderFooter
    document.Open()

    document.NewPage()

    ”add picture
    ‘Dim gif1 = iTextSharp.text.Image.GetInstance(Server.MapPath(“../images/sponser/CIES_logo.gif”))
    ‘document.Add(gif1)

    objects = iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(New StreamReader(strHTMLpath, Encoding.Default), styles)

    HTMLWorker.ParseToList(New StreamReader(strHTMLpath, Encoding.Default), styles)
    Dim intK As Int32 = 0
    For intK = 0 To objects.Count – 1
    document.Add(objects(intK))
    Next

    Catch ex As Exception

    Finally
    document.Close()
    Response.Write(strPDFpath)
    Response.ClearContent()
    Response.ClearHeaders()
    Response.AddHeader(“Content-Disposition”, “attachment; filename=” + strPDFpath)
    Response.ContentType = “application/octet-stream”
    Response.WriteFile(strPDFpath)
    Response.Flush()
    Response.Close()

    End Try

  18. nothing was done by using this code….

  19. Hi, I tried this example. However i couldn’t see style sheet getting applied. what could be the reason? I have checked the html file bring generated, it had style sheet applied.

  20. oas’s code works well. I want to create a asp webform using vb.net, when click a button on the page, a table on the page will be saved to a pdf file, how can I change the code, like change strHTMLpath to table ID. Thanks.

  21. Niall Little Says:

    Hello, i’m looking to find a list of the attributes that you can use with the StyleSheet.LoadTagStyle and their definition.
    Is that available anywhere? Specifically, other than controlling font and font size i want to control vertical whitespace and indentation. the “leading” tag almost does this, but does not quite do what i hoped for.

    thanks,
    Niall Little

  22. Hi, cool site, good writing ;)

  23. When i try to use the above code I get the following error:
    Control ‘gvSerchResult’ of type ‘GridView’ must be placed inside a form tag with runat=server.

    Can anybody suggest a solution for this.

  24. Thijs Vanbrabant Says:

    Rakesh,

    I had the same problem and you have to override the VerifyRenderingInServerForm Method. Like this:

    public override void VerifyRenderingInServerForm(Control control)
    {

    }

    Greetz,

    Thijs

  25. vishal parekh Says:

    Hi hspinfo,

    I tried your code, and it works fine for me .

    Thanks a lot.

    Excellent work done By You.

  26. Vishal Parekh Says:

    Hi, hspinfo

    In PDF which has been generated from Html file, in page of PDF

    “Untitled Page” is written and then the content is started.

    any idea How To Solve this problem?

    Thanks

  27. hello everybody,

    I used ur code to convert html file into pdf in some application,
    but I notice I haven’t the class Markup in my iTextSharp, is this corrupt one? or should I consider something else?

  28. Minh Duong Says:

    Hi every body !!!
    Thank’s a lot !!! but i can’t found “using FinfoBase.Engine;” Help me.
    Now who can tell me how can convert .doc file tho .pdf in asp.net
    I’m thank;s you so much !!!.

  29. ArunBaskar Says:

    Hi, I tried this example. However i couldn’t see style sheet getting applied. what could be the reason?.

  30. ArunBaskar Says:

    Hi, I tried this example. However i couldn’t see style sheet getting applied. what could be the reason? I have checked the html file bring generated, it had style sheet applied.

  31. Hi,

    Iam using the code given by you for converting my web page to pdf but iam having the problem of binding my gridview because iam using data list inside the datagrid for my data representation..and how to do this phenomenon…Please help me..My requirement is very urgent

  32. manoj kumar Says:

    i am using datalist..
    i want to convert datalist to pdf format using itextsharp
    My requirement is very urgent

  33. saarraah33 Says:

    hii , how i can convert pdf to html plz?

  34. Amiya Rout Says:

    hi

    when i m executing the code
    Response.WriteFile(Server.MapPath(“~/” + strPDFpath));

    i am getting error The process cannot access the file “path..\outputfiles\914532.pdf” because it is being used by another process.

    please help

  35. Hi,

    Thank You for the Great Work.

    I have report a where the asp.net tabled has rowspan in order to group the data., when i push this HTML to create PDF, the PDF is created in an unstructured manner., ie., no rowspan is applied.

    Is there any way to work on this., (C#.net 2.0)

  36. Deepak Jindal Says:

    Thanks a Lot !!!
    Very Gud Site. But i’ am getting a problem. I’ am not able to add image in my string builder. Or i closed the htw in the finally block but still it give me an error message “HTML file is used It is used by another Process”.

    PLZ Help

    Thanks a lot in Advance.

    • Hi Deepak.,

      This might be because of the StreamReader is not closed.

      ‘In this Code might cause you the issue..’
      objects = iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(new StreamReader(strHTMLpath, Encoding.Default), styles);

      ‘Rewrite the above as..’

      StreamReader objStreamReader;
      objStreamReader = new StreamReader(strHTMLpath, Encoding.Default);
      objects = iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(objStreamReader, styles);
      objStreamReader.Close();
      objStreamReader.Dispose();

      Just give a try..

      Happy Coding !!

      • Deepak Jindal Says:

        Thanks Pradeep it’s working fine. But i’ am facing two more issues:
        1. How we can Merge PDF Files
        2. Can we add an image in our html string (Actually) it is not taking the path. It give me an exception “IT is not a virtual path”.

        If you have any ideas send me the code.
        Thanks a lot Buddy !!!..

  37. Superb……..
    Simply Superb….

    This helps me a lot….
    You done a marvelous work..

    Thanks my friend….Thanks

  38. In my output pdf the font sizes and other html formatting are coming through in the pdf but not font styles..

    Registering a certain font for a tag like ariel for span tag wouldn’t make sense in my case since the user can enter different font families for style tags, for instance they have may the word Hello in ariel followed by the name in Courier font. Any ideas?

    Thanks,

  39. Hi, I am getting an error….
    “Could not find a part of the path ‘C:\Registration_files\image001.gif’.”

    I am converting Registration.html file which has its Registration_files folder in the same folder as the file is in. But still the code checks for files in “C:\Registration_files\..” a fixed location. I have never given this as default location.

    Here is my code

    ArrayList objects;

    iTextSharp.text.html.simpleparser.StyleSheet styles = new iTextSharp.text.html.simpleparser.StyleSheet();

    (It is giving exception on this line =>) objects = iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(new StreamReader(dir + ConvertFileName + “.html”), styles);

    for (int k = 0; k < objects.Count; k++)
    {
    myDocument.Add((IElement)objects[k]);
    }

  40. hi

    am am convert aspx page to PDf using your First code it generate html and pdf file successfully in Html format abd css is Ok but for Pdf formate are not Set alignment of table are not looking god 3 html page become 14 Pdf pages can you provide me any sugestion

    panal in place of Grideview in panal there are multiple html table

    please provide me solution

Leave a Reply