Convert HTML content to PDF
How to convert HTML content to PDF? Once, this question was very intricate. But after getting the solution I found it like a piece of cake. Here, I have provided solution with C# code so that you don’t have to waste your time by looking here and there for the solutions.
I have used iTextSharp.dll (version 4.0.6.0) in my solution which is very useful indeed. You can get it from here.
Review the below code, where I have converted content of the GridView to HTML and then I have converted that HTML content into a PDF file.
I have put comments with the code for more specification.
//Include following name space to use iTextSharp library
using iTextSharp.text;
using iTextSharp.text.pdf;
//Document is inbuilt class, available in iTextSharp
Document document = new Document(PageSize.A4, 80, 50, 30, 65);
StringBuilder strData = new StringBuilder(string.Empty);
//I have provided Path for the HTML which will be generated from GridView content
string strHTMLpath = Server.MapPath("MyHTML.html");
//I have provided Path for the PDF file which will be generated from HTML content
string strPDFpath = Server.MapPath("MyPDF.pdf");
try
{
StringWriter sw = new StringWriter();
sw.WriteLine(Environment.NewLine);
sw.WriteLine(Environment.NewLine);
sw.WriteLine(Environment.NewLine);
sw.WriteLine(Environment.NewLine);
HtmlTextWriter htw = new HtmlTextWriter(sw);
//gvSearchResult is a GridView, I have converted its content to HTML and will acquire final PDF file
/*Here i have set AllowPaging and AllowSorting property of GridView as false, As my aim is to get whole content of the GridView in a single table, by setting these properties and binding the GridView again will remove paging and sorting property from it. */
gvSerchResult.AllowPaging = false;
gvSerchResult.AllowSorting = false;
BindGridView();
//Rendering the HtmlTextWriter
gvSerchResult.RenderControl(htw);
/*Here i have set AllowPaging and AllowSorting property of GridView as true, As my aim is to get whole content of the GridView is now finished and I have acquire its content in HtmlTextWriter. Then by setting properties again I will get the original form of my GridView again*/
gvSerchResult.AllowPaging = true;
gvSerchResult.AllowSorting = true;
BindGridView();
StreamWriter strWriter = new StreamWriter(strHTMLpath, false, Encoding.UTF8);
strWriter.Write("<html><head><link href=Style.css rel=stylesheet type=text/css /></head><body>" + htw.InnerWriter.ToString() + "</body></html>");
strWriter.Close();
strWriter.Dispose();
iTextSharp.text.html.simpleparser.
StyleSheet styles = new iTextSharp.text.html.simpleparser.StyleSheet();
styles.LoadTagStyle("ol", "leading", "16,0");
PdfWriter.GetInstance(document, new FileStream(strPDFpath, FileMode.Create));
document.Add(new Header(iTextSharp.text.html.Markup.HTML_ATTR_STYLESHEET, "Style.css"));
document.Open();
ArrayList objects;
styles.LoadTagStyle("li", "face", "garamond");
styles.LoadTagStyle("span", "size", "8px");
styles.LoadTagStyle("body", "font-family", "times new roman");
styles.LoadTagStyle("body", "font-size", "10px");
document.NewPage();
objects = iTextSharp.text.html.simpleparser.
HTMLWorker.ParseToList(new StreamReader(strHTMLpath, Encoding.Default), styles);
for (int k = 0; k < objects.Count; k++)
{
document.Add((IElement)objects[k]);
}
}
catch (Exception ex)
{
throw ex;
}
finally
{
document.Close();
Response.Write(Server.MapPath("~/" + strPDFpath));
Response.ClearContent();
Response.ClearHeaders();
Response.AddHeader("Content-Disposition", "attachment; filename=" + strPDFpath);
Response.ContentType = "application/octet-stream";
Response.WriteFile(Server.MapPath("~/" + strPDFpath));
Response.Flush();
Response.Close();
if (File.Exists(Server.MapPath("~/" + strPDFpath)))
{
File.Delete(Server.MapPath("~/" + strPDFpath));
}
}
Whether you found this article useful or not? please provide your valuable comments.
June 14, 2008 at 6:47 am
Great Article, it helped me a lot. Thank you very much, keep mailing me about ur good work. Thank a lot once again
October 8, 2008 at 1:38 pm
Fantastic Article !!!!,
I have checked so many articles on this topic & this certainly is the best.
Thanks a lot & keep up the good work….
October 16, 2008 at 9:40 am
Hi!
I used ur Code, didn’t change even 1 leter. Build succsess, but when i browse it i got a mistake:
“~/C:\Inetpub\wwwroot\FinfoBase\Admin\FINFOBASE_Form\MyPDF.pdf’ is not a valid virtual path” – C:\Inetpub\wwwroot\FinfoBase is where i placed my project.
Source Error at line : Respone.write(Server.Mapath(“~”+strPDFpath));
Help me plz!
October 16, 2008 at 12:11 pm
Hi Nam,
You need to provide only virtual path for the Server.MapPath( ) method. For example “~/FinfoBase/Admin/…../MyPDF.pdf”.
There for use the code like
Server.MapPath(“~/FinfoBase/Admin/…../MyPDF.pdf”)
Hope this will help to solve your problem.
Also check on the web, how to provide virtual path for this method.
October 17, 2008 at 9:43 am
Thanks hspinfo for the response, but now i get another error, name “Could not find file ‘C:\Inetpub\wwwroot\FinfoBase\Admin\FINFOBASE_Form\MyPDF.pdf’”.
I guess that i failed to generate the “MyPDF.pdf” file or “MyHTML.html” file from the begining but i dont know how to fix it.
Could you mind send me the example you ‘ve done so that better for me and dont waste your time.
Thanks again .
October 18, 2008 at 8:09 am
Hi,
How do i do with standalone java. What is the jar file that i need to include to get the following classes to convert the html content to pdf.
string strHTMLpath = Server.MapPath(”MyHTML.html”);
Environment.NewLine
HtmlTextWriter htw = new HtmlTextWriter(sw);
Please share.
Regards,
Vijay.B
October 18, 2008 at 1:12 pm
Hi Nam,
You can just copy/paste code that is given here… and it will generate the HTML and PDF files on the same folder where your project is…
October 18, 2008 at 1:14 pm
Hi Vijay,
I am not aware with the Java, so I don’t know anything much on that…
sorry dear…
October 20, 2008 at 2:06 am
Hi hspinfo!
Here are those code:
using System;
using System.Data;
using System.Configuration;
using System.Collections;
using System.Web;
using System.Web.Security;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.Web.UI.WebControls.WebParts;
using System.Web.UI.HtmlControls;
using iTextSharp.text;
using iTextSharp.text.pdf;
using System.Text;
using System.IO;
using FinfoBase.Engine;
public partial class Admin_FINFOBASE_Form_HTMLToPDF : System.Web.UI.Page
{
protected void Page_Load(object sender, EventArgs e)
{
if (!IsPostBack)
{
BindGridView();
}
}
protected void BindGridView()
{
DataTable dt = FINFOBASE_FormDB.GetAll();
gvSerchResult.DataSource = dt;
gvSerchResult.DataBind();
}
protected void btnConvert_Click(object sender, EventArgs e)
{
Document document = new Document(PageSize.A4, 80, 50, 30, 65);
StringBuilder strData = new StringBuilder(string.Empty);
//I have provided Path for the HTML which will be generated from GridView
string strHTMLpath = Server.MapPath(“MyHTML.html”);
//I have provided Path for the PDF file which will be generated from HTML content
string strPDFpath = Server.MapPath(“MyPDF.pdf”);
try
{
StringWriter sw = new StringWriter();
sw.WriteLine(Environment.NewLine);
sw.WriteLine(Environment.NewLine);
sw.WriteLine(Environment.NewLine);
sw.WriteLine(Environment.NewLine);
HtmlTextWriter htw = new HtmlTextWriter(sw);
//gvSearchResult is a GridView, I have converted its content to HTML and will acquire final PDF file
/*Here i have set AllowPaging and AllowSorting property of GridView as false, As my aim is to get
whole content of the GridView in a single table, by setting these properties and binding
the GridView again will remove paging and sorting property from it. */
gvSerchResult.AllowPaging = false;
gvSerchResult.AllowSorting = false;
BindGridView();
//Rendering the HtmlTextWriter
gvSerchResult.RenderControl(htw);
/*Here i have set AllowPaging and AllowSorting property of GridView as true, As my aim is to get
whole content of the GridView is now finished and I have acquire its content in
HtmlTextWriter. Then by setting properties again I will get the original form of my GridView again*/
gvSerchResult.AllowPaging = true;
gvSerchResult.AllowSorting = true;
BindGridView();
StreamWriter strWriter = new StreamWriter(strHTMLpath, false, Encoding.UTF8);
strWriter.Write(“” + htw.InnerWriter.ToString() + “”);
strWriter.Close();
strWriter.Dispose();
iTextSharp.text.html.simpleparser.
StyleSheet styles = new iTextSharp.text.html.simpleparser.StyleSheet();
styles.LoadTagStyle(“ol”, “leading”, “16,0″);
PdfWriter.GetInstance(document, new FileStream(@”C:\Inetpub\wwwroot\FinfoBase\Admin\FINFOBASE_Form\MyPDF.pdf”, FileMode.Create));
document.Add(new Header(iTextSharp.text.html.Markup.HTML_ATTR_STYLESHEET, “Style.css”));
document.Open();
ArrayList objects;
styles.LoadTagStyle(“li”, “face”, “garamond”);
styles.LoadTagStyle(“span”, “size”, “8px”);
styles.LoadTagStyle(“body”, “font-family”, “times new roman”);
styles.LoadTagStyle(“body”, “font-size”, “10px”);
document.NewPage();
objects = iTextSharp.text.html.simpleparser.
HTMLWorker.ParseToList(new StreamReader(strHTMLpath, Encoding.Default), styles);
for (int k = 0; k < objects.Count; k++)
{
document.Add((IElement)objects[k]);
}
}
catch (Exception ex)
{
throw ex;
}
finally
{
document.Close();
Response.Write(Server.MapPath(“~/Admin/FINFOBASE_Form/MyPDF.pdf “));
Response.ClearContent();
Response.ClearHeaders();
Response.AddHeader(“Content-Disposition”, “attachment; filename=” + strPDFpath);
Response.ContentType = “application/octet-stream”;
Response.WriteFile(Server.MapPath(“~/Admin/FINFOBASE_Form/MyPDF.pdf “));
Response.Flush();
Response.Close();
if (File.Exists(Server.MapPath(“~/Admin/FINFOBASE_Form/MyPDF.pdf “)))
{
File.Delete(Server.MapPath(Server.MapPath(“~/Admin/FINFOBASE_Form/MyPDF.pdf “)));
}
}
}
}
October 20, 2008 at 7:45 am
Hello Nam,
You have written wrong code for the below line:
PdfWriter.GetInstance(document, new FileStream(@”C:\Inetpub\wwwroot\FinfoBase\Admin\FINFOBASE_Form\MyPDF.pdf”, FileMode.Create));
pls change to :
PdfWriter.GetInstance(document, new FileStream(strPDFpath, FileMode.Create));
Pls assign path to the variable strPDFPath, and strHTMLPath while declaring, and use the same variable for all process it as below, and do not provide any static path during the process:
string strHTMLpath = Server.MapPath(”~/Admin/FINFOBASE_Form/MyHTML.htm”);
string strPDFpath = Server.MapPath(”~/Admin/FINFOBASE_Form/MyPDF.pdf”);
October 27, 2008 at 11:36 am
Hi… I have a problem with this code. It generates right the pdf file, but when I try to make another one, I get an error. It said that HTML file is still in use. What should I do to solve this?
Thanks in advance and sorry for my bad english.
November 6, 2008 at 2:20 pm
Hi there Vijay,
To do this using Java you should check out iText which is the original Java implementation of the iTextSharp library.
Hope this Helps
November 6, 2008 at 2:26 pm
Also to Lucía,
The reason the HTML file is still in use is because the HtmlTextWriter is never closed in the code. The easiest way to remedy this would be in the finally block add the line
htw.Close();
NOTE: you will also need to move the definition of htw out of the try block so that it is available inside the finaly block.
This should release any resources held by the HtmlTextWriter object and let you reuse the file.
November 27, 2008 at 7:41 am
How way can convert to multi languages.
Thanks.
November 27, 2008 at 10:47 am
Does anybody have an example of working project in C# if so please send it rared to my email kristofer8@o2.pl. Thx a lot in advance.
November 27, 2008 at 8:09 pm
hi!
Thank you very much for your information about convert html to pdf but i can’t convert this especially another text for example can you help me how can ı convert in asp.net FCKEditor1.value convent to pdf with html tags like tables or msn smiles or normal text.
i need so much very urgently .
Thank for your interest
i am waiting for your return
Güncel Sarıman
November 28, 2008 at 3:29 am
”’ Html to pdf….
Dim document As Document
Dim strData As StringBuilder
Dim strHTMLpath As String = “”
Dim strPDFpath As String = “”
Dim sw As StringWriter
Dim strHtml As String = “”
Dim strWriter As StreamWriter
Dim styles As iTextSharp.text.html.simpleparser.StyleSheet
Dim HTMLWorker As iTextSharp.text.html.simpleparser.HTMLWorker
Dim objects As ArrayList
document = New Document(PageSize.A4, 80, 50, 30, 65)
strData = New StringBuilder(String.Empty)
strHTMLpath = Server.MapPath(“MyHTML.html”)
strPDFpath = Server.MapPath(“MyHTML.pdf”)
Try
sw = New StringWriter
sw.WriteLine(Environment.NewLine)
Dim i As Integer
strHtml += “”
For i = 0 To 70
strHtml += “”
strHtml += “ ภาษาไทย 111″
strHtml += “ ”
strHtml += “ ”
Next
strHtml += “”
strHtml += “”
strHtml += “ English 222″
strHtml += ” ”
strHtml += “ ”
strHtml += “”
strHtml += “”
strHtml += “”
strWriter = New StreamWriter(strHTMLpath, False, Encoding.UTF8)
strWriter.Write(“” + strHtml + “”)
strWriter.Close()
styles = New iTextSharp.text.html.simpleparser.StyleSheet
styles.LoadTagStyle(“ol”, “leading”, “16″)
PdfWriter.GetInstance(document, New FileStream(strPDFpath, FileMode.Create))
document.Add(New Header(iTextSharp.text.html.Markup.HTML_ATTR_STYLESHEET, “Style.css”))
”footer
Dim HeaderFooter As HeaderFooter
HeaderFooter = New HeaderFooter(New Phrase(“Page “), True)
HeaderFooter.Border = 1
document.Footer = HeaderFooter
document.Open()
document.NewPage()
”add picture
‘Dim gif1 = iTextSharp.text.Image.GetInstance(Server.MapPath(“../images/sponser/CIES_logo.gif”))
‘document.Add(gif1)
objects = iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(New StreamReader(strHTMLpath, Encoding.Default), styles)
HTMLWorker.ParseToList(New StreamReader(strHTMLpath, Encoding.Default), styles)
Dim intK As Int32 = 0
For intK = 0 To objects.Count – 1
document.Add(objects(intK))
Next
Catch ex As Exception
Finally
document.Close()
Response.Write(strPDFpath)
Response.ClearContent()
Response.ClearHeaders()
Response.AddHeader(“Content-Disposition”, “attachment; filename=” + strPDFpath)
Response.ContentType = “application/octet-stream”
Response.WriteFile(strPDFpath)
Response.Flush()
Response.Close()
End Try
December 19, 2008 at 2:31 pm
nothing was done by using this code….
January 2, 2009 at 3:54 pm
Hi, I tried this example. However i couldn’t see style sheet getting applied. what could be the reason? I have checked the html file bring generated, it had style sheet applied.
January 22, 2009 at 3:53 am
oas’s code works well. I want to create a asp webform using vb.net, when click a button on the page, a table on the page will be saved to a pdf file, how can I change the code, like change strHTMLpath to table ID. Thanks.
January 23, 2009 at 7:10 pm
[...] January 23, 2009 at 1:40 pm (Asp.Net, HTML) http://hspinfo.wordpress.com/2008/01/12/how-to-convert-html-content-to-pdf-file/#comment-193 [...]
February 5, 2009 at 9:53 pm
Hello, i’m looking to find a list of the attributes that you can use with the StyleSheet.LoadTagStyle and their definition.
Is that available anywhere? Specifically, other than controlling font and font size i want to control vertical whitespace and indentation. the “leading” tag almost does this, but does not quite do what i hoped for.
thanks,
Niall Little
February 10, 2009 at 12:40 am
Hi, cool site, good writing
February 12, 2009 at 4:13 am
When i try to use the above code I get the following error:
Control ‘gvSerchResult’ of type ‘GridView’ must be placed inside a form tag with runat=server.
Can anybody suggest a solution for this.
February 19, 2009 at 7:07 pm
Rakesh,
I had the same problem and you have to override the VerifyRenderingInServerForm Method. Like this:
public override void VerifyRenderingInServerForm(Control control)
{
}
Greetz,
Thijs
February 20, 2009 at 6:14 pm
Hi hspinfo,
I tried your code, and it works fine for me .
Thanks a lot.
Excellent work done By You.
February 28, 2009 at 12:36 pm
Hi, hspinfo
In PDF which has been generated from Html file, in page of PDF
“Untitled Page” is written and then the content is started.
any idea How To Solve this problem?
Thanks
May 20, 2009 at 8:18 pm
hello everybody,
I used ur code to convert html file into pdf in some application,
but I notice I haven’t the class Markup in my iTextSharp, is this corrupt one? or should I consider something else?
June 13, 2009 at 11:31 pm
Hi every body !!!
Thank’s a lot !!! but i can’t found “using FinfoBase.Engine;” Help me.
Now who can tell me how can convert .doc file tho .pdf in asp.net
I’m thank;s you so much !!!.
June 27, 2009 at 7:12 pm
Hi, I tried this example. However i couldn’t see style sheet getting applied. what could be the reason?.
June 27, 2009 at 7:14 pm
Hi, I tried this example. However i couldn’t see style sheet getting applied. what could be the reason? I have checked the html file bring generated, it had style sheet applied.
July 2, 2009 at 9:28 am
Hi,
Iam using the code given by you for converting my web page to pdf but iam having the problem of binding my gridview because iam using data list inside the datagrid for my data representation..and how to do this phenomenon…Please help me..My requirement is very urgent
July 2, 2009 at 2:39 pm
i am using datalist..
i want to convert datalist to pdf format using itextsharp
My requirement is very urgent
July 3, 2009 at 3:57 pm
hii , how i can convert pdf to html plz?
July 15, 2009 at 4:10 pm
hi
when i m executing the code
Response.WriteFile(Server.MapPath(“~/” + strPDFpath));
i am getting error The process cannot access the file “path..\outputfiles\914532.pdf” because it is being used by another process.
please help
August 18, 2009 at 10:33 pm
Hi,
Thank You for the Great Work.
I have report a where the asp.net tabled has rowspan in order to group the data., when i push this HTML to create PDF, the PDF is created in an unstructured manner., ie., no rowspan is applied.
Is there any way to work on this., (C#.net 2.0)
September 23, 2009 at 12:35 pm
Thanks a Lot !!!
Very Gud Site. But i’ am getting a problem. I’ am not able to add image in my string builder. Or i closed the htw in the finally block but still it give me an error message “HTML file is used It is used by another Process”.
PLZ Help
Thanks a lot in Advance.
September 23, 2009 at 4:30 pm
Hi Deepak.,
This might be because of the StreamReader is not closed.
‘In this Code might cause you the issue..’
objects = iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(new StreamReader(strHTMLpath, Encoding.Default), styles);
‘Rewrite the above as..’
StreamReader objStreamReader;
objStreamReader = new StreamReader(strHTMLpath, Encoding.Default);
objects = iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(objStreamReader, styles);
objStreamReader.Close();
objStreamReader.Dispose();
Just give a try..
Happy Coding !!
September 24, 2009 at 9:48 am
Thanks Pradeep it’s working fine. But i’ am facing two more issues:
1. How we can Merge PDF Files
2. Can we add an image in our html string (Actually) it is not taking the path. It give me an exception “IT is not a virtual path”.
If you have any ideas send me the code.
Thanks a lot Buddy !!!..
October 27, 2009 at 11:24 am
Superb……..
Simply Superb….
This helps me a lot….
You done a marvelous work..
Thanks my friend….Thanks
November 6, 2009 at 8:29 pm
In my output pdf the font sizes and other html formatting are coming through in the pdf but not font styles..
Registering a certain font for a tag like ariel for span tag wouldn’t make sense in my case since the user can enter different font families for style tags, for instance they have may the word Hello in ariel followed by the name in Courier font. Any ideas?
Thanks,
November 12, 2009 at 11:48 am
Hi, I am getting an error….
“Could not find a part of the path ‘C:\Registration_files\image001.gif’.”
I am converting Registration.html file which has its Registration_files folder in the same folder as the file is in. But still the code checks for files in “C:\Registration_files\..” a fixed location. I have never given this as default location.
Here is my code
ArrayList objects;
iTextSharp.text.html.simpleparser.StyleSheet styles = new iTextSharp.text.html.simpleparser.StyleSheet();
(It is giving exception on this line =>) objects = iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(new StreamReader(dir + ConvertFileName + “.html”), styles);
for (int k = 0; k < objects.Count; k++)
{
myDocument.Add((IElement)objects[k]);
}
November 13, 2009 at 5:07 pm
hi
am am convert aspx page to PDf using your First code it generate html and pdf file successfully in Html format abd css is Ok but for Pdf formate are not Set alignment of table are not looking god 3 html page become 14 Pdf pages can you provide me any sugestion
panal in place of Grideview in panal there are multiple html table
please provide me solution