Super easy html to pdf in Rails

After trying PDFKit, htmldoc (PDF::HTMLDoc) and PDFmyURL, the winner is PDFmyURL.

WHY not PDFKit?

I dislike PDFKit for the installation issues and the stupid big amount of dependencies, because you need a lot of XOrg and Qt libs to make it work (it depends on wkhtmltopdf —the fugliest and random name, I know that it comes from Webkit html to pdf but it is ugly—), and I don’t want to play with xserver libraries on my server just for this. But I must say that PDFKit works and makes an amazing work, so if you don’t care about having X libs on your server you can try it.

WHY not htmldoc?

Ruby has an amazing wrapper for htmldoc, and it works really well, but It (htmldoc) can handle styles properly, so if you are planning to use it to ‘print’ one of your views, you must remake it without styles, or just be happy with the version without styles.

WHY PDFmyURL?

  1. Super easy to use.
  2. Fast and Free web service.
  3. If you need customization you can pay it for it.
  4. I just need 3 lines in a helper to use it https://gist.github.com/1230528
  5. It has a nice api
HTML 2 PDF generation via WeasyPrint (Githubed)

WeasyPrint (Github) : is a nice python-based alternative to wkhtml2pdf.

Features are detailled here (CSS support, etc.) :

Starting with version 0.11, WeasyPrint passes the Acid2 test

$ sudo apt-get install python-dev python-pip python-lxml libcairo2 libpango1.0-0 libgdk-pixbuf2.0-0 libffi-dev shared-mime-info virtualenv
$ cd /tmp
$ sudo pip install virtualenv
$ virtualenv --system-site-packages ./venv
$ . ./venv/bin/activate
$ pip install WeasyPrint
$ weasyprint http://weasyprint.org /Apps/weasyprint-website.pdf 

To uninstall (leaving virtualenv) :

 $ deactivate 
Convert HTML to PDF in C# using iTextSharp

You can use iTextSharp in a commercial application and it does not require your app to be released as open source. But it will not work with complex HTML pages and will give fairly good results with simple HTML pages. Specially, tables are not yet supported.

iTextSharp— http://itextsharpsl.codeplex.com/.

Code Snippet

* // create converter
* HtmlToPdfConverter html2pdf = new HtmlToPdfConverter();
* // open new pdf file
* html2pdf.Open(@”test”);
* // start a chapter
* html2pdf.AddChapter(@”Dummy Chapter”);
* string html = …;
* // convert string
* html2pdf.Run(html);
* // add a new chapter
* html2pdf.AddChapter(@”Boost page”);
* // read web page
* html2pdf.Run(new Uri(@”http://www.apple.com”));
* // close and finish pdf file.
* html2pdf.Close();   http://dlvr.it/4p6bhk

html2pdf server setup

I’m always forgetting to setup a working html2pdf binary on new server setups, here’s my reminder:

sudo apt-get install libqtcore4
# source: http://code.google.com/p/wkhtmltopdf/downloads/list
wget http://wkhtmltopdf.googlecode.com/files/wkhtmltopdf-0.11.0_rc1-static-amd64.tar.bz2
bunzip2 wkhtmltopdf-0.11.0_rc1-static-amd64.tar.bz2
tar -xvf wkhtmltopdf-0.11.0_rc1-static-amd64.tar
cp wkhtmltopdf-amd64 /usr/bin/wkhtmltopdf
ln /usr/bin/wkhtmltopdf /usr/bin/html2pdf
Meu conversor de html para pdf preferido

Quem trabalha com desenvolvimento web sabe que existe dezenas de conversores de html para arquivo pdf, em PHP acho que o domPdf seja o mais conhecido, já tente me arriscar com ele mais os resultados não foram bons.

Em alguns projetos use muito o pdf para a geração de relatórios, lembro uma vez que fiquei horas tentando encontrar o programa que atendesse minhas exigências, depois que conheci o Html2Pdf parei de procurar.

O conceito de páginas “page”, “page_header”, “page_footer” deixa a estrutura bem montada para você só colocar os dados. No bitbucket tem um exemplo de como funciona.

Para me facilitar criei um módulo para o kohana frameword, na verdade criar um módulo para kohana é tão fácil que não da pra dizer nem que foi feito por mim.

O Html2Pdf facilita minha vida e economiza horas de trabalho.

Convert HTML to PDF in C# using Spire.PDF

As far as I know, this is the simplest solution with high quality to convert any HTML file to PDF using C# and finally I chose this one. It supports to convert the complex HTML page with rich elements, such as HTTPS, CSS3, HTML5, JavaScript. But it is not free. You can have 1 month free trial on it. If you want to use it in commercial application, you need to purchase the license at the acceptable price.

https://pdfapi.codeplex.com/

Code Snippet

* //Create a pdf document
* PdfDocument doc = new PdfDocument();
* String url = “Http://apple.com/”
* doc.LoadFromHTML(url, false, true, true);
* //Save pdf file
* doc.SaveToFile(“webpageaspdf.pdf”);
* doc.Close();
* System.Diagnostics.Process.Start(“webpageaspdf.pdf”); http://dlvr.it/4p5Lwg

Convert HTML to PDF in C# using wkhtmltopdf

It is a simple shell utility created in C# to convert html to pdf using the webkit rendering engine, and it is free. But it has strange codes when converting HTML pages in GB2132-code to PDF.

http://code.google.com/p/wkhtmltopdf/

Code Snippet

* ///
* /// Convert html to pdf document.
* ///
* ///URL address
*   ///PDF save load
* public static bool HtmlToPdf(string url, string path)
* {        
*             try
*     {
*        if (string.IsNullOrEmpty(url) || string.IsNullOrEmpty(path))
*            return false;   
*        Process p = new Process();
*        string str = System.Web.HttpContext.Current.Server.MapPath(“wkhtmltopdf.exe”);
*             
*        if (!System.IO.File.Exists(str))
*            return false;
*
*        p.StartInfo.FileName = str;
*        p.StartInfo.Arguments = ” "” + url + “" ” + path;
*        p.StartInfo.UseShellExecute = false;
*        p.StartInfo.RedirectStandardInput = true;
*         p.StartInfo.RedirectStandardOutput = true;
*         p.StartInfo.RedirectStandardError = true;
*         p.StartInfo.CreateNoWindow = true;
*         p.Start();
*         System.Threading.Thread.Sleep(500);
*         return true;
*     }
*     catch (Exception ex)
*     {
*         HttpContext.Current.Response.Write(ex);
*     }
*     return false;
* } http://dlvr.it/4p43ct

Convert HTML to PDF in C# using Spire.PDF

As far as I know, this is the simplest solution with high quality to convert any HTML file to PDF using C# and finally I chose this one. It supports to convert the complex HTML page with rich elements, such as HTTPS, CSS3, HTML5, JavaScript. But it is not free. You can have 1 month free trial on it. If you want to use it in commercial application, you need to purchase the license at the acceptable price.

https://pdfapi.codeplex.com/

Code Snippet

* //Create a pdf document
* PdfDocument doc = new PdfDocument();
* String url = “Http://apple.com/”
* doc.LoadFromHTML(url, false, true, true);
* //Save pdf file
* doc.SaveToFile(“webpageaspdf.pdf”);
* doc.Close();
* System.Diagnostics.Process.Start(“webpageaspdf.pdf”); http://dlvr.it/4fkHHr

Convert HTML to PDF in C# using wkhtmltopdf

It is a simple shell utility created in C# to convert html to pdf using the webkit rendering engine, and it is free. But it has strange codes when converting HTML pages in GB2132-code to PDF.

http://code.google.com/p/wkhtmltopdf/

Code Snippet

* ///
* /// Convert html to pdf document.
* ///
* ///URL address
*   ///PDF save load
* public static bool HtmlToPdf(string url, string path)
* {        
*             try
*     {
*        if (string.IsNullOrEmpty(url) || string.IsNullOrEmpty(path))
*            return false;   
*        Process p = new Process();
*        string str = System.Web.HttpContext.Current.Server.MapPath(“wkhtmltopdf.exe”);
*             
*        if (!System.IO.File.Exists(str))
*            return false;
*
*        p.StartInfo.FileName = str;
*        p.StartInfo.Arguments = ” "” + url + “" ” + path;
*        p.StartInfo.UseShellExecute = false;
*        p.StartInfo.RedirectStandardInput = true;
*         p.StartInfo.RedirectStandardOutput = true;
*         p.StartInfo.RedirectStandardError = true;
*         p.StartInfo.CreateNoWindow = true;
*         p.Start();
*         System.Threading.Thread.Sleep(500);
*         return true;
*     }
*     catch (Exception ex)
*     {
*         HttpContext.Current.Response.Write(ex);
*     }
*     return false;
* } http://dlvr.it/4ffssd

Convert HTML to PDF in C# using iTextSharp

You can use iTextSharp in a commercial application and it does not require your app to be released as open source. But it will not work with complex HTML pages and will give fairly good results with simple HTML pages. Specially, tables are not yet supported.

iTextSharp— http://itextsharpsl.codeplex.com/.

Code Snippet

* // create converter
* HtmlToPdfConverter html2pdf = new HtmlToPdfConverter();
* // open new pdf file
* html2pdf.Open(@”test”);
* // start a chapter
* html2pdf.AddChapter(@”Dummy Chapter”);
* string html = …;
* // convert string
* html2pdf.Run(html);
* // add a new chapter
* html2pdf.AddChapter(@”Boost page”);
* // read web page
* html2pdf.Run(new Uri(@”http://www.apple.com”));
* // close and finish pdf file.
* html2pdf.Close();   http://dlvr.it/4fZZWk

Text
Photo
Quote
Link
Chat
Audio
Video