HTML-to-RTF Pro DLL - Programmer's Guide

The component converts HTML files(strings) into RTF or TXT files (strings) with images, tables, CSS etc.
This scheme shows all methods and properties of DLL to help you specify a conversion mode that output RTF will look as you wish.
The HTML-to-RTF Pro DLL has written on C/C++, it's completely standalone and doesn't require MS Word or any other word processors.

functions:
 htmltortf_string()
 htmltortf_file()
 flushlist()
 
structure ConvertSettings
{

  int PreserveTables;                  //keep tables/transform to text
  int PreserveImages;                //keep images
  int PreserveHyperlinks;
            //keep hyperlinks
  int PreserveFontFace
;
             //keep font
  int PreserveFontSize;              //keep font sizes
  int PreserveFontColor
;
           //keep background colors
  int PreserveBackgroundColor;
//keep background colors
  int PreserveAlignment;
             //keep alignment
  int PreserveTableWidth;
          //keep width of table's columns
  int PreserveNestedTables;
      //keep nested tables/ trasform
  int PageMarginLeft;
                 //set left page margin
  int PageMarginRight;
               //set right page margin
  int PageMarginTop;
                 //set top page margin
  int PageMarginBottom;
           //set bottom page margin
  int BorderVisibility;
                 //specify table borders visibility
  int PageOrientation
;
               //set portrait or landscape page
  int PageSize;
                           //specify page size (Letter, A4 ...)
  int FontFace;
                         //set default font
  int FontSize;
                           //set default font size
  int PageAlignment
;
                  //set default text alignment
  int RtfLanguage;
                     //specify RTf spelling language
  int Encoding;
                           //select encoding
  int OutputTextFormat;
           //output file format RTF or Text
  int PreservePageBreaks;
         //keep page breaks
  int ImageCompatible;
               //image type (WordPad or Word)
  int PageNumbers;
                     //specify page numbers
  char PageHeader[150];
           //header
  char PageFooter[150];
           //footer
  char HtmlPath[650];
               //path
  int PageNumbersAlignV;
           //page numbers vertical alignment
  int PageNumbersAlignH
;
           //page numbers horiz alignment
  int PreserveHR;
                       //keep horiz rules <hr>
  int RtfParts;
                             //completely rtf file or only rtf body
  int CreateTraceFile;
                  //creates trace(debug) file
  char TraceFilePath[650];
          //trace file path
  int TableCellPadding;
               //set default cell padding in pix
  int PreserveHttpImages;
         //set default cell padding in pix
}
 
   
   
   
   



char * htmltortf_string(char * html, char * rtf, struct ConvertSettings cs) - this method takes HTML string and returns RTF string. When this function finished converting and you got rtf string, launch the function flushlist() to release memory.

See C++ and Delphi samples

return values:
pointer (char *) to RTF buffer, memory was allocated by function htmltortstring(). Copy this RTF buffer into some local memory/buffer and next launch the method int flushlist()

int htmltortf_file(char *htmlfile, char *outfolder, struct ConvertSettings cs) - this method converts HTML file into RTF file. This method supports image conversion.

See C++ and Delphi samples

return values:
0 - converting successful

2 - not enough memory
3 - can't create ouput rtf/text file
4 - can't open html file
5 - html file has zerow length

int flushlist() - releases memory (necessary only after function htmltort_string)

PageHeader - put text in RTF page header (150 symbols max)


Possible values: any string, 150 symbols max
Default value: ""
Example:
strcpy(cs.PageHeader,"Page Header Example");
Example:
strcpy(cs.PageHeader,"{\\b Bold Page Header Example}");
Example:
strcpy(cs.PageHeader,"{\\b\\i Bold Italic Page Header Example}" );
Example:
strcpy(cs.PageHeader, "{\\qc Centered Page Header Example}");

PageFooter - put text in RTF page footer (150 symbols max)


Possible values: any string, 150 symbols max
Default value: ""
Example:
strcpy(cs.PageFooter, "Page Footer Example" );


BorderVisibility - this property set table border's visibility;
or
Possible values: Hidden, SameAsOriginalHtml, Visible.
Integer values:

Name
Integer equivalent
Visible
1
Hidden
0
SameAsOriginalHtml
2


cs.BorderVisibility = 1; //visible
cs.BorderVisibility = 1; //invisible

Encoding - default encoding of HTML page
Possible values: AutoSelect, ISO-8859-1, ISO-8859-5, KOI8-R, Windows-1250, Windows-1251, Windows-1252, Windows-1253, Windows-1254, Windows-1255, Windows-1256, Windows-1257, Windows-1258.
Integer values:

Name
Integer equivalent
AutoSelect
0
ISO-8859-1
1
ISO-8859-5
2
KOI8-R
3
Windows-1251
4
UT8
5
Windows-1254
6
Windows-1256
7
Windows-1250
8
Windows-1252
9
Windows-1253
10
Windows-1255
11
Windows-1257
12
Windows-1258
13

Example:
cs.Encoding = 0; //AutoSelect

OutputTextFormat - type of output file (RTF or TXT)
Possible values: Rtf, Text.
Integer values:

Name
Integer equivalent
Rtf
0
Text
1

Example: cs.OutputTextFormat = 0; //Rtf

FontFace - default font face
Possible values: Arial, Times New Roman, Verdana, Helvetica, Courier, Courier_New,
Times, Georgia, MS Sans Serif, Futura, Arial Narrow, Garamond, Impact,
Lucida_Console, Tahoma, Inform, Symbol, WingDings, Traditional Arabic
Integer values:

Name
Integer equivalent
Arial
0
Times New Roman
1
Verdana
2
Helvetica
3
Courier
4
Courier_New
5
Times
6
Georgia
7
MS Sans Serif
8
Futura
9
Arial Narrow
10
Garamond
11
Impact
12
Lucida Console
13
Tahoma
14
Inform
15
Symbol
16
WingDings
17
Traditional Arabic
18

Example: cs.FontFace = 3; //Helvetica

FontSize - default font size
Possible values: Any size

Example:
cs.FontSize = 12;

PageAlignment - default page alignment
Possible values: AlignLeft, AlignRight, AlignCenter, AlignJustify.
Integer values:

Name
Integer equivalent
AlignLeft
0
AlignCenter
1
AlignRight
2
AlignJustify
3

Example: cs.PageAlignment = 3; //justify

PreserveAlignment - keep alignment as in HTML
or
Possible values: 1 or 0
Default value: 1

PreserveTables - convert tables or transform to text
or
Possible values: 1 or 0
Default value: 1

PreserveNestedTables - preserve nested tables as in HTML or transform it to plain tables
or
Possible values: 1 or 0
Default value: 0

PreserveImages - convert images or skip them (this feature works only in method ConvertFile)
or
Possible values: 1 or 0
Default value: 0

PreserveHttpImages - download remote images or skip them (this feature works only in method ConvertFile)
or
Possible values: 1 or 0
Default value: 0

PreserveFontFace - keep font face as in HTML
or
Possible values: 1 or 0
Default value: 0

PreserveFontSize - keep font size as in HTML
or
Possible values: 1 or 0
Default value: 0

PreserveFontColor - keep font color as in HTML
or
Possible values: 1 or 0
Default value: 1

PreserveBackgroundColor - keep background color as in HTML (for table columns and text)

Possible values: 1 or 0
Default value: 1

PreserveHyperlinks - keep hyperlinks as in HTML
or
Possible values: 1 or 0
Default value: 0

PreserveTableWidth - keep width of table columns as in HTML
or
Possible values: 1 or 0
Default value: 0

PageSize - select page size for RTF

Possible values: A4, A3, A5, B5, Letter, Legal, Executive, Monarh
Default value: Letter
Integer values:

Name
Integer equivalent
A4
0
A3
1
A5
2
B5
3
Letter
4
Legal
5
Executive
6
Monarh
7

Example: cs.PageSize = 4; //Letter

PageOrientation - select page orientation: Portrait or Landscape

Possible values: Portrait or Landscape
Integer values:

Name
Integer equivalent
Portrait
0
Landscape
1

Example: cs.PageOrientation = 0; //Portrait

PageMarginLeft, PageMarginRight, PageMarginTop, PageMarginBottom - set page margins

Possible values: any value
Default values:
   PageMarginLeft      = margin_25mm
   PageMarginRight    = margin_10mm
   PageMarginTop      = margin_10mm
   PageMarginBottom = margin_10mm

Name
Integer equivalent
margin_0mm
0
margin_5mm
5
margin_10mm
10
margin_15mm
15
margin_20mm
20
margin_25mm
25
margin_30mm
30
margin_35mm
35
margin_40mm
40
margin_45mm
45
margin_50mm
50

RtfLanguage - language which will be used for spelling in Microsoft Word
Possible values: l_Albanian, l_English, l_Belgian, l_Bulgarian, l_Hungarian, l_Danish, l_Spanish,
l_Italian, l_Latvian, l_Lithuanian, l_German, l_Netherlands, l_Norwegian, l_Portuguese, l_Romanian, l_Russian,
l_Ukrainian, l_Finnish, l_French, l_Czech, l_Swedish, l_Turkish, l_Arabic.
Integer values:

Name
Integer equivalent
l_Albanian
1052
l_English
1033
l_Belgian
2067
l_Bulgarian
1026
l_Hungarian
1038
l_Danish
1030
l_Spanish
3082
l_Latvian
1062
l_Lithuanian
1063
l_German
1031
l_Netherlands
1043
l_Norwegian
2068
l_Portuguese
2070
l_Romanian
1048
l_Russian
1049
l_Ukrainian
1058
l_Finnish
1035
l_French
1036
l_Czech
1029
l_Swedish
1053
l_Arabic
1053
l_Turkish
1055
l_Japanese
932
l_SimplifiedChinese
936
l_TraditionalChinese
950
l_Korean
949
l_Thai
874
l_Italian
1040
l_Polish
1045
l_Brazil
1046

Default value: l_English
Example:
cs.RtfLanguage = HTML2RTF.l_English
Example:
cs.RtfLanguage = 1033

ImageCompatible - select image compatible: image_Word or image_WordPad

Possible values: image_Word or image_WordPad
Default value: image_Word
Integer values:

Name
Integer equivalent
image_Word
0
image_WordPad
1

When you selected image_Word the images will be strored in JPG, PNG and WMF formats:
{\pict...\pngblip....89504e470d0a1a0a000......}
{\pict...\jpegblip....ffd8ffe070d0a1a0a000......}
{\pict...\wmetafile8....010009000003de000......}

When you selected image_WordPad the images will be strored in BMP format:
{\pict...\dibitmap0....28000000690100......}

Example:
cs.PageOrientation = HTML2RTF.image_WordPad
Example:
cs.PageOrientation = 1

PreservePageBreaks - preserve page breaks as in HTML

If '1', then converter will create new page in RTF each time when it meet 'page-break-before:' or 'page-break-aftrer:'.
Possible values: 1 or 0
Default value: 0
This is example for page-breaks:

<HTML>
<HEAD>
<TITLE>page-break-after</TITLE>
<STYLE>
P.after {page-break-after: always}
P.before {page-break-before: always}
</STYLE>
</HEAD>
<BODY>
<p class="before">Some text</p>
<p class="after">Some text</p>
<p>Some text</p>
</BODY>
</HTML>

 

PageNumbers - put page numbers

Possible values: PageNumDisable, PageNumFirst and PageNumSecond
Default value: PageNumDisable
Integer values:

Name
Integer equivalent
PageNumDisable
0
PageNumFirst
1
PageNumSecond
2

Example: cs.PageNumbers = HTML2RTF.PageNumFirst

PageNumbersAlignH - horizontal alignment of page numbers
Possible values: AlignLeft, AlignRight, AlignCenter.
Integer values:

Name
Integer equivalent
AlignLeft
0
AlignCenter
1
AlignRight
2

Default value: AlignCenter
Example:
cs.PageNumbersAlignH = HTML2RTF.AlignLeft
Example: cs.PageNumbersAlignH = 0

PageNumbersAlignV - vertical alignment of page numbers
Possible values: AlignTop, AlignBottom
Integer values:

Name
Integer equivalent
AlignTop
4
AlignBottom
5

Default value: AlignBottom
Example:
cs.PageNumbersAlignV = HTML2RTF.AlignLeft
Example: cs.PageNumbersAlignV = 0

PreserveHR - keep horizontal rule <HR> as in HTML

Possible values: 1 or 0
Default value: 1

RtfParts - allows to create completely RTF or only RTF body which you can insert inside another RTF file

Possible values: RtfCompletely, RtfBody
Default value: RtfCompletely

Name
Integer equivalent
RtfCompletely
0
RtfBody
1


Example:
cs.RtfParts = 1;

CreateTraceFile - forces DLL to create tracing text file, this file shows you debug info and helps find some errors

Possible values: 0 or 1
Default value: 0
Example:
cs.CreateTraceFile = 1;
By default tracing file will be created on C:\htmltortf-trace.txt, you can also specify path for this file using this parameter:

TraceFilePath - path for tracing file

Possible values: any string
Default value: "C:\htmltortf-trace.txt"
Example:
strcpy(cs.TraceFilePath, "D:\\report.txt");

TableCellPadding - set default cell padding in pix for all tables inside converting file

Possible values: 0 to 10
Default value: 2
Example:
cs.TableCellPadding = 10;