OpenTBS - create OpenOffice and Ms Office documents with PHP

version 1.7.5, 2012-02-14, by Skrol29
help file modified on 2012-02-14
  1. Introduction
  2. Installing
  3. Understanding principles
  4. Synopsis and code examples
  5. Demo
  6. Debugging your template
  7. What to do if Zlib extension is not enabled with PHP?
  8. Changelog
  9. License

1. Introduction

OpenTBS is a plug-in for the TinyButStrong Template Engine.

TinyButStrong is a PHP Template Engine which has special template syntax and allows you to design templates in their natural editing tools. But it normally works only for Text files, including XML and HTML.

With TinyButStrong and its plug-in OpenTBS, you can use the template engine to merge OpenOffice documents and Ms Office documents with lot of facilities. All OpenDocument Format (ODF) and Office Open XML (OOXML) can be merged with OpenTBS, and also XPS files (XPS is a PDF competitor provided by Microsoft). In fact, all zip archives containing Xml/Html/Text files can be merged with OpenTBS.

What is special to OpenTBS:

You should know Template Engines and more specifically TinyButStrong to use OpenTBS.

2. Installing

Requirements:
Installation:

Just put the file "tbs_plugin_opentbs.php" with your PHP scripts.

3. Understanding principles

It is important to figure out that OpenOffice and Ms Office (since version 2007) documents are technically zip archives containing XML files, even if the extension of the document is not ".zip". Those zip archives can contain other file types like pictures or sounds, but the document structure and the text contents are saved as XML files. The XML Synopsis summarizes the key entities of XML sub-files contained in OpenOffice and Ms Office documents.

TinyButStrong can merge XML files, but cannot read zip archives by itself. The plug-in OpenTBS extends the TinyButStrong methods LoadTemplate() and Show() to make them working with zip archives. But you do not have to bother with it because OpenTBS is managing archives in a way that is invisible for you.

When the OpenTBS plugin is installed, the LoadTemplate() method becomes able to first load a zip archive (an OpenOffice or Ms Office document), and then to load the contents of any XML or Text files stored in the archive. You can then merge the contents of XML or Text files with all features of the TinyButStrong template engine. At the end, the Show() method does render the entire zip archive including modified stored files. The render can be done as an HTTP download, a news file on the server's disk, or in a PHP string.

Since OpenTBS version 1.3, you can also add and delete files in the archive. Before this version you could only modify existing files in the archive.

OpenTBS has automatic extension recognition. When you load a document which has one of the following extensions { odt, odg, ods, odf, odp, odm, docx, xlsx, pptx }, then the main XML file of the archive are automatically loaded, and some special character conversion are preset. For example, for all OpenDocument files, the stored file "content.xml" is automatically loaded.
Since version 1.6.0, if the extension is not recognized then OpenTBS also try to find the document by the sub-file presence. And if all fails, then you can force the document type using a special command.

4. Synopsis and code examples

4.1. Preparation of TinyButStrong Template Engine with the OpenTBS plug-in

include_once('tbs_class.php');
include_once('tbs_plugin_opentbs.php');

$TBS = new clsTinyButStrong;
$TBS->Plugin(TBS_INSTALL, OPENTBS_PLUGIN);

4.2. Method LoadTemplate()

• Load an archive with the automatic extension recognition (explained above):

$TBS->LoadTemplate('document.odt'); // Load the archive 'document.odt'.

• Load an archive without the automatic extension recognition:

(supported since OpenTBS version 1.1)
$TBS->LoadTemplate('document.odt#');

• Load an archive and one file stored in this archive:

$TBS->LoadTemplate('document.odt#content.xml');

• Load an archive and several files stored in this archive:

$TBS->LoadTemplate('document.odt#content.xml;settings.xml');

• Load a stored file from the current archive:

$TBS->LoadTemplate('#content.xml'); // Load the stored file 'content.xml' from the current archive.

The archive must be previously loaded.
If the file is stored in a subfolder, then indicate the full path. For example: 'word/document.xml'.

• Load an archive with special data conversion:

(supported since OpenTBS version 1.3.2)
$TBS->LoadTemplate('document.odt', OPENTBS_ALREADY_UTF8);

OpenTBS manages XML files that are UTF8 encoded. But by default, it assumes that all the data to merge (which can come from PHP or SQL) is Ascii encoded, and thus it performs conversions. If you want to define the data conversion, then you can use one of the following constants:

Please note that if you need to change the data conversion for one or few fields only in your template, then you can use parameter "htmlconv" (see the TBS documentation for more details).

4.3. Method Show()

Render options for OpenTBS:

• Render the merged archive as an HTTP download: ($file_name is optional)

$TBS->Show(OPENTBS_DOWNLOAD, $file_name);

• Render the merged archive as an HTTP output with your customized HTTP headers:

header(...); // your custom headers here
$TBS->Show(OPENTBS_NOHEADER); // output the binary file without header

• Render the merged archive as a new file saved on the server's disk:

$TBS->Show(OPENTBS_FILE, $file_name);

• Render the merged archive as a PHP string:

(supported since OpenTBS version 1.1)
$TBS->Show(OPENTBS_STRING);
$string = $TBS->Source;

When you use OPENTBS_STRING then there is no output for the client. But instead, the binary source of the archive is placed into property $TBS->Source. This feature can be useful, for example, when you want to place the merged document into an email as an attached file.

4.4. Change data of series in charts

• Change series in charts:

(supported since OpenTBS version 1.6.0, for Ms Word only)
Example: $TBS->PlugIn(OPENTBS_CHART, $ChartNameOrNum, $SeriesNameOrNum, $NewValues, $NewLegend=false)

This command changes the values of a series in a Chart of the document. The chart will be automatically actualized when the merged document is opened because OpenTBS also breaks the link between the chart and its cached view.
The result is true if the series is modified with success, otherwise the result is false.

Argument Description
$CharteNameOrNum Internal name of the XML file that contains the chart definition (with or without the extension), or the order number of the chart in the document (first is number 1).
You can use the command OPENTBS_DEBUG_CHART_LIST in order to view all chart internal names in the document.
$CarteNameOrNum is typically 'chart1' or 1.
$SeriesNumOrName Exact caption of the series in the chart, or its number (first is number 1). Typically 'Series 1' or 1.
$NewValues The new data of the series. Must be an array, or value false if you want to delete the series.
The array can store data with 2 possibilities:
$NewValues = array( array('cat1', 'cat2', 'cat3', ...), array(val1, val2, val3, ...) );
or
$NewValues = array('cat1'=>val1, 'cat2'=>val2, 'cat3'=>val3, ...);
If the chart type is "X Y (Scatter)" then you must use only the first data store type.
$NewLegend Optional. The new caption of the series.
Please note:

4.5. Change pictures in the document

• Change an internal picture with a new one:

(supported since OpenTBS version 1.4.0, for OpenOffice and MsOffice documents only)
Example: [onshow.x;ope=changepic;from='../pic/[val].gif';as='[val].gif';default=current]

I the example above, $x is a PHP global variable containing the name or the path for an external picture file. But the feature works simillary with items data merged with a TBS block.

When a TBS field having "ope=changepic" is merged in the template, then OpenTBS will search the first picture located before the field (to be more precise, the TBS field must be located somewhere inside or after the opening tag of the picture in the template source code), and then it will change the picture assuming that the value of the field is the path for a picture file on the server. You don't have to care about loading the picture file in the document, OpenTBS will manage this for you.

Note that parameter "ope=changepic" is a feature provided by the OpenTBS plug-in, which extends the "ope" parameter natively present with TBS.

In order to simplify your coding, they are other complementary parameters that are provided for the changepic parameter:

Parameter Description
from Reformulate the path of the picture to insert. The parameter's value can contain the [val] keyword or any [var] fields, they work the same way as with parameter "file". Parameter "from" is optional.
as Reformulate name of the picture that it will take inside the document. It is rare to need it, but it can help in some cases. Note that the external picture file is not renamed. The new name must be defined without path. The parameter's value can contain the [val] keyword or any [var] fields, they work the same way as with parameter "file". Parameter "as" is optional.
default Define the picture that should be used when the expected one is not found. The parameter's value must be the path of a file on the server, or the keyword "current". If you've set "default=current" then OpenTBS will let the picture of the template if the expected one is not found.
adjust Adjust the size of the picture in the document. This parameter requires that PHP is configured with the GD extension, which is usually the case.
Values can be on of the followings:
adjust (or adjust=inside) The picture is adjusted to enter into the picture bounds of the template.
adjust=samewidth
The picture is adjusted to have the same width than the picture of the template.
adjust=sameheigth The picture is adjusted to have the same height than the picture of the template.
adjust=100% (or another pourcentage) The picture is adjusted to be proportional to the originial size.
Parameter adjust is supported since OpenTBS version 1.7.0.

4.6. Manual modification in the contents

The following commands are supported since OpenTBS version 1.7.0:

Command Desciption
$TBS->PlugIn(OPENTBS_SELECT_MAIN) Select and load the main sub-file in the opened template. For example in a Writer document, or an Ms Word document, this command can bring you back from the merging of a header to the main body.
$TBS->PlugIn(OPENTBS_SELECT_SHEET, $Sheet)

Select and load the sub-file corresponding to $Sheet.
This command will raise an error if the opened template is not a Workbook (Ms Word or OpenOffice Calc). This command is useless for an OpenOffice Calc workbook because all sheets are saved in single sub-file. Nevertheless using it won't raise an error.
$Sheet must be a sheet identifier.
A sheet identifier can be either an integer corresponding to the index of the sheet, or a string corresponding to the name of the sheet.
Use command $TBS->PlugIn(OPENTBS_DEBUG_INFO) to list all id and name of sheets in current Workbook.

$TBS->PlugIn(OPENTBS_DISPLAY_SHEETS, $Sheets[, $Visible]) Make one or several sheets visible or hidden.
This command will raise an error if the opened template is not a Workbook.
$Sheets must be an array of sheet identifier, or even a single sheet identifier. See command OPENTBS_SELECT_SHEET for more details about sheet identifiers.
$Visible must be a boolean, default value is true.
$TBS->PlugIn(OPENTBS_DELETE_SHEETS, $Sheets[, $Delete]) Make on or several sheets deleted or not.
This command will raise an error if the opened template is not a Workbook.
$Sheets must be an array of sheet identifier, or even a single sheet identifier. See command OPENTBS_SELECT_SHEET for more details about sheet identifiers.
$Delete must be a boolean, default value is true.
Please note that for now, you must not delete a sheet that contains a Pivot Table because this will produce an error when the workbook is opened.
$TBS->PlugIn(OPENTBS_DELETE_COMMENTS) Delete all usual user comments in the opened template.
$TBS->PlugIn(OPENTBS_DELETE_ELEMENTS, $Elements) Delete XML elements in the current sub-file.
$Elements must be an array of strings. For example:
$Elements = array('w:bookmarkStart', 'w:bookmarkEnd')
This will delete all bookmarks in an Ms Word document.

4.7. Manual modification of files in the archive

• Check if a file does exists in the archive:

$TBS->Plugin(OPENTBS_FILEEXISTS, $Name)

Return true or false. $Name must include the inner path.
For example : $Name = 'META-INF/manifest.xml';

(supported since OpenTBS version 1.7.4)

• Add any new file in the archive:

// OpenTBS >= 1.6.0
$TBS->Plugin(OPENTBS_ADDFILE, $Name, $Data, $DataType=OPENTBS_STRING, $Compress=true);

// Deprecated since OpenTBS 1.6.0
$TBS->Plugin(OPENTBS_PLUGIN, OPENTBS_ADDFILE, $Name, $Data, $DataType=OPENTBS_STRING, $Compress=true);

If $Data is false then the previously add file with the given name is canceled if any.

$DataType must be OPENTBS_STRING if $Data is the content to add ; it must be OPENTBS_FILE if $Data is the path of the external file to insert.

$Compress can be true, false or an array with keys ('meth','len_u','crc32') which means that the data is already previously compressed.

(supported since OpenTBS version 1.3)

• Replace an existing file in the archive:

$TBS->Plugin(OPENTBS_DELETEFILE, $Name, $Data, $DataType=OPENTBS_STRING, $Compress=true);

The arguments are the same as command OPENTBS_ADDFILE.
Please note that any TBS merge on a file in the archive will cancel previous or future replacements.

(supported since OpenTBS version 1.7.4)

• Delete an existing file in the archive:

// OpenTBS >= 1.6.0
$TBS->Plugin(OPENTBS_DELETEFILE, $Name);

// Deprecated since OpenTBS 1.6.0
$TBS->Plugin(OPENTBS_PLUGIN, OPENTBS_DELETEFILE, $Name);

Delete the existing file in the archive, or a file previously added using the OPENTBS_ADDFILE command.

(supported since OpenTBS version 1.3)

• Reset all modifications in the archive:

// OpenTBS >= 1.6.0
$TBS->Plugin(OPENTBS_RESET);

// Deprecated since OpenTBS 1.6.0
$TBS->Plugin(OPENTBS_PLUGIN, OPENTBS_RESET);

The automatic extension recognition is also applied as it was applied for the first load of the archive.

4.8. Miscellaneous

• Dealing with apostrophes:

Both OpenOffice and Ms Office may automatically convert single quotes (') into typographic apostrophes (’), depending to the auto-correction options. This may be annoying when you need to code a TBS fields that have a single quote. That's why OpenTBS automatically convert by default all (’) back to single quotes (') in documents.
If you want to stop this conversion, you can set $TBS->OtbsConvertApostrophes = false; and no apostrophes will be converted. Note that you can avoid the auto-correction of single quotes (') in Ms Word using keys[ctrl]+[z], and in OpenOffice using the cancel button.

Property OtbsConvertApostrophes is supported since OpenTBS version 1.6.0.

• Forcing the document type recognition:

You can force the document type recognition using command OPENTBS_FORCE_DOCTYPE. Example:

$TBS->PlugIn(OPENTBS_FORCE_DOCTYPE, 'docx');

This command is supported since OpenTBS version 1.6.0.

• Retrieving the name of the current document:

Property $TBS->tbsCurrFile indicates the name of the current file loaded from the archive. The value is false if no file is loaded yet from the archive.

Other TinyButStrong methods and properties stay unchanged and are available for merging your template.

(supported since OpenTBS version 1.1)

5. Demo

The OpenTBS package includes a full set of runnable templates. Some templates can contain useful complementary information for designing.
Run the following demo under PHP: OpenTBS demo

6. Debugging your template

Since OpenTBS version 1.6.0, there are several commands for debugging. Please note that those commands do not exit the process.

Command Desciption
$TBS->PlugIn(OPENTBS_DEBUG_INFO [, $Exit]) Display technical information about the current loaded template, including sheet information if the template is a workbook, and chart information if the template have some.
$Exit must be a boolean, default value is true.
$TBS->PlugIn(OPENTBS_DEBUG_XML_CURRENT) Display XML contents of sub-files already opened and modified for merging. XML is indented in order to improve reading.
$TBS->PlugIn(OPENTBS_DEBUG_XML_SHOW) Ends the merge process as if the final document was created. But instead of creating the document, displays the XML contents of sub-files modified for merging. XML is indented in order to improve reading.

There is also deprecated debug options:

Command Desciption
$TBS->PlugIn(OPENTBS_DEBUG_XML) Does the same as $TBS->PlugIn(OPENTBS_DEBUG_XML_SHOW);
Supported since OpenTBS version 1.3.2.
$TBS->PlugIn(OPENTBS_DEBUG_XML+OPENTBS_DEBUG_AVOIDAUTOFIELDS) Avoid merging of [onload], [onshow] and [var].Supported since OpenTBS version 1.3.2.
$TBS->Render = OPENTBS_DEBUG_AVOIDAUTOFIELDS; Work also in property Render.Supported since OpenTBS version 1.3.2.
$TBS->PlugIn(OPENTBS_DEBUG_CHART_LIST) Does the same as $TBS->PlugIn(OPENTBS_DEBUG_INFO);Supported since OpenTBS version 1.6.0.

Otherwise, here are some indications that may help for the issues you can met with merging:

a) The merged document is producing error messages when opened with its application (OpenOffice or Ms Office)

The most likely causes are:

• You've chosen the OPENTBS_DOWNLOAD render option but a php error message or any other unexpected content has been output before by PHP.

Activate the debug mode using the command OPENTBS_DEBUG_XML_SHOW, it helps to check PHP error message and other unexpected content.

• The merging has produced an invalid document or an invalid XML content in an XML file of the document.

Activate the debug mode using it helps to check the XML contents of merged files.

See section (b) below for more information in the XML structure of the files.

b) The merged document is well opened by its application (OpenOffice or Ms Office) but the content is not designed as expected

First, you can have a look the demo templates, they contain examples and advices for each type of document.
And to go further: even if you can edit your template using directly OpenOffice or Ms Office, you will probably need to understand the XML tags and attributes to complete your merge. The file xml_synopsis.txt is a small synopsis of the XML structure you can found in the inner source of those documents. Have a look to it if you feel lost.

c) Go deeper in the debugging

You can view the inner source of a document using a zip software like 7-Zip. It allows you to open an archive even if the extension is not ".zip".

7. What to do if Zlib extension is not enabled with PHP?

OpenTBS uses Zlib functions in order to automatically uncompress and recompress files stored in the zip archive. If Zlib is not enabled, then you have to use your own uncompress/compress tool, or to prepare the template to have files uncompressed in the zip archive.

Example to uncompress the "content.xml" file in an ODT document using 7-Zip:
1) open the ODT file with 7-Zip
2) extract the "content.xml" file from the ODT file in the same folder than the ODT file
3) close 7-Zip
4) open 7-Zip, and change current directory to be the same as the ODT file
5) select the "content.xml" file and click on button [Add], or menu [File][7-Zip][Add to archive...]
6) A new window named "Add to archive" is opened,
    - replace the archive name with the ODT file name,
    - set the Compression level to "None".
7) Click on [Ok]
If you re-open the ODT file with 7-Zip, you can notice that the size and the uncompressed size are the same.
If the file should be placed in a sub-folder of the archive, then open the archive and rename the file in order to move it in a folder. For example rename "manifest.xml" to "META-INF\manifest.xml" will move it into META-INF. But moving the file will no delete the one which has the same name in the target folder. You have to go and delete the old one.

8. Changelog

version 1.7.5, on 2012-02-14
- Avoid erroneous Ms Word merged documents when duplicating objects such as drawings and shapes.
- Based on TbZip version 2.11
- New coding shorctut $TBS->TbsZip.
- More examples of formulas for Xlsx and Ods speadsheets.

version 1.7.4, on 2011-10-20
- parameter "defaut=current" does not work and may build invalid documents when the target image is missing.
- new command OPENTBS_REPLACEFILE
- new command OPENTBS_FILEEXISTS

version 1.7.3, on 2011-10-13
- fixed bug: in Ms Word documents, automatic fields (onload, onshow) placed in headers and footers with parameter "ope=changepic" are producing an erroneous merge. In Word 2010 the picture may by missing, in Word 2007 the docx file may be considered as corrupted.

version 1.7.2, on 2011-10-12
- fixed bug: error when using command OPENTBS_SELECT_SHEET with a sheet name: Notice: Undefined index: xxx in xxx on line 1986.

version 1.7.1, on 2011-10-07
- fixed bug: first non-empty cell of an Excel Spreadsheet is never merged if it contains a TBS field.
- minor internal improvements.

version 1.7.0, on 2011-08-21
- new parameter 'adjust' for changing picture size
- new command OPENTBS_DEBUG_INFO
- new command OPENTBS_SELECT_MAIN
- new command OPENTBS_SELECT_SHEET
- new command OPENTBS_DISPLAY_SHEETS
- new command OPENTBS_DELETE_SHEETS
- new command OPENTBS_DELETE_COMMENTS
- new command OPENTBS_DELETE_ELEMENTS
- parameter 'changepic' is optimized

version 1.6.2, on 2011-07-12
- fixed bug: Ms Excel cells could consider as error some formatted values such as '0.00000000000000'.

version 1.6.1, on 2011-06-08
- fixed bug: some documents may be corrupted when created using OPENTBS_DOWNLOAD because of a PHP error "supplied argument is not a valid stream resource" or "Undefined property: clsOpenTBS::$OutputHandle".
- fixed bug: using keyword "xlsxNum", "xlsxDate" or "xlsxBool" inside a cell that is not merged can make a corrupted XLSX spreadsheet.
- improvement: updated templates in the demo.
- based on a TbsZip v2.8

version 1.6.0, on 2011-06-07
- new feature: merge charts in Ms Word documents.
- new feature: merge rows and columns Ms Excel workbooks.
- new feature: new "ope" parameters for forcing cells type in Ms Excel (Numeric, Date and Boolean).
- new feature: debug mode enhanced.
- new feature: force the type of document using command OPENTBS_FORCE_DOCTYPE.
- new property: deal with apostrophes using property OtbsConvertApostrophes.
- improvement: if the document extension is not recognized, then try to recognize document type by sub-file presence.
- improvement: can use the Direct Command feature of TBS 3.7.0.
- based on a TbsZip v2.6

version 1.5.0, on 2011-03-20
- new feature: headers and footers are automatically loaded for OpenOffice & MsOffice.
- new feature: automatically cleans up spelling and change trackings information in MsWord templates (such information may deconstruct the TBS tags). This feature can be disabled.
- new constant OPENTBS_DEBUG_AVOIDAUTOFIELDS
- improvement: Debug doesn't stopped if an OpenTBS alert occurs.
- improvement: OpenTBS alerts say if the process will be stopped.
- fixed bug: in debug mode: "warning function.str-repeat: Second argument has to be greater than or equal to 0"
- fixed bug: when using OPENTBS_RESET: "Warning: Missing argument 2 for clsOpenTBS::OnCommand() in ... on line 225"
- fixed bug: DML images were not found when using parameter "ope=changepic" in a DOCX document
- fixed bug: the script ends and display the XML contents when a when using parameter "ope=changepic" with a new image type in a DOCX document

version 1.4.1, on 2010-10-28
- major bug fixed: due to TbsZip, some added or modified files can be saved the document with a wrong CRC control code. This could make softwares to consider the document as corrupted, but were often easily fixed by OpenOffice and Ms Office. Only few CRC codes are wrongly saved, thus the bug is rare and can seem to appear randomly on few documents.

version 1.4.0, on 2010-10-05
- new parameters "changepic" and "default"

version 1.3.3, on 2010-08-05
- property Version of OpenTBS version 1.3.2 was saying 1.3.1

version 1.3.2, on 2010-07-23
- possibility to change de default data conversion using the new constants OPENTBS_DEFAULT, OPENTBS_ALREADY_XML or OPENTBS_ALREADY_UTF8
- enhanced debug mode: listing of added, deleted and modified files ; and show XML formated contents of files merged with OpenTBS.

version 1.3.1, on 2010-07-01
- based on TbsZip version 2.1: fixes a bug that saved a bad time of modification file was added, and saved time modification when a file content is replaced.
- the addpic operator now automatically updates the "fanifest.xml" file on OpenOffice document. Without this fix, an ODP merged document could be open with an error message with OpenOffice >= 3.2

version 1.3, on 2010-06-01
- a new plugin command that add a new file in the archive
- a new plugin command that delete a new file in the archive
- a parameter 'ope=addpic' that add a new picture in the archive directly from the template
- based on a TbsZip v2 (modify/delete/add files in a zip archive, )

version 1.1, on 2009-11-19
- New render option : OPENTBS_STRING
- New feature: can reset changes in the current archive using $TBS->Plugin(OPENTBS_PLUGIN, OPENTBS_RESET);
- New behavior: extension of the archive is ignored by LoadTemplate() if the name is ended with '#'
- Bug fixed: in case of several files to take from the archive in one shot, then only the last one had [onload] fields merged.

9. License

OpenTBS is under LGPL (Lesser General Public License)