Synopsis of XML files stored in the archives supported by OpenTBS.
Version 2010-01-18
Extensions: odt, odg, ods, odf, odp, docx, xlsx and pptx
This file is incomplete, feel free to send your own comments to:
http://www.tinybutstrong.com/onlyyou.html
=================================
OpenOffice Documents (.ODT, .ODS, .ODG, .ODF, .ODP, .ODM)
=================================
All simple quotes "'" in texts are coded with "'" but they are automatically replaced by the OpenTBS plugin.
The main information is stored in the file 'content.xml'.
The pictures are stored in the directory 'Pictures' and should be registered into the file 'META-INF/manifest.xml'. (OpenTBS does it automatically for you when you use parameter "addpic")
Since OpenOffice 3.2, if a picture is not registered in the Manifest file, then it can produce a message error when opening the document.
Video and sound cannot be stored in OpenOffice documents.
Main file: 'content.xml':
-------------------------
Synopsis:
---------
...
Normal new lines are made with a new paragraphs ...
Simple new lines are made with
Page breaks are made with a new paragraphe having a style which has the attribute {fo:break-before="page"}.
Local styles (bold, color,...) are made with ...
Images:
Table in a document:
--------------------
...
...
...
Spécial to .ODF (OpenOffice Math Formula):
------------------------------------------
Any comment in the formula must be entered between text delimiters which are the double quotes (").
Newlines are made with the keyword 'newline' outside the text delimiter.
=================================
Ms Office Document (.DOCX, .XLSX, .PPTX)
=================================
Pictures are stored into the directory 'word/media/', 'ppt/media/' or 'xl/media' depending to the type of document. (OpenTBS does it automatically for you when you use parameter "addpic")
Pictures also need to be registred into the Relationshio file 'word/_rels/document.xml.rels' in order to be usable in a document. (OpenTBS does it automatically for you when you use parameter "addpic")
Picture synopsis in a Word document:
-------------------------------------
...
************************
Ms Word Document (.DOCX)
************************
Synopsis of the main file 'word/document.xml':
----------------------------------------------
New paragraph
Parameters of the paragraph
Set of parameters for a Run
New run item. A run item is a set of content having common layout properties.
Set of parameters for a Run. Examples: is italic, is bold.
Your text is here
Simple new lines are made with
Page breaks are made with
What are attributes "w:rsidR" and "w:rsidRPr" for?
--------------------------------------------------
"w:rsidR" is a Revision ID. Each new user on a doc has a new id,
and each of its modification is marked with its RsID.
More info: http://blogs.msdn.com/brian_jones/archive/2006/12/11/what-s-up-with-all-those-rsids.aspx
Synopsis of a table inserted in a Word document:
------------------------------------------------
...
...
...
...
...
****************************
Ms Excel SpreadSheet (.XLSX)
****************************
An Excel workbook can have one or several worksheets. The contents of cells are saved in worksheets.
Worksheets files are named 'xl/worksheets/sheet1.xml', and also sheet2.xml, sheet3.xml...
The file names are not the names defined in Excel by the user, they are internal names. But it seems
that there is always at least a worksheet named 'sheet1.xml'.
All string values of cells are stored in the file 'xl/sharedStrings.xml'. The cells contains
in fact the index of the string in the sharedStrings.xml file. This separation will probably
make difficulties to merge an Excel sheet.
All sheets of the workbook are listed in the file 'xl/workbook.xml'.
Synopsis of a sheet file like 'xl/worksheets/sheet1.xml':
---------------------------------------------------------
...
A range of one row in wich several cells are defined
Definition of a cell:
Attribute r is the address if the cell in the sheet
Attribute s is the style of the cell (the format). Styles are saved into the file 'xl/styles.xml' but I have not found the link yet.
Attribute t is the type of data, by default it is numerical
t="s" means that the displayed value is a string, the saved value is the index if the string taken in file sharedStrings.xml.
B13+B14 the formula. If there is no formula, this tag is absent.
0 the inner value without formatting. If t="s" then the value is in fact the index of the string.
Synopsis of the Shared String file 'xl/sharedStrings.xml':
----------------------------------------------------------
value or text
**********************************
Ms PowerPoint Presentation (.PPTX)
**********************************
Think to set all texts to "Tools\Language\No check" when you edit the PowerPoint presentation, otherwise some TBS fields
can be split by XML tags about the language and spell checking.
Slides are listed in the file 'ppt/_rels/presentation.xml.rels', where an internal id is affacted to them.
The first slide is quite always corresponding to the file 'ppt/slides/slide1.xml'.
Synopsis of a slide file like 'ppt/slides/slide1.xml':
------------------------------------------------------
Some text here