Categories > OpenTBS with DOCX >

MS Word xml tags changed by OpenTBS

The forum is closed. Please use Stack Overflow for submitting new questions. Use tags: tinybutstrong , opentbs
By: Jan
Date: 2013-08-26
Time: 14:26

MS Word xml tags changed by OpenTBS

In my php code I've got a text string in wich the str_replace function replaces <i> with the corespondening Word syntax: </w:t></w:r><w:r><w:rPrw:i/></w:rPr><w:t>.
If I look in the document.xml is see that all the < chr are replaced with &lt;.
I am puzzled how comes. Anyone who can help me out?
By: Sarah Kemp
Date: 2013-08-26
Time: 17:31

Re: MS Word xml tags changed by OpenTBS

You can try strconv=no (http://www.tinybutstrong.com/manual.php#appendix_field) but I have not had much luck inserting WordML manually...
By: Jan
Date: 2013-08-27
Time: 09:17

Re: MS Word xml tags changed by OpenTBS

hi Sarah,
thnkz for the reply, but i've tried allready the strconv with several parameters (yes, no, utf8) but with no result
By: Skrol29
Date: 2013-08-28
Time: 00:06

Re: MS Word xml tags changed by OpenTBS

Hi Jan,

What is your TBS field?

Did you have a look at this thread ?
http://www.tinybutstrong.com/forum.php?thr=2923
By: Jan
Date: 2013-08-28
Time: 08:42

Re: MS Word xml tags changed by OpenTBS

Hi Skrol,

Thnkz for the reply.

What do you mena exactly with 'What is your TBS field?'. The syntax in my docx template is [onshow.overviewTrans], where 'overviewTrans' is a string with text transformed via XSLT with some html tags.
This tags I would like to change to docx xml. I tried with a str_replace, in which I also do a UTF-8 to ISO-8859-1 transform (for special chr's).
Anyway, my str_replace does the replacment from <i> to </w:t></w:r><w:r><w:rPr><w:color w:val="800000"/><w:i/></w:rPr><w:t>
and </i> to </w:t></w:r><w:r><w:rPr><w:color w:val="800000"/></w:rPr><w:t>.
If i do a php echo it does the work but after loading the template and do the merge all the < and > become &lt; and &gt; --> &lt;/w:t&gt;&lt;/w:r&gt;&lt;w:r&gt;&lt;w:rPr&gt;&lt;w:color w:val="800000"/&gt;&lt;w:i/&gt;&lt;/w:rPr&gt;&lt;w:t&gt;
So I do something the same as in the thread you mention.
If i change manually the chr in the document.xml and zip it back to the docx everything works as it schould.

I also did the things you recoment in the thread http://www.tinybutstrong.com/forum.php?thr=2957
menaning:
$TBS->LoadTemplate('my_template.docx', false);
this does the job if I look in the document.xml but gives an error in Word when it tries to open the docx.
The 'strconv=no' and 'protect=no' in the template have no effect.

I am out of options here.
Would be grat if you get me on track!
By: Jan
Date: 2014-04-15
Time: 13:36

Re: MS Word xml tags changed by OpenTBS

Anyway, after month of other business I had to come back to my original work.
The last days I spend several hours on my problem and whateverIi try the outcome is or a docx with the wrong outcome or a corrupted docx.
I am pretty sure that before merging my string has the proper format (i.g. with the Word xml tags </w:t></w:r><w:r><w:rPr><w:color w:val="800000"/><w:i/></w:rPr><w:t>) but after merging the document.xml contains the &lt; and &gt; format insted of the < and >.
So, anyone????
By: Skrol29
Date: 2014-04-16
Time: 02:05

Re: MS Word xml tags changed by OpenTBS

Did you use parameter "strconv=no" in this TBS field ?
By: Jan
Date: 2014-04-16
Time: 08:08

Re: MS Word xml tags changed by OpenTBS

Yes, I did. See my reply above on Date: 2013-08-27.
By: Skrol29
Date: 2014-04-16
Time: 22:19

Re: MS Word xml tags changed by OpenTBS

But in your post after, you said your TBS fields is [onshow.overviewTrans].
By: Jan
Date: 2014-04-17
Time: 14:13

Re: MS Word xml tags changed by OpenTBS

Yep, I did, but removed the parameters because they didn't had any or the wrong impact, so why keep them in the code?
By: Skrol29
Date: 2014-04-17
Time: 23:07

Re: MS Word xml tags changed by OpenTBS

If you have TBS lower than 1.8.0 then it is not "stronv=no" but "htmlconv=no".

Otherwise, it may be a problem of split formating.
Try this: in the template, write [onshow.overviewTrans;strconv=no]. Then cut it, and then paste without formating.

If nothing works, then I will need your template in order to check what is happening.
By: Jan
Date: 2014-04-18
Time: 08:55

Re: MS Word xml tags changed by OpenTBS

Thnkz Skrol29,

I'll ccheck verrsion and try again and let you know what the outcome is.
By: Jan
Date: 2014-04-22
Time: 10:18

Re: MS Word xml tags changed by OpenTBS

Hi Skrol29,

I use TBS 1.8.0
The cut and paste still rsults in a corrupt doc file (says Word).
How do I sent my Word template to you?
By: Skrol29
Date: 2014-04-23
Time: 01:04

Re: MS Word xml tags changed by OpenTBS

hi Jan,

I receive your template. I understand better know your problem.
I thought you want to convert HTML to Word XML only for one field, but yu seem to have plenty of them.

If you have quite few and localized fields that can contain string to convert, then the solution should be simple (but delicate).
If you have plenty of them, then it can be quite difficult because the replacement with "</w:t>..." can be wrong depending on how is placed the TBS field.

Do you have lot of fields to convert or only few ?
By: Jan
Date: 2014-04-23
Time: 09:06

Re: MS Word xml tags changed by OpenTBS

Hi Skrol29,

Only 2 fields should be converted:
1. 3rd page [onshow.overviewTrans;protect=no]
2. 17th page [onload.gravesTrans]
If you want I could sent the text that should be merged here.

The replacement of <i> and </i> is done as disribed above (Date: 2013-08-28)
By: Skrol29
Date: 2014-04-24
Time: 00:43

Re: MS Word xml tags changed by OpenTBS

Hi,

Ok  then the solution is to use an "onformat" custom function.

DOCX:
[onshow.overviewTrans;onformat=f_html2docx]
[onshow.gravesTrans;onformat=f_html2docx]

PHP:
function f_html2docx($FieldName, &$CurrVal, &$CurrPrm) {

    $el = 'i';
    $tag_open  = '<' . $el . '>';
    $tag_close = '</' . $el . '>';
    $nb = substr_count($CurrVal, $tag_open);
    // Check opening and closing tags
    if ( ($nb > 0) && ($nb == substr_count($CurrVal, $tag_open)) ) {
        $CurrPrm['strconv'] = 'no';
        $CurrVal= str_replace($tag_open,  '</w:t></w:r><w:r><w:rPr><w:i/></w:rPr><w:t>', $CurrVal);
        $CurrVal= str_replace($tag_close, '</w:t></w:r><w:r><w:t>', $CurrVal);
    }

}

Interesting related topics :
http://www.tinybutstrong.com/forum.php?thr=2885
http://stackoverflow.com/questions/9315531/opentbs-convert-html-tags-to-ms-word-tags
By: Jan
Date: 2014-04-24
Time: 13:44

Re: MS Word xml tags changed by OpenTBS

Hi Skrol29,

I'll try your suggestion. In the mean time I made a workaround that works perfectly.
I found out that during the merge of the template tinyTBS or OpenTBS change al the <, > and ' characters into html code, i.e. < into &lt; > into &gt; and ' into &quote;
Word can't read that proper and gives a 'corrupt' message.
After solving that with a php replace the result is perfect!
Anyway, thnkzz for thinking along and the suggestions.
If you are interested in my workaround just let me know!
By: Skrol29
Date: 2014-04-24
Time: 14:09

Re: MS Word xml tags changed by OpenTBS

> If you are interested in my workaround just let me know!

The problem is that you've bypassed an OpenTBS protection.
If you open the template with CharSet to false
  $TBS->LoadTemplate('my_template.docx', false);
then any data merged in the template can corrupt the DOCX since they can contain XML special characters.

So it is better to let the protection, and switch it off only for special item data such as those you have with HTML tags.

By: Jan
Date: 2014-05-06
Time: 13:07

Re: MS Word xml tags changed by OpenTBS

I tried your suggestions but
$TBS->LoadTemplate('my_template.docx', false);
gives again a corrupt docx and the function f_html2docx has no effect, that is: the <, > and ' characters are still written as  &lt;  &gt; and &quote; in the docx.
I ran out of options and probably will use my own workaround:
        $TBS->LoadTemplate($docFile);
    $TBS->MergeBlock('data',$data);

    // create $id.docx
    $file_name = $lastName . ' ' . $dateOfInvestigation . '.docx';
        $TBS->Show(OPENTBS_FILE, $file_name);
   
    $docXml = 'word/document.xml';
    $zip = new ZipArchive;
    if ($zip->open($file_name) === TRUE)
    {
        $search = array('&lt;i&gt;', '&lt;/i&gt;', '&lt;b&gt;', '&lt;/b&gt;', '&quot;');
        // set replace var
        $replace = array('</w:t></w:r><w:r><w:rPr><w:color w:val="800000"/><w:i/></w:rPr><w:t>', '</w:t></w:r><w:r><w:rPr><w:color w:val="800000"/></w:rPr><w:t>', "", "", "'");
        // search and replace
        $newDocXml = str_replace($search, $replace, $zip->getFromName($docXml));
        //Delete the old...
        $zip->deleteName($docXml);
        //Write the new...
        $zip->addFromString($docXml, $newDocXml);
        //And write back to the filesystem.
        $zip->close();
    }
    else
    {
        echo 'failed';
    }
By: Vincent T
Date: 2015-08-18
Time: 21:49

Re: MS Word xml tags changed by OpenTBS

Hi, I know it is long time, but while running into a kind of similar problem I think I found out the solution to your problem.

Instead of  $TBS->MergeBlock('data',$data); use  $TBS->MergeBlock('data','text', $data);
When I do like that it interpretate right away the WordML.
By: Jan
Date: 2015-08-19
Time: 08:28

Re: MS Word xml tags changed by OpenTBS

Hi Vincent T, thnkzz for the message.
If I have time i will look into your solution/suggestion and let you know what the outcome is.