Compare commits

...

3 Commits

4 changed files with 121 additions and 110 deletions

View File

@ -5,11 +5,7 @@
Filter to install as an export filter for LibreOffice Calc. --> Filter to install as an export filter for LibreOffice Calc. -->
<!-- Dominique Meeùs, modified 21-5-2020, version 0.92. <!-- Dominique Meeùs, modified 21-5-2020, version 0.92.
Hardcoded languages of the columns are : nl-BE, fr-BE, de-DE, en-GB, es-ES. --> Hardcoded languages of the columns are : nl-BE, fr-BE, de-DE, en-GB, es-ES. -->
<!-- Philippe Tourigny, modified 12-9-2022, version 0.94 <!-- Copyright 2013, 2020 Dominique Meeùs.
Enable the filter to read the language code from the first column
in each row. Also, add a SYSTEM DOCTYPE declaration to the output
XML file.-->
<!-- Copyright 2013, 2020 Dominique Meeùs, and 2022 Philippe Tourigny@.
This program is free software: you can redistribute it and/or modify it This program is free software: you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation, as published by the Free Software Foundation,
@ -26,68 +22,57 @@
xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0" xmlns:text="urn:oasis:names:tc:opendocument:xmlns:text:1.0"
exclude-result-prefixes="office table text"> exclude-result-prefixes="office table text">
<!-- Namespaces needed to access parts of the document --> <!-- Namespaces needed to access parts of the document -->
<xsl:output method = "xml" indent = "yes" encoding = "UTF-8" omit-xml-declaration = "no"/>
<!-- Version 0.93: Add a SYSTEM DOCTYPE to output -->
<xsl:output method = "xml" indent = "yes" encoding = "UTF-8"
doctype-system="tmx14.dtd" omit-xml-declaration = "no"/>
<xsl:template match="/"> <xsl:template match="/">
<tmx version="1.4"> <tmx version="1.4">
<!-- Define variables to make code easier to manage. <header
Cells in the first row are language code headings. creationtool="TMX-export for LibreOffice"
The first target language is selected as the creationtoolversion="0.9"
administrative language for the output TMX. segtype="sentence"
The first column identifies the source language. --> o-tmf="application/vnd.oasis.opendocument.spreadsheet"
adminlang="fr-BE"
<xsl:variable name="headingCell" srclang="nl-BE"
select="//table:table/table:table-row[1]/table:table-cell"/> datatype="plaintext">
<xsl:variable name="adminlang"
select="$headingCell[2]/text:p"/>
<xsl:variable name="srclang"
select="$headingCell[1]/text:p"/>
<!-- Define the TMX header
The <xsl:attribute> element is used because variables
are not recognized if entered directly in attributes. -->
<header>
<xsl:attribute name="creationtool">TMX-export for LibreOffice</xsl:attribute>
<xsl:attribute name="creationtoolversion">0.95</xsl:attribute>
<xsl:attribute name="segtype">sentence</xsl:attribute>
<xsl:attribute name="o-tmf">application/vnd.oasis.opendocument.spreadsheet</xsl:attribute>
<xsl:attribute name="adminlang">
<xsl:value-of select="$adminlang"/>
</xsl:attribute>
<xsl:attribute name="srclang">
<xsl:value-of select="$srclang"/>
</xsl:attribute>
<xsl:attribute name="datatype">plaintext</xsl:attribute>
</header> </header>
<!-- Todo : get the language from Calc, if any, or from a dialog, if LibreOffice allows,
<!-- Define the TMX body. --> or from the first row
to set the srclang of the header and the xml:lang of the tuv. -->
<body> <body>
<xsl:for-each select="//table:table-row[position()>1]"> <xsl:for-each select="//table:table-row">
<tu> <tu>
<xsl:for-each select="table:table-cell"> <xsl:for-each select="table:table-cell">
<xsl:variable name="currentLang" <xsl:choose>
select="//table:table/table:table-row[1]/table:table-cell"/> <xsl:when test="position()=1">
<xsl:variable name="currentColumn" <tuv xml:lang="nl-BE">
select="position()"/> <seg><xsl:value-of select="text:p"/></seg>
<xsl:if test="normalize-space(text:p) != ''">
<tuv>
<xsl:attribute name="xml:lang">
<xsl:value-of select="$currentLang[$currentColumn]/text:p"/>
</xsl:attribute>
<seg>
<xsl:value-of select="text:p"/>
</seg>
</tuv> </tuv>
</xsl:if> </xsl:when>
</xsl:for-each> <xsl:when test="position()=2">
<tuv xml:lang="fr-BE">
<seg><xsl:value-of select="text:p"/></seg>
</tuv>
</xsl:when>
<xsl:when test="position()=3">
<tuv xml:lang="de-DE">
<seg><xsl:value-of select="text:p"/></seg>
</tuv>
</xsl:when>
<xsl:when test="position()=4">
<tuv xml:lang="en-GB">
<seg><xsl:value-of select="text:p"/></seg>
</tuv>
</xsl:when>
<xsl:when test="position()=5">
<tuv xml:lang="es-ES">
<seg><xsl:value-of select="text:p"/></seg>
</tuv>
</xsl:when>
</xsl:choose>
</xsl:for-each>
</tu> </tu>
</xsl:for-each> </xsl:for-each>
</body> </body>
</tmx> </tmx>
</xsl:template> </xsl:template>
</xsl:stylesheet> </xsl:stylesheet>

View File

@ -4,10 +4,7 @@
XSLT transformation of a TMX translation memory exchange file XSLT transformation of a TMX translation memory exchange file
into an Open Document Format spreadsheet in two columns. into an Open Document Format spreadsheet in two columns.
Filter to install as an import filter for LibreOffice Calc. --> Filter to install as an import filter for LibreOffice Calc. -->
<!-- Philippe Tourigny, modified 12-9-2022, version 0.99 <!-- Copyright 2013 Dominique Meeùs.
Allow the filter to retrieve the languages in the TMX from its
first <tu> element, and create a column for each language. -->
<!-- Copyright 2013 Dominique Meeùs, and 2022 Philippe Tourigny.
This program is free software: you can redistribute it and/or modify it This program is free software: you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation, as published by the Free Software Foundation,
@ -45,76 +42,47 @@
office:version="1.0"> office:version="1.0">
<office:automatic-styles> <office:automatic-styles>
<!-- PTable properties --> <!-- Properties of the table -->
<style:style style:name="ta1" style:family="table" style:master-page-name="Default"> <style:style style:name="ta1" style:family="table" style:master-page-name="Default">
<style:table-properties table:display="true" style:writing-mode="lr-tb"/> <style:table-properties table:display="true" style:writing-mode="lr-tb"/>
</style:style> </style:style>
<!-- Column properties (for all languages) --> <!-- Properties of the columns -->
<style:style style:name="co1" style:family="table-column"> <!-- I consider the case of a two-languages TMX -->
<style:table-column-properties fo:break-before="auto" style:column-width="14.000cm"/> <!-- Todo : pass the source and target languages as attributes to the text in Calc -->
<style:style style:name="co1" style:family="table-column"><!-- source language -->
<style:table-column-properties fo:break-before="auto" style:column-width="16.000cm"/>
</style:style>
<style:style style:name="co2" style:family="table-column"><!-- target language -->
<style:table-column-properties fo:break-before="auto" style:column-width="16.000cm"/>
</style:style> </style:style>
<!-- Row properties --> <!-- Properties of the rows -->
<!-- All rows are set to “optimal height” --> <!-- The rows are “optimal height” but do not expand for the “wrap option” of the cells -->
<style:style style:name="ro1" style:family="table-row"> <style:style style:name="ro1" style:family="table-row">
<style:table-row-properties fo:break-before="auto" style:use-optimal-row-height="true"/> <style:table-row-properties fo:break-before="auto" style:use-optimal-row-height="true"/>
</style:style> </style:style>
<!-- Cell properties --> <!-- Properties of the cells -->
<!-- Language code heading cells <style:style style:name="ce1" style:family="table-cell" style:parent-style-name="Default">
The language codes are centered and set in bold <style:table-cell-properties fo:wrap-option="wrap"/>
in the first column. -->
<style:style style:name="heading" style:family="table-cell"
style:parent-style-name="Default">
<style:table-cell-properties style:text-align-source="fix"
style:repeat-content="false" fo:wrap-option="wrap"/>
<style:paragraph-properties fo:text-align="center"/>
<style:text-properties fo:font-weight="bold"/>
</style:style> </style:style>
<!-- Style for cells with the segment text -->
<style:style style:name="ce2" style:family="table-cell" style:parent-style-name="Default"> <style:style style:name="ce2" style:family="table-cell" style:parent-style-name="Default">
<style:table-cell-properties fo:wrap-option="wrap"/> <style:table-cell-properties fo:wrap-option="wrap"/>
</style:style> </style:style>
</office:automatic-styles> </office:automatic-styles>
<!-- Define variables used to identify the languages
In a TMX with three or more languages. All translation unit
(<tuv>) elements are assumed to contain the same number of
languages, and the first <tu> is used to identify them. -->
<!-- Todo: Identify the <tu> with the largest highest number of <tuv> elements to identify all languages in a TMX file with more languages in some <tu> elements than others. -->
<xsl:variable name="firstTU" select="tmx/body/tu[1]"/>
<xsl:variable name="numLangs" select="count($firstTU/tuv)"/>
<office:body> <office:body>
<office:spreadsheet> <office:spreadsheet>
<table:table table:style-name="ta1"> <table:table>
<!-- Set the format for a number of columns equal to <!-- Format of the columns -->
the number of languages in the imported TMX file --> <!-- How about a free number of columns ? -->
<table:table-column table:style-name="co1" table:number-columns-repeated="{$numLangs}" <table:table-column table:style-name="co1" table:default-cell-style-name="ce1"/>
table:default-cell-style-name="ce2"/> <table:table-column table:style-name="co2" table:default-cell-style-name="ce2"/>
<!-- Fill in the language headers in the first row <!-- Process XML of the input TMX file: one row for each tu, one cell for the segment in each tuv -->
The use of the "local-name()" function enables
the filter to handle older versions that use the
"lang" attribute as well as recent versions that
use the "xml:lang"` attributes -->
<table:table-row table:style-name="ro1">
<xsl:for-each select="$firstTU/tuv">
<table:table-cell table:style-name="heading">
<text:p>
<xsl:value-of select="@*[local-name()='lang']"/>
</text:p>
</table:table-cell>
</xsl:for-each>
</table:table-row>
<!-- Process the <tu> and <tuv> elements in the TMX file:
One row per tu, one column per segment in each <tuv>. -->
<xsl:for-each select="tmx/body/tu"> <xsl:for-each select="tmx/body/tu">
<table:table-row table:style-name="ro1"> <table:table-row>
<xsl:for-each select="tuv"> <xsl:for-each select="tuv">
<table:table-cell> <table:table-cell>
<text:p><xsl:value-of select="seg"/></text:p> <text:p><xsl:value-of select="seg"/></text:p>

View File

@ -0,0 +1,58 @@
Dominique Meeùs (dominique@d-meeus.be, https://d-meeus.be),
created 30-9-2013, version 0.9.
(If my surname is not shown correctly, mind the fact that this file is encoded in UTF-8.)
(One would have written Mee&ugrave;s in old HTML entities for older charachters encodings.)
Copyright 2013 Dominique Meeùs, see below.
Modified 28-11-2019 (this readme file) as 0.91 for the position of the command
to install in new versions of LibreOffice.
Modified 21-5-2020 (the XML export filter) as 0.92 to allow five columns nl-BE, fr-BE, de-DE, en-GB, es-ES.
(Originally two columns only.)
This TMX-filters software provides two XSLT filters
to import or export TMX translation memories
to and from LibreOffice (or OpenOffice.org) Calc.
Installation
============
The package TMX-filters-vx_y.jar has to be installed in LibreOffice
by the standard dialog from the menu Tools,
command Macros > XML Filter Settings…
[command directly XML Filter Settings… in older versions]
Click the button Open Package… and browse to the package.
Limitation
==========
Source language nl-BE (Dutch) and targets fr-BE (French), de-DE, en-GB, es-ES
are hardcoded in the export filter.
The segments in two or more of the above languages have to be put
in the corresponding columns A (nl), B (fr), C (de), D (en), E (es) of Calc.
Some columns may stay empty.
The source language is marked srclang="nl-BE" near the end of line 3.
It suffices to change this attribute value to change the source language.
The order A (nl), B (fr), C (de), D (en), E (es) in compulsory in Calc
to have the segments correctly marked for their language.
Each segment being so marked, the order has no effect on the translation memory. Only srclang matters
For other languages you have to search and replace these attribute values in the TMX
or edit the export filter.
Feel free to e-mail me if you think of a way to improve this.
Use
===
— To read a TMX in LibreOffice Calc: File, Open…
— To produce a TMX sentences aligned in two columns in LibreOffice Calc,
File, Save as… (not Export…) and choose type TMX (likely all the way down a long list).
License
=======
Copyright 2013, 2020 Dominique Meeùs.
This program is free software: you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License
as published by the Free Software Foundation,
either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY;
without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
See the GNU Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public License along with this program.
If not, see http://www.gnu.org/licenses/.