Skip to content Skip to sidebar Skip to footer

Copy Only Html From Mixed Xml And Html

We have a bunch of files that are html pages but which contain additional xml elements (all prefixed with our company name 'TLA') to provide data and structure for an older program

Solution 1:

Specifically targeting HTML elements would be hard, but if you just want to exclude content from the TLA namespace (but still include any non-TLA elements that the TLA elements contain), then this should work:

<xsl:stylesheetversion="1.0"xmlns:xsl="http://www.w3.org/1999/XSL/Transform"xmlns:mbl="http://www.tla.com"exclude-result-prefixes="mbl"><xsl:outputmethod="xml"indent="yes"/><xsl:strip-spaceelements="*" /><xsl:templatematch="@*|node()"priority="-2"><xsl:copy><xsl:apply-templatesselect="@*|node()"/></xsl:copy></xsl:template><!-- This element-only identity template prevents the 
       TLA namespace declaration from being copied to the output --><xsl:templatematch="*"><xsl:elementname="{name()}"><xsl:apply-templatesselect="@* | node()" /></xsl:element></xsl:template><!-- Pass processing on to child elements of TLA elements --><xsl:templatematch="mbl:*"><xsl:apply-templatesselect="*" /></xsl:template></xsl:stylesheet>

You can also use this instead if you want to exclude anything that has any non-null namespace:

<xsl:stylesheetversion="1.0"xmlns:xsl="http://www.w3.org/1999/XSL/Transform"xmlns:mbl="http://www.tla.com"exclude-result-prefixes="mbl"><xsl:outputmethod="xml"indent="yes"/><xsl:strip-spaceelements="*" /><xsl:templatematch="@*|node()"priority="-2"><xsl:copy><xsl:apply-templatesselect="@*|node()"/></xsl:copy></xsl:template><xsl:templatematch="*"><xsl:elementname="{name()}"><xsl:apply-templatesselect="@* | node()" /></xsl:element></xsl:template><xsl:templatematch="*[namespace-uri()]"><xsl:apply-templatesselect="*" /></xsl:template></xsl:stylesheet>

When either is run on your sample input, the result is:

<html><head><title>Highly Simplified Example Form</title></head><body><table><tr><td><inputid="input_id_1"type="text" /></td></tr></table></body></html>

Post a Comment for "Copy Only Html From Mixed Xml And Html"