XSLT 2.0 Grouping techniques
In this page, I am compiling some grouping
problems on XML data, and their solutions using the XSLT 2.0 language.
1. Grouping problem
The following question was asked on XSL-List.
I have the following source XML:
<!-- other upto 100 elements -->
<!-- other upto 100 elements -->
<!-- other upto 100 elements -->
<!-- other upto 100 elements -->
Each <Result> has the same list of sub-elements, some might not have a text
I want to aggregate and get something like this:
<Tag value="John" count="2" />
<Tag value="Thomas" count="1" />
<Tag value="UK" count="2" />
<Tag value="US" count="1" />
<Tag value="Estonia" count="1" />
<Tag value="Red" count="2" />
<Tag value="Green" count="1" />
<!-- other elements grouped by element name, sorted by total
of element values-->
Following is a XSLT 2.0 solution for this (the sorting is not implemented):
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
<xsl:output method="xml" indent="yes" />
<xsl:template match="/">
<xsl:for-each select="All_Results/Result[1]/*">
name="name" select="name()" />
<xsl:for-each-group select="../../Result/*[name() = $name]" group-by=".">
<xsl:if test="not(normalize-space(.) = '')">
<Tag value="{.}" count="{count(current-group())}" />
Andrew Welch suggested:
Here's another way which doesn't rely on all
elements being present in the first <Result>:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:for-each-group select="/All_Results/Result/*[normalize-space()]"
<xsl:for-each-group select="current-group()" group-by=".">
<Tag value="{current-grouping-key()}" count="{count(current-group())}"/>
2. Eliminating duplicates
The following question was asked on XSL-List.
What's the best way of getting rid of duplicate nodes which contain more than
one attribute. Suppose I have the following xml:
<edge source="IGetter" target="CGetter" dependency="positive"/>
<edge source="IGetter" target="CGetter" dependency="positive"/>
<edge source="IGetter" target="CCount" dependency="positive"/>
<edge source="ICount" target="IGetter" dependency="positive"/>
<edge source="ICount" target="CGetter" dependency="positive"/>
<edge source="ICount" target="ICount" dependency="positive"/>
<edge source="ICount" target="CCount" dependency="positive"/>
<edge source="ICount" target="CCount" dependency="positive"/>
How do I get rid of one
<edge source="IGetter" target="CGetter" dependency="positive"/>
and one
<edge source="ICount" target="CCount" dependency="positive"/>
which appear twice?
Following is a solution for this, using some new XPath 2.0 constructs:
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
<xsl:output method="xml" indent="yes" />
<xsl:template match="x">
<xsl:for-each select="edge[not(some $i
in preceding-sibling::edge satisfies deep-equal($i, .))]">
select="." />
This when applied to the XML:
<edge source="IGetter" target="CGetter"
<edge source="IGetter" target="CGetter"
<edge source="IGetter" target="CCount"
<edge source="ICount" target="IGetter"
<edge source="ICount" target="CGetter"
<edge source="ICount" target="ICount" dependency="positive"/>
<edge source="ICount" target="CCount" dependency="positive"/>
<edge source="ICount" target="CCount" dependency="positive"/>
Produces output:
<?xml version="1.0" encoding="UTF-8"?>
<edge source="IGetter" target="CGetter"
<edge source="IGetter" target="CCount"
<edge source="ICount" target="IGetter"
<edge source="ICount" target="CGetter"
<edge source="ICount" target="ICount" dependency="positive"/>
<edge source="ICount" target="CCount" dependency="positive"/>
(Thanks to Abel Braaksma for ideas.)
3. Positional grouping problem
The following question was asked on
The input XML is as following:
What I need to do is to select all nodes between a <StartOrderGroup> element
and a <EndOrderGroup> element, so that I get an output like:
Car - 2
Car - 3
Bus - 4
Truck - 9
Here's a solution to this problem from, Michael Kay:
In XSLT 2.0, use
<xsl:for-each-group group-starting-with="StartOrderGroup">
<xsl:variable name="start" select="current-group()[self::StartOrderGroup]"/>
<xsl:variable name="end" select="current-group()[self::EndOrderGroup]"/>
<xsl:variable name="group" select="current-group()[. >>
$start and . << $end]
select="name()"/> - <xsl:value-of select="Id"/>
plus some formatting as required.
I attempted to solve this as following:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
<xsl:output method="text" />
<xsl:template match="Orders">
<xsl:for-each-group select="*" group-starting-with="StartOrderGroup">
<xsl:variable name="curr-group"
select="current-group()" />
<xsl:variable name="indx"
select="index-of(for $x in $curr-group return $x/local-name(), 'EndOrderGroup')"
<xsl:for-each select="$curr-group[position()
> 1 and position() < $indx]">
select="local-name()" /> - <xsl:value-of select="Id" /><xsl:text>
Last Updated: Jan 11, 2009