XSLT Grouping techniques
 

In this page, I'm compiling some XSLT grouping problems and their solutions. XSLT 1.0 has no built-in support for grouping. Grouping in XSLT 1.0 stylesheets is done using special techniques, mainly the Muenchian method. Recognizing the need for grouping solutions, XSLT 2.0 has introduced an explicit syntax xsl:for-each-group, to do grouping.
 

1. Muenchian grouping method

This grouping technique is quite popular in XSLT 1.0, because of its applicability to wide range of grouping problems, and efficiency. It is named after its inventor Steve Muench from Oracle. The description about this technique can be found at,
http://www.jenitennison.com/xslt/grouping/index.html or, http://www.dpawson.co.uk/xsl/sect2/N4486.html.

Here is the original post by Steve Muench introducing this method on the
XSL-List:
http://www.biglist.com/lists/xsl-list/archives/200005/msg00270.html.
 

2. Recursive grouping method

This XSLT 1.0 grouping technique was suggested by Sergiu Ignat on the
XSL-List. It looks very good, and is easy to understand by a beginner. Here is the original post by Sergiu Ignat introducing this method.

Quote (Sergiu Ignat)...
The main idea is to have a named template that takes as a parameter the node list that should be grouped, processes the group defined by the first element and recursively calls itself for the rest of the list excluding that group.

The example below will group books in a book list and will compute how many books each author has (input XML is shown after it).

<xsl:stylesheet version="1.0" xmlns:xsl="
http://www.w3.org/1999/XSL/Transform">
 <xsl:template match="/booklist">
  <authors>
   <xsl:call-template name="RecursiveGrouping">
    <xsl:with-param name="list" select="book"/>
   </xsl:call-template>
  </authors>
 </xsl:template>

 <xsl:template name="RecursiveGrouping">
  <xsl:param name="list"/>

  <!-- Selecting the first author name as group identifier and the group itself-->
  <xsl:variable name="group-identifier" select="$list[1]/@author"/>
  <xsl:variable name="group" select="$list[@author=$group-identifier]"/>

  <!-- Do some work for the group -->
  <author name="{$group-identifier}" number-of-books="{count($group)}"/>

  <!-- If there are other groups left, calls itself -->
  <xsl:if test="count($list)>count($group)">
  <xsl:call-template name="RecursiveGrouping">
    <xsl:with-param name="list" select="$list[not(@author=$group-identifier)]"/>
  </xsl:call-template>
  </xsl:if>
 </xsl:template>
</xsl:stylesheet>

The input XML for this example is the shown below.

<booklist>
   <book author="Frank Herbert" title="Dune"/>
   <book author="Roberto Quaglia" title="Bread, Butter and Paradoxine"/>
   <book author="Kate Wilhelm" title="Where Late the Sweet Bird Sang"/>
   <book author="Anthony Burgess" title="A Clockwork Orange"/>
   <book author="Frank Herbert" title="Dragon in the Sea"/>
   <book author="Anthony Burgess" title="Earthly Powers"/>
   <book author="Isaak Asimov" title="The Foundation Trilogy"/>
   <book author="Frank Herbert" title="Children of Dune"/>
   <book author="Isaak Asimov" title="The Caves of Steel"/>
</booklist>

There was an interesting debate on
XSL-List about performance comparison between Sergiu's method and the Muenchian method. Michael Kay expressed following opinion: "Assuming a competent implementation of xsl:key, Muenchian grouping has O(N log N) complexity. By contrast, your algorithm appears to perform N*G/2 comparisons, where N is the number of items and G the number of groups. This may perform well in the special case where G is small and independent of N, but in the more general case G is likely to increase as N increases, which means that your algorithm approaches O(N^2)".

Sergiu said..
> The advantage of recursive grouping is that it is possible to define grouping rules that are much more complex than one accepted by the use
> attribute of xsl:key or by generate-id() function. For example if in a list of items that have attributes price and quantity the grouping should be done
> by price*quantity.

Michael replied..
"Muenchian grouping makes it easy to group on the result of any path expression, e.g.

<xsl:key name="m" match="order-line" group="@price * @qty"/>

If there is a limitation, it is that Muenchian grouping is inconvenient when the node-set to be grouped is anything other than "all nodes in a single document that match pattern P": that is, it's inconvenient when the population spans multiple documents, or when it is scoped to a particular subtree within a document." 

The downside here is:

> <xsl:with-param name="list" select="$list[not(@author=$group-identifier)]"/>

which performs a serial search of the grouping population (actually, on average, half the population) once for each distinct value of the grouping key. The algorithm therefore has order (at least) O(P*G) where P is the size of the population and G the number of groups. In applications where the number of groups increases with the population (e.g. grouping employees by surname) this is effectively O(N^2).

I agree that the algorithm is viable in cases where the number of groups is small and almost fixed, e.g. grouping sales by continent.

But of course if the number of groups is completely fixed, the simplest approach is:

for-each continent
  for-each P[continent=current()]



3. Converting flat XML structure to hierarchical one (Positional grouping)

The following question was asked on the
XSL-List.

I need to convert a sequence of elements of the same name like this:

<a>content1</a>
<note>content2</note>
<note>content3</note>
<note>content4</note>
<b>content5</b>

into a nested structure like this:

<a>content1</a>
<note>content2
  <cont>content3</cont>
  <cont>content4</cont>
</note>
<b>content5</b>

In other words, if there are consecutive <note> elements, they should be combined into one <note> element with the 2nd, 3rd and subsequent lines expressed as <cont> (continuation line) elements within the <note> element.

The stylesheet for this problem is:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="
http://www.w3.org/1999/XSL/Transform">
 
<xsl:output method="xml" indent="yes" />
 
<xsl:template match="/root">
  <root>
    <xsl:for-each select="*">
      <xsl:choose>
        <xsl:when test="name(.) != 'note'">
          <xsl:copy-of select="." />
        </xsl:when>
        <xsl:when test="name(preceding-sibling::*[1]) != 'note'">
          <note>
            <xsl:value-of select="." />
            <xsl:call-template name="create-group">
              <xsl:with-param name="node-set" select="following-sibling::*" />
            </xsl:call-template>
          </note>
        </xsl:when>
      </xsl:choose> 
    </xsl:for-each>
  </root>
</xsl:template>
 
<xsl:template name="create-group">
  <xsl:param name="node-set" />
   
  <xsl:choose>
    <xsl:when test="(name($node-set[1]) = 'note') and (name($node-set[2]) != 'note')">
      <cont><xsl:value-of select="$node-set[1]"/></cont>
    </xsl:when>
    <xsl:when test="(name($node-set[1]) = 'note') and (name($node-set[2]) = 'note')">
      <cont><xsl:value-of select="$node-set[1]"/></cont>
      <xsl:call-template name="create-group">
        <xsl:with-param name="node-set" select="$node-set[position() > 1]" />
      </xsl:call-template>
    </xsl:when>
  </xsl:choose> 
</xsl:template>
 
</xsl:stylesheet>

When this stylesheet is applied to XML:
<?xml version="1.0" encoding="UTF-8"?>
<root>
  <a>content1</a>
  <note>content2</note>
  <note>content3</note>
  <note>content4</note>
  <b>content5</b>  
</root>

It produces output:
<?xml version="1.0" encoding="UTF-8"?>
<root>
   <a>content1</a>
   <note>content2
     <cont>content3</cont>
     <cont>content4</cont>
   </note>
   <b>content5</b>
</root>

The idea behind this technique is :
1) Linearly traverse the XML data set using xsl:for-each.
2) Ouput the XML element if its name is not 'note'.
3) If the element encountered is the 1st 'note' element of the group, call a recursive named template, passing it the node set comprising of XML elements after the current 'note' element (the 1st one of the group).
4) The named template outputs all the 'note' elements of the current group, and discards all elements after the group.
 

4. Extracting unique values from a list of values. Or, eliminating duplicates from a list.

The following question was asked on
XSL-List.

Given this XML:
<gui type="alertBox">...</gui>
<gui type="tooltip">...</gui>
<gui type="help">...</gui>
<gui type="tooltip">...</gui>
<gui type="alertBox">...</gui>
<gui type="tooltip">...</gui>
<gui type="help">...</gui>

How can this be transformed to something like:

<alertBox/>
<tooltip/>
<help/>

The stylesheet for this problem is:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="
http://www.w3.org/1999/XSL/Transform">
  
  <xsl:output method="xml" indent="yes" />
  
  <xsl:template match="/root">
    <root>
      <xsl:for-each select="gui">
        <xsl:if test="not(@type = preceding-sibling::gui/@type)">
          <xsl:element name="{@type}" />
        </xsl:if>
      </xsl:for-each>
    </root>
  </xsl:template>
  
</xsl:stylesheet>
(The input XML is assumed to be enclosed in <root> tag).

But if lets say, the input XML is something like:

<root>
 <othertag>
   <gui type="x"></gui>
 </othertag>
 <gui type="tooltip"></gui>
 <gui type="help"></gui>
 <gui type="tooltip"></gui>
 <othertag>
   <gui type="alertBox">
     <gui type="alertBox"></gui>
   </gui>
 </othertag>
 <gui type="tooltip"></gui>
 <gui type="help"></gui>
</root>
(i.e., the <gui> tags can exists at any level of nesting).

To cater to a generalized XML like above, the stylesheet has to be modified to:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="
http://www.w3.org/1999/XSL/Transform">
 
<xsl:output method="xml" indent="yes" />
 
<xsl:template match="/root">
  <root>
    <xsl:for-each select="//gui">
      <xsl:if test="not((@type = preceding::gui/@type) or (@type = ancestor::gui/@type))">
        <xsl:element name="{@type}" />
      </xsl:if>       
    </xsl:for-each>
  </root>
</xsl:template>
 
</xsl:stylesheet>
(Please note the use of preceding and ancestor axis).

Wendell Piez remarked on the above solution...
It might be called the "canonical" solution. It's very general, but commonly avoided simply because of its performance implications. In a naive implementation, the entire tree will be traversed backwards to the beginning for every @type attribute checked. That's pretty serious cycling.

Below is an efficient implementation for the above problem using keys (inspired by Wendell's comments):

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" indent="yes" />

<xsl:key name="by-type" match="gui" use="@type" />

<xsl:template match="/root">
  <root>
     <xsl:for-each select="//gui[generate-id() = generate-id(key('by-type', @type)[1])]">
       <xsl:element name="{@type}" />
     </xsl:for-each>
   </root>
</xsl:template>

</xsl:stylesheet>

The above stylesheet uses Muenchian method of grouping. Various groups of "gui" nodes are formed, based on values of @type attribute. The 1st "gui" node from every group is used(thus getting the unique values).

(Wendell is an XSLT Expert, and works for Mulberry Technologies, who have kindly hosted the XSL-List)
 

5. Eliminating duplicate occurrences

The following question was asked on
XSL-List.

If I have a variable $thestring containing the following string: "Hello, Hello, Hello, test, dog, cat, cat". Is there a way to use the string-compare function to parse it and check for duplicates within the string, and then possibly remove those extra occurrences...resulting in the string "Hello, test, dog, cat".

The stylesheet for this problem is:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="
http://www.w3.org/1999/XSL/Transform" version="1.0">
 
<xsl:output method="text" /> 
 
<xsl:variable name="string" select="'Hello, Hello, Hello, test, dog, cat, cat'" />
 
<xsl:template match="/">
   <xsl:choose>
     <xsl:when test="not(contains($string, ','))">
       <xsl:value-of select="$string" />
     </xsl:when>
     <xsl:otherwise>
       <!-- call remove-duplicates template, only if the string contains at-least 2 words -->
       <xsl:call-template name="remove-duplicates">
          <xsl:with-param name="string" select="translate($string, ' ', '')" />
          <xsl:with-param name="newstring" select="''" />
       </xsl:call-template>
     </xsl:otherwise>
   </xsl:choose>
</xsl:template>
  
<xsl:template name="remove-duplicates">
   <xsl:param name="string" />
   <xsl:param name="newstring" />
  
   <xsl:choose>
     <xsl:when test="$string = ''">
       <xsl:value-of select="$newstring" />
     </xsl:when>
     <xsl:otherwise>
       <xsl:if test="contains($newstring, substring-before($string, ','))">
         <xsl:call-template name="remove-duplicates">
             <xsl:with-param name="string" select="substring-after($string, ',')" />
             <xsl:with-param name="newstring" select="$newstring" />
         </xsl:call-template>
       </xsl:if>
       <xsl:if test="not(contains($newstring, substring-before($string, ',')))">
         <xsl:variable name="temp">
           <xsl:if test="$newstring = ''">
              <xsl:value-of select="substring-before($string, ',')" />
           </xsl:if>
           <xsl:if test="not($newstring = '')">
              <xsl:value-of select="concat($newstring,',', substring-before($string, ','))" />
           </xsl:if>
         </xsl:variable>
         <xsl:call-template name="remove-duplicates">
             <xsl:with-param name="string" select="substring-after($string, ',')" />
             <xsl:with-param name="newstring" select="$temp" />
         </xsl:call-template>
       </xsl:if>
     </xsl:otherwise>
   </xsl:choose>
  
</xsl:template>
 
</xsl:stylesheet>


6. Positional grouping

The following question was asked on
XSL-List.

Suppose I have an input document:

<A>
  <B X="1"/>
  <B X="2"/>
  <B X="3"/>
  <C X="4"/>
  <B X="5"/>
  <B X="6"/>
  <B X="7"/>
</A>

Now, suppose I wish to group together consecutive B elements, giving a result document like this:

<A>
  <D>
    <B X="1"/>
    <B X="2"/>
    <B X="3"/>
  </D>
  <C X="4"/>
  <D>
    <B X="5"/>
    <B X="6"/>
    <B X="7"/>
  </D>
</A>

How can I do this?

The stylesheet for this problem is:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="
http://www.w3.org/1999/XSL/Transform" version="1.0">
 
 <xsl:output method="xml" indent="yes" />
 
 <xsl:template match="/A"> 
  <A>
   <xsl:for-each select="*">
     <xsl:choose>
       <xsl:when test="not(self::B)">
         <xsl:copy-of select="." />
       </xsl:when>
       <xsl:when test="name(preceding-sibling::*[1]) != 'B'">
         <D>
           <xsl:copy-of select="." />
           <xsl:call-template name="create-group">
             <xsl:with-param name="list" select="following-sibling::*" />
           </xsl:call-template>
         </D>
       </xsl:when>
     </xsl:choose>
   </xsl:for-each>
  </A>
 </xsl:template>
 
 <xsl:template name="create-group">
   <xsl:param name="list" />
     
   <xsl:if test="name($list[1]) = 'B'">
     <xsl:copy-of select="$list[1]" />
     <xsl:call-template name="create-group">
       <xsl:with-param name="list" select="$list[position() > 1]" />
     </xsl:call-template>
   </xsl:if>  
 </xsl:template>
 
</xsl:stylesheet>

Michael Kay suggested this solution:

Assuming you want a pure XSLT 1.0 solution, rather than one that relies on XSLT 2.0 for-each-group, or EXSLT extensions like set:leading(), a recursive template is probably the simplest approach. In fact, it's sometimes easier than using for-each-group. Remember that you can use apply-templates as well as call-template for such problems. Try this:

<xsl:template match="A">
  <xsl:copy>
    <xsl:apply-templates select="*"/>
  </xsl:copy>
</xsl:template>

<xsl:template match="A/*" priority="8">
  <xsl:copy-of select="."/>
</xsl:template>

<xsl:template match="B[preceding-sibling::*[1][self::B]]" priority="15"/>

<xsl:template match="B" priority="10">
  <D>
    <xsl:apply-templates select="." mode="sequence"/>
  </D>
</xsl:template>

<xsl:template match="B" mode="sequence">
  <xsl:copy-of select="."/>
  <xsl:apply-templates select="following-sibling::*[1][self::B]" mode="sequence"/>
</xsl:template>

The way to think about this is that you want to have a template rule for every element in the result tree, and then you want to organize your apply-templates to select the nodes in the input tree that will trigger creation of a node in the result tree.

(Michael Kay is editor of
XSLT 2.0 specification, and co-editor of XPath 2.0 specification. He is also producer of famous "Saxon : XSLT and XQuery processor".)

Dimitre Novatchev offered this solution:

This transformation:

<xsl:stylesheet version="1.0"  xmlns:xsl="
http://www.w3.org/1999/XSL/Transform">

 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>
 
 <xsl:key name="kNextGroup" match="B" use="generate-id(following-sibling::*[not(self::B)][1])"/>
 
  <xsl:template match="node()|@*" name="identity">
    <xsl:copy>
      <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
  </xsl:template>
 
  <xsl:template match="B[not(preceding-sibling::*[1][self::B])]">
    <D>
      <xsl:copy-of select="key('kNextGroup', generate-id(following-sibling::*[not(self::B)][1]))" />
    </D>
  </xsl:template>
 
  <xsl:template match="B"/>

</xsl:stylesheet>

when applied on your source xml document :

<A>
  <B X="1"/>
  <B X="2"/>
  <B X="3"/>
  <C X="4"/>
  <B X="5"/>
  <B X="6"/>
  <B X="7"/>
</A>

produces the wanted result:

<A>
   <D>
      <B X="1"/>
      <B X="2"/>
      <B X="3"/>
   </D>
   <C X="4"/>
   <D>
      <B X="5"/>
      <B X="6"/>
      <B X="7"/>
   </D>
</A>

(Dimitre Novatchev is an XSLT Expert, and developer of the
FXSL library.)
 

7. Muenchian grouping

The following question was asked on XSL-List.

Given this XML file:

<a>
  <b>
     <desc>some text
     </desc>
     <bChild>
       <desc>some other text
       </desc>
     </bChild>
  </b>
  <b>
     <desc>some text
     </desc>
     <bChild>
       <desc>some other text
       </desc>
     </bChild>
     <bChild>
       <desc>maybe some more text
       </desc>
     </bChild>
  </b>
</a>

b elements can have description children(desc), and so can their bChild children. The goal is to Muenchian group all the b elements on the concatenated value of their desc and the desc values of however many bChild children there are.

Michael Kay provided this hint:
Easy in 2.0

use="string-join(.//desc, '')"

The stylesheet for this problem is (using node-set extension function):

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="
http://www.w3.org/1999/XSL/Transform" version="1.0"
                      xmlns:exslt="
http://exslt.org/common"
                      exclude-result-prefixes="exslt">

<xsl:output method="xml" indent="yes" />

<xsl:key name="by-desc" match="b" use="concatenated-desc" />

<xsl:template match="/a">
  <xsl:variable name="rtf">
    <a>
      <xsl:for-each select="b">
        <b>
          <concatenated-desc>
            <xsl:call-template name="concatenate-desc">
              <xsl:with-param name="x" select=".//desc" />
              <xsl:with-param name="y" select="''" />
            </xsl:call-template>
          </concatenated-desc>
          <xsl:copy-of select="child::node()" />
        </b>
      </xsl:for-each>
    </a>
  </xsl:variable>
 
  <a>
    <xsl:for-each select="exslt:node-set($rtf)/a/b[generate-id(.) = generate-id(key('by-desc',concatenated-desc)[1])]">
      <group-of-b>       
        <xsl:for-each select="key('by-desc',concatenated-desc)">
          <b>
            <xsl:copy-of select="child::node()[not(self::concatenated-desc)]"/>
          </b>
        </xsl:for-each>
      </group-of-b>
    </xsl:for-each>
  </a>
</xsl:template>

<xsl:template name="concatenate-desc">
  <xsl:param name="x" />
  <xsl:param name="y" />
 
  <xsl:choose>
    <xsl:when test="count($x) &gt; 0">
      <xsl:call-template name="concatenate-desc">
        <xsl:with-param name="x" select="$x[position() &gt; 1]" />
        <xsl:with-param name="y" select="concat($y,normalize-space($x[1]))" />
      </xsl:call-template>
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="$y" />
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

</xsl:stylesheet>

For e.g. when the above XSL is applied to XML:

<a>
  <b>
     <desc>some text
     </desc>
     <bChild>
        <desc>some other text
       </desc>
     </bChild>
  </b>
  <b>
     <desc>some text
     </desc>
     <bChild>
       <desc>some other text
       </desc>
     </bChild>
     <bChild>
       <desc>maybe some more text
       </desc>
     </bChild>
  </b>
  <b>
     <desc>some text
     </desc>
     <bChild>
       <desc>some other text
       </desc>
     </bChild>
     <test>xyz</test>
  </b>
</a>

The output received is:

<?xml version="1.0" encoding="UTF-8"?>
<a>
   <group-of-b>
      <b>
          <desc>some text
          </desc>
          <bChild>
            <desc>some other text
            </desc>
          </bChild>
      </b>
      <b>
           <desc>some text
           </desc>
           <bChild>
             <desc>some other text
             </desc>
           </bChild>
           <test>xyz</test>
      </b>
   </group-of-b>
   <group-of-b>
      <b>
          <desc>some text
          </desc>
          <bChild>
            <desc>some other text
            </desc>
          </bChild>
          <bChild>
             <desc>maybe some more text
             </desc>
          </bChild>
      </b>
   </group-of-b>
</a>

David Carlisle remarked:
>I feel its not possible with pure XSLT 1.0 . We need to take help of extension function (node-set)..
I think that's probably true, or to say the same thing another way, the pure XSLT1 solution would be to do two passes, first do a transform that concatenates all the text nodes into a single element or attribute, then do a second pass that sorts on that new node. a node-set extension is a convenience to allow that to be done in a single run of the processor.

Wendell Piez further said:
I'll fill in a bit by adding that two passes can be accomplished in a single run of a stylesheet, if you use either XSLT version 2.0, or a very popular extension function in XSLT 1.0, usually called node-set().
The node-set() extension is popular enough almost to be official, and for an extension, it's very portable: see
http://www.exslt.org. The idea here is that you would bind the results of the first pass to a variable, use the extension function to convert it from a result tree (which, in XSLT 1.0, you cannot process further as such, merely copy it or process it as a string), and then process the node set you get back as input to your second pass. Putting two transforms together like this into a single stylesheet, it's probably wise to use modes to discriminate the passes.


8. Grouping by two or any number

The following question was asked on microsoft.public.xsl Newsgroup.

I have the following XML:

<Parent>
   <node>a</node>
   <node>s</node>
   <node>d</node>
   <node>f</node>
   <node>g</node>
   <node>h</node>
   <node>j</node>
   <node>k</node>
   <node>l</node>
</Parent>

 I need to print in following format (into groups of  two)
 group 1 - a,s
 group 2 - d,f
 group 3 - g,h
 group 4 - j,k
 group 5 - l

The stylesheet for this problem is:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:output method="text" />

<xsl:param name="group-size" select="3" />

<xsl:template match="/Parent">

<!-- calculate the no of "node" elements, which are left as a fraction;  which are to be displayed in the last group -->
<xsl:variable name="n" select="count(node) - (floor((count(node) div $group-size)) * $group-size)" />

<xsl:for-each select="node">
    <!-- determine group boundary; this if test stops at the last "node" element of the group -->
    <xsl:if test="(position() mod $group-size) = 0">
       group <xsl:value-of select="floor(position() div $group-size)" /><xsl:text> - </xsl:text>
       <!-- display group members -->
      <xsl:for-each select=". | preceding-sibling::node[position() &lt;= ($group-size - 1)]">
          <xsl:value-of select="." /><xsl:if test="(position() mod $group-size) != 0"><xsl:text>,</xsl:text></xsl:if>
      </xsl:for-each>
      <xsl:text>&#xa;</xsl:text>
    </xsl:if>

    <!-- this if test processes the last group; whose number of group members will be less than the group-size -->
    <xsl:if test="((position() = last()) and ((position() mod $group-size) != 0))">
        group <xsl:value-of select="floor(position() div $group-size) + 1" /><xsl:text> - </xsl:text>
        <xsl:for-each select=". | preceding-sibling::node[position() &lt; $n]">
           <xsl:value-of select="." /><xsl:if test="position() != last()"><xsl:text>,</xsl:text></xsl:if>
        </xsl:for-each>
        <xsl:text>&#xa;</xsl:text>
    </xsl:if>
</xsl:for-each>

</xsl:template>

</xsl:stylesheet>


9. Multiple level grouping

The following question was asked on
XSL-List.

I have a set of colors and have a whole bunch of pictures and want to match on all pictures who have one or more of the given colors but  have to at least match on 2 of them (unique, so not red and red - so a picture could list red twice but that would not be a match).

Given this XML:

<data>
  <colors>
    <color
>red</color>
    <color>blue</color>
    <color>fucia</color>
    <color>violet</color>
  </colors>
 <pictures>
    <picture sample="1">
      <color>black</color>
      <color>grey</color>
      <color>white</color>
    </picture>
    <picture sample="2">
      <color>red</color>
      <color>green</color>
      <color>brown</color>
      <color>blue</color>
    </picture>
    <picture sample="3">
      <color>purple</color>
      <color>orange</color>
    </picture>
    <picture sample="4">
      <color>blue</color>
      <color>green</color>
      <color>red</color>
    </picture>
    <picture sample="5">
      <color>fucia</color>
      <color>green</color>
      <color>violet</color>
    </picture>
    <picture sample="6">
      <color>red</color>
      <color>brown</color>
      <color>red</color>
    </picture>
 </pictures>
</data>

The desired output is:

picture sample #2
picture sample #4
picture sample #5

The stylesheet for this problem is (tested with Saxon 8.3 XSLT processor):

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="
http://www.w3.org/1999/XSL/Transform" version="2.0">
 
<xsl:output method="text" /> 
 
<xsl:template match="/data">
  <xsl:variable name="temp1" select="colors/color" />
 
  <xsl:for-each select="pictures/picture">
    <xsl:variable name="temp2" select="distinct-values(color)" />   
    <xsl:if test="count(distinct-values($temp1[.=$temp2])) &gt; 1">
      picture sample #<xsl:value-of select="@sample" /><xsl:text>&#xa;</xsl:text>
    </xsl:if>
  </xsl:for-each>
</xsl:template>
 
</xsl:stylesheet>

The expression distinct-values($temp1[.=$temp2]) finds the intersection of sequences $temp1 and $temp2.

There were other interesting replies ..

Wendell Piez provided the following XSLT 1.0 stylesheet:

It turned out that keys weren't actually necessary: as posed (as I understand it) the colors problem could be solved with a simple (if not obvious) test. But using a key does make it slightly more efficient:

<xsl:stylesheet xmlns:xsl="
http://www.w3.org/1999/XSL/Transform" version="1.0">

   <xsl:key name="pictures-by-color" match="picture" use="color"/>

   <xsl:variable name="colors" select="/data/colors/color"/>

   <xsl:template match="data">
     <xsl:apply-templates select="colors"/>
     <!-- if we weren't going to select using the key, we could go straight to
          the pictures; but then we'd be testing each one -->
   </xsl:template>

   <xsl:template match="colors">
     <xsl:apply-templates select="key('pictures-by-color',color)"/>
     <!-- by using the key we select just the pictures that have at least
          one color matching a color in your set -->
   </xsl:template>

   <xsl:template match="picture">
     <xsl:if test="count(color[not(.=preceding-sibling::color)][.=$colors]) &gt;= 2">
       <xsl:text>&#xA;picture sample #</xsl:text>
       <xsl:value-of select="@sample"/>
     </xsl:if>
   </xsl:template>

</xsl:stylesheet>

Wendell provided following explanation:
color[not(.=preceding-sibling::color)][.=$colors]

The first predicate deduplicates each color against its siblings; the second sees whether it's listed among the colors you want. This yields the set of unique colors (within each parent) listed among the colors of interest, which you can count.
Note that it's a brute-force deduplication; if you have lots and lots of siblings you could optimize that first predicate (using another key).

Dimitre Novatchev provided following XSLT 1.0 answer:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 
 <xsl:output omit-xml-declaration="yes"/>
 
 <xsl:variable name="vColors" select="/*/colors"/>

 <xsl:template match="/">
    <xsl:for-each select="/*/*/picture">
      <xsl:if test="count($vColors/*[. = current()/color]) >= 2">
        <xsl:value-of select="concat('Picture Id=', @sample, '&#xA;')"/>
      </xsl:if>
    </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>


10. Muenchian grouping solution

The following question was asked on XSL-List.

Given this XML document:

<?xml version="1.0" encoding="UTF-8"?>
<root>
 <people>
  <person>
   <name>
      <fname>Jack</fname>
      <mname>Fred</mname>
      <lname>Smith</lname>
   </name>
   <age>22</age>
  </person>
  <person>
   <name>
      <fname>Jane</fname>
      <mname>Mary</mname>
      <lname>Smith</lname>
   </name>
   <age>23</age>
  </person>
  <person>
   <name>
      <fname>Frank</fname>
      <mname>Joseph</mname>
      <lname>Franks</lname>
   </name>
   <age>23</age>
  </person>
 </people>
</root>

I'd like to sort and group it by age, such that each person of the same age is placed in ascending order within each unique age value.

The desired output is:

<?xml version="1.0" encoding="UTF-8"?>
<people>
 <age value="22">
  <name>
     <fname>Jack</fname>
     <mname>Fred</mname>
     <lname>Smith</lname>
  </name>
 </age>
 <age value="23">
  <name>
     <fname>Frank</fname>
     <mname>Joseph</mname>
     <lname>Franks</lname>
  </name>
  <name>
     <fname>Jane</fname>
     <mname>Mary</mname>
     <lname>Smith</lname>
  </name>
 </age>
</people>

The stylesheet for this problem is (using Muenchian method):

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" indent="yes" />
   
<xsl:key name="byAge" match="person" use="age"/>
 
<xsl:template match="/root/people">
   <people>
     <xsl:for-each select="person[generate-id() = generate-id(key('byAge', age)[1])]">
       <xsl:sort select="age" order="ascending" data-type="number" />
       <age value="{age}">
         <xsl:copy-of select="key('byAge', age)/name" />
       </age>
     </xsl:for-each>
   </people>
</xsl:template>
   
</xsl:stylesheet>


11. Application of Muenchian grouping

The following question was asked on  comp.text.xml Newsgroup.

I get the xml message like:

<message>
    <docList>
        <docSet>1</docSet>
        <docTp>A1/docTp>
        <docSet>1</docSet>
        <docTp>B1</docTp>
        <docSet>2</docSet>
        <docTp>A2</docTp>
        <docSet>1</docSet>
        <docTp>C1</docTp>
        <docSet>2</docSet>
        <docTp>B2</docTp>
        .....
     </docList>
</message>

I need to make HTML like:

DocSet 1
Doc Tp = A1
Doct Tp = B1
Doc Tp = C1

DocSet2
DocTp = A2
.....

The stylesheet for this problem is:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="
http://www.w3.org/1999/XSL/Transform" version="1.0">
 
<xsl:output method="html" indent="yes" />

<xsl:key name="by-docSet" match="docSet" use="." />
 
<xsl:template match="/message">
   <html>
         <head>
               <title/>
         </head>
         <body>
            <table>
               <xsl:for-each select="docList/docSet[generate-id(.) = generate-id(key('by-docSet', .)[1])]">
                    <tr>
                         <td>
                               DocSet
                         </td>
                         <td>
                               <xsl:value-of select="." />    
                         </td>
                     </tr>
                     <xsl:for-each select="key('by-docSet', .)">
                         <tr>
                            <td>
                                 Doc Tp =
                           </td>
                           <td>
                                 <xsl:value-of select="following-sibling::docTp[1]" />
                           </td>
                         </tr>
                   </xsl:for-each>
               </xsl:for-each>          
          </table>
       </body>
   </html> 
</xsl:template>
 
</xsl:stylesheet>


12. Muenchian grouping problem

The following question was asked on microsoft.public.xsl Newsgroup.

Using code I pull values from XML and put it into a hashes then put it back into xml to get the unique set. However, I was wondering if this was possible via XSL:

<foo>
       <bars type="bartype">
            <bar id="1" x="a1" y="a2" z="3"/>
            <bar id="2" x="a11" y="a22" z="33"/>
            <bar id="1" x="a11111" y="a22222" z="33333"/>
        </bars>
        <houses type="housetype">
            <house id="7" a="a1" b="a2" c="a3"/>
            <house id="8" a="a1" b="a2" c="a3"/>
            <house id="9" a="a1" b="a2" c="a3"/>
        </houses>
       <bars type="bartype">
            <bar id="2" x="a111" y="a222" z="a333"/>
            <bar id="1" x="a1111" y="a2222" z="a3333"/>
        </bars>
</foo>

to

<uniques>
    <unique id="1" type="bartype"/>
    <unique id="2" type="bartype"/>
    <unique id="7" type="housetype"/>
    <unique id="8" type="housetype"/>
    <unique id="9" type="housetype"/>
</uniques>

The stylesheet for this problem is:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" encoding="UTF-8" indent="yes"/>

<xsl:key name="by-id" match="*" use="@id" />

<xsl:template match="/">
   <uniques>
      <xsl:for-each select="//*[@id][generate-id(.) = generate-id(key('by-id', @id)[1])]">
          <unique id="{@id}" type="{../@type}" />
       </xsl:for-each>
   </uniques>
</xsl:template>

</xsl:stylesheet>


13.
Grouping problem

The following question was asked on XSL-List.

I have XML of the form:

<page>
  <entry date="2005-04-15">
    <title>foo</title>
  </entry>
  <entry date="2005-04-15">
    <title>bar</title>
  </entry>
  <entry date="2005-02-05">
    <title>baz</title>
  </entry>
  ...
</page>

Which I am trying to group by date, sort by title and then split into sets of 3, 3 being the number of columns in the HTML table element I am trying to produce as an end result.

I've got the grouping and sorting:
<xsl:for-each select="entry[key('days', @date) and count(.|key('days', @date)[1])= 1]">
  <xsl:sort select="title"/>

and I've even got the first item in each group of three from that grouped and sorted set:
<xsl:for-each select="key('days', @date)[position() mod 3 = 1]">

But I can't seem a way to display the following siblings of the above, making the 3 cell rows. Should I be doing this some other way?

The stylesheet for this problem is:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:output method="html" indent="yes" />
 
<xsl:key name="by-date" match="entry" use="@date" />
 
<xsl:template match="/page">
    <html>
      <head>
        <title/>
      </head>
      <body>
        <table>
          <xsl:for-each select="entry[generate-id() = generate-id(key('by-date', @date)[1])]">
            <xsl:for-each select="key('by-date',@date)">
              <xsl:sort select="title" />
              <xsl:if test="(position() = 1) or ((position() - 1) mod 3 = 0)">
                <xsl:variable name="pos" select="position()" />               
                <xsl:call-template name="generateTRs">
                  <xsl:with-param name="node-set" select="key('by-date', @date)[position() &gt;= $pos][position() &lt;= ($pos + 2)]" />
                </xsl:call-template>               
              </xsl:if>
            </xsl:for-each>
            <!-- a dummy row -->
            <tr>
              <td>-</td><td>-</td><td>-</td>
            </tr>
          </xsl:for-each>
        </table>
      </body>
    </html> 
</xsl:template>
  
<xsl:template name="generateTRs">
    <xsl:param name="node-set" />
   
    <tr>
      <xsl:for-each select="$node-set">
        <td>
          <xsl:value-of select="title" />
        </td>
      </xsl:for-each>
      <xsl:call-template name="generateRemainingTDs">
         <xsl:with-param name="n" select="3 - count($node-set)" />
      </xsl:call-template>
    </tr> 
</xsl:template>
 
<xsl:template name="generateRemainingTDs">
    <xsl:param name="n" />
       
    <xsl:if test="$n &gt; 0">
      <td/>
      <xsl:call-template name="generateRemainingTDs">
        <xsl:with-param name="n" select="$n - 1" />
      </xsl:call-template>
    </xsl:if>
</xsl:template>
 
</xsl:stylesheet>

For e.g., when it is applied to XML:

<page>
  <entry date="2005-04-15">
    <title>foo</title>
  </entry>
  <entry date="2005-04-15">
    <title>bar</title>
  </entry>
  <entry date="2005-04-15">
    <title>baz</title>
  </entry>
  <entry date="2004-04-15">
    <title>a</title>
  </entry>
  <entry date="2004-04-15">
    <title>b</title>
  </entry>
  <entry date="2004-02-05">
    <title>c</title>
  </entry>
  <entry date="2003-04-15">
    <title>d</title>
  </entry>
  <entry date="2003-04-15">
    <title>e</title>
  </entry>
  <entry date="2003-02-05">
    <title>f</title>
  </entry>
  <entry date="2002-02-05">
    <title>g</title>
  </entry>
</page>

The output produced is:

<html>
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
      <title></title>
   </head>
   <body>
      <table>
         <tr>
            <td>foo</td>
            <td>bar</td>
            <td>baz</td>
         </tr>
         <tr>
            <td>-</td>
            <td>-</td>
            <td>-</td>
         </tr>
         <tr>
            <td>a</td>
            <td>b</td>
            <td></td>
         </tr>
         <tr>
            <td>-</td>
            <td>-</td>
            <td>-</td>
         </tr>
         <tr>
            <td>c</td>
            <td></td>
            <td></td>
         </tr>
         <tr>
            <td>-</td>
            <td>-</td>
            <td>-</td>
         </tr>
         <tr>
            <td>d</td>
            <td>e</td>
            <td></td>
         </tr>
         <tr>
            <td>-</td>
            <td>-</td>
            <td>-</td>
         </tr>
         <tr>
            <td>f</td>
            <td></td>
            <td></td>
         </tr>
         <tr>
            <td>-</td>
            <td>-</td>
            <td>-</td>
         </tr>
         <tr>
            <td>g</td>
            <td></td>
            <td></td>
         </tr>
         <tr>
            <td>-</td>
            <td>-</td>
            <td>-</td>
         </tr>
      </table>
   </body>
</html>


14. Grouping problem

The following question was asked on XSL-List.

Given the following XML file:

<cars>
  <car>
    <model>J980384</model>
    <name>Ranger</name>
    <categ>Pick-up</categ>
    <color>blue</color>
    <stock>6</stock>
  </car>
  <car>
    <model>V667320</model>
    <name>Sportage</name>
    <categ>sport</categ>
    <color>green</color>
    <stock>8</stock>
  </car>
  <car>
    <model>M382932</model>
    <name>Silverado</name>
    <categ>pick-up</categ>
    <color>blue</color>
    <stock>3</stock>
  </car>
  <car>
    <model>L930389</model>
    <name>Jaguar</name>
    <categ>Sport</categ>
    <color>red</color>
    <stock>2</stock>
  </car>
  <car>
    <model>J980384</model>
    <name>Ranger</name>
    <categ>Pick-up</categ>
    <color>grey</color>
    <stock>3</stock>
  </car>
  <car>
    <model>L930389</model>
    <name>Jaguar</name>
    <categ>Sport</categ>
    <color>blue</color>
    <stock>1</stock>
  </car>
  <car>
    <model>J980384</model>
    <name>Ranger</name>
    <categ>Pick-up</categ>
    <color>black</color>
    <stock>5</stock>
  </car>
</cars>

Supposing that XML document has more than 100 models, I need to group only those that have more than one representation. I wish that the grouped elements appear this way

Car:   L930389  Jaguar  Sport
occurrence : 2
Total stock: 3

Car:   L930384  Ranger  Pick-up
occurrence : 3
Total stock: 14

The elements with one occurrence shouldn't be shown.

The stylesheet for this problem is:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
 
<xsl:output method="text" />
 
<xsl:key name="by-model" match="car" use="model" />
 
<xsl:template match="/cars">
   <xsl:for-each select="car[generate-id() = generate-id(key('by-model',model)[1])][count(key('by-model',model)) > 1]">
     <xsl:sort select="model" order="descending" />
     Car: <xsl:value-of select="model" /><xsl:text> </xsl:text><xsl:value-of select="name" /><xsl:text> </xsl:text><xsl:value-of select="categ" /><xsl:text>&#xa;</xsl:text>
     occurrence: <xsl:value-of select="count(key('by-model',model))" /><xsl:text>&#xa;</xsl:text>
     Total stock: <xsl:value-of select="sum(key('by-model',model)/stock)" /><xsl:text>&#xa;</xsl:text>
  </xsl:for-each>
</xsl:template>
 
</xsl:stylesheet>

(This is a usual Muenchian grouping solution. But here there is an additional predicate [count(key('by-model',model)) > 1] , which selects only those groups, whose group size is > 1)
 

15. Grouping problem

The following question was asked on XSL-List.

Given this XML document:

<Unit xmlns="http://www.xml.com/xml/unit" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.xml.com/xml/unit/unit.xsd">
<Objectives>
    <Objective domain="CognitiveDomainObjective">an understanding of the roles of XML in providing IT solutions to organization
    </Objective>
    <Objective domain="CognitiveDomainObjective">the knowledge of XML technologies components and their roles in providing XML solution to IT and business problems
    </Objective>
    <Objective domain="CognitiveDomainObjective">the knowledge of the specific issues and requirements related to the field of XML technologies,in particular XML document, DTD, XML Schema, XPath and XSLT
    </Objective>
    <Objective domain="CognitiveDomainObjective">an understanding of the different issues related to storing and retrieving textual data
    </Objective>
    <Objective domain="CognitiveDomainObjective">an understanding of the different design and implementation issues related to search engines
    </Objective>
    <Objective domain="AffectiveDomainObjective">an appreciation of the role of XML technologies as a solution to some of the distributed computing,WWW and database problems
    </Objective>
    <Objective domain="AffectiveDomainObjective">an acceptance that XML technology has its own limitation and it should be considered when developing a solution
    </Objective>
    <Objective domain="AffectiveDomainObjective">an appreciation of the role of DTD and XML Schema in ensuring the quality of the XML solution
    </Objective>
    <Objective domain="AffectiveDomainObjective">an appreciation of the complexity inherent by text retrieval systems
    </Objective>
    <Objective domain="PsychomotorDomainObjective">designing and creating a well-formed and valid XML document
    </Objective>
    <Objective domain="PsychomotorDomainObjective">retrieving and transforming XML document into a number of different presentation format
    </Objective>
    <Objective domain="PsychomotorDomainObjective">identifying different components of a text retrieval system
    </Objective>
    <Objective domain="PsychomotorDomainObjective">evaluating the different techniques used in building a text retrieval systems
    </Objective>
</Objectives>
</Unit>

It is desired to group the elements to produce output like:

<Unit xmlns="http://www.xml.com/xml/unit" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.xml.com/xml/unit/newUnit.xsd">
<Objectives>
    <CognitiveDomainObjective>
        <Objective>...
        </Objective>
        <Objective>....
        </Objective>
        <Objective>.....
        </Objective>
    </CognitiveDomainObjective>
    <AffectiveDomainObjective>
        <Objective>...
        </Objective>
    </AffectiveDomainObjective>
    <PsychomotorDomainObjective>
        <Objective>.....
        </Objective>
    </PsychomotorDomainObjective>
</Objectives>
</Unit>

The stylesheet for this problem is:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
 
<xsl:output method="xml" indent="yes" />

<xsl:template match="/">
   <Unit xmlns="http://www.xml.com/xml/unit"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://www.xml.com/xml/unit/newUnit.xsd">
     <Objectives>
       <xsl:for-each select="//*[local-name() = 'Objective'][not(@domain = preceding-sibling::*[local-name() =
'Objective']/@domain)]"> 
          <xsl:variable name="temp" select="@domain" />
          <xsl:element name="{@domain}">
            <xsl:for-each select="//*[local-name() = 'Objective'][@domain = $temp]">
              <xsl:copy>
                <xsl:value-of select="." />
              </xsl:copy>
            </xsl:for-each>
          </xsl:element>
       </xsl:for-each>
     </Objectives>
  </Unit>
</xsl:template>
 
</xsl:stylesheet>
 

16. Selecting unique entries from a list

The following question was asked on XSL-List.

I need to take the following XML and generate an initial web page displaying only one instance of each department name.

<xml>
    <List>
      <Entry>
       <Session>2004/5</Session>
       <Department>Accounting and Finance</Department>
      </Entry>
      <Entry>
       <Session>2004/5</Session>
       <Department>Accounting and Finance</Department>
      </Entry>
      <Entry>
       <Session>2004/5</Session>
       <Department>Maths</Department>
      </Entry>
      <Entry>
       <Session>2004/5</Session>
       <Department>Maths</Department>
      </Entry>
      <Entry>
       <Session>2004/5</Session>
       <Department>Economic History</Department>
      </Entry>
    </List>
</xml>

So in the html only unique departments would be displayed once:

Accounting and Finance
Maths
Economic History

The stylesheet for this problem is:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
 
<xsl:output method="html" />
 
<xsl:template match="/xml">
    <html>
      <head>
        <title/>
      </head>
      <body>
        <table>
          <xsl:for-each select="List/Entry/Department[not(. = preceding::Department)]">
            <tr>
              <td><xsl:value-of select="." /></td>
            </tr>
          </xsl:for-each>
        </table> 
      </body>
    </html>
</xsl:template>
 
</xsl:stylesheet>

This is tested with IE 6, and Saxon 6.5.3.


17. Grouping problem

The following question was asked on the XSL-List.

I wish to group the following XML into 2 periods. The periods are arbitrary, but for this example they happen to be:
Period 1:  1 - 12
Period 2:  14 - 30

Expected Result:
<result>
  <period begins="1" ends="12">
    <B period_begin="1" period_end="5"/>
    <B period_begin="2" period_end="7"/>
    <B period_begin="3" period_end="10"/>
    <B period_begin="4" period_end="12"/>
  </period>
  <period begins="14" ends="30">
    <B period_begin="14" period_end="16"/>
    <B period_begin="16" period_end="20"/>
    <B period_begin="16" period_end="30"/>
  </period>
</result>

Source XML (sorted)
<A>
  <B period_begin="1" period_end="5"/>
  <B period_begin="2" period_end="7"/>
  <B period_begin="3" period_end="10"/>
  <B period_begin="4" period_end="12"/>
  <B period_begin="14" period_end="16"/>
  <B period_begin="16" period_end="20"/>
  <B period_begin="16" period_end="30"/>
</A>

Source XML (un-sorted)
<A>
  <B period_begin="14" period_end="16"/>
  <B period_begin="2" period_end="7"/>
  <B period_begin="16" period_end="20"/>
  <B period_begin="1" period_end="5"/>
  <B period_begin="4" period_end="12"/>
  <B period_begin="16" period_end="30"/>
  <B period_begin="3" period_end="10"/>
</A>

The stylesheet for this problem is:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" indent="yes" />

<xsl:variable name="start1" select="1" />
<xsl:variable name="end1" select="12" />
<xsl:variable name="start2" select="14" />
<xsl:variable name="end2" select="30" />
 
<xsl:template match="/A">
   <result>
     <period begins="{$start1}" ends="{$end1}">
       <xsl:for-each select="B[(@period_begin &gt;= $start1) and (@period_end &lt;= $end1)]">
         <xsl:sort select="@period_begin" data-type="number" />
         <xsl:copy-of select="." />
       </xsl:for-each>
     </period>
     <period begins="{$start2}" ends="{$end2}">
       <xsl:for-each select="B[(@period_begin &gt;= $start2) and (@period_end &lt;= $end2)]">
         <xsl:sort select="@period_begin" data-type="number" />
         <xsl:copy-of select="." />
       </xsl:for-each>
     </period>
   </result>
</xsl:template>
 
</xsl:stylesheet>

When the above XSLT stylesheet is given this XML as input:

<?xml version="1.0" encoding="UTF-8"?>
<A>
  <B period_begin="14" period_end="16"/>
  <B period_begin="2" period_end="7"/>
  <B period_begin="16" period_end="20"/>
  <B period_begin="1" period_end="5"/>
  <B period_begin="4" period_end="12"/>
  <B period_begin="16" period_end="30"/>
  <B period_begin="3" period_end="10"/>
</A>

The output received is:

<?xml version="1.0" encoding="utf-8"?>
<result>
  <period begins="1" ends="12">
    <B period_begin="1" period_end="5"/>
    <B period_begin="2" period_end="7"/>
    <B period_begin="3" period_end="10"/>
    <B period_begin="4" period_end="12"/>
 </period>
 <period begins="14" ends="30">
    <B period_begin="14" period_end="16"/>
    <B period_begin="16" period_end="20"/>
    <B period_begin="16" period_end="30"/>
 </period>
</result>

There were other interesting replies

Wendell Piez..

I'm assuming your target periods are known at time of writing. (If only known at run-time, they can be parameterized. If not even that, more specification is called for.) I'm also assuming your data is known to
be valid to the assumptions made for sorting (for example, there's no B whose beginning is in one period and end is in another) -- there's no exception-handling for any of that.

Here goes:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:local="data:hey-david" exclude-result-prefixes="local">

<xsl:output indent="yes"/>

<xsl:variable name="period-groups" xmlns="data:hey-david">
   <group start="1" end="12"/>
   <group start="14" end="30"/>
</xsl:variable>

<xsl:variable name="periods" select="//*[@period_begin]"/>

<xsl:template match="/">
   <result>
     <xsl:apply-templates select="document('')/*/*/local:group"/>
   </result>
</xsl:template>

<xsl:template match="local:group">
   <xsl:variable name="start" select="@start"/>
   <xsl:variable name="end" select="@end"/>
   <period begins="{@start}" ends="{@end}">
     <xsl:for-each select="$periods[@period_begin &gt;= $start and @period_end &lt;= $end]">
       <xsl:sort select="@period_begin"/>
       <xsl:sort select="@period_end"/>
       <xsl:copy-of select="."/>
     </xsl:for-each>
   </period>
</xsl:template>

</xsl:stylesheet>

Dimitre Novatchev..

This has an elegant solution using the f:foldl() function of FXSL.

Here, I'm giving a "first glance" XSLT 2.0 solution without using FXSL.

This transformation:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 
  <xsl:template match="A">
    <xsl:variable name="vStarting" select=
    "*[not(@period_begin/xs:integer(.)
         &lt;=
           preceding-sibling::*/@period_end/xs:integer(.)
           )]">
    </xsl:variable>
   
    <xsl:for-each select="$vStarting">
      <xsl:variable name="vPos" select="position()"/>
      <period start="{@period_begin}"
           end="{if ($vPos = last() )
                 then
                    max( (. | following-sibling::*)
                           /@period_end/xs:integer(.)
                        )
                 else
                    max( (. | following-sibling::*)
                       [. &lt;&lt; $vStarting[$vPos + 1]]
                           /@period_end/xs:integer(.)
                     )
                 }"
       />
    </xsl:for-each>
  </xsl:template>
</xsl:stylesheet>


when applied on this source xml document (added one more group to
yours):

<A>
 <B period_begin="1" period_end="5"/>
 <B period_begin="2" period_end="7"/>
 <B period_begin="3" period_end="10"/>
 <B period_begin="4" period_end="12"/>
 <B period_begin="14" period_end="16"/>
 <B period_begin="16" period_end="20"/>
 <B period_begin="16" period_end="30"/>
 <B period_begin="32" period_end="33"/>
 <B period_begin="33" period_end="38"/>
</A>

produces the wanted result:

<period start="1" end="12"/>
<period start="14" end="30"/>
<period start="32" end="38"/>

Dimitre provided another solution, using FXSL f:foldl() function..

Still an XSLT 2.0 solution, but this time using f:foldl() as promised. This can be re-written 1:1 in XSLT 1.0 + FXSL for XSLT 1.0.

The reason I'm posting this second solution is because it resembles very much the "functional tokenizer"  (see for example:
   http://www.biglist.com/lists/xsl-list/archives/200111/msg00905.html)

and your problem can be called something like:

                 "interval tokenization".

I still cannot fully assimilate the meaning and implications of this striking similarity but it seems to prove that there's law, order and elegance in the world of functional programming.

Here's the transformation:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:foldl-func="foldl-func" xmlns:f="http://fxsl.sf.net/" exclude-result-prefixes="f foldl-func">

   <xsl:import href="../f/func-foldl.xsl"/>
  
   <xsl:output omit-xml-declaration="yes" indent="yes"/>

   <!--
      This transformation must be applied to: 
      ../data/periods.xml                 
   -->
   <xsl:variable name="vFoldlFun" as="element()">
    <foldl-func:foldl-func/>
   </xsl:variable>
  
   <xsl:variable name="vA0" as="element()+">
     <period start="0" end="0"/>
   </xsl:variable>

    <xsl:template match="/">
      <xsl:sequence select="f:foldl($vFoldlFun, $vA0, /*/* )[position() > 1]"/>
    </xsl:template>
   
    <xsl:template match="foldl-func:*" as="element()+" mode="f:FXSL">
       <xsl:param name="arg1"/>
       <xsl:param name="arg2"/>
      
       <xsl:variable name="vLastPeriod" select="$arg1[last()]"/>
        
       <xsl:choose>
         <xsl:when test="number($arg2/@period_begin) > number($vLastPeriod/@end)">
           <xsl:sequence select="$arg1"/>
           <period start="{$arg2/@period_begin}" end="{$arg2/@period_end}"/>
         </xsl:when>
         <xsl:otherwise>
           <xsl:sequence select="$arg1[not(. is $vLastPeriod)]"/>
           <xsl:choose>
             <xsl:when test="number($arg2/@period_end) > number($vLastPeriod/@end)">
               <period start="{$vLastPeriod/@start}" end="{$arg2/@period_end}"/>
             </xsl:when>
             <xsl:otherwise>
               <xsl:sequence select="$vLastPeriod"/>
             </xsl:otherwise>
           </xsl:choose>
         </xsl:otherwise>
       </xsl:choose>
    </xsl:template>

</xsl:stylesheet>

When applied on the same source xml document:

<A>
  <B period_begin="1" period_end="5"/>
  <B period_begin="2" period_end="7"/>
  <B period_begin="3" period_end="10"/>
  <B period_begin="4" period_end="12"/>
  <B period_begin="14" period_end="16"/>
  <B period_begin="16" period_end="20"/>
  <B period_begin="16" period_end="30"/>
  <B period_begin="32" period_end="33"/>
  <B period_begin="33" period_end="38"/>
</A>

it produces the wanted result:

<period start="1" end="12"/>
<period start="14" end="30"/>
<period start="32" end="38"/>

David Carlisle ..

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:output indent="yes"/>

<xsl:template match="A">
  <result>
    <xsl:apply-templates select="B[1]"/>
  </result>
</xsl:template>

<xsl:template match="B">
   <xsl:param name="b" select="@period_begin"/>
   <xsl:param name="e" select="@period_end"/>
   <xsl:param name="g" select="/.."/>
  
   <xsl:variable name="e2" select="@period_end[. &gt; $e]|$e[. &gt;= current()/@period_end]"/>
   <xsl:choose>
     <xsl:when test="following-sibling::B[@period_begin &lt;=$e2 and @period_end &gt;= $e2]">
       <xsl:apply-templates select="following-sibling::B[1]">
         <xsl:with-param name="b" select="$b"/>
         <xsl:with-param name="e" select="$e2"/>
         <xsl:with-param name="g" select="$g|."/>
       </xsl:apply-templates>
     </xsl:when>
     <xsl:otherwise>
       <period begins="{$b}" ends="{$e2}">
         <xsl:copy-of select="$g|."/>
       </period>
       <xsl:apply-templates select="following-sibling::B[1]"/>
     </xsl:otherwise>
   </xsl:choose>
</xsl:template>

$ saxon period.xml  period.xsl
<?xml version="1.0" encoding="utf-8"?>
<result>
   <period begins="1" ends="12">
      <B period_begin="1" period_end="5"/>
      <B period_begin="2" period_end="7"/>
      <B period_begin="3" period_end="10"/>
      <B period_begin="4" period_end="12"/>
   </period>
   <period begins="14" ends="16">
      <B period_begin="14" period_end="16"/>
      <B period_begin="15" period_end="16"/>
   </period>
   <period begins="52" ends="62">
      <B period_begin="52" period_end="62"/>
   </period>
</result>
 

18. Sort by number of occurrences and remove duplicates

The following question was asked on the XSL-List.

> I have an xml doc that is the result of a keyword search, which list the
> keyword, and all pages by id that have that instance
> of the keyword. The
> output looks like this.

> <?xml version="1.0" encoding="UTF-8"?>
> <docroot>
>  <token>
>  <pageid>1</pageid>
>  <pageid>3</pageid>
>  <pageid>84</pageid>
>  </token>
>  <token>
>  <pageid>3</pageid>
>  <pageid>5</pageid>
>  <pageid>84</pageid>
>  </token>
>  <token>financial aid
>  <pageid>5</pageid>
>  <pageid>84</pageid>
>  </token>
> </docroot>

> I need to transform this this into a grouped and
> sorted list of page
> id's, so that the instances of pageid that occur the
> most frequently are
> first, and so that each pageid is listed only once.
> The above xml would
> then look like this.

> <docroot>
>  <pageid>84</pageid>
>  <pageid>3</pageid>
>  <pageid>5</pageid>
>  <pageid>1</pageid>
> </docroot>

The stylesheet for this problem is:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" indent="yes" />

<xsl:key name="by-pageid" match="pageid" use="."/>

<xsl:template match="/docroot">
     <docroot>
         <xsl:for-each select="token/pageid[generate-id() = generate-id(key('by-pageid', .)[1])]">
             <xsl:sort select="count(key('by-pageid', .))" data-type="number" order="descending" />
             <pageid><xsl:value-of select="." /></pageid>
         </xsl:for-each>
     </docroot>
</xsl:template>

</xsl:stylesheet>
 

19. Applying Muenchian grouping with the help of result tree fragment (eliminating duplicates)

The following question was asked on the XSL-List.

Given this HTML file:

<TABLE>
    <TR>
       <TD>Checking existence of Wood</TD>
       <TR>
          <TD>Found values for Wood</TD>
          <TD> The values are x y z</TD>
          <TD>Found values for Wood</TD>
          <TD> The values are x y z</TD>
       </TR>
    </TR>
    <TR>
       <TD>Checking existence of Tree</TD>
       <TR>
          <TD>Found values for Tree</TD>
          <TD> The values are a b c</TD>
       </TR>
    </TR>
</TABLE>

It is desired to produce an identically structured output file, and just eliminating duplicate TD text within TABLE/TR/TR ..

The stylesheet for this problem is:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:common="http://exslt.org/common"
                exclude-result-prefixes="common"
                version="1.0">
 
<xsl:output method="html" indent="yes" />
 
<xsl:key name="by-td" match="temp/TD" use="." />
 
<xsl:template match="node() | @*">
   <xsl:copy>
     <xsl:apply-templates select="node() | @*" />
   </xsl:copy>
</xsl:template>

<xsl:template match="TABLE/TR/TR">
   <xsl:variable name="rtf">
     <temp>
       <xsl:copy-of select="TD" />
     </temp> 
   </xsl:variable>
   <xsl:for-each select="common:node-set($rtf)/temp/TD[generate-id() = generate-id(key('by-td', .)[1])]">
     <xsl:copy-of select="." />
   </xsl:for-each>
</xsl:template>
 
</xsl:stylesheet>

The stylesheet uses identity template, node-set function, and the muenchian grouping technique (applied on a Result Tree Fragment).. The stylesheet is tested with Saxon 8.4.
 

20. Muenchian grouping on XML fragments within a larger XML

The following question was asked on microsoft.public.xsl Newsgroup.

I have xml like this:

<root>
<A id=1>
 <B>
  <C>
   <D>1</D>
  </C>
  <C>
   <D id=3>2</D>
  </C>
  <C>
   <D id=4>2</D>
  </C>
 </B>
 <B>
  <C>
   <D>3</D>
  </C>
  <C>
   <D id=5>2</D>
  </C>
  <C>
   <D>4</D>
  </C>
 </B>
</A>
<A id=2>
 <B>
  <C>
   <D>4</D>
  </C>
  <C>
   <D>5</D>
  </C>
  <C>
   <D id=6>1</D>
  </C>
 </B>
 <B>
  <C>
   <D>5</D>
  </C>
  <C>
   <D>5</D>
  </C>
  <C>
   <D>4</D>
  </C>
 </B>
</A>
</root>

In scope A I need to select list of unique D

for A[id=1] should be: 1,2,3,4
for A[id=2] should be: 1,4,5

How to do this?

The stylesheet for this problem is:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                      xmlns:msxsl="urn:schemas-microsoft-com:xslt" version="1.0">
 
<xsl:output method="text" encoding="UTF-8" />
 
<xsl:key name="by-d" match="temp/D" use="." />
 
<xsl:template match="/root">
   <xsl:for-each select="A">
     A[id = {@id}]
     <xsl:variable name="rtf">
       <temp>
         <xsl:copy-of select=".//D" />
       </temp>
     </xsl:variable>
     <xsl:for-each select="msxsl:node-set($rtf)/temp/D[generate-id() = generate-id(key('by-d', .)[1])]">
       <xsl:sort select="." order="ascending" data-type="number" />
       <xsl:value-of select="." /><xsl:if test="position() !=  last()"><xsl:text>,</xsl:text></xsl:if>
     </xsl:for-each>
   </xsl:for-each>
</xsl:template>
 
</xsl:stylesheet>


21. Extracting unique list of attributes (Muenchian grouping)

The following question was asked on XSL-List.

<root>
  <students>
    <student grade="first">abc</student>
    <student grade="first">def</student>
    <student grade="second">ghi</student>
    <student grade="third">jkl</student>
    <student grade="third">mno</student>
  </students>
</root>

It is required to transform this into the following xml using an XSL.

<root>
  <grades>
    <grade level="first"/>
    <grade level="second"/>
    <grade level="third"/>
  </grades>
  <students>
     <student grade="first">abc</student>
     <student grade="first">def</student>
     <student grade="second">ghi</student>
     <student grade="third">jkl</student>
     <student grade="third">mno</student>
   </students>
</root>

The stylesheet for this problem is:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:output method="xml" indent="yes" />

<xsl:key name="by-grade" match="student" use="@grade" />

<xsl:template match="/root">
 <root>
   <grades>
     <xsl:for-each select="students/student[generate-id() = generate-id(key('by-grade', @grade)[1])]">
       <grade level="{@grade}" />
     </xsl:for-each>
   </grades>
   <xsl:copy-of select="students" />
 </root>
</xsl:template>

</xsl:stylesheet>

Michael Kay suggested:
For all problems involving eliminating duplicate values, or grouping by
common values, see

http://www.jenitennison.com/xslt/grouping

or search on "Muenchian grouping".

If you're using XSLT 2.0, your problem can be solved easily using the distinct-values() function in XPath 2.0.


22. Positional grouping solutions

Using Sibling Recursion (by Michael Kay):

<xsl:template match="parent">
   <xsl:apply-templates select="STARTER[1]" mode="sr"/>
</xsl:template>

<xsl:template match="STARTER" mode="sr">
 <group>
     <xsl:apply-templates select="."/>
     <xsl:apply-templates select="following-sibling::*[1][not(self::STARTER)]"/>
 </group>
</xsl:template>

<xsl:template match="*" mode="sr">
    <xsl:apply-templates select="."/>
    <xsl:apply-templates select="following-sibling::*[1][not(self::STARTER)]"/>
</xsl:template>

where STARTER is an element that starts a new group, and * represents a non-starter element.

By David Carlisle (using a key):

Given this XML file:

<?xml version="1.0"?>
<X>
    <Y>
        <Z>HI</Z>
        <Z>HI</Z>
        <Z>HI</Z>
        <Z>YES</Z>
        <Z>HI</Z>
        <Z>HI</Z>
    </Y>
</X>

The desired output is:

<X>
    <Y>
        <group>
            <Z>HI</Z>
            <Z>HI</Z>
            <Z>HI</Z>
        </group>
        <Z>YES</Z>
        <group>
            <Z>HI</Z>
            <Z>HI</Z>
        </group>
    </Y>
</X>

The code suggested by David is:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:output method="xml" indent="yes" />

<xsl:key name="z" match="Z" use="string(preceding-sibling::Z[.='YES'][1])"/>

<xsl:template match="/X">
    <X>
        <xsl:apply-templates select="Y" />
        </X>
</xsl:template>

<xsl:template match="Y">
<Y>
    <xsl:for-each select="Z[generate-id()=generate-id(key('z',string(preceding-sibling::Z[.='YES'][1])))]">
        <group>
            <xsl:copy-of select="key('z',string(preceding-sibling::Z[.='YES'][1]))[not(.='YES')]"/>
        </group>
        <xsl:copy-of select="key('z',string(preceding-sibling::Z[.='YES'][1]))[(.='YES')]"/>
    </xsl:for-each>
</Y>
</xsl:template>

</xsl:stylesheet>

Here is a solution conceived by me (using a named template):

This stylesheet works on the same XML as David's:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:output method="xml" indent="yes" />

<xsl:template match="/X">
 <X>
  <Y>
    <xsl:apply-templates select="Y" />
  </Y>
 </X>
</xsl:template>

<xsl:template match="Y">
  <xsl:apply-templates select="Z" />
</xsl:template>

<xsl:template match="Z">
  <xsl:choose>
    <xsl:when test=". = 'YES'">
      <xsl:copy-of select="." />
    </xsl:when>
    <xsl:when test="((. = 'HI') and not(preceding-sibling::Z[1] = 'HI'))">
      <group>
        <xsl:copy-of select="." />
        <xsl:call-template name="generate-group-elements">
          <xsl:with-param name="list" select="following-sibling::Z" />
        </xsl:call-template>
      </group>
   </xsl:when>
 </xsl:choose>
</xsl:template>

<xsl:template name="generate-group-elements">
  <xsl:param name="list" />

 <xsl:if test="$list[1]='HI'">
    <xsl:copy-of select="$list[1]" />
    <xsl:call-template name="generate-group-elements">
      <xsl:with-param name="list" select="$list[position() > 1]" />
    </xsl:call-template>
  </xsl:if>
</xsl:template>

</xsl:stylesheet>


23.
Positional grouping

The following question was asked on XSL-List.

I am trying to retrieve data from a table structure (xml table) where after every few rows a special row appears which contains a piece of data which is relevant to rows appearing immediately after it (i.e. its next few siblings).  I am having hard time figuring out how to achieve this without a dynamically assigned variable in XSL.
 
  Test data looks like this:
 
  <schedule>
         <row type="header">
                 <col>January</col>
                 <col>Opponent</col>
         </row>
         <row type="data">
                 <col>10 at 6pm</col>
                 <col>Dallas</col>
         </row>
         <row type="data">
                 <col>21 at 8pm</col>
                 <col>New York</col>
         </row>
         <row type="data">
                 <col>31 at 8pm</col>
                 <col>Chicago</col>
         </row>
         <row type="header">
                 <col>March</col>
                 <col>Opponent</col>
         </row>
         <row type="data">
                 <col>16 at 9pm</col>
                 <col>Houston</col>
         </row>
         <row type="data">
                 <col>31 at 7pm</col>
                 <col>Sacramento</col>
         </row>
  </schedule>
 
  And the desired output is:

  <schedule>
         <date>January 10 at 6pm</date>
         <date>January 21 at 8pm</date>
         <date>January 31 at 8pm</date>
         <date>March   16 at 9pm</date>
         <date>March   31 at 7pm</date>
  </schedule>

The stylesheet for this problem is:

<?xml version="1.0"?
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
 
<xsl:output method="xml" indent="yes" /
  
<xsl:template match="/schedule"
   <schedule>
     <xsl:apply-templates select="row" />        
   </schedule>  
</xsl:template>
 
<xsl:template match="row">
   <xsl:choose>
     <xsl:when test="col[2] = 'Opponent'">
       <xsl:call-template name="groupSiblings">
         <xsl:with-param name="month" select="col[1]" />
         <xsl:with-param name="list" select="following-sibling::row" />
       </xsl:call-template>
     </xsl:when>
     <xsl:otherwise />     
  </xsl:choose>  
</xsl:template>
 
<xsl:template name="groupSiblings">
   <xsl:param name="month" /
   <xsl:param name="list" /

   <xsl:if test="not($list[1]/col[2] = 'Opponent')">
     <date> <xsl:value-of select="$month" /><xsl:text> </xsl:text> <xsl:value-of select="$list[1]/col[1]" /></date>
     <xsl:if test="count($list) &gt;= 2">
       <xsl:call-template name="groupSiblings">
         <xsl:with-param name="month" select="$month" />
         <xsl:with-param name="list" select="$list[position() >  1]" />
       </xsl:call-template>
     </xsl:if>
   </xsl:if>   
</xsl:template>
 
</xsl:stylesheet>
 

24. Positional grouping

The following question was asked on XSL-List.

Given the following XML file:

<root>
   <par>my first para</par>
   <par>second para</par>
   <li>first list item of first list</li>
   <li>second list item of first list</li>
   <li>third list item of first list</li>
   <par>third para</par>
   <li>first list item of second list</li>
   <par>fourth para</par>
   <li>first list item of third list</li>
   <li>second list item of third list</li>
</root>

It is desired to produce the following XML:

<root>
 <p>my first para</p>
 <p>second para</p>
 <ul>
   <li>first list item of first list</li>
   <li>second list item of first list</li>
   <li>third list item of first list</li>
 </ul>
 <p>third para</p>
 <ul>
   <li>first list item of second list</li>
 </ul>
 <p>fourth para</p>
 <ul>
   <li>first list item of third list</li>
   <li>second list item of third list</li>
 </ul>
</root>

The stylesheet for this problem is:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:output method="xml" indent="yes" />

<xsl:template match="/root">
  <root>
   <xsl:for-each select="*">
     <xsl:choose>
       <xsl:when test="self::par">
        <p><xsl:value-of select="." /></p>
       </xsl:when>
       <xsl:otherwise>
         <xsl:if test="(name(preceding-sibling::*[1]) = 'par') or (position() = 1)">
           <ul>
             <xsl:call-template name="makegroup">
               <xsl:with-param name="nodeset" select="self::* | following-sibling::*" />
             </xsl:call-template>
           </ul>
         </xsl:if>
       </xsl:otherwise>
     </xsl:choose>
   </xsl:for-each>
  </root>
</xsl:template>

<xsl:template name="makegroup">
  <xsl:param name="nodeset" />

  <xsl:if test="name($nodeset[1]) = 'li'">
    <xsl:copy-of select="$nodeset[1]" />
    <xsl:call-template name="makegroup">
      <xsl:with-param name="nodeset" select="$nodeset[position() &gt; 1]" />
    </xsl:call-template>
  </xsl:if>

</xsl:template>

</xsl:stylesheet>


25. XSLT grouping problem

The following question was asked on XSL-List.

Given the following XML file:

<chapter>
  <para>This is a paragraph in the chapter</para>
  <line_first>This is first line of a stanza of poetry</line_first>
  <line>This is line of poetry</line>
  <line>This is line of poetry</line>
  <line>This is line of poetry</line>
  <line_last>This is last line of a stanza of poetry</line_last>
  <para>This is a paragraph in the chapter</para>
</chapter>

It is desired to produce the following output:

<chapter>
  <para>This is a paragraph in the chapter</para>
   <stanza>
     <line_first>This is first line of poetry</line_first>
     <line>This is line of poetry</line>
     <line>This is line of poetry</line>
     <line>This is line of poetry</line>
     <line_last>This is last line of poetry</line_last>
  </stanza>
<para>This is a paragraph in the chapter</para>
</chapter>

(The consecutive elements from <line_first> and <line_last> need to be grouped in a <stanza> tag. There can be multiple stanzas.)

This is a positional grouping problem.

The stylesheet for this problem is:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:output method="xml" indent="yes" />

<xsl:template match="/chapter">
    <chapter>
        <xsl:for-each select="*">
            <xsl:choose>
                <xsl:when test="self::para">
                    <xsl:copy-of select="." />
                </xsl:when>
                <xsl:when test="self::line_first">
                    <stanza>
                        <xsl:call-template name="makegroup">
                            <xsl:with-param name="nodeset" select="self::line_first | following-sibling::*" />
                        </xsl:call-template>
                    </stanza>
                </xsl:when>
            </xsl:choose>
        </xsl:for-each>
    </chapter>
</xsl:template>

<xsl:template name="makegroup">
<xsl:param name="nodeset" />

<xsl:choose>
    <xsl:when test="$nodeset[1]/self::line_last">
        <xsl:copy-of select="$nodeset[1]" />
    </xsl:when>
    <xsl:otherwise>
        <xsl:copy-of select="$nodeset[1]" />
        <xsl:call-template name="makegroup">
            <xsl:with-param name="nodeset" select="$nodeset[position() &gt; 1]" />
        </xsl:call-template>
    </xsl:otherwise>
</xsl:choose>
</xsl:template>

</xsl:stylesheet>

David Carlisle suggested the following XSLT 2.0 solution:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">

<xsl:output indent="yes"/>

<xsl:template match="*">
    <xsl:copy>
        <xsl:copy-of select="@*"/>
        <xsl:apply-templates/>
    </xsl:copy>
</xsl:template>

<xsl:template match="*[line]">
    <xsl:copy>
        <xsl:copy-of select="@*"/>
        <xsl:for-each-group select="*" group-adjacent="exists(self::line|self::line_first|self::line_last)">
            <xsl:choose>
                <xsl:when test="current-grouping-key()">
                    <stanza>
                        <xsl:apply-templates select="current-group()"/>
                    </stanza>
                </xsl:when>
                <xsl:otherwise>
                    <xsl:apply-templates select="current-group()"/>
                </xsl:otherwise>
            </xsl:choose>
        </xsl:for-each-group>
    </xsl:copy>
</xsl:template>

</xsl:stylesheet>

Dimitre Novatchev suggested following XSLT 1.0 solution:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output omit-xml-declaration="yes" indent="yes"/>

<xsl:strip-space elements="*"/>

<xsl:key name="kLLast" match="line_last | line" use="generate-id(preceding-sibling::line_first[1])"/>

<xsl:template match="node()|@*">
    <xsl:copy>
        <xsl:apply-templates select="node()|@*"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="line_first">
    <stanza>
        <xsl:copy-of select=".|key('kLLast',generate-id())"/>
    </stanza>
</xsl:template>

<xsl:template match="line|line_last"/>

</xsl:stylesheet>

Michael Kay suggested this XSLT 2.0 approach:

<xsl:for-each-group select="*" group-adjacent="self::line or self::line_first or self::line_last">
    <xsl:choose>
        <xsl:when test="current-grouping-key()">
            <stanza><xsl:copy-of select="current-group()"/></stanza>
        </xsl:when>
        <xsl:otherwise>
            <xsl:copy-of select="current-group()"/>
        </xsl:otherwise>
    </xsl:choose>
</xsl:for-each-group>
 

26. Muenchian grouping application

The following question was asked on microsoft.public.xsl Newsgroup.

Given this XML document:

<doc>
  <product>
    <prod>product one</prod>
    <prod>product two</prod>
    <prod>product three</prod>
    <prod>product four</prod>
  </product>
  <category>
    <cat>category one</cat>
    <cat>category two</cat>
    <cat>category one</cat>
    <cat>category two</cat>
  </category>
</doc>

Each product belongs to a category that is in its corresponding position.  i.e.  "product one" belongs "category one" ; "product two" belongs "category two" ; "product three" belongs "category one" .. etc

What I would like to achieve is to present them in a HTML table like this:

category one      product one
                         product three

category two      product two
                          product four
 

The stylesheet for this problem is:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                       xmlns:msxsl="urn:schemas-microsoft-com:xslt"
                       version="1.0">
 
<xsl:output method="html" />
 
<xsl:key name="by-cat" match="cat" use="." />
 
<xsl:variable name="d" select="/" />
 
<xsl:template match="/doc">
   <html>
     <head>
       <title/>
     </head>
     <body>
       <xsl:variable name="rtf">
         <xsl:apply-templates select="category" />         
       </xsl:variable>
       <table>
         <xsl:for-each select="msxsl:node-set($rtf)/category/x/y">
           <xsl:variable name="n1" select="../z[1]" />
           <tr>
             <td><xsl:value-of select="." /></td>
             <td><xsl:value-of select="$d/doc/product/prod[position() = $n1]" /></td>
           </tr>
           <xsl:for-each select="../z[position() > 1]">
             <xsl:variable name="n2" select="." />
             <tr>
                 <td/>
                 <td><xsl:value-of select="$d/doc/product/prod[position() = $n2]" /></td>
             </tr>
           </xsl:for-each>
         </xsl:for-each>
       </table>
     </body> 
   </html>
</xsl:template>
 
<xsl:template match="category">
   <category>
     <xsl:for-each select="cat[generate-id() = generate-id(key('by-cat', .)[1])]">
       <x>
         <y><xsl:value-of select="." /></y>
         <xsl:for-each select="key('by-cat', .)">
           <z><xsl:value-of select="count(/doc/category/cat[generate-id() = generate-id(current())]/preceding-sibling::cat) + 1" /></z>
         </xsl:for-each>   
       </x>
     </xsl:for-each>
   </category>
</xsl:template>
 
</xsl:stylesheet>


27. Positional grouping

The following question was asked on XSL-List.

Given this input XML:

<a>1</a>
<b>2</b>
<Attribute>
    <Name>xx</Name>
    <Value>xx</Value>
</Attribute>
<Attribute>
    <Name>yy </Name>
    <Value>yy</Value>
</Attribute>
...
<c>3</c>

I would like to generate:

<a>1</a>
<b>2</b>
<OtherAttributes>
    <Attribute>
        <Name>xx</Name>
        <Value>xx</Value>
    </Attribute>
    <Attribute>
        <Name>yy </Name>
        <Value>yy</Value>
    </Attribute>
...
</OtherAttributes>
<c>3</c>

There could be 0 to n attribute elements and I would like to wrap them between an OtherAttributes.

George Cristian Bina suggested following XSLT 1.0 stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="node() | @*">
    <xsl:copy>
        <xsl:apply-templates select="node() | @*"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="Attribute">
    <OtherAttributes>
        <xsl:copy>
            <xsl:apply-templates select="node() | @*"/>
        </xsl:copy>
        <xsl:apply-templates select="following-sibling::*[1][self::Attribute]" mode="att"/>
    </OtherAttributes>
</xsl:template>

<xsl:template match="Attribute[preceding-sibling::*[1][self::Attribute]]"/>

<xsl:template match="Attribute" mode="att">
    <xsl:copy>
        <xsl:apply-templates select="node() | @*"/>
    </xsl:copy>
    <xsl:apply-templates select="following-sibling::*[1][self::Attribute]" mode="att"/>
</xsl:template>

<xsl:template match="node() | @*" mode="att">
    <xsl:copy>
        <xsl:apply-templates select="node() | @*"/>
    </xsl:copy>
</xsl:template>

</xsl:stylesheet>

I suggested following XSLT 1.0 solution:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:output method="xml" indent="yes" />

<xsl:template match="/root">
    <root>
        <xsl:for-each select="*">
            <xsl:choose>
                <xsl:when test="not(self::Attribute)">
                    <xsl:copy-of select="." />
                </xsl:when>
                <xsl:when test="not(preceding-sibling::*[1]/self::Attribute)">
                    <OtherAttributes>
                        <xsl:copy-of select="." />
                        <xsl:call-template name="MakeGroup">
                            <xsl:with-param name="nodeset" select="following-sibling::*" />
                        </xsl:call-template>
                    </OtherAttributes>
                </xsl:when>
            </xsl:choose>
        </xsl:for-each>
    </root>
</xsl:template>

<xsl:template name="MakeGroup">
<xsl:param name="nodeset" />

<xsl:if test="$nodeset[1]/self::Attribute">
    <xsl:copy-of select="$nodeset[1]" />
    <xsl:call-template name="MakeGroup">
        <xsl:with-param name="nodeset" select="$nodeset[position() &gt; 1]" />
    </xsl:call-template>
</xsl:if>
</xsl:template>

</xsl:stylesheet>

David Carlisle suggested following XSLT 2.0 solution:

With grouping problems like this, the XSLT2 solution is probably easier to follow, as for-each-group makes the grouping explicit.

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:strip-space elements="*"/>

<xsl:output indent="yes"/>

<xsl:template match="node()">
    <xsl:copy>
        <xsl:copy-of select="@*"/>
        <xsl:apply-templates/>
    </xsl:copy>
</xsl:template>

<xsl:template match="*[Attribute]">
    <xsl:copy>
        <xsl:for-each-group select="*" group-adjacent="boolean(self::Attribute)">
            <xsl:choose>
                <xsl:when test="self::Attribute">
                    <OtherAttributes>
                        <xsl:apply-templates select="current-group()"/>
                    </OtherAttributes>
                </xsl:when>
                <xsl:otherwise>
                    <xsl:apply-templates select="current-group()"/>
                </xsl:otherwise>
            </xsl:choose>
        </xsl:for-each-group>
    </xsl:copy>
</xsl:template>

</xsl:stylesheet>
 

28. Muenchian grouping

The following question was asked on XSL-List.

Given this source XML:

<report>
    <column>name</column>
    <column>country</column>
    <column>group</column>
    <table>
        <date>07-13-2006
            <group type="AAA">111111
                <entry>
                    <name>Adel</name>
                    <country>USA</country>
                    <id>12345</id>
                </entry>
                <entry>
                    <name>Barry</name>
                    <country>USA</country>
                    <id>12346</id>
                </entry>
                <entry>
                    <name>Carl</name>
                    <country>USA</country>
                    <id>12347</id>
                </entry>
            </group>
            <group type="BBB">111111
                <entry>
                    <name>Dave</name>
                    <country>USA</country>
                    <id>12345</id>
                </entry>
                <entry>
                    <name>Ethel</name>
                    <country>USA</country>
                    <id>12346</id>
                </entry>
                <entry>
                    <name>Fred</name>
                    <country>USA</country>
                    <id>12347</id>
                  </entry>
            </group>
            <group type="CCC">111111
                <entry>
                    <name>George</name>
                    <country>EUR</country>
                    <id>24567</id>
                </entry>
                <entry>
                    <name>Harold</name>
                    <country>EUR</country>
                    <id>23458</id>
                </entry>
                <entry>
                    <name>Jennifer</name>
                    <country>EUR</country>
                    <id>23459</id>
                </entry>
            </group>
            </date>
    </table>
</report>

The desired output is (should be an HTML table):

USA
Group Name id
AAA Adel 12345
AAA Barry 12346
AAA Carl 12347
BBB Dave 12345
BBB Ethel 12346
BBB Fred 12347

EUR
Group Name id
CCC George 24567
CCC Harold 23458
CCC Jennifer 23459

The XSLT 1.0 solution is (using Muenchian grouping method):

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="html" indent="yes" />

<xsl:key name="by-country" match="entry" use="country" />

<xsl:template match="/report">
    <html>
        <head>
            <title/>
        </head>
        <body>
            <table border="1">
                <xsl:for-each select="//entry[generate-id() = generate-id(key('by-country', country)[1])]">
                    <tr>
                        <td><xsl:value-of select="country" /></td>
                    </tr>
                    <tr>
                        <td>Group</td>
                        <td>Name</td>
                        <td>id</td>
                    </tr>
                    <xsl:for-each select="key('by-country', country)">
                        <tr>
                            <td><xsl:value-of select="../@type" /></td>
                            <td><xsl:value-of select="name" /></td>
                            <td><xsl:value-of select="id" /></td>
                        </tr>
                    </xsl:for-each>
                </xsl:for-each>
            </table>
        </body>
    </html>
</xsl:template>

</xsl:stylesheet>

Michael Kay suggested the following XSLT 2.0 solution:

<xsl:for-each-group select="//entry" group-adjacent="country">
    <h2><xsl:value-of select="current-grouping-key()"/></h2>
    <table>
        <thead>...</thead>
        <tbody>
            <xsl:for-each select="current-group()">
                <tr>
                    <td><xsl:value-of select="../@type"/></td>
                    <td><xsl:value-of select="name"/></td>
                    <td><xsl:value-of select="id"/></td>


29.
Grouping problem

The following question was asked on XSL-List.

I need to use XSLT 1.0.

I've a set of nodes, for which I want:
a. to replace some values (most will not change), then
b. eliminate the duplicates and sort them

Sample input XML:

<?xml version="1.0" encoding="UTF-8" ?>
<a>
    <e1>
        <b>abc</b>
        <b>abd</b>
        <b>abe</b>
        <b>abf</b>
        <b>abg</b>
        <b>abh</b>
        <b>abd</b>
    </e1>
    <e2>
        <f c="abe" b1="abc"/>
        <f c="abf" b1="abj"/>
        <f c="abg" b1="abi"/>
        <f c="abh" b1="abi"/>
    </e2>
</a>

The input table _e1_ will be transformed using _f_ to:

<e1>
    <b>abc</b>
    <b>abd</b>
    <b>abc</b>
    <b>abj</b>
    <b>abi</b>
    <b>abi</b>
    <b>abd</b>
</e1>

and finally should become:

<e3>
    <b2>abc</b2>
    <b2>abd</b2>
    <b2>abi</b2>
    <b2>abj</b2>
</e3>

Dimitre Novatchev suggested the following XSLT 1.0 solution:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output omit-xml-declaration="yes" indent="yes"/>

<xsl:key name="kb" match="b[not(. = ../../e2/f/@c)]" use="."/>
<xsl:key name="kb" match="@b1" use="../../../e1/b[. = current()/../@c]"/>
<xsl:key name="kDist" match="b | @b1" use="."/>

<xsl:template match="/">
    <e3>
        <xsl:for-each select="key('kb', */e1/b)[generate-id() = generate-id(key('kDist', .)[1])]">
            <xsl:sort/>
            <b2><xsl:value-of select="."/></b2>
        </xsl:for-each>
    </e3>
</xsl:template>

</xsl:stylesheet>

I provided the following answer (which uses node-set extension function):

<?xml version="1.0"?>

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:common="http://exslt.org/common" exclude-result-prefixes="common">

<xsl:output method="xml" indent="yes" />

<xsl:template match="/a">
    <e3>
        <xsl:variable name="rtf">
            <xsl:for-each select="e1/b">
                <b>
                    <xsl:choose>
                        <xsl:when test="../../e2/f/@c = .">
                            <xsl:value-of select="../../e2/f[@c = current()]/@b1" />
                        </xsl:when>
                        <xsl:otherwise>
                            <xsl:value-of select="." />
                        </xsl:otherwise>
                    </xsl:choose>
                </b>
            </xsl:for-each>
        </xsl:variable>
        <xsl:for-each select="common:node-set($rtf)/b[not(. = preceding-sibling::b)]">
            <xsl:sort select="." />
            <b2><xsl:value-of select="." /></b2>
        </xsl:for-each>
    </e3>
</xsl:template>

</xsl:stylesheet>


30. Converting a flat XML to hierarchical one

The following question was asked on XSL-List.

If I were to have a RDBMS table along the following lines:

+----+---------------+------+
| id     | name              | path   |
+----+---------------+------+
| 1     | About             | 1        |
| 2     | Services          | 2        |
| 3     | Environmental | 2.1     |
| 4     | Landscaping   | 2.2     |
+----+---------------+------+

And this were converted to XML, e.g.:

<directories>
    <item>
        <id>1</id>
        <name>About</name>
        <path>1</path>
    </item>
    <item>
        <id>2</id>
        <name>Services</name>
        <path>2</path>
    </item>
    ... etc ...
</directories>

How it can be converted to a tree structure? e.g.:

<directories>
    <directory name="About"/>
    <directory name="Services">
        <directory name="Environmental" />
        <directory name="Landscaping" />
    </directory>
</directories>

The XSLT 1.0 solution for this is:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" indent="yes"/>

<xsl:template match="/directories">
    <directories>
        <xsl:for-each select="item[(string-length(path) - string-length(translate(path, '.', ''))) = 0]">
            <directory name="{name}">
                <xsl:call-template name="re-arrange">
                    <xsl:with-param name="nodeset" select="../item" />
                    <xsl:with-param name="curr-path" select="path" />
                    <xsl:with-param name="no-of-dots" select="1" />
                </xsl:call-template>
            </directory>
        </xsl:for-each>
    </directories>
</xsl:template>

<xsl:template name="re-arrange">
    <xsl:param name="nodeset" />
    <xsl:param name="curr-path" />
    <xsl:param name="no-of-dots" />

    <xsl:for-each select="$nodeset[((string-length(path) - string-length(translate(path, '.', ''))) = $no-of-dots) and starts-with(path, $curr-path)]">
        <directory name="{name}">
            <xsl:call-template name="re-arrange">
                <xsl:with-param name="nodeset" select="$nodeset" />
                <xsl:with-param name="curr-path" select="path" />
                <xsl:with-param name="no-of-dots" select="$no-of-dots + 1" />
            </xsl:call-template>
        </directory>
    </xsl:for-each>

</xsl:template>

</xsl:stylesheet>

David Carlisle provided the following XSLT 2.0 solution:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:template match="directories">
        <directories>
            <xsl:apply-templates select="item[not(contains(path,'.'))]"/>
        </directories>
    </xsl:template>

    <xsl:template match="item">
        <directory name="{name}">
            <xsl:apply-templates select="../item[matches(path,concat('^',current()/path,'.[^\.]$'))]"/>
        </directory>
    </xsl:template>
 
</xsl:stylesheet>
 

31. Extracting unique alphabets, and counting how many times they occur

The following question was asked on XSL-List.

I have the following XML file:

<list>
    <item1 ids="a,b,c,e" />
    <item2 ids="b,c,d" />
    <item3 ids="a,c,d,e" />
    <item4 ids="e,f" />
    <item5 ids="a,c,d,e,g" />
</list>

and I want the following output:

a: 3
b: 2
c: 4
d: 3
e: 4
f: 1
g: 1

The XSLT 1.0 solution for this is (using the node-set extension function):

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:common="http://exslt.org/common">

<xsl:output method="text" />

<xsl:key name="x" match="*" use="." />

<xsl:template match="/list">
    <xsl:variable name="rtf">
        <xsl:for-each select="*">
            <xsl:call-template name="tokenize">
                <xsl:with-param name="string" select="@ids" />
                <xsl:with-param name="delim" select="','" />
            </xsl:call-template>
        </xsl:for-each>
    </xsl:variable>

    <xsl:for-each select="common:node-set($rtf)/token[generate-id() = generate-id(key('x', .)[1])]">
        <xsl:sort select="." />
        <xsl:value-of select="." />: <xsl:value-of select="count(key('x', .))" /><xsl:text>&#xa;</xsl:text>
    </xsl:for-each>
</xsl:template>

<xsl:template name="tokenize">
<xsl:param name="string" />
<xsl:param name="delim" />

<xsl:choose>
    <xsl:when test="contains($string, $delim)">
        <token><xsl:value-of select="substring-before($string, $delim)" /></token>
        <xsl:call-template name="tokenize">
            <xsl:with-param name="string" select="substring-after($string, $delim)" />
            <xsl:with-param name="delim" select="$delim" />
        </xsl:call-template>
    </xsl:when>
    <xsl:otherwise>
        <token><xsl:value-of select="$string" /></token>
    </xsl:otherwise>
</xsl:choose>
</xsl:template>

</xsl:stylesheet>

Michael Kay suggested the following XSLT 2.0 solution:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="text" />

<xsl:template match="/">
    <xsl:for-each-group select="list/*/@ids/tokenize(.,',')" group-by=".">
        <xsl:sort select="current-grouping-key()"/>
        <xsl:value-of select="current-grouping-key()"/>
        <xsl:text>: </xsl:text>
        <xsl:value-of select="count(current-group())"/><xsl:text>&#xa;</xsl:text>
    </xsl:for-each-group>
</xsl:template>

</xsl:stylesheet>
 

32. Positional grouping

The following question was asked on XSL-List.

I want to wrap a script from a play in a XML file:

<play>
  <scene>Scene 1</scene>
  <character>char 1</character>
  <line>blah blah blah</line>
  <line>blah blah blah</line>
  <line>blah blah blah</line>
  <character>char 2</character>
  <line>blah blah blah</line>
  <line>blah blah blah</line>
  <line>blah blah blah</line>
  <character>char 3</character>
  <line>blah blah blah</line>
  <line>blah blah blah</line>
  <line>blah blah blah</line>
  <scene>Scene 2</scene>
  <character>char 1</character>
  <line>blah blah blah</line>
  <line>blah blah blah</line>
  <line>blah blah blah</line>
</play>

The desired output is:

<play>
  <scene name="Scene 1">
    <character name="char 1">
      <line>blah blah blah</line>
      <line>blah blah blah</line>
      <line>blah blah blah</line>
   </character>
   <character name="char 2">
     <line>blah blah blah</line>
     <line>blah blah blah</line>
     <line>blah blah blah</line>
   </character>
   <character name="char 3">
     <line>blah blah blah</line>
     <line>blah blah blah</line>
     <line>blah blah blah</line>
   </character>
 </scene>
 <scene name="Scene 2">
    <character name="char 1">
      <line>blah blah blah</line>
      <line>blah blah blah</line>
      <line>blah blah blah</line>
   </character>
 </scene>
</play>

Michael Kay suggested:

In XSLT 2.0 it looks like this:

<xsl:template match="play">
  <xsl:for-each-group select="*" group-starting-with="scene">
    <scene name="{.}">
      <xsl:for-each-group select="current-group()" group-starting-with="character">
        <character name="{.}">
          <xsl:copy-of select="current-group()[self::line]"/>
       </character>
     </xsl:for-each-group>
   </scene>
 </xsl:for-each-group>
</xsl:template>

In 1.0 it's much more difficult: the two general approaches are (a) to treat it as a value-based grouping problem, which you can tackle with Muenchian
grouping, using something like generate-id(preceding-sibling::scene[1]) as the grouping key, or (b) to do a recursive traversal over the siblings, using apply-templates select="following-sibling::*[1]" to achieve the recursion, and terminating each level of recursion when there are no more elements on the same logical level of the hierarchy.

I provided the following XSLT 1.0 based solution:

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" indent="yes" />

<xsl:template match="/play">
  <play>
    <xsl:apply-templates select="scene" />
  </play>
</xsl:template>

<xsl:template match="scene">
  <scene name="{.}">
    <xsl:apply-templates select="following-sibling::character">
      <xsl:with-param name="gen-id" select="generate-id()" />
    </xsl:apply-templates>
  </scene>
</xsl:template>

<xsl:template match="character">
  <xsl:param name="gen-id" />

  <xsl:if test="$gen-id = generate-id(preceding-sibling::scene[1])">
     <character name="{.}">
       <xsl:apply-templates select="following-sibling::line">
         <xsl:with-param name="gen-id" select="generate-id()" />
       </xsl:apply-templates>
     </character>
  </xsl:if>
</xsl:template>

<xsl:template match="line">
  <xsl:param name="gen-id" />

  <xsl:if test="$gen-id = generate-id(preceding-sibling::character[1])">
     <xsl:copy-of select="." />
  </xsl:if>
</xsl:template>

</xsl:stylesheet>

Michael expressed following opinion on the above approach:

This solution is likely to be O[n^2] with respect to the number of lines in the play, which could be rather large. A recursive traversal that uses apply-templates select="following-sibling::*[1]" would be O[n].

Michael provided the following XSLT 1.0 based solution, illustrating his idea of "sibling recursion":

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" indent="yes" />

<xsl:template match="/play">
  <play>
    <xsl:apply-templates select="scene" />
 </play>
</xsl:template>

<xsl:template match="scene">
  <scene name="{.}">
    <xsl:apply-templates select="following-sibling::character[1]"/>
  </scene>
</xsl:template>

<xsl:template match="character">
  <character name="{.}">
    <xsl:apply-templates select="following-sibling::*[1][self::line]"/>
  </character>
  <xsl:apply-templates select="following-sibling::*[self::character|self::scene][1][self::character]"/>
</xsl:template>

<xsl:template match="line">
   <xsl:copy-of select="."/>
   <xsl:apply-templates select="following-sibling::*[1][self::line]"/>
</xsl:template>

</xsl:stylesheet>

Michael further said:
I call the technique "sibling recursion". It's more concise than most recursive code, because it's self-terminating: you don't need to test explicitly whether there are more items to process, because apply-templates does that automatically for you.
 

33. Positional grouping methods

Positional grouping is a class of problems in which it is necessary to convert a flat sequence into a hierarchy by recognizing patterns in the sequence of items. The allocation of items to groups is based on positional relationships of the items in the sequence.

Here I am describing both XSLT 1.0 and XSLT 2.0 based solutions.

Lets say, we have the following XML file:

<html>
  <h1>heading</h1>
  <p>text</p>
  <h1>heading..</h1>
  <p>text..</p>
  <p>text....</p>
  <h1>heading....</h1>
  <p>text....</p>
  <p>text......</p>
  <p>text........</p>
</html>

It is desired to produce the following output, as a result of XSLT transformation:

<result>
  <group label="heading">
     <p>text</p>
 </group>
 <group label="heading..">
    <p>text..</p>
    <p>text....</p>
 </group>
 <group label="heading....">
   <p>text....</p>
   <p>text......</p>
   <p>text........</p>
 </group>
</result>

Different groups are formed, containing elements taken from the sequence of items, delimited by <h1> element.

Here are the possible solutions:

[1] The sibling recursion technique, suggested by Michael Kay (XSLT 1.0).

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" indent="yes"/>

<xsl:template match="/html">
  <result>
     <xsl:apply-templates select="h1" />
  </result>
</xsl:template>

<xsl:template match="h1">
  <group label="{.}">
     <xsl:apply-templates select="following-sibling::*[1][self::p]" />
  </group>
</xsl:template>

<xsl:template match="p">
   <xsl:copy-of select="." />
   <xsl:apply-templates select="following-sibling::*[1][self::p]" />
</xsl:template>

</xsl:stylesheet>

[2] This technique is based on a recursive named template (XSLT 1.0).

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" indent="yes"/>

<xsl:template match="/html">
   <result>
     <xsl:apply-templates select="h1" />
   </result>
</xsl:template>

<xsl:template match="h1">
   <group label="{.}">
     <xsl:call-template name="group">
       <xsl:with-param name="node-set" select="following-sibling::*" />
     </xsl:call-template>
   </group>
</xsl:template>

<xsl:template name="group">
<xsl:param name="node-set" />

<xsl:if test="$node-set[1][self::p]">
    <xsl:copy-of select="$node-set[1]" />
    <xsl:call-template name="group">
       <xsl:with-param name="node-set" select="$node-set[position() &gt; 1]" />
    </xsl:call-template>
 </xsl:if>
</xsl:template>

</xsl:stylesheet>

[3] This technique uses generate-id() function to identify groups (XSLT 1.0).

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" indent="yes"/>

<xsl:template match="/html">
  <result>
    <xsl:apply-templates select="h1" />
  </result>
</xsl:template>

<xsl:template match="h1">
  <xsl:variable name="gen-id" select="generate-id()" />
  <group label="{.}">
    <xsl:for-each select="following-sibling::p[$gen-id = generate-id(preceding-sibling::h1[1])]">
      <xsl:copy-of select="." />
    </xsl:for-each>
  </group>
</xsl:template>

</xsl:stylesheet>

[4] This technique uses xsl:for-each-group instruction (XSLT 2.0).

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" indent="yes"/>

<xsl:template match="/html">
  <result>
    <xsl:for-each-group select="*" group-starting-with="h1">
      <group label="{.}">
        <xsl:copy-of select="current-group()[position() &gt; 1]" />
      </group>
    </xsl:for-each-group>
  </result>
</xsl:template>

</xsl:stylesheet>

Among XSLT 1.0 solutions, the sibling recursion technique is most efficient (as xsl:apply-templates traverses one element at a time, and terminates automatically when the group has ended).

If using XSLT 2.0, technique [4] should be the obvious choice.
 

34. Muenchian grouping example

The following question was asked on XSL-List.

I have the following XML document for my input:

<?xml version="1.0" encoding="UTF-8"?>
<Root>
  <year>2006</year>
  <year>2007</year>
  <year>2008</year>
  <week>1</week>
  <day>01</day>
  <day>02</day>
</Root>

My expected output is:

<Root>
  <yearList>
    <year>2006</year>
    <year>2007</year>
    <year>2008</year>
  <yearList>
  <week>1</week>
  <dayList>
    <day>01</day>
    <day>02</day>
 </dayList>
</Root>

I want to group the elements using XSLT 1.0, and here element name is not fixed. It will be varied. When we have the same element, then we need a parent element which have the element's name appended with string "List" (e.g. yearList).

The XSLT 1.0 solution for this is:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" indent="yes" />

<xsl:key name="x" match="Root/*" use="local-name()" />

<xsl:template match="/Root">
  <Root>
    <xsl:for-each select="*[generate-id() = generate-id(key('x', local-name())[1])]">
      <xsl:choose>
         <xsl:when test="count(key('x', local-name())) &gt; 1">
            <xsl:element name="{local-name()}List">
               <xsl:copy-of select="key('x', local-name())" />
           </xsl:element>
        </xsl:when>
        <xsl:otherwise>
           <xsl:copy-of select="key('x', local-name())" />
        </xsl:otherwise>
      </xsl:choose>
    </xsl:for-each>
  </Root>
</xsl:template>

</xsl:stylesheet>
 

35. Muenchian grouping example (with multiple levels)

The following question was asked on XSL-List.

I need to turn this:

<test>
  <group>
     <player name="joe" position="pitcher" team="mets" state="ny">2</player>
     <player name="mark" position="outfielder" team="mets" state="ny">11</player>
     <player name="john" position="pitcher" team="mets" state="ny">23</player>
     <player name="pete" position="outfielder" team="mets" state="ny">27</player>
     <player name="roy" position="outfielder" team="mets" state="ny">13</player>
     <player name="carl" position="infielder" team="mets" state="ny">32</player>
  </group>
</test>

Into something like this:

<?xml version="1.0" encoding="UTF-8"?>
<group>
  <state name="ny">
    <team name="mets">
       <position name="pitcher">
         <player name="joe" number="2"/>
         <player name="john" number="23"/>
      </position>
      <position name="outfielder">
        <player name="mark" number="11"/>
        <player name="pete" number="27"/>
        <player name="roy" number="13"/>
      </position>
      <position name="infielder">
        <player name="carl" number="32"/>
      </position>
    </team>
 </state>
</group>

There can be many states, many teams and many positions.

Following is the XSLT stylesheet for this problem:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" indent="yes" />

<xsl:key name="x" match="player" use="@state"/>
<xsl:key name="y" match="player" use="concat(@state, ':', @team)"/>
<xsl:key name="z" match="player" use="concat(@state, ':', @team, ':', @position)"/>

<xsl:template match="/test">
   <group>
      <xsl:for-each select="group/player[generate-id() = generate-id(key('x', @state)[1])]">
         <xsl:variable name="state" select="@state" />
         <state name="{@state}">
            <xsl:for-each select="../player[generate-id() = generate-id(key('y', concat($state, ':', @team))[1])]">
               <xsl:variable name="team" select="@team" />
               <team name="{@team}">
                  <xsl:for-each select="../player[generate-id() = generate-id(key('z', concat($state, ':', $team, ':', @position))[1])]">
                     <position name="{@position}">
                         <xsl:for-each select="key('z', concat($state, ':', $team, ':', @position))">
                            <player name="{@name}" number="{.}"/>
                         </xsl:for-each>
                    </position>
                  </xsl:for-each>
               </team>
            </xsl:for-each>
         </state>
      </xsl:for-each>
   </group>
</xsl:template>

</xsl:stylesheet>
 

36. Positional grouping problem

The following question was asked on XSL-List.

I have following input XML:

<root>
    <a id="1"/>
    <a id="2"/>
    <b/>
    <d/>
    <g/>
    <a id="3"/>
    <a id="4"/>
    <a id="5"/>
    <x/>
    <a id="6"/>
    <a id="7"/>
</root>

and I'm trying to create XSLT 1.0 script, which would nest "uninterrupted" sibling groups of 'a' elements into 'a-block' elements, so the output would look like:

<root>
    <a-block>
        <a id="1"/>
        <a id="2"/>
    </a-block>
    <b/>
    <d/>
    <g/>
    <a-block>
        <a id="3"/>
        <a id="4"/>
        <a id="5"/>
    </a-block>
    <x/>
    <a-block>
        <a id="6"/>
        <a id="7"/>
    </a-block>
</root>

Martin Honnen provided following answer:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:output method="xml" indent="yes"/>

<xsl:key name="a-group" match="a" use="generate-id(preceding-sibling::*[not(self::a)][1])"/>

<xsl:template match="root">
    <xsl:copy>
        <xsl:apply-templates select="*"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="a[preceding-sibling::*[1][not(self::a)] or not(preceding-sibling::*)]">
    <a-block>
        <xsl:apply-templates select="key('a-group', generate-id(preceding-sibling::*[not(self::a)][1]))" mode="copy"/>
    </a-block>
</xsl:template>

<xsl:template match="a[preceding-sibling::*[1][self::a]]"/>

<xsl:template match="a" mode="copy">
    <xsl:copy>
        <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="* | @*">
    <xsl:copy>
        <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
</xsl:template>

</xsl:stylesheet>

G. Ken Holman provided following answer:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:output method="xml" indent="yes"/>

<!--identity transform for all nodes other than 'a'-->
<xsl:template match="node() | @*">
    <xsl:copy>
        <xsl:apply-templates select="node() | @*" />
    </xsl:copy>
</xsl:template>

<!--the first a-->
<xsl:template match="a[not(preceding-sibling::*[1][self::a])]">
    <a-block>
        <xsl:apply-templates select="." mode="nested"/>
    </a-block>
</xsl:template>

<!--other a's when nested-->
<xsl:template match="a" mode="nested">
    <xsl:copy-of select="."/>
    <xsl:apply-templates select="following-sibling::*[1][self::a]" mode="nested"/>
</xsl:template>

<!--other a's when not nested can be ignored-->
<xsl:template match="a"/>

</xsl:stylesheet>

(Ken actually helped me to write this stylesheet..)


37. Getting unique values

The following question was asked on XSL-List.

I have to get the unique Products from the input below:

<Proposal>
    <Quote>
        <QuoteId>1</QuoteId>
        <Products>
            <Product>
                <ProdID>
                    1234
                </ProdID>
                <ProdID>
                    5678
                </ProdID>
            </Product>
        </Products>
    </Quote>
    <Quote>
        <QuoteId>2</QuoteId>
        <Products>
            <Product>
                <ProdID>
                    1234
                </ProdID>
                <ProdID>
                    5678
                </ProdID>
            </Product>
        </Products>
    </Quote>
</Proposal>

Desired output is:

<Proposal>
    <Products>
        <ProdId>1234</ProdId>
        <ProdId>5678</ProdId>
    </Products>
</Proposal>

Following is a XSLT 2.0 solution:

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">

<xsl:output method="xml" indent="yes" />

<xsl:template match="/">
    <Proposal>
        <Products>
            <xsl:for-each select="distinct-values(for $x in //ProdID return normalize-space($x))">
                <ProdId><xsl:value-of select="." /></ProdId>
            </xsl:for-each>
        </Products>
    </Proposal>
</xsl:template>

</xsl:stylesheet>
 

38. Counting path occurrences

The following question was asked on XSL-List.

I'm trying to write a generic stylesheet to count occurrences of full paths to elements.

For example, I'd like input like this:

<a>
    <b>
        <x>
            <w>blah</w>
        </x>
        <y>bleh</y>
    </b>
    <b>
        <x>
            <w>blih</w>
        </x>
        <x>
            <w>bloh</w>
        </x>
    </b>
    <c>
        <w>blwh</w>
        <w>blyh</w>
    </c>
</a>

To generate this output:

/a - 1
/a/b - 2
/a/b/x - 3
/a/b/x/w - 3
/a/b/y - 1
/a/c - 1
/a/c/w - 2

Andrew Welch suggested the following XSLT 2.0 solution:

<xsl:for-each-group select="//*/string-join(ancestor-or-self::*/name(), '/')" group-by=".">
    <xsl:sequence select="concat('&#xa;', '/', ., ' - ', count(current-group()))"/>
</xsl:for-each-group>


Home


Last Updated: Dec 27, 2009