Retrieving Dynamic XML from SQL Server
Introduction
This article will
describe the different ways of retrieving XML documents, highlight my
experiences, and list the differences between them.
Three flavors of XML Output
To allow the retrieval
of data in the XML format from SQL Server, the FOR XML command has been
added to the T-SQL syntax. Using the FOR XML command with a sequel query
allows the results to be generated as XML.
The command allows three variants
1. RAW
2. AUTO
3. EXPLICIT
If we take the following
SQL Query (see Figure 1), which uses the Pubs sample database shipped
with the SQL Server, and apply it to each of the modes (or variants), we
gain an understanding of the various types of XML output that each mode
is capable of producing.
RAW Mode
Query
SELECT '<rows>'+(
SELECT S.State_Id,
S.State_Code,
S.State_Name,
C.City_Id,
C.City_Code,
C.City_Name
FROM tblStates S
INNER JOIN tblCities C
ON C.State_Id = S.State_Id
FOR XML RAW
)+'</rows>'
Output
Here is a snippet of the structure generated by the RAW mode.
<rows>
<row State_Id="1" State_Code="AP" State_Name="Andhra Pradesh" City_Id="1" City_Code="NLR"City_Name="Nellore" />
<row State_Id="1" State_Code="AP" State_Name="Andhra Pradesh" City_Id="2" City_Code="TPT"City_Name="Tirupati" />
<row State_Id="1" State_Code="AP" State_Name="Andhra Pradesh" City_Id="3" City_Code="HYD"City_Name="Hyderabad" />
<row State_Id="1" State_Code="AP" State_Name="Andhra Pradesh" City_Id="4" City_Code="KRN"City_Name="Karnool" />
<row State_Id="2" State_Code="TN" State_Name="Tamil Nadu" City_Id="5" City_Code="CHN" City_Name="Chennai" />
<row State_Id="2" State_Code="TN" State_Name="Tamil Nadu" City_Id="6" City_Code="MDR"City_Name="Madhurai" />
<row State_Id="2" State_Code="TN" State_Name="Tamil Nadu" City_Id="7" City_Code="KBR"City_Name="Koimbattore" />
<row State_Id="3" State_Code="KA" State_Name="Kerala" City_Id="8" City_Code="TVD" City_Name="Trivendram" />
<row State_Id="4" State_Code="KT" State_Name="Karnataka" City_Id="9" City_Code="BGR" City_Name="Banglore" />
<row State_Id="4" State_Code="KT" State_Name="Karnataka" City_Id="10" City_Code="MGR"City_Name="Manglore" />
<row State_Id="4" State_Code="KT" State_Name="Karnataka" City_Id="11" City_Code="MSR" City_Name="Mysore" />
<row State_Id="4" State_Code="KT" State_Name="Karnataka" City_Id="12" City_Code="BLR" City_Name="Belur" />
<row State_Id="4" State_Code="KT" State_Name="Karnataka" City_Id="13" City_Code="SRP" City_Name="Sri Ranga Patnam" />
</rows>
The XML document produced contains an element <Row>, which is fixed, for each record of the result set generated by Query. This is not very useful because we have no control over the element naming and document structure. The RAW mode falls short of the XML document I am looking to create, and really has limited use. This query forms all data into attributes format.
Query
SELECT '<rows>'+(
SELECT S.State_Id,
S.State_Code,
S.State_Name,
C.City_Id,
C.City_Code,
C.City_Name
FROM tblStates S
INNER JOIN tblCities C
ON C.State_Id = S.State_Id
FOR XML RAW ('City'), ELEMENTS
)+'</rows>'
Output
<rows>
<City>
<State_Id>1</State_Id>
<State_Code>AP</State_Code>
<State_Name>Andhra Pradesh</State_Name>
<City_Id>1</City_Id>
<City_Code>NLR</City_Code>
<City_Name>Nellore</City_Name>
</City>
<City>
<State_Id>1</State_Id>
<State_Code>AP</State_Code>
<State_Name>Andhra Pradesh</State_Name>
<City_Id>3</City_Id>
<City_Code>HYD</City_Code>
<City_Name>Hyderabad</City_Name>
</City>
<City>
<State_Id>1</State_Id>
<State_Code>AP</State_Code>
<State_Name>Andhra Pradesh</State_Name>
<City_Id>4</City_Id>
<City_Code>KRN</City_Code>
<City_Name>Karnool</City_Name>
</City>
<City>
<State_Id>2</State_Id>
<State_Code>TN</State_Code>
<State_Name>Tamil Nadu</State_Name>
<City_Id>7</City_Id>
<City_Code>KBR</City_Code>
<City_Name>Koimbattore</City_Name>
</City>
</rows>
The above query generates an xml with nodes for item in data.
AUTO Mode
Query
SELECT '<rows>'+(
SELECT [State].State_Id,
[State].State_Code,
[State].State_Name,
[City].City_Id,
[City].City_Code,
[City].City_Name
FROM tblStates [State]
INNER JOIN tblCities [City]
ON [City].State_Id = [State].State_Id
FOR XML AUTO
)+'</rows>'
Output
Below is a cross section of the output structure generated by the AUTO mode.
<rows>
<State State_Id="1" State_Code="AP" State_Name="Andhra Pradesh">
<City City_Id="1" City_Code="NLR" City_Name="Nellore" />
<City City_Id="2" City_Code="TPT" City_Name="Tirupati" />
<City City_Id="3" City_Code="HYD" City_Name="Hyderabad" />
<City City_Id="4" City_Code="KRN" City_Name="Karnool" />
</State>
<State State_Id="2" State_Code="TN" State_Name="Tamil Nadu">
<City City_Id="5" City_Code="CHN" City_Name="Chennai" />
<City City_Id="6" City_Code="MDR" City_Name="Madhurai" />
<City City_Id="7" City_Code="KBR" City_Name="Koimbattore" />
</State>
<State State_Id="3" State_Code="KA" State_Name="Kerala">
<City City_Id="8" City_Code="TVD" City_Name="Trivendram" />
</State>
<State State_Id="4" State_Code="KT" State_Name="Karnataka">
<City City_Id="9" City_Code="BGR" City_Name="Banglore" />
<City City_Id="10" City_Code="MGR" City_Name="Manglore" />
<City City_Id="11" City_Code="MSR" City_Name="Mysore" />
<City City_Id="12" City_Code="BLR" City_Name="Belur" />
<City City_Id="13" City_Code="SRP" City_Name="Sri Ranga Patnam" />
</State>
</rows>
As you can see the <State> and
<City> tags have a parent-child relationship, giving us the
hierarchical structure we require. This node relationship is determined
on the order in which the tables are declared within the query, with
each table declared in the sequence becoming a child of the previously
declared table.
Look back at Query and notice the aliases
given within the query affect the names generated within the XML
document. Using this, we can control the labeling of the XML elements
and attributes, and achieve the naming convention we require.
The AUTO mode allows us to create the XML document we want, but the following disadvantages become apparent with further use:
- We can create hierarchical structures, but only in a linear fashion, as a parent node can only have one child and vice versa.
- Using
aliases to create element and attribute names can become tedious and is
sometimes counterproductive to the readability of the query.
- We
cannot have both attributes and elements within the document. Its
either all elements, specified using the ELEMENTS keyword, or the
default, which is attributes.
These shortcomings are addressed within EXPLICIT mode.