Parsing XML from the Net - Using the SAXParser
What you learn: You will learn how to properly parse XML (here: from the net) using a SAXParser.

Difficulty: 1 of 5

What it will look like:

Description:
0.) In this tutorial we are going to parse the following XML-File located at the following url: images/tut/basic/parsingxml/example.xml :
Using xml Syntax Highlighting
- <?xml version="1.0"?>
- <outertag>
- <innertag sampleattribute="innertagAttribute">
- <mytag>
- anddev.org rulez =)
- </mytag>
- <tagwithnumber thenumber="1337"/>
- </innertag>
- </outertag>
Parsed in 0.001 seconds, using GeSHi 1.0.8.4
To accomplish the parsing, we are going to use a SAX-Parser (Wiki-Info). SAX stands for "Simple API for XML", so it is perfect for us

1.) Lets take a look at the onCreate(...)-method. It will open an URL, create a SAXParser, add a ContentHandler to it, parse the URL and display the Results in a TextView.
Using java Syntax Highlighting
- /** Called when the activity is first created. */
- @Override
- public void onCreate(Bundle icicle) {
- super.onCreate(icicle);
- /* Create a new TextView to display the parsingresult later. */
- TextView tv = new TextView(this);
- try {
- /* Create a URL we want to load some xml-data from. */
- URL url = new URL("http://www.anddev.org/images/tut/basic/parsingxml/example.xml");
- /* Get a SAXParser from the SAXPArserFactory. */
- SAXParserFactory spf = SAXParserFactory.newInstance();
- SAXParser sp = spf.newSAXParser();
- /* Get the XMLReader of the SAXParser we created. */
- XMLReader xr = sp.getXMLReader();
- /* Create a new ContentHandler and apply it to the XML-Reader*/
- ExampleHandler myExampleHandler = new ExampleHandler();
- xr.setContentHandler(myExampleHandler);
- /* Parse the xml-data from our URL. */
- xr.parse(new InputSource(url.openStream()));
- /* Parsing has finished. */
- /* Our ExampleHandler now provides the parsed data to us. */
- ParsedExampleDataSet parsedExampleDataSet =
- myExampleHandler.getParsedData();
- /* Set the result to be displayed in our GUI. */
- tv.setText(parsedExampleDataSet.toString());
- } catch (Exception e) {
- /* Display any Error to the GUI. */
- tv.setText("Error: " + e.getMessage());
- Log.e(MY_DEBUG_TAG, "WeatherQueryError", e);
- }
- /* Display the TextView. */
- this.setContentView(tv);
- }
Parsed in 0.012 seconds, using GeSHi 1.0.8.4
2.) The next Class to take a look at is the ExampleHandler which extends org.xml.sax.helpers.DefaultHandler. A SAX-Handler is an really easy class. We will just need to implement some trivial functions.
The SAXParser will 'walk' through the XML-File from the beginning to the end and always, when it reaches an opening tag like:
Using xml Syntax Highlighting
- <outertag>
Parsed in 0.000 seconds, using GeSHi 1.0.8.4
the Handler-Function:
Using java Syntax Highlighting
- @Override
- public void startElement(String namespaceURI, String localName,
- String qName, Attributes atts) throws SAXException {
- }
Parsed in 0.011 seconds, using GeSHi 1.0.8.4
gets called. Where in this case localName will be "outertag".
The same happens on closing tags, like:
Using xml Syntax Highlighting
- </outertag>
Parsed in 0.000 seconds, using GeSHi 1.0.8.4
Here the equivalent 'closing'-method that gets called:
Using java Syntax Highlighting
- @Override
- public void endElement(String namespaceURI, String localName, String qName)
- throws SAXException {
- }
Parsed in 0.011 seconds, using GeSHi 1.0.8.4
In XML you can put any characters you want between a opening and an closing tag, like this:
Using xml Syntax Highlighting
- <mytag>
- anddev.org rulez =)
- </mytag>
Parsed in 0.000 seconds, using GeSHi 1.0.8.4
When the Parser reaches such a Tag, the following method gets called, providing the characters between the opening and the closing tag:
Using java Syntax Highlighting
- /** Gets be called on the following structure:
- * <tag>characters</tag> */
- @Override
- public void characters(char ch[], int start, int length) {
- String textBetween = new String(ch, start, length);
- }
Parsed in 0.011 seconds, using GeSHi 1.0.8.4
Finally on the start/end of each document the following functions get called:
Using java Syntax Highlighting
- @Override
- public void startDocument() throws SAXException {
- // Do some startup if needed
- }
- @Override
- public void endDocument() throws SAXException {
- // Do some finishing work if needed
- }
Parsed in 0.011 seconds, using GeSHi 1.0.8.4
3.) What was shown up to here was just the basic structure of the SAX-Handler. Now I'll show you the standard way of a real-life Handler-Implementation. This is also pretty simple. We probably want to know how 'deep' the Parser has parsed so far, so we create some booleans indicating which tags are still open. This is done like the following:
Using java Syntax Highlighting
- // ===========================================================
- // Fields
- // ===========================================================
- private boolean in_outertag = false;
- private boolean in_innertag = false;
- private boolean in_mytag = false;
Parsed in 0.012 seconds, using GeSHi 1.0.8.4
As we know, when the parser reaches an opening-tag the startElement(...)-method gets called:
So we will simply check the localName and set the corresponding "in_xyz"-boolean to true.
Using java Syntax Highlighting
- @Override
- public void startElement(String namespaceURI, String localName,
- String qName, Attributes atts) throws SAXException {
- if (localName.equals("outertag")) {
- this.in_outertag = true;
- }else if (localName.equals("innertag")) {
- this.in_innertag = true;
- }else if (localName.equals("mytag")) {
- this.in_mytag = true;
- }else if (localName.equals("tagwithnumber")) {
- String attrValue = atts.getValue("thenumber");
- int i = Integer.parseInt(attrValue);
- myParsedExampleDataSet.setExtractedInt(i);
- }
- }
Parsed in 0.012 seconds, using GeSHi 1.0.8.4
Similiar on closing tags the endElement(..)-method gets called and we just set the "in_xyz"-boolean back to false:
Using java Syntax Highlighting
- @Override
- public void endElement(String namespaceURI, String localName, String qName)
- throws SAXException {
- if (localName.equals("outertag")) {
- this.in_outertag = false;
- }else if (localName.equals("innertag")) {
- this.in_innertag = false;
- }else if (localName.equals("mytag")) {
- this.in_mytag = false;
- }else if (localName.equals("tagwithnumber")) {
- // Nothing to do here
- }
- }
Parsed in 0.012 seconds, using GeSHi 1.0.8.4
So for example when the Parser reaches the following part:
Using xml Syntax Highlighting
- <mytag>
- anddev.org rulez =)
- </mytag>
Parsed in 0.000 seconds, using GeSHi 1.0.8.4
Our "in_xyz"-booleans indicate in which tag these characters have been 'detected' and easily extract them:
Using java Syntax Highlighting
- /** Gets be called on the following structure:
- * <tag>characters</tag> */
- @Override
- public void characters(char ch[], int start, int length) {
- if(this.in_mytag){
- myParsedExampleDataSet.setExtractedString(new String(ch, start, length));
- }
- }
Parsed in 0.011 seconds, using GeSHi 1.0.8.4
4.) What I prefer is to make the Handler create a nice object that gets created during the whole parsing and when parsing has finished I can simply grab the created Object:
Using java Syntax Highlighting
- public ParsedExampleDataSet getParsedData() {
- return this.myParsedExampleDataSet;
- }
Parsed in 0.012 seconds, using GeSHi 1.0.8.4
So now you know how to properly use the SAXParser together with the SAXHandler 
If any question remained feel free to ask

If any question remained feel free to ask

The Full Source:
"/src/your_package_structure/ParsingXML.java"
Using java Syntax Highlighting
- package org.anddev.android.parsingxml;
- import java.net.URL;
- import javax.xml.parsers.SAXParser;
- import javax.xml.parsers.SAXParserFactory;
- import org.xml.sax.InputSource;
- import org.xml.sax.XMLReader;
- import android.app.Activity;
- import android.os.Bundle;
- import android.util.Log;
- import android.widget.TextView;
- public class ParsingXML extends Activity {
- private final String MY_DEBUG_TAG = "WeatherForcaster";
- /** Called when the activity is first created. */
- @Override
- public void onCreate(Bundle icicle) {
- super.onCreate(icicle);
- /* Create a new TextView to display the parsingresult later. */
- TextView tv = new TextView(this);
- try {
- /* Create a URL we want to load some xml-data from. */
- URL url = new URL("http://www.anddev.org/images/tut/basic/parsingxml/example.xml");
- /* Get a SAXParser from the SAXPArserFactory. */
- SAXParserFactory spf = SAXParserFactory.newInstance();
- SAXParser sp = spf.newSAXParser();
- /* Get the XMLReader of the SAXParser we created. */
- XMLReader xr = sp.getXMLReader();
- /* Create a new ContentHandler and apply it to the XML-Reader*/
- ExampleHandler myExampleHandler = new ExampleHandler();
- xr.setContentHandler(myExampleHandler);
- /* Parse the xml-data from our URL. */
- xr.parse(new InputSource(url.openStream()));
- /* Parsing has finished. */
- /* Our ExampleHandler now provides the parsed data to us. */
- ParsedExampleDataSet parsedExampleDataSet =
- myExampleHandler.getParsedData();
- /* Set the result to be displayed in our GUI. */
- tv.setText(parsedExampleDataSet.toString());
- } catch (Exception e) {
- /* Display any Error to the GUI. */
- tv.setText("Error: " + e.getMessage());
- Log.e(MY_DEBUG_TAG, "WeatherQueryError", e);
- }
- /* Display the TextView. */
- this.setContentView(tv);
- }
- }
Parsed in 0.015 seconds, using GeSHi 1.0.8.4
"/src/your_package_structure/ExampleHandler.java"
Using java Syntax Highlighting
- package org.anddev.android.parsingxml;
- import org.xml.sax.Attributes;
- import org.xml.sax.SAXException;
- import org.xml.sax.helpers.DefaultHandler;
- public class ExampleHandler extends DefaultHandler{
- // ===========================================================
- // Fields
- // ===========================================================
- private boolean in_outertag = false;
- private boolean in_innertag = false;
- private boolean in_mytag = false;
- private ParsedExampleDataSet myParsedExampleDataSet = new ParsedExampleDataSet();
- // ===========================================================
- // Getter & Setter
- // ===========================================================
- public ParsedExampleDataSet getParsedData() {
- return this.myParsedExampleDataSet;
- }
- // ===========================================================
- // Methods
- // ===========================================================
- @Override
- public void startDocument() throws SAXException {
- this.myParsedExampleDataSet = new ParsedExampleDataSet();
- }
- @Override
- public void endDocument() throws SAXException {
- // Nothing to do
- }
- /** Gets be called on opening tags like:
- * <tag>
- * Can provide attribute(s), when xml was like:
- * <tag attribute="attributeValue">*/
- @Override
- public void startElement(String namespaceURI, String localName,
- String qName, Attributes atts) throws SAXException {
- if (localName.equals("outertag")) {
- this.in_outertag = true;
- }else if (localName.equals("innertag")) {
- this.in_innertag = true;
- }else if (localName.equals("mytag")) {
- this.in_mytag = true;
- }else if (localName.equals("tagwithnumber")) {
- // Extract an Attribute
- String attrValue = atts.getValue("thenumber");
- int i = Integer.parseInt(attrValue);
- myParsedExampleDataSet.setExtractedInt(i);
- }
- }
- /** Gets be called on closing tags like:
- * </tag> */
- @Override
- public void endElement(String namespaceURI, String localName, String qName)
- throws SAXException {
- if (localName.equals("outertag")) {
- this.in_outertag = false;
- }else if (localName.equals("innertag")) {
- this.in_innertag = false;
- }else if (localName.equals("mytag")) {
- this.in_mytag = false;
- }else if (localName.equals("tagwithnumber")) {
- // Nothing to do here
- }
- }
- /** Gets be called on the following structure:
- * <tag>characters</tag> */
- @Override
- public void characters(char ch[], int start, int length) {
- if(this.in_mytag){
- myParsedExampleDataSet.setExtractedString(new String(ch, start, length));
- }
- }
- }
Parsed in 0.017 seconds, using GeSHi 1.0.8.4
"/src/your_package_structure/ParsedExampleDataSet.java"
Using java Syntax Highlighting
- package org.anddev.android.parsingxml;
- public class ParsedExampleDataSet {
- private String extractedString = null;
- private int extractedInt = 0;
- public String getExtractedString() {
- return extractedString;
- }
- public void setExtractedString(String extractedString) {
- this.extractedString = extractedString;
- }
- public int getExtractedInt() {
- return extractedInt;
- }
- public void setExtractedInt(int extractedInt) {
- this.extractedInt = extractedInt;
- }
- public String toString(){
- return "ExtractedString = " + this.extractedString
- + "nExtractedInt = " + this.extractedInt;
- }
- }
Parsed in 0.014 seconds, using GeSHi 1.0.8.4
Thats it 

Regards,
plusminus