plusminus wrote:Parsing XML from the Net - Using the SAXParser
What you learn: You will learn how to properly parse
XML (
here: from the net) using a
SAXParser.
Problems/Questions: Write them right below...
Difficulty: 1 of 5
What it will look like:Description:0.) In this tutorial we are going to parse the following XML-File located at the following url:
images/tut/basic/parsingxml/example.xml :
Using xml Syntax Highlighting
<?xml version="1.0"?>
<outertag>
<innertag sampleattribute="innertagAttribute">
<mytag>
anddev.org rulez =)
</mytag>
<tagwithnumber thenumber="1337"/>
</innertag>
</outertag>
Parsed in 0.001 seconds, using
GeSHi 1.0.8.4
To accomplish the parsing, we are going to use a
SAX-Parser (
Wiki-Info).
SAX stands for "
Simple API for XML", so it is perfect for us
1.) Lets take a look at the
onCreate(...)-method. It will open an
URL, create a
SAXParser, add a
ContentHandler to it, parse the
URL and display the
Results in a
TextView.
Using java Syntax Highlighting
/** Called when the activity is first created. */
@Override
public void onCreate(Bundle icicle) {
super.onCreate(icicle);
/* Create a new TextView to display the parsingresult later. */
TextView tv = new TextView(this);
try {
/* Create a URL we want to load some xml-data from. */
URL url = new URL("http://www.anddev.org/images/tut/basic/parsingxml/example.xml");
/* Get a SAXParser from the SAXPArserFactory. */
SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser sp = spf.newSAXParser();
/* Get the XMLReader of the SAXParser we created. */
XMLReader xr = sp.getXMLReader();
/* Create a new ContentHandler and apply it to the XML-Reader*/
ExampleHandler myExampleHandler = new ExampleHandler();
xr.setContentHandler(myExampleHandler);
/* Parse the xml-data from our URL. */
xr.parse(new InputSource(url.openStream()));
/* Parsing has finished. */
/* Our ExampleHandler now provides the parsed data to us. */
ParsedExampleDataSet parsedExampleDataSet =
myExampleHandler.getParsedData();
/* Set the result to be displayed in our GUI. */
tv.setText(parsedExampleDataSet.toString());
} catch (Exception e) {
/* Display any Error to the GUI. */
tv.setText("Error: " + e.getMessage());
Log.e(MY_DEBUG_TAG, "WeatherQueryError", e);
}
/* Display the TextView. */
this.setContentView(tv);
}
Parsed in 0.080 seconds, using
GeSHi 1.0.8.4
2.) The next Class to take a look at is the
ExampleHandler which
extends org.xml.sax.helpers.DefaultHandler. A
SAX-Handler is an really easy class. We will just need to implement some trivial functions.
The
SAXParser will 'walk' through the XML-File from the beginning to the end and always, when it reaches an
opening tag like:
Using xml Syntax Highlighting
<outertag>
Parsed in 0.000 seconds, using
GeSHi 1.0.8.4
the Handler-Function:
Using java Syntax Highlighting
@Override
public void startElement(String namespaceURI, String localName,
String qName, Attributes atts) throws SAXException {
}
Parsed in 0.036 seconds, using
GeSHi 1.0.8.4
gets called. Where in this case
localName will be "
outertag".
The same happens on
closing tags, like:
Using xml Syntax Highlighting
</outertag>
Parsed in 0.000 seconds, using
GeSHi 1.0.8.4
Here the equivalent '
closing'-method that gets called:
Using java Syntax Highlighting
@Override
public void endElement(String namespaceURI, String localName, String qName)
throws SAXException {
}
Parsed in 0.036 seconds, using
GeSHi 1.0.8.4
In XML you can put any characters you want
between a opening and an closing tag, like this:
Using xml Syntax Highlighting
<mytag>
anddev.org rulez =)
</mytag>
Parsed in 0.000 seconds, using
GeSHi 1.0.8.4
When the Parser reaches such a
Tag, the following method gets called, providing the
characters between the opening and the
closing tag:
Using java Syntax Highlighting
/** Gets be called on the following structure:
* <tag>characters</tag> */
@Override
public void characters(char ch[], int start, int length) {
String textBetween = new String(ch, start, length);
}
Parsed in 0.037 seconds, using
GeSHi 1.0.8.4
Finally on the start/end of each document the following functions get called:
Using java Syntax Highlighting
@Override
public void startDocument() throws SAXException {
// Do some startup if needed
}
@Override
public void endDocument() throws SAXException {
// Do some finishing work if needed
}
Parsed in 0.037 seconds, using
GeSHi 1.0.8.4
3.) What was shown up to here was just the basic structure of the
SAX-Handler. Now I'll show you the standard way of a real-life
Handler-Implementation. This is also pretty simple. We probably want to know how 'deep' the
Parser has parsed so far, so we create some booleans indicating which tags are still open. This is done like the following:
Using java Syntax Highlighting
// ===========================================================
// Fields
// ===========================================================
private boolean in_outertag = false;
private boolean in_innertag = false;
private boolean in_mytag = false;
Parsed in 0.037 seconds, using
GeSHi 1.0.8.4
As we know, when the parser reaches an opening-tag the
startElement(...)-method gets called:
So we will simply check the localName and set the corresponding "
in_xyz"-boolean to true.
Using java Syntax Highlighting
@Override
public void startElement(String namespaceURI, String localName,
String qName, Attributes atts) throws SAXException {
if (localName.equals("outertag")) {
this.in_outertag = true;
}else if (localName.equals("innertag")) {
this.in_innertag = true;
}else if (localName.equals("mytag")) {
this.in_mytag = true;
}else if (localName.equals("tagwithnumber")) {
String attrValue = atts.getValue("thenumber");
int i = Integer.parseInt(attrValue);
myParsedExampleDataSet.setExtractedInt(i);
}
}
Parsed in 0.039 seconds, using
GeSHi 1.0.8.4
Similiar on closing tags the
endElement(..)-method gets called and we just set the "
in_xyz"-boolean back to false:
Using java Syntax Highlighting
@Override
public void endElement(String namespaceURI, String localName, String qName)
throws SAXException {
if (localName.equals("outertag")) {
this.in_outertag = false;
}else if (localName.equals("innertag")) {
this.in_innertag = false;
}else if (localName.equals("mytag")) {
this.in_mytag = false;
}else if (localName.equals("tagwithnumber")) {
// Nothing to do here
}
}
Parsed in 0.038 seconds, using
GeSHi 1.0.8.4
So for example when the
Parser reaches the following part:
Using xml Syntax Highlighting
<mytag>
anddev.org rulez =)
</mytag>
Parsed in 0.000 seconds, using
GeSHi 1.0.8.4
Our "in_xyz"-booleans indicate in which tag these characters have been '
detected' and easily extract them:
Using java Syntax Highlighting
/** Gets be called on the following structure:
* <tag>characters</tag> */
@Override
public void characters(char ch[], int start, int length) {
if(this.in_mytag){
myParsedExampleDataSet.setExtractedString(new String(ch, start, length));
}
}
Parsed in 0.037 seconds, using
GeSHi 1.0.8.4
4.) What I prefer is to make the Handler create a nice object that gets created during the whole parsing and when parsing has finished I can simply grab the created Object:
Using java Syntax Highlighting
public ParsedExampleDataSet getParsedData() {
return this.myParsedExampleDataSet;
}
Parsed in 0.036 seconds, using
GeSHi 1.0.8.4
So now you know how to properly use the
SAXParser together with the
SAXHandler 
If any question remained feel free to ask

The Full Source: "
/src/your_package_structure/ParsingXML.java"
Using java Syntax Highlighting
package org.anddev.android.parsingxml;
import java.net.URL;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.InputSource;
import org.xml.sax.XMLReader;
import android.app.Activity;
import android.os.Bundle;
import android.util.Log;
import android.widget.TextView;
public class ParsingXML extends Activity {
private final String MY_DEBUG_TAG = "WeatherForcaster";
/** Called when the activity is first created. */
@Override
public void onCreate(Bundle icicle) {
super.onCreate(icicle);
/* Create a new TextView to display the parsingresult later. */
TextView tv = new TextView(this);
try {
/* Create a URL we want to load some xml-data from. */
URL url = new URL("http://www.anddev.org/images/tut/basic/parsingxml/example.xml");
/* Get a SAXParser from the SAXPArserFactory. */
SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser sp = spf.newSAXParser();
/* Get the XMLReader of the SAXParser we created. */
XMLReader xr = sp.getXMLReader();
/* Create a new ContentHandler and apply it to the XML-Reader*/
ExampleHandler myExampleHandler = new ExampleHandler();
xr.setContentHandler(myExampleHandler);
/* Parse the xml-data from our URL. */
xr.parse(new InputSource(url.openStream()));
/* Parsing has finished. */
/* Our ExampleHandler now provides the parsed data to us. */
ParsedExampleDataSet parsedExampleDataSet =
myExampleHandler.getParsedData();
/* Set the result to be displayed in our GUI. */
tv.setText(parsedExampleDataSet.toString());
} catch (Exception e) {
/* Display any Error to the GUI. */
tv.setText("Error: " + e.getMessage());
Log.e(MY_DEBUG_TAG, "WeatherQueryError", e);
}
/* Display the TextView. */
this.setContentView(tv);
}
}
Parsed in 0.044 seconds, using
GeSHi 1.0.8.4
"
/src/your_package_structure/ExampleHandler.java"
Using java Syntax Highlighting
package org.anddev.android.parsingxml;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class ExampleHandler extends DefaultHandler{
// ===========================================================
// Fields
// ===========================================================
private boolean in_outertag = false;
private boolean in_innertag = false;
private boolean in_mytag = false;
private ParsedExampleDataSet myParsedExampleDataSet = new ParsedExampleDataSet();
// ===========================================================
// Getter & Setter
// ===========================================================
public ParsedExampleDataSet getParsedData() {
return this.myParsedExampleDataSet;
}
// ===========================================================
// Methods
// ===========================================================
@Override
public void startDocument() throws SAXException {
this.myParsedExampleDataSet = new ParsedExampleDataSet();
}
@Override
public void endDocument() throws SAXException {
// Nothing to do
}
/** Gets be called on opening tags like:
* <tag>
* Can provide attribute(s), when xml was like:
* <tag attribute="attributeValue">*/
@Override
public void startElement(String namespaceURI, String localName,
String qName, Attributes atts) throws SAXException {
if (localName.equals("outertag")) {
this.in_outertag = true;
}else if (localName.equals("innertag")) {
this.in_innertag = true;
}else if (localName.equals("mytag")) {
this.in_mytag = true;
}else if (localName.equals("tagwithnumber")) {
// Extract an Attribute
String attrValue = atts.getValue("thenumber");
int i = Integer.parseInt(attrValue);
myParsedExampleDataSet.setExtractedInt(i);
}
}
/** Gets be called on closing tags like:
* </tag> */
@Override
public void endElement(String namespaceURI, String localName, String qName)
throws SAXException {
if (localName.equals("outertag")) {
this.in_outertag = false;
}else if (localName.equals("innertag")) {
this.in_innertag = false;
}else if (localName.equals("mytag")) {
this.in_mytag = false;
}else if (localName.equals("tagwithnumber")) {
// Nothing to do here
}
}
/** Gets be called on the following structure:
* <tag>characters</tag> */
@Override
public void characters(char ch[], int start, int length) {
if(this.in_mytag){
myParsedExampleDataSet.setExtractedString(new String(ch, start, length));
}
}
}
Parsed in 0.047 seconds, using
GeSHi 1.0.8.4
"
/src/your_package_structure/ParsedExampleDataSet.java"
Using java Syntax Highlighting
package org.anddev.android.parsingxml;
public class ParsedExampleDataSet {
private String extractedString = null;
private int extractedInt = 0;
public String getExtractedString() {
return extractedString;
}
public void setExtractedString(String extractedString) {
this.extractedString = extractedString;
}
public int getExtractedInt() {
return extractedInt;
}
public void setExtractedInt(int extractedInt) {
this.extractedInt = extractedInt;
}
public String toString(){
return "ExtractedString = " + this.extractedString
+ "nExtractedInt = " + this.extractedInt;
}
}
Parsed in 0.038 seconds, using
GeSHi 1.0.8.4
Thats it

Regards,
plusminus