On CBS.com: Six show girls attacked

Parse XML documents with JAXP

Tags: Guest Contributor

  • Save
  • Print
  • 0

Takeaway: You can implement a JAXP parser using the Apache Xerces-2 parser. Here's how.

This article originally appeared as an XML e-newsletter.

By Brian Schaffner

There are many ways to parse XML documents with Java. You have options for parsing using DOM and SAX, which are the standard parsing techniques. You also have the option to use the Java API for XML Processing (JAXP).

JAXP is a Java interface that provides a standard approach to parsing XML documents. Let's look at how you can implement a JAXP parser using the Apache Xerces-2 parser.

Factory patterns

JAXP provides parsers for DOM and SAX approaches to processing XML documents. The factory class you use determines the approach you use. A factory class is a standard design pattern that gives you the ability to manufacture classes as needed.

With JAXP, you can use either the DocumentBuilderFactory to create DocumentBuilder classes or the SAXParserFactory to create SAXParser classes. The difference is that DOM parsers read the entire document into memory and allow you to traverse the document in a random access way, while SAX parsers call handlers to interpret XML data as it's encountered in the document. We'll concentrate on DocumentBuilder classes for now.

DocumentBuilder

The DocumentBuilder class is created by calling the newDocumentBuilder method of the DocumentBuilderFactory class. You can create as many DocumentBuilderFactories as you want using the newInstance method of the DocumentBuilderFactory class.

For example, to start you'll want to create a new DocumentBuilderFactory, like this:

DocumentBuilderFactory dbfactory = DocumentBuilderFactory.newInstance();

Once you have a handle to the factory, you can create an instance of the actual DOM parser using the following code:

DocumentBuilder builder = dbfactory. newDocumentBuilder();

This creates a new instance of the actual DocumentBuilder class. In order to parse a document, you call the parse method of the DocumentBuilder class. The parse method will return a Document object, which you can use to process the XML document.

Listing A shows a simple implementation using the DocumentBuilderFactory and DocumentBuilder classes. Click here.

The DocumentBuilder class is really just a DOM parser. The advantage of using the JAXP DocumentBuilder class is portability to other underlying XML parser implementations. The DocumentBuilderFactory and DocumentBuilder classes provide an abstraction layer that removes the dependency on a specific parser from your code.

Real document

When using DOM via the DocumentBuilder interface, the parser will return a Document class. This is important because the Document class is defined by the W3C, which means you can interact with the Document class exactly as you would if you were using any other DOM parser.

For example, you can retrieve an element's value using the following method:

String getXMLValue(Document doc, String name) {     
     NodeList nlist=doc.getElementsByTagName(name);
     String value = nlist.item(0).getFirstChild().getNodeValue();
     return value;
}

This method looks for a child node within the document with the same name as the string passed into the method.

Brian Schaffner is an associate director for Fujitsu Consulting. He provides architecture, design, and development support for Fujitsu's Technology Consulting practice.

  • Save
  • Print
  • 0

What do you think?

Article Categories

Security
Security Solutions, IT Locksmith
Networking and Communications
E-mail Administration NetNote, Cisco Routers and Switches
CIO and IT Management
Project Management, CIO Issues, Strategies that Scale
Desktops, Laptops & OS
Windows 2000 Professional, Microsoft Word, Microsoft Excel, Microsoft Access, Windows XP,
Data Management
Oracle, SQL Server
Servers
Windows NT, Linux NetNote, Windows Server 2003
Career Development
Geek Trivia
Software/Web Development
Web Development Zone, Visual Basic, .NET

The PC Cracked Open

advertisement
Click Here