On GameSpot: Is the creator of Mario out of ideas?

Process and parse XML with ease using Jakarta Digester

Tags: Jakarta Project, Contributor Melonfire, XML

  • Save
  • Print
  • 1

Takeaway: Get a brief introduction to processing XML with Jakarta Digester, including how it can be used to create pattern-matching rules for an XML document and to perform actions on the resulting collections.

This article is also available as a TechRepublic download, which includes all of the code listings in a separate text file.

Most of the time, parsing an XML document involves either programming a parser to sequentially traverse an XML document, taking different actions as it encounters different tags (SAX) or building a tree representation of the document in memory and using tree methods to navigate the tree's parent-child relationships (DOM). While both these methods work, they tend to be either complex to program or memory- and processor-inefficient.

The Jakarta Project offers a third option, via its Digester component: writing rules to map XML elements into Java objects and defining complex actions for such objects. This is a brief introduction to processing XML with Jakarta Digester, showing you how it can be used to create pattern-matching rules for an XML document and perform actions on the resulting collections.

Note: This tutorial assumes that Digester is correctly installed and configured on your system. For download and installation instructions, visit the official Jakarta Digester Web site.

Basic Usage

Let's begin by setting up the XML file that will serve as the basis for examples in this tutorial. Pop open your favorite text editor, and create the XML document shown in Listing A.

Listing A

<?xml version='1.0'?>
<library>
    <movie>
        <title>The Matrix</title>
        <rating>5</rating>
    </movie>
    <movie>
        <title>Mission: Impossible III</title>
        <rating>3</rating>
    </movie>
    <movie>
        <title>Minority Report</title>
        <rating>5</rating>
    </movie>
</library>

Before you can begin using Digester, it's important to have a clear understanding of both the XML structure and the output you plan to generate, because the design of your classes and rulesets will depend on this. For the first example, therefore, let's assume that the task is to parse the XML file above and generate a list of movie titles with their associated rating.

With this goal in mind, create some Configuration classes that will serve as Java objects to store the XML data. (Listing B)

Listing B

public class LibraryDetailsConfiguration  {
    private HashMap movieDetailsMap;
    // movieDetailsMap would store the title as key and
    // MovieDetailsConfiguration object as value
     
    public LibraryDetailsConfiguration() {
        movieDetailsMap = new HashMap();
    }
   
    public void setMovieDetailsMap(HashMap movieDetailsMap) {
        this.movieDetailsMap = movieDetailsMap;
    }
   
    public HashMap getMovieDetailsMap() {
        return movieDetailsMap;
    }
   
    public void setMovie(MovieDetailsConfiguration movieDetails) {
        movieDetailsMap.put(movieDetails.getTitle(), movieDetails);
    }
   
    public void printLibraryDetails(){
        Collection movieDetailsColl = this.movieDetailsMap.values();
        if (movieDetailsColl != null) {
            Iterator itMovieDetails = movieDetailsColl.iterator();
            while(itMovieDetails.hasNext()) {
                MovieDetailsConfiguration movieDetails = (MovieDetailsConfiguration)
                itMovieDetails.next();
                System.out.println(movieDetails.getTitle() + " -> " + movieDetails.getRating());
            }
        } else {
            System.out.println("Movie details do not exist"); 
        }
     }
}

This class sets up a HashMap to hold information on each movie. For each element of the HashMap, the movie title serves as the key, and an instance of the MovieDetailsConfiguration class holds corresponding movie metadata (in this case, the rating). The code for this class should be fairly self-explanatory: it simply contains standard methods to set and get movie information, such as the rating and title. (Listing C)

Listing C

public class MovieDetailsConfiguration  {

    private String title;
    private int rating;
 
    public MovieDetailsConfiguration() {
    }

    public void setTitle(String title) {
        this.title = title;
    }

    public String getTitle() {
        return title;
    }

    public void setRating(int rating) {
        this.rating = rating;
    }

    public int getRating() {
        return rating;
    } 
}

Once the Configuration classes are created, the next step is to create the RuleSet class. This RuleSet class contains the set of rules that Digester will use to parse the XML file. These rules are simply the actions that you want Digester to take whenever it encounters a particular pattern during the parsing process. Take a look at the class code shown in Listing D.

Listing D

public class LibraryRuleSet {
 
    public static LibraryDetailsConfiguration getLibraryDetailsConfig (String libraryConfigFilePath) {
       
        LibraryDetailsConfiguration libraryDetails = null;
       
        try {
       
            Digester digester = new Digester();
            digester.setValidating(false);
       
            // create an object of the given class,
            //push it to top of Digester object stack
                 digester.addObjectCreate("library",LibraryDetailsConfiguration.class);
            digester.addObjectCreate("library/movie", MovieDetailsConfiguration.class);
   
            // call the setter method of the given bean property on the object
            digester.addBeanPropertySetter("library/movie/title","title");
            digester.addBeanPropertySetter("library/movie/rating","rating");
           
            // pop the object on top of the stack
            digester.addSetNext("library/movie","setMovie");
           
            // begin parsing
            File fileObj = new File(libraryConfigFilePath);
            libraryDetails = (LibraryDetailsConfiguration) digester.parse(fileObj);
   
        } catch(Exception e) {
            System.out.println("Exception in parsing data file: " + e);
            e.printStackTrace();
        }
    return libraryDetails;
    }
}

This class actually performs the hard work of parsing the XML data and mapping this data into the previously-defined Java objects. First, the addObjectCreate() method creates instances of the two Configuration classes; next, the addBeanPropertySetter() method tells Digester which setter methods to call to attach the character data of the <title> and <rating> elements to the object instances; and finally, the parse() method actually takes care of parsing the file and executing the defined actions. The end result is a properly-configured instance of the LibraryDetailsConfiguration class, and all that is now required is to write a small stub that will call that instance's printLibraryDetails() method. (Listing E)

Listing E

public class TestDigester
{
    public static void main(String args[]) {
        LibraryDetailsConfiguration libraryDetails = LibraryRuleSet.getLibraryDetails("");
        libraryDetails.printLibraryDetails();
    }
}

Here's the output:

Minority Report -> 5Mission: Impossible III -> 3The Matrix -> 5

Altering the ruleset

With Digester (as with any SAX-based API), a change in output requirements necessitates a change in the XML-to-Java-object mapping, and hence a change in the rulesets used. To illustrate this, let's assume a slightly different XML file as shown in Listing F.

Listing F

<?xml version='1.0'?>
<library>
    <movie>
        <title>The Matrix</title>
        <cast>
            <person>Keanu Reeves</person>
            <person>Laurence Fishburne</person>
            <person>Carrie-Anne Moss</person>
        </cast>
        <rating>5</rating>
    </movie>
    <movie>
        <title>Mission: Impossible III</title>
        <cast>
            <person>Tom Cruise</person>
            <person>Ving Rhames</person>
            <person>Laurence Fishburne</person>
        </cast>
        <rating>3</rating>
    </movie>
    <movie>
        <title>Minority Report</title>
        <cast>
            <person>Tom Cruise</person>
            <person>Max von Sydow</person>
        </cast>
        <rating>5</rating>
    </movie>
</library>

And let's also assume that you now wish to process this XML data in a different way, this time keying on actor name and listing all the movies for that actor. This would require a change in the MovieDetailsConfiguration class, which now needs an array to hold cast information for each movie. (Listing G)

Listing G

public class MovieDetailsConfiguration  {

    private String title;
    private ArrayList movieCastList;
    private int rating;
 
    public MovieDetailsConfiguration() {
        movieCastList = new ArrayList();
    }

    public void setTitle(String title) {
        this.title = title;
    }

    public String getTitle() {
        return title;
    }

    public void setMovieCastList(ArrayList movieCastList) {
        this.movieCastList = movieCastList;
    }

    public ArrayList getMovieCastList() {
        return movieCastList;
    }

    public void addMovieCast(String castName) {
        movieCastList.add(castName);
    }


    public void setRating(int rating) {
        this.rating = rating;
    }

    public int getRating() {
        return rating;
    } 
}

It would also necessitate a change in the setMovie() and printLibraryDetails() methods of the LibraryDetailsConfiguration class as shown in Listing H.

Listing H

public class LibraryDetailsConfiguration  {
    private HashMap movieDetailsMap;
    // movieDetailsMap would store the title as key and
    // MovieDetailsConfiguration object as value
     
    public LibraryDetailsConfiguration() {
        movieDetailsMap = new HashMap();
    }
   
    public void setMovieDetailsMap(HashMap movieDetailsMap) {
        this.movieDetailsMap = movieDetailsMap;
    }
   
    public HashMap getMovieDetailsMap() {
        return movieDetailsMap;
    }

    public void setMovie(MovieDetailsConfiguration movieDetails) {
        ArrayList movieDetailList = null;
        ArrayList movieCastList = movieDetails.getMovieCastList();
        Iterator itMovieCast = movieCastList.iterator();

        while (itMovieCast.hasNext()) {
            String movieCastName = (String)itMovieCast.next();
            if(movieDetailsMap.containsKey(movieCastName)) {
                movieDetailList = (ArrayList)
                movieDetailsMap.get(movieCastName);
            } else {
                movieDetailList = new ArrayList();
            }

            movieDetailList.add(movieDetails);
            movieDetailsMap.put(movieCastName, movieDetailList);
        }
    }

   
    public void printLibraryDetails() {
        Collection movieCastColl = this.movieDetailsMap.keySet();
        if (movieCastColl != null) {
            Iterator itMovieCast = movieCastColl.iterator();
            while(itMovieCast.hasNext()) {
                String movieCastName =  (String)itMovieCast.next();
                System.out.println("ACTOR: " + movieCastName);
                ArrayList movieDetailsList = (ArrayList)movieDetailsMap.get(movieCastName);
                Iterator itMovieDetailsList = movieDetailsList.iterator();
                while (itMovieDetailsList.hasNext()) {
                    MovieDetailsConfiguration movieDetails = (MovieDetailsConfiguration)itMovieDetailsList.next();
                    System.out.println("\t\tMOVIE TITLE: " + movieDetails.getTitle());
                }
            }
        } else {
            System.out.println("Movie details do not exist"); 
        }
    }
}

Finally, the rulesets would also need to change, to find and locate cast members and attach to the cast array. (Listing I)

Listing I

public class LibraryRuleSet {
 
    public static LibraryDetailsConfiguration getLibraryDetailsConfig (String libraryConfigFilePath) {
       
        LibraryDetailsConfiguration libraryDetails = null;
       
        try {
       
            Digester digester = new Digester();
            digester.setValidating(false);
       
            // create an object of the given class, push it to top of Digester object stack
            digester.addObjectCreate("library",LibraryDetailsConfiguration.class);
            digester.addObjectCreate("library/movie", MovieDetailsConfiguration.class);
   
            // call the setter method of the given bean property on the object
            digester.addBeanPropertySetter("library/movie/title","title");
digester.addBeanPropertySetter("library/movie/rating","rating");

            // call the given method on the object, specify parameters
            digester.addCallMethod("library/movie/cast/person", "addMovieCast",1);
            digester.addCallParam("library/movie/cast/person",0);
           
            // pop the object on top of the stack
            digester.addSetNext("library/movie","setMovie");
            
            // begin parsing
            File fileObj = new File(libraryConfigFilePath);
            libraryDetails = (LibraryDetailsConfiguration) digester.parse(fileObj);
   
        } catch(Exception e) {
            System.out.println("Exception in parsing movie data: " + e);
            e.printStackTrace();
        }
    return libraryDetails;
    }
}

And here's the revised output of the printLibraryDetails() method:

ACTOR: Carrie-Anne Moss
        MOVIE TITLE: The Matrix
ACTOR: Keanu Reeves
        MOVIE TITLE: The Matrix
ACTOR: Laurence Fishburne
        MOVIE TITLE: The Matrix
        MOVIE TITLE: Mission: Impossible III
ACTOR: Ving Rhames
        MOVIE TITLE: Mission: Impossible III
ACTOR: Tom Cruise
        MOVIE TITLE: Mission: Impossible III
        MOVIE TITLE: Minority Report
ACTOR: Max von Sydow
        MOVIE TITLE: Minority Report

Storing rulesets as XML

In the previous examples, you've seen Digester rulesets written in Java. However, one of the coolest things about Digester is that this isn't the only way to define patterns for mapping XML elements to actions. Instead, it's also possible to specify the rulesets themselves in XML, thus making it possible for even non-Java developers to leverage Digester's capabilities.

To see how this works, begin by removing the LibraryRuleSet class defined in the previous example, and replace it with rulesets written in XML, as shown in Listing J.

Listing J

<?xml version='1.0'?>
<digester-rules>
  <pattern value="library">
    <object-create-rule classname="sample.config.LibraryDetailsConfiguration"/>
  </pattern>
  <pattern value="library/movie">
    <object-create-rule classname="sample.config.MovieDetailsConfiguration"/>
    <bean-property-setter-rule pattern="title"/>
    <call-method-rule pattern="cast/person" methodname="addMovieCast" paramcount="1" />
    <call-param-rule pattern="cast/person" paramnumber="0"/>
    <set-next-rule methodname="setMovie" />
  </pattern>
</digester-rules>

Save this file as movieRules.xml.

Next, alter the stub such that it knows to read these XML rulesets. (Listing K)

Listing K

public class TestDigester {
  public static void main(String args[]) {
    TestDigester digesterClient = new TestDigester();
    digesterClient.digest();
  }
 
  public void digest() {
    try {
      //Create Digester using rules defined in movieRules.xml
      Digester digester = DigesterLoader.createDigester(this.getClass().getClassLoader().getResource("movieRules.xml"));

      //Parse movie.xml using the Digester to get an instance of Academy
      LibraryDetailsConfiguration libraryConfig = (LibraryDetailsConfiguration) digester.parse(this.getClass().getClassLoader().getResourceAsStream("movie.xml"));
      libraryConfig.printLibraryDetails();
    } catch(Exception e) {
      System.out.println("Exception while parsing the data file : " + e);
      e.printStackTrace();
    }
  }
}

And now try running it -- the output should be the same as that of the previous example.

As these examples illustrate, Jakarta's Digester component provides an intuitive rules-based framework for parsing an XML file, one that's significantly easier to program for than the standard SAX-based API. The use of XML-based rulesets further improves usability, allowing even non-Java developers to get their hands dirty with the application. Try it out for yourself -- and happy coding!

  • Save
  • Print
  • 1

Print/View all Posts Comments on this article

Re-inventing the wheel?nperkins@...  | 01/25/07
Seems to be more of an OR mapperTony Hopkinson  | 01/25/07
visual mapping is match easedoviddashevsky@...  | 01/25/07
XSLT is indeed a nightmare to maintainTony Hopkinson  | 01/26/07
do u have other idea?doviddashevsky@...  | 01/29/07
That just shifts the effort to maintaining the mappingTony Hopkinson  | 01/29/07

What do you think?

Article Categories

Security
Security Solutions, IT Locksmith
Networking and Communications
E-mail Administration NetNote, Cisco Routers and Switches
CIO and IT Management
Project Management, CIO Issues, Strategies that Scale
Desktops, Laptops & OS
Windows 2000 Professional, Microsoft Word, Microsoft Excel, Microsoft Access, Windows XP,
Data Management
Oracle, SQL Server
Servers
Windows NT, Linux NetNote, Windows Server 2003
Career Development
Geek Trivia
Software/Web Development
Web Development Zone, Visual Basic, .NET

The PC Cracked Open

advertisement
Click Here