Part 2: Developing an object oriented database in less than 140 lines of C#
Takeaway: In Part 1 of this series we explored how an object oriented database (OODB) could be implemented using built in .NET Framework functionality. The basic database is setup using XML as a storage format, XPath and Predicates as query languages, and the XmlDocument object as a container for our data. In Part 2, Zach Smith explains in detail how the code in the sample project is implemented.
This article is also available as a TechRepublic download, which includes all of the sample code and files in a Visual Studio project file. Click here to view Part 1 of the series.
The architecture
Before going into the details of the code we need to take a look at the architecture behind our solution. The basic architecture is built with two classes and an interface:
Classes:
- XmlDBState – This is an abstract class that contains all functionality for the database. This includes searching, saving, deleting, and file management/creation functionality.
- XmlDBBase – This is a public class that is meant to be used as a base class for objects that will be saved into the database. It is not a requirement for the objects to inherit from this class, however, inheriting from XmlDBBase will automatically implement the IXmlSerializable interface and save coding time.
Interface:
- IXmlSerializable – Any object that is to be saved in the database must implement this interface. As mentioned above, if an object inherits from XmlDBBase, it will already implement this interface and no further action will be required for the object to be saved in the database.
Now that we have the general architecture laid out we can get into the source code of how the database works.
Loading the database
For reference, the following XML (Listing A) is what the database looks like when it is written out to disk:
Listing A:
<Database><XmlDB.Order>
<Order xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Identity>76a0558b-a8c7-42e3-8f1d-c56319365787</Identity>
<CustomerIdentity>6f5e9a2b-b68f-4b6d-9298-fbe5f135dd25</CustomerIdentity>
<DatePlaced>2006-11-21T07:12:16.3176493-05:00</DatePlaced>
</Order>
<Order xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Identity>16d8f0b8-46c6-47c3-ac6b-a0b0e0852970</Identity>
<CustomerIdentity>61cf2db4-0071-4380-83df-65a102d82ff2</CustomerIdentity>
<DatePlaced>2006-11-21T07:12:26.0533326-05:00</DatePlaced>
</Order>
</XmlDB.Order>
<XmlDB.Customer>
<Customer xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Identity>6f5e9a2b-b68f-4b6d-9298-fbe5f135dd25</Identity>
<LastName>Cunningham</LastName>
<FirstName>Marty</FirstName>
</Customer>
<Customer xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Identity>61cf2db4-0071-4380-83df-65a102d82ff2</Identity>
<LastName>Smith</LastName>
<FirstName>Zach</FirstName>
</Customer>
</XmlDB.Customer>
</Database>
In this particular example there are two customers and two orders stored in the database. Each type of object stored in the database is contained within a node which is dedicated to that particular type. For example, the Database\XmlDB.Order (Database\[namespace].[type]) node contains all Order objects that have been saved.
Within each dedicated type node are object nodes which are serialized objects. Listing B shows an example of this.
Listing B:
<Order xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Identity>16d8f0b8-46c6-47c3-ac6b-a0b0e0852970</Identity>
<CustomerIdentity>61cf2db4-0071-4380-83df-65a102d82ff2</CustomerIdentity>
<DatePlaced>2006-11-21T07:12:26.0533326-05:00</DatePlaced>
</Order>
This is simply the serialized form of an Order object.
To load the XML database we create an XmlDocument object and call its "Load" method to load the XML. This functionality is held in the XmlDBState class as the "OpenDatabase" function. (Listing C)
Listing C:
public static void OpenDatabase(string path){
//Set the path to the database file.
_path = path;
//Douse the database file already exist?
if (File.Exists(path))
Database.Load(path); //If so, load it.
//If a main node already exists in the database
// use it. If not, create it.
MainNode = (Database.ChildNodes.Count > 0) ?
Database.SelectSingleNode(rootName) :
Database.CreateElement(rootName);
//If the main node doesn't exist, add it.
if (Database.SelectSingleNode(rootName) == null)
Database.AppendChild(MainNode);
}
An interesting feature of this function is that if the database file doesn't exist, it will automatically create an in-memory database to use. After an object is saved and persisted to disk, the in-memory database will be written to disk.
The "MainNode" referenced in the code above refers to the "Database" node of the XML. This is the node that all objects will be saved under, and is considered the "root node" of the XML document.
Saving objects
After the database is loaded in the XmlDBState object we will want to save objects to it. This is also handled in XmlDBState in a method called "SaveObject". The code for SaveObject is shown in Listing D.
Listing D:
public static void SaveObject(IXmlDBSerializable data, bool persistData){
Type type = data.GetType();
string typeString = type.ToString();
PropertyInfo[] properties = type.GetProperties();
//Remove the object if it's currently in the database.
XmlNode typeNode = RemoveCurrentObject(typeString, data);
//Loop through each property in our object and see
// if we need to save them in the database.
foreach (PropertyInfo property in properties)
{
//Get the property's value.
object propertyValue = property.GetValue(data, null);
//Check to see if the property is IXmlDBSerializable,
// and if it is save it to the database.
if (propertyValue is IXmlDBSerializable)
((IXmlDBSerializable)propertyValue).Save(persistData);
else if (propertyValue is System.Collections.ICollection)
{
//This property is a collection of objects.
// We need to see if this collection contains
// IXmlDBSerializable objects, and serialize
// those objects if needed.
IList propertyList = propertyValue as IList;
//Does the collection contain IXmlDBSerializable
// objects?
if (propertyList != null &&
propertyList.Count > 0 &&
propertyList[0] is IXmlDBSerializable)
{
//It does contain IXmlDBSerializable objects
// so save each of them.
foreach (object listObject in propertyList)
((IXmlDBSerializable)listObject).Save(persistData);
}
}
}
//If the type which is being saved isn't currently
// represented in the database, create a place to
// hold that specific type.
if (typeNode == null)
{
typeNode = XmlDBState.Database.CreateElement(typeString);
XmlDBState.MainNode.AppendChild(typeNode);
}
//Prepare the objects we will need for serializing
// the object.
XmlSerializer serializer = new XmlSerializer(type);
StringWriter writer = new StringWriter();
XmlDocument objectDocument = new XmlDocument();
//Serialize the object into our StringWriter object.
serializer.Serialize(writer, data);
//Create an XmlDocument from our serialized object.
objectDocument.InnerXml = writer.ToString();
//If the serialized object had data, import it into
// the database.
if (objectDocument.ChildNodes.Count > 0)
{
//Set the object's Node property to the serialized
// data.
data.Node =
XmlDBState.Database.ImportNode(objectDocument.ChildNodes[1],
true);
//Append the serialized object to the type node.
typeNode.AppendChild(data.Node);
}
//If requested, persist these changes to the XML file
// held on disk. If this is not called, the change is
// made in memory only.
if (persistData)
XmlDBState.Database.Save(XmlDBState.Path);
}
This function is arguably where the most important functionality for the database is held. Below are the steps taken by this function from start to finish:
- Delete the object if it already exists in the database. The reason we do this is because it is easier to remove the serialized object and save it again than it is to update the serialized object in the database. The RemoveCurrentObject function that is called also returns the XmlNode that is responsible for holding the type of object we're saving. If you look at the example database shown above you will see the XmlDB.Order and XmlDB.Customer nodes – these are type nodes, one holds Order objects and the other holds Customer objects.
- After we remove the current object we need to use reflection to examine the object which is being saved. This is done so we can save any child objects/collections to the database along with the main object. If we find that the object has children that must be saved, the children are explicitly cast to IXmlDBSerializable and used to call the Save() method.
- Next we check to see if the typeNode exists and if it doesn't we create it.
- We then must create the serialized form of the object. This is done by using the XmlSerializer object to serialize the object into a StringWriter object. This allows us to access the serialized object via a String, which we can then import into a temporary XmlDocument. We import it into the XmlDocument so that it is able to be imported into the main database XmlDocument. This is done using the ImportNode and AppendChild methods of the main XmlDocument object.
- Lastly, if requested, we will persist the data to disk. This is done by calling the XmlDocument.Save method and passing the database's path as the parameter. This causes the XmlDocument to overwrite whatever is currently on disk. The act of writing the database back out to disk is slow, which is why we give an option to not persist the data. For instance, if we were saving 10,000 objects it would be much faster to save the 10,000 objects in memory (Save(false)) and then call XmlDBBase.Database.Save(XmlDBBase.Path) than it would be to just call Save(true) on all 10,000 objects.
Deleting objects
The deletion of objects is handled by two functions in XmlDBState – Delete and RemoveCurrentObject. The code for Delete is shown in Listing E.
Listing E:
public static void Delete(IXmlDBSerializable data, bool deep){
//If this is a "deep delete", we look through the object's
// properties and delete all child objects from the database.
if (deep)
{
PropertyInfo[] properties = data.GetType().GetProperties();
foreach (PropertyInfo property in properties)
{
object propertyValue = property.GetValue(data, null);
if (propertyValue is IXmlDBSerializable)
Delete((IXmlDBSerializable)propertyValue, true);
else if (propertyValue is System.Collections.ICollection)
{
IList propertyList = propertyValue as IList;
if (propertyList != null &&
propertyList.Count > 0 &&
propertyList[0] is IXmlDBSerializable)
foreach (object listObject in propertyList)
Delete((IXmlDBSerializable)listObject, true);
}
}
}
//Remove the object from the database.
XmlDBState.RemoveCurrentObject(data.GetType().ToString(), data);
//Persist the database to disk.
XmlDBState.Database.Save(XmlDBState.Path);
}
As you can see, Delete uses RemoveCurrentObject internally, but also provides the option to "deep delete". This means that each child object of the object that is being deleted will be deleted from the database as well. The code for RemoveCurrentObject is shown in Listing F.
Listing F:
public static XmlNode RemoveCurrentObject(string typeString, IXmlDBSerializable data){
//Find the node that holds this type's data.
XmlNode typeNode = XmlDBState.MainNode.SelectSingleNode(typeString);
//If the object has a node associated with it, remove
// the node from the database.
if (data.Node != null)
typeNode.RemoveChild(data.Node);
//Return the node that is responsible for this type's
// data.
return typeNode;
}
RemoveCurrentObject basically finds the object within the current object in the database, and uses the RemoveChild method of the type's XmlNode to remove the serialized object from the database. This is an in-memory operation, which is why the Delete method takes the extra step of calling XmlDBState.Database.Save to persist the changes to disk.
Predicate queries
By providing the user with an option to search through the database using Predicate methods, we enable a type-safe, and integrated C# query mechanism. This functionality is implemented in the database by an overload of the Search method (Listing G).
Listing G
public static List<DynamicType> Search<DynamicType>(Predicate<DynamicType> searchFunction)
where DynamicType : IXmlDBSerializable
{
//Get the Type of the object we're searching for.
Type type = typeof(DynamicType);
//Get the nodes of those objects in our database.
XmlNodeList nodeList =
XmlDBState.Database.SelectNodes(String.Format(@"/Database/{0}/{1}",
type.FullName, type.Name));
//Get a collection of DynamicType objects via the
// ExtractObjectsFromNodes method.
List<DynamicType> matches = ExtractObjectsFromNodes<DynamicType>(nodeList);
//Use the List<T>.FindAll method to narrow our results
// to only what was searched for.
return matches.FindAll(searchFunction);
}
This function selects all nodes of the given type from the database, deserializes the nodes using ExtractObjectsFromNodes, and then filters the collection by using the List<T>.FindAll method provided by the .NET Framework.
The code for ExtractObjectsFromNodes is shown in Listing H.
Listing H
private static List<DynamicType> ExtractObjectsFromNodes<DynamicType>(XmlNodeList nodeList)
{
XmlSerializer serializer = new XmlSerializer(typeof(DynamicType));
List<DynamicType> objects = new List<DynamicType>();
foreach (XmlNode node in nodeList)
{
StringReader reader = new StringReader(node.OuterXml);
DynamicType deserialized = (DynamicType)serializer.Deserialize(reader);
((IXmlDBSerializable)deserialized).Node = node;
objects.Add(deserialized);
}
return objects;
}
This method simply loops through each node in our XmlNodeList and deserializes each object into a live business object. It then adds the object to a List<T> collection and returns the collection at the end of the loop.
It is important to realize that this type of query requires that we deserialize all instances of the requested type that are held in the database. This means that if there are 10,000 Customer objects in the database, and we use a Predicate query to filter them, 10,000 objects must be deserialized before the database even begins to filter the results. This is obviously a time consuming process, which is why we provide an alternate query mechanism based on XPath.
XPath queries
As mentioned in Part 1 of this series, a major advantage of using XML to store our objects is that we can use XPath as an ad-hoc query mechanism. This mechanism is handled in the XmlDBState class by an overload of the Search method, as shown in Listing I.
Listing I
public static List<DynamicType> Search<DynamicType>(string query)
where DynamicType : IXmlDBSerializable
{
//Get the Type of the object we're searching for.
Type type = typeof(DynamicType);
//Create a List<DynamicType> collection to hold the results.
List<DynamicType> matches = new List<DynamicType>();
//Change single quotes to double quotes in the query.
query = query.Replace("'", "\"");
//Build our XPath query.
string xpath = "Database/" + type.FullName + "/" +
type.Name + "[" + query + "]";
try
{
//Select all nodes which match out XPath query.
XmlNodeList nodes = XmlDBState.Database.SelectNodes(xpath);
//If we have results, extract objects from those nodes.
if (nodes != null)
matches = ExtractObjectsFromNodes<DynamicType>(nodes);
}
catch (Exception exception)
{
throw new Exception("Could not search. Possible bad query syntax?",
exception);
}
return matches;
}
Notice that when XPath queries are used, we only deserialize the objects which we know are matches for our query. This speeds up the query's execution time dramatically. In testing, Predicate queries took one second to return 10,000 objects, while XPath queries took only one hundredth of a second.
Also notice that in the beginning of the method we are replacing single quotes with double quotes. This is so users will not have to escape the double quotes in their queries. This can have side effects, however, due to the fact that single quotes are valid XPath characters and may be required in some instances. This was done as a compromise for usability.
The IXmlDBSerializable interface
The IXmlDBSerializable interface must be implemented by any object which is to be saved in the database. This allows the database to treat all objects the same and takes out any guesswork associated with figuring out if an object can be saved to the database. The code for IXmlDBSerializable is shown Listing J.
Listing J
namespace XmlDBLibrary{
public interface IXmlDBSerializable
{
System.Guid Identity { get; set; }
System.Xml.XmlNode Node { get; set; }
void Save(bool persistData);
}
}
The Identity property is needed so that each object saved in the database is uniquely identified. This property should be automatically generated for any object implementing IXmlDBSerializable.
The Node property is used to hold the XmlNode in the database which corresponds to the current object. This allows the database to remove the object quickly, without having to search for the object's identity. This property will be null for objects that haven't yet been saved to the database.
The Save method is required so that the XmlDBState.SaveObject method can save the child objects of the object being saved. This is done by casting the child object to IXmlDBSerializable and calling Save on the child object.
The XmlDBBase class
The XmlDBBase class is provided as a shortcut for developers to use when an object needs to be compliant with the IXmlDBSerializable interface. It is not a required class, as IXmlDBSerializable could be implemented manually. When used as a base class, XmlDBBase also provides save, search, and delete functionality. The code for XmlDBBase is shown in Listing K.
Listing K
public class XmlDBBase : IXmlDBSerializable{
private XmlNode _node = null;
private System.Guid _identity = System.Guid.NewGuid();
public XmlDBBase()
{
}
public void Delete()
{
this.Delete(false);
}
public void Delete(bool deep)
{
XmlDBState.Delete(this, deep);
}
public void Save()
{
Save(true);
}
public void Save(bool persistData)
{
XmlDBState.SaveObject(this, persistData);
}
public static List<DynamicType> Search<DynamicType>(
System.Predicate<DynamicType> searchFunction)
where DynamicType : IXmlDBSerializable
{
return XmlDBState.Search<DynamicType>(searchFunction);
}
public static List<DynamicType> Search<DynamicType>(
string query)
where DynamicType : IXmlDBSerializable
{
return XmlDBState.Search<DynamicType>(query);
}
public static DynamicType GetSingle<DynamicType>(
System.Predicate<DynamicType> searchFunction)
where DynamicType : IXmlDBSerializable
{
List<DynamicType> results =
XmlDBState.Search<DynamicType>(searchFunction);
return (results.Count == 0) ? default(DynamicType) : results[0];
}
public System.Guid Identity
{
get { return _identity; }
set { _identity = value; }
}
[System.Xml.Serialization.XmlIgnore]
public XmlNode Node
{
get { return _node; }
set { _node = value; }
}
}
Notice that methods are provided for doing both XPath and Predicate queries. Support is also provided for Delete and Save functionality. Most of these methods are basically pass-through methods into the XmlDBState object.
One interesting thing to point out is the use of the XmlIgnore attribute on the Node property. This is used because we do not want the Node property serialized to the database. If we serialized this property we would basically be storing an exact copy of our serialized object within the object.
Final thoughts
While this is obviously not an enterprise level OODB, I believe it is a good example of the flexibility of the .NET Framework. Very little code was needed to get this working, and the code that is used basically just ties together different .NET Framework functionality. Every major function of this database was already provided by the .NET Framework – from Predicates/XPath for queries, to the XmlDocument object for data management.
If you have any questions or comments about the ideas represented in this article, please leave a comment and I'll be happy to respond.
SponsoredWhite Papers, Webcasts, and Downloads
- Yankee Group: Exploring the Benefits of 3G Wireless Integrated into Business-Class Routers Sprint
- Sprint DataLink for Wireless WAN Fact Sheet Sprint
- Document Process Automation for customer orders: A new performance perspective Esker
- White paper: IBM pureXML for SOA: Unlocking the business value of information IBM
- TDWI Podcast: Big Blue touts Dynamic, Balanced Warehousing IBM
Article Categories
- Security
- Security Solutions, IT Locksmith
- Networking and Communications
- E-mail Administration NetNote, Cisco Routers and Switches
- CIO and IT Management
- Project Management, CIO Issues, Strategies that Scale
- Desktops, Laptops & OS
- Windows 2000 Professional, Microsoft Word, Microsoft Excel, Microsoft Access, Windows XP,
- Data Management
- Oracle, SQL Server
- Servers
- Windows NT, Linux NetNote, Windows Server 2003
- Career Development
- Geek Trivia
- Software/Web Development
- Web Development Zone, Visual Basic, .NET

