Archive for January, 2010

How do I–Write my own C Sharp IEquality Comparor when System.Collections.Generic.List.Contains (System.Xml.Linq.XNode)-fails

Written by Cornelius J. van Dyk on . Posted in How Do I...

Unfortunately I ran into an issue with the C# List class the other day.  I was in need of checking for a given XML snippet’s existence in an XML document.  
I immediately jumped in and tried using Linq2Xml and List to solve the problem.  The way I figured it, I could grab a quick list of XNodes from the XML 
document and then using the .Contains() method, I could check if the node in the XML snippet exists in the list.  Imagine my surprise when I ran through the code, 
but for some reason, the .Contains() method did NOT return true even on an identical match.  Weird.  :S

After spending some time trying to figure out why it wasn’t working as advertised (as I’ve mentioned before, 80% of a developer’s time is spent 
figuring out why something isn’t working as advertised or published) I decided to go the easier route and leverage the .Contains() method’s second 
overloaded form which takes an IEQualityComparor so all I’d have to do is write my own IEQualityComparor derived class and pass 
it to the .Contains() method.  I found that the fact that the snippet did not have pre or post nodes, caused it to fail the XML match in the 
comparison.  In order for it to work correctly, the comparison would have to do a text comparison.  In addition, a snippet may match, but 
have it’s properties in a different sequence that the compared document, but having a different sequence would also cause it to fail the match.  
We would have to allow for that too.  Let’s take a look at the XML document against which we are comparing the snippets.  The document is defined thus:

?>
>
  >
    
/>
/>
/>
>
>

Now for testing we have defined the following XML snippets.  The first snippet contains the exact matching node from the “log4net” node and 
is defined thus:

?>
>
  >
    
/>
>
>

The second snippet contains the same XML except, I’ve switched the “name” and the “type” properties around.  It is defined thus:

?>
>
  >
    
/>
>
>

Now let’s look at the test code to show both the problem and test the solution.  The C# code loads each of the XML pieces for the test and looks like this:

cjvandyk.core.Xml.XNodeComparor cmp = new cjvandyk.core.Xml.XNodeComparor();
System.Xml.Linq.XElement xmlConfig = System.Xml.Linq.XElement.Load(@"D:\CODE\cjvandyk.core\DemoEngine\XMLFile.xml");
System.Xml.Linq.XElement xmlSnippet = System.Xml.Linq.XElement.Load(@"D:\CODE\cjvandyk.core\DemoEngine\XMLSnippet.xml");
List _nod = cmp.GetListOfElements(xmlConfig.Elements("configSections").DescendantNodes());
foreach (System.Xml.Linq.XNode nod in xmlSnippet.Elements("configSections").DescendantNodes())
{
    if (!(nod is System.Xml.Linq.XComment))
    {
        if (_nod.Contains(nod))
        {
            Console.WriteLine("The XML document contains the XML snippet.");
        }
    }
    if (!(nod is System.Xml.Linq.XComment))
    {
        if (_nod.Contains(nod, cmp))
        {
            Console.WriteLine("The XML document contains the XML snippet.");
        }
    }
}
xmlSnippet = System.Xml.Linq.XElement.Load(@"D:\CODE\cjvandyk.core\DemoEngine\XMLSnippet2.xml");
_nod = cmp.GetListOfElements(xmlConfig.Elements("configSections").DescendantNodes());
foreach (System.Xml.Linq.XNode nod in xmlSnippet.Elements("configSections").DescendantNodes())
{
    if (!(nod is System.Xml.Linq.XComment))
    {
        if (_nod.Contains(nod))
        {
            Console.WriteLine("The XML document contains the XML snippet.");
        }
    }
    if (!(nod is System.Xml.Linq.XComment))
    {
        if (_nod.Contains(nod, cmp))
        {
            Console.WriteLine("The XML document contains the XML snippet.");
        }
    }
}

Ignore the first line for the moment.  It’s just an instantiation of my custom IEqualityComparor class.  Next we load the XML document into an 
XElement called xmlConfig.  Then we load the first XML snippet into it’s own XElement called xmlSnippet.  Now we define a 
List called _nod and assign to it the return of my GetListOfElements() custom method.  The method is defined as follows:

public List GetListOfElements(IEnumerable xml)
{
    List lst = new List();
    foreach (System.Xml.Linq.XNode nod in xml)
    {
        if (!(nod is System.Xml.Linq.XComment))
        {
            lst.Add(nod);
        }
    }
    return lst;
}

As you can see, all it does is itterate the XNodes, ignore comments and return the list at the end.  Now we just iterate all the XNodes in the snippet, 
exactly the same way the custom method got the list of nodes, but then for each node, we simply check to see if it exists in the List.  
We do two checks.  The first check is to show the failure and it does NOT use the IEqualityComparor.  The second check does use the 
IEqualityComparor and works as expected.

Once we’re done with the first snippet, we load the second snippet to show that the IEqualityComparor also works when the XML attributes of the 
XNode are swapped around.

The full definition of the IEqualityComparor looks like this:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace cjvandyk.core.Xml
{
    public class XNodeComparor : IEqualityComparer
    {
        public bool Equals(System.Xml.Linq.XNode source, System.Xml.Linq.XNode target)
        {
            List _sourceAtt = GetListOfAttributesAsString((System.Xml.Linq.XElement)source);
            List _targetAtt = GetListOfAttributesAsString((System.Xml.Linq.XElement)target);
            List _toRemove = new List();
            foreach (string att in _sourceAtt)
            {
                if (_targetAtt.Contains(att))
                {
                    _toRemove.Add(att);
                }
            }
            foreach (string att in _toRemove)
            {
                _targetAtt.Remove(att);
                _sourceAtt.Remove(att);
            }
            if ((_sourceAtt.Count() > 0) || (_targetAtt.Count() > 0))
            {
                return false;
            }
            return true;
        }

        public List GetListOfElements(IEnumerable xml)
        {
            List lst = new List();
            foreach (System.Xml.Linq.XNode nod in xml)
            {
                if (!(nod is System.Xml.Linq.XComment))
                {
                    lst.Add(nod);
                }
            }
            return lst;
        }

        private List GetListOfAttributes(System.Xml.Linq.XElement xml)
        {
            List lst = new List();
            foreach (System.Xml.Linq.XAttribute att in xml.Attributes())
            {
                lst.Add(att);
            }
            return lst;
        }

        private List GetListOfAttributesAsString(System.Xml.Linq.XElement xml)
        {
            List lst = new List();
            foreach (System.Xml.Linq.XAttribute att in xml.Attributes())
            {
                lst.Add(att.ToString().ToLower());
            }
            return lst;
        }

        public int GetHashCode(System.Xml.Linq.XNode source)
        {
            return source.GetHashCode();
        }
    }
}

The heart of the class is the Equals() method.  The method declares two List variables that each are filled with the output from our custom 
GetListOfAttributesAsString(XElement) method.  This method simply iterates through all the XAttributes of the XElement and returns them as a list 
of strings.

Now when we look at the Equals() method, you may notice what appears to be unneeded code.  We declare an additional List called 
_toRemove and then add all the matching attributes between _sourceAtt and _targetAtt to the list.  Once we have the list, we iterate it and remove 
all it’s items from the _sourceAtt and _targetAtt lists.

Why don’t we just directly remove the items from these two lists during the first iteration?

Why not just write the method like this?:

public bool Equals(System.Xml.Linq.XNode source, System.Xml.Linq.XNode target)
{
    List _sourceAtt = GetListOfAttributesAsString((System.Xml.Linq.XElement)source);
    List _targetAtt = GetListOfAttributesAsString((System.Xml.Linq.XElement)target);
    foreach (string att in _sourceAtt)
    {
        if (_targetAtt.Contains(att))
        {
            _targetAtt.Remove(att);
            _sourceAtt.Remove(att);
        }
    }
    if ((_sourceAtt.Count() > 0) || (_targetAtt.Count() > 0))
    {
        return false;
    }
    return true;
}

This would seem to be a better way to write the method, but C# doesn’t allow you to modify the source of the iteration.  Attempting to do so, results in the following error:

image_3_1198780E

As a result, we have to have the extra list to hold the matches and then iterate it, removing it’s elements from the other two lists.

In this post we’ve shown how to create our very own IEqualityComparor method to use for our purposes since the List.Contains() 
method doesn’t function the way we needed it to.  I would venture to say, that the .Contains() method without our IEqualityComparor is 
pretty much useless.

Of course, if anyone out there can explain to me why it works the way it does and how the way it works is actually useful, I’m all ears. 🙂



Cheers
C




image