How do I–Write my own C Sharp IEquality Comparor when System.Collections.Generic.List.Contains (System.Xml.Linq.XNode)-fails

Written by Cornelius J. van Dyk on . Posted in How Do I...

Unfortunately I ran into an issue with the C# List class the other day.  I was in need of checking for a given XML snippet’s existence in an XML document.  
I immediately jumped in and tried using Linq2Xml and List to solve the problem.  The way I figured it, I could grab a quick list of XNodes from the XML 
document and then using the .Contains() method, I could check if the node in the XML snippet exists in the list.  Imagine my surprise when I ran through the code, 
but for some reason, the .Contains() method did NOT return true even on an identical match.  Weird.  :S

After spending some time trying to figure out why it wasn’t working as advertised (as I’ve mentioned before, 80% of a developer’s time is spent 
figuring out why something isn’t working as advertised or published) I decided to go the easier route and leverage the .Contains() method’s second 
overloaded form which takes an IEQualityComparor so all I’d have to do is write my own IEQualityComparor derived class and pass 
it to the .Contains() method.  I found that the fact that the snippet did not have pre or post nodes, caused it to fail the XML match in the 
comparison.  In order for it to work correctly, the comparison would have to do a text comparison.  In addition, a snippet may match, but 
have it’s properties in a different sequence that the compared document, but having a different sequence would also cause it to fail the match.  
We would have to allow for that too.  Let’s take a look at the XML document against which we are comparing the snippets.  The document is defined thus:

?>
>
  >
    
/>
/>
/>
>
>

Now for testing we have defined the following XML snippets.  The first snippet contains the exact matching node from the “log4net” node and 
is defined thus:

?>
>
  >
    
/>
>
>

The second snippet contains the same XML except, I’ve switched the “name” and the “type” properties around.  It is defined thus:

?>
>
  >
    
/>
>
>

Now let’s look at the test code to show both the problem and test the solution.  The C# code loads each of the XML pieces for the test and looks like this:

cjvandyk.core.Xml.XNodeComparor cmp = new cjvandyk.core.Xml.XNodeComparor();
System.Xml.Linq.XElement xmlConfig = System.Xml.Linq.XElement.Load(@"D:\CODE\cjvandyk.core\DemoEngine\XMLFile.xml");
System.Xml.Linq.XElement xmlSnippet = System.Xml.Linq.XElement.Load(@"D:\CODE\cjvandyk.core\DemoEngine\XMLSnippet.xml");
List _nod = cmp.GetListOfElements(xmlConfig.Elements("configSections").DescendantNodes());
foreach (System.Xml.Linq.XNode nod in xmlSnippet.Elements("configSections").DescendantNodes())
{
    if (!(nod is System.Xml.Linq.XComment))
    {
        if (_nod.Contains(nod))
        {
            Console.WriteLine("The XML document contains the XML snippet.");
        }
    }
    if (!(nod is System.Xml.Linq.XComment))
    {
        if (_nod.Contains(nod, cmp))
        {
            Console.WriteLine("The XML document contains the XML snippet.");
        }
    }
}
xmlSnippet = System.Xml.Linq.XElement.Load(@"D:\CODE\cjvandyk.core\DemoEngine\XMLSnippet2.xml");
_nod = cmp.GetListOfElements(xmlConfig.Elements("configSections").DescendantNodes());
foreach (System.Xml.Linq.XNode nod in xmlSnippet.Elements("configSections").DescendantNodes())
{
    if (!(nod is System.Xml.Linq.XComment))
    {
        if (_nod.Contains(nod))
        {
            Console.WriteLine("The XML document contains the XML snippet.");
        }
    }
    if (!(nod is System.Xml.Linq.XComment))
    {
        if (_nod.Contains(nod, cmp))
        {
            Console.WriteLine("The XML document contains the XML snippet.");
        }
    }
}

Ignore the first line for the moment.  It’s just an instantiation of my custom IEqualityComparor class.  Next we load the XML document into an 
XElement called xmlConfig.  Then we load the first XML snippet into it’s own XElement called xmlSnippet.  Now we define a 
List called _nod and assign to it the return of my GetListOfElements() custom method.  The method is defined as follows:

public List GetListOfElements(IEnumerable xml)
{
    List lst = new List();
    foreach (System.Xml.Linq.XNode nod in xml)
    {
        if (!(nod is System.Xml.Linq.XComment))
        {
            lst.Add(nod);
        }
    }
    return lst;
}

As you can see, all it does is itterate the XNodes, ignore comments and return the list at the end.  Now we just iterate all the XNodes in the snippet, 
exactly the same way the custom method got the list of nodes, but then for each node, we simply check to see if it exists in the List.  
We do two checks.  The first check is to show the failure and it does NOT use the IEqualityComparor.  The second check does use the 
IEqualityComparor and works as expected.

Once we’re done with the first snippet, we load the second snippet to show that the IEqualityComparor also works when the XML attributes of the 
XNode are swapped around.

The full definition of the IEqualityComparor looks like this:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace cjvandyk.core.Xml
{
    public class XNodeComparor : IEqualityComparer
    {
        public bool Equals(System.Xml.Linq.XNode source, System.Xml.Linq.XNode target)
        {
            List _sourceAtt = GetListOfAttributesAsString((System.Xml.Linq.XElement)source);
            List _targetAtt = GetListOfAttributesAsString((System.Xml.Linq.XElement)target);
            List _toRemove = new List();
            foreach (string att in _sourceAtt)
            {
                if (_targetAtt.Contains(att))
                {
                    _toRemove.Add(att);
                }
            }
            foreach (string att in _toRemove)
            {
                _targetAtt.Remove(att);
                _sourceAtt.Remove(att);
            }
            if ((_sourceAtt.Count() > 0) || (_targetAtt.Count() > 0))
            {
                return false;
            }
            return true;
        }

        public List GetListOfElements(IEnumerable xml)
        {
            List lst = new List();
            foreach (System.Xml.Linq.XNode nod in xml)
            {
                if (!(nod is System.Xml.Linq.XComment))
                {
                    lst.Add(nod);
                }
            }
            return lst;
        }

        private List GetListOfAttributes(System.Xml.Linq.XElement xml)
        {
            List lst = new List();
            foreach (System.Xml.Linq.XAttribute att in xml.Attributes())
            {
                lst.Add(att);
            }
            return lst;
        }

        private List GetListOfAttributesAsString(System.Xml.Linq.XElement xml)
        {
            List lst = new List();
            foreach (System.Xml.Linq.XAttribute att in xml.Attributes())
            {
                lst.Add(att.ToString().ToLower());
            }
            return lst;
        }

        public int GetHashCode(System.Xml.Linq.XNode source)
        {
            return source.GetHashCode();
        }
    }
}

The heart of the class is the Equals() method.  The method declares two List variables that each are filled with the output from our custom 
GetListOfAttributesAsString(XElement) method.  This method simply iterates through all the XAttributes of the XElement and returns them as a list 
of strings.

Now when we look at the Equals() method, you may notice what appears to be unneeded code.  We declare an additional List called 
_toRemove and then add all the matching attributes between _sourceAtt and _targetAtt to the list.  Once we have the list, we iterate it and remove 
all it’s items from the _sourceAtt and _targetAtt lists.

Why don’t we just directly remove the items from these two lists during the first iteration?

Why not just write the method like this?:

public bool Equals(System.Xml.Linq.XNode source, System.Xml.Linq.XNode target)
{
    List _sourceAtt = GetListOfAttributesAsString((System.Xml.Linq.XElement)source);
    List _targetAtt = GetListOfAttributesAsString((System.Xml.Linq.XElement)target);
    foreach (string att in _sourceAtt)
    {
        if (_targetAtt.Contains(att))
        {
            _targetAtt.Remove(att);
            _sourceAtt.Remove(att);
        }
    }
    if ((_sourceAtt.Count() > 0) || (_targetAtt.Count() > 0))
    {
        return false;
    }
    return true;
}

This would seem to be a better way to write the method, but C# doesn’t allow you to modify the source of the iteration.  Attempting to do so, results in the following error:

image_3_1198780E

As a result, we have to have the extra list to hold the matches and then iterate it, removing it’s elements from the other two lists.

In this post we’ve shown how to create our very own IEqualityComparor method to use for our purposes since the List.Contains() 
method doesn’t function the way we needed it to.  I would venture to say, that the .Contains() method without our IEqualityComparor is 
pretty much useless.

Of course, if anyone out there can explain to me why it works the way it does and how the way it works is actually useful, I’m all ears. 🙂



Cheers
C




image

Tags: , , ,

Trackback from your site.

Cornelius J. van Dyk

Born and raised in South Africa during the 70's I got my start in computers when a game on my Sinclair ZX Spectrum crashed, revealing it's BASIC source code. The ZX had a whopping 48K of memory which was considered to be a lot in the Commodore Vic20 era, but more importantly, it had BASIC built into the soft touch keyboard. Teaching myself to program, I coded my first commercial program at age 15.

After graduating high school at 17, I joined the South African Air Force, graduating the Academy and becoming a Pilot with the rank of First Lieutenant by age 20. After serving my country for six years, I made my way back into computer software.

Continuing my education, I graduated Suma Cum Laude from the Computer Training Institute before joining First National Bank where my work won the Smithsonian Award for Technological Innovation in the field of Banking and Insurance. Soon I met Will Coleman from Amdahl SA, who introduced me to a little known programming language named Huron/ObjectStar. As fate would have it, this unknown language and Y2K brought me to the USA in 1998.

I got involved with SharePoint after playing around with the Beta for SharePoint Portal Server 2003. Leaving my career at Rexnord to become a consultant in 2004, I was first awarded the Microsoft Most Valuable Professional Award for SharePoint in 2005, becoming only the 9th MVP for WSS at the time. I fulfilled a life long dream by pledging allegiance to the Flag as a US citizen in 2006. I met the love of my life and became a private consultant in 2008. I was honored to receive my ninth MVP award for SharePoint Server in 2013.

Leave a comment

You must be logged in to post a comment.