Comparing EMF Models

2008-07-6

At work, we needed a mechanism to compare EMF models. We are developing a system that uses ATL model to model transformations. We wanted to validate the transformations with JUnit unit tests. We needed a mechanism that let us compare expected models with transformed output models.

EMF Compare

I started evaluating the EMF Compare Framework. This framework targets model comparison by building a EMF model of the differences found during the comparison process. It let you evaluate differences pretty exhaustively. It evaluates source and target models by trying to match parallel elements. For related elements, matching is applied recursively.

The differences model is build of matched and unmatched elements. Matched elements let you examine the similarity of the matching (a number between 0 and 1). It seems as if two elements match just by having the same metaclass. Then, the similarity is set depending of how similar are their attributes. If the two models are identical, the root elements matching similarity is 1. If you change the value of two attributes, for example, then this precision was over 0.9. If you just change the order of two nodes, the precision was again under 1, even when the references in the metamodel weres unordered. This behavior wasn’t very good for our purposes, since models under comparison could have different orders in their references.

If you think in the semantic of the comparison, two models can be identical when their unordered references don’t have the same order, don’t they? I suspect that EMF Compare probably would let you configure different comparison strategies easily, but we found a simpler way to achieve what we wanted.

Modifying the EcoreUtil.EqualityHelper class to ignore order in references

A friend told me have a look at the equals() method provided by the org.eclipse.emf.ecore.util.EcoreUtil class. It receives two EObject elements and compare them recursively. The two models have to had exactly the same structure in order be equal (no matter wether references were ordered or unordered). Again stabbed with the same problem. A look at the source of the method revealed that it delegates completely the functionality in a helper inner class: EqualityHelper. This class code is well factorized and it can be understood easily. The comparison of two list of elements was properly contained in an equals(List, List) method. So we try to hack this method in order to ignore orders when comparing.

After spending some hours trying to make a very complex modification of the method, another friend proposed the simplest way to compare two lists: sort them both and then compare sorted lists expecting exactly the same order. The solution was as obvious as wonderfully easy to implement: we already had the exhaustive comparison so we only needed to center our efforts in sorting the lists.

The last thing we needed was to find an EObject comparison criteria in order to implement the proper java.util.Comparator. This wasn’t that easy and we finally ended up parsing the toString() method result so we can obtain the attributes list string (the attribute’s name is hard coded in the EMF generated code of each concrete toString() method).

Below the source code of the modified EqualityHelper is shown.

public class EMFComparator extends EcoreUtil.EqualityHelper {
 
  class EObjectComparator implements Comparator<EObject> {
    public int compare(EObject object1, EObject object2) {
      String targetString1 = extractComparisonString(object1);
      String targetString2 = extractComparisonString(object2);

      return targetString1.compareTo(targetString2);
    }

    private String extractComparisonString(EObject object) {
      return object.toString().replaceAll(
          object.getClass().getName(), "").replaceAll(
          Integer.toHexString(object.hashCode()), "");
    }
  }

  @Override
  public boolean equals(List list1, List list2) {
    Comparator comparator = new EObjectComparator();

    List<EObject> sortedList1 = new ArrayList<EObject>(list1);
    List<EObject> sortedList2 = new ArrayList<EObject>(list2);

    Collections.sort(sortedList1, comparator);
    Collections.sort(sortedList2, comparator);
   
    return super.equals(sortedList1, sortedList2);
  }

Conclusion

I was a bit surprised of not finding this problem solved when googling for it. I’m sure more people have had this need and have solved this problem before. When you are transforming models, you need a formal way to compare real output models and expected output models. I wouldn’t develop a complex transformation system with a lot of transformation rules without this system. Lateral effects when modifying transformation rules can break the system and being easily unnoticed. If transformations are an essential mechanism in a data loading system, like our case is, this danger is not acceptable.

Update (2008-11-19)

There was an error in the code originally posted. The method extractComparisonString() received an String object, when it should receive an EObject (it didn’t make sense). Thank you so much to Jim Showalter for warning me. I should have copy/pasted from what I coded at work, instead of rewriting at home.

You can also download an Eclipse sample project that compares two UML models. It includes the source of the comparator.

There are 11 comments in this article:

  1. 2008-11-19Code Reviewer say:

    This code doesn’t actually work. If you look at the chain of calls, you’re basically doing: replace o1.toString().hashCode() in o1.toString(), but o1.toString() contains the hashCode of o1, not of o1′s toString(). Also, there’s no reason to replace the class name, and even if there was you’re replacing o1.toString().getClass().getName(), which will always return java.lang.String, when what you want is to replace o1.getClass().getName(). And even if you get both replaces working, ordering will fail when comparing two models generated from the same diagram but with different seeds for the IDs, because the IDs won’t match, so the sorts won’t match.”

  2. 2009-02-8Errors in my post on Comparing EMF Models - Jorge Manrubia say:

    [...] was an error in the code originally posted in my article on Comparing EMF models. Thank you so much to Jim Showalter for warning me. The post has been updated with the right code. [...]

  3. 2010-11-3Stephen say:

    Jorge, I used your code as a base for my own comparator. Thanks.

    EqualityHelper flags two operations as mismatched if the same parameter has a superclass on one side, but not on the other (or a different superclass). In other words, they have different eTypes. I would consider the pramaters’ classes to be mismatched, but not their operations.

    The funny thing is that the parameters themselves do not get flagged as not matching, only the operations. Which is rather misleading.

  4. 2011-08-4Fabian say:

    Hi, the sorting didn’t worked for my EObjects, i think its sometimes not enough to take the Standard toString Method to sort by. So I have written my own EqualityHelper that really compares the two Lists without a care of order if the Ordered Attribute of the Reference is false. It works finde :)

    import java.util.*;
    import org.eclipse.emf.ecore.EObject;
    import org.eclipse.emf.ecore.EReference;
    import org.eclipse.emf.ecore.EStructuralFeature;
    import org.eclipse.emf.ecore.util.EcoreUtil;
    import org.eclipse.emf.ecore.util.EcoreUtil.EqualityHelper;

    public class CustomEqualityHelper extends EcoreUtil.EqualityHelper {

    private static final long serialVersionUID = 1L;

    public CustomEqualityHelper() {
        super();
    }

    private CustomEqualityHelper(CustomEqualityHelper emfComparator) {
        this.putAll(emfComparator);
    }

    @SuppressWarnings("unchecked")
    @Override
    protected boolean haveEqualReference(EObject eObject1, EObject eObject2, EReference reference)
    {
      Object value1 = eObject1.eGet(reference);
      Object value2 = eObject2.eGet(reference);

      if(reference.isMany())
      {
          // *** important change for Unordered Lists here:
          if(reference.isOrdered())
              return equals((List<EObject>)value1, (List<EObject>)value2);
          else
              return equalsUnOrdered((List<EObject>)value1, (List<EObject>)value2);
      }
      else
      {
          return equals((EObject)value1, (EObject)value2);
      }

    }

    public boolean equalsUnOrdered(List<EObject> list1, List<EObject> list2)
    {
        // if the Lists haven't the same size they can't be equal
        if(list1.size()!=list2.size()) return false;

        // save the Equal pairs: list1 Elements as Key and list2 Elements as Value
    Map<EObject,EObject> equalFound = new HashMap<EObject,EObject>();

        for(EObject list1El:list1)
    {
        // try to find an equal Element for list1El in list2
        for(EObject list2El:list2)
        {
            // skip Elements of list2 that already have an equal Element in list1
            if(!equalFound.containsValue(list2El))
                // if list list1El is equal to list2El save it the Map
                // create a copy of the CustomEqualityHelper because it's made only for one recursive Compare
                if(new CustomEqualityHelper(this).equals(list1El, list2El))
                {
                    equalFound.put(list1El, list2El);
                    break;
                }
        }

        // if no equal Element for listEl1 was found in list2, they couldn't be equal
        if(!equalFound.containsKey(list1El)) return false;

        }

    // For all Elements of list1 an own equal Element in list2 was found.
    return true;

  5. 2011-08-7Jorge Manrubia say:

    Thank you Fabian. Your approach seems much better. Thanks for sharing it!

  6. 2012-02-23Annamalai say:

    Thanks dude for the solution. It really works great. We have used the code above and it works fine for quiet a bit of EMF Models. But sometimes it just runs endlessly for very long, i suppose it gets into a recursion that is endless when there are bi directional relationships.

    I am not sure though. But did you guys get anything like that during your working on this.

  7. 2012-02-23Jorge Manrubia say:

    Hi. Sorry, when I was using this I didn’t have any recursion problem. This solution was implemented some time ago and I am no longer working with EMF models anymore, so I am afraid I can’t be very helpful here. If it won’t end I imagine it has do with some association that is being followed when browsing the model and enter in a loop. I hope you found the problem

  8. 2012-03-5Annamalai say:

    Thanks Mate ! I found the problem and as guessed it is due to bi directional references which are associations. Therefore i have altered to code a little bit to handle such scenarios.

    If you find a reference & if the reference is not of containment type then i handle it in a different manner instead of letting it into equalsUnOrdered method. When they are references without containment true, only their attributes need to be compared. Therefore i divert the call into a new method for comparing references without containment. This halts the implementation from getting into any sort of recursive calling due to bi directional relationship. Not sure how to attach the code otherwise would be happy to do so.

    Do let me know how to attach the code to this comment, i would do it.

  9. 2012-03-5Jorge Manrubia say:

    Hi,

    Great. You can share your code wrapping it with a <code> tag:

    <code lang="java">
    //your code here
    </code>

    But don’t worry, just post it and I will format it if there is any problem. Thanks a lot for sharing

  10. 2012-03-6Annamalai say:

    Thanks Jorge

    I am documenting only the 3 methods that i have changed or introduced. Please refer to the above implementation for the remaining part of the code.

    1. haveEqualReference
    2. equalUnOrderedReferences
    3. compareReferenceObject

    Its a simple improvement. Considering the fact that we may definitely have bi directional relationship in our models, during compare this may cause the compare algorithm to fail due to recursive running. Therefore we need a terminating condition.

    I have used references with containment false, as the terminating conditon wherein for these objects only Attribute Compare is used. Reference Compare is by passed.

    Please do feel free to write to me if you need any further information on the implementation.

    @Override
      protected boolean haveEqualReference(final EObject eObject1, final EObject eObject2, final EReference reference) {
        Object value1 = eObject1.eGet(reference);
        Object value2 = eObject2.eGet(reference);


        if (reference.isMany()) {

          // *** important change for Unordered Lists here:
          if (reference.isOrdered()) {
            return equals((List<EObject>) value1, (List<EObject>) value2);
          }

          /**
           * Handle Recursion or Bi Directional References. Only if they are containments, then we need to parse them
           * recursively. Otherwise is if it only association, then attribute level compare is enough.
           */

          boolean containment = reference.isContainment();
          if (containment) {
            return equalsUnOrdered((List<EObject>) value1, (List<EObject>) value2);
          }

          // In bidirectional references, if the object is the container of the current object being visited.
          // It can return true. You need not check on Containers.
          if (reference.isContainer()) {
            return true;
          }

          // Finally if we find out that they are not containments nor containers,
          // then they are treated as unorderedReferences.
          // These need to be compared only based on Attributes.
          return equalUnOrderedReferences((List<EObject>) value1, (List<EObject>) value2);
          // End of Compare.
        }

        if ((value1 != null) && (value2 != null)) {

          boolean containment = reference.isContainment();
          boolean equals = true;
          if (containment) {
            equals = equals((EObject) value1, (EObject) value2);
          }
          else if (reference.isContainer()) {
            equals = true;
          }
          else {
            equals = compareReferenceObject((EObject) value1, (EObject) value2);
          }
          return equals;
        }

        return true;
      }

      /**
       * @param value1
       * @param value2
       * @return
       */

      private boolean equalUnOrderedReferences(final List<EObject> list1, final List<EObject> list2) {
        if (list1.size() != list2.size()) {
          // return false;
          if (this.trace) {
            System.out.println("Reference Sizes is not Equal");
          }
          return false;
        }


        // save the Equal pairs: list1 Elements as Key and list2 Elements as Value
        Map<EObject, EObject> equalFound = new HashMap<EObject, EObject>();
        boolean result = true;
        for (EObject list1El : list1) {
          // try to find an equal Element for list1El in list2
          for (EObject list2El : list2) {
            // skip Elements of list2 that already have an equal Element in list1
            if (!equalFound.containsValue(list2El)) {
              // if list list1El is equal to list2El save it the Map
              // Being only a Reference, we need to compare only its Attributes and not its further references.
              // Therefore i get only the Attributes and validate them.
              result = compareReferenceObject(list1El, list2El);
              if (result) {
                equalFound.put(list1El, list2El);
                break;
              }
            }
          }

          // if no equal Element for listEl1 was found in list2, they couldn't be equal
          if (!equalFound.containsKey(list1El)) {
            return false;
          }

        }

        // For all Elements of list1 an own equal Element in list2 was found.
        return result;
      }

      /**
       * @param result
       * @param object1
       * @param object2
       * @return
       */

      private boolean compareReferenceObject(final EObject object1, final EObject object2) {
        boolean result = true;
        EList<EAttribute> eAllAttributes = object1.eClass().getEAllAttributes();
        for (EAttribute eAttribute : eAllAttributes) {
          if (!haveEqualAttribute(object1, object2, eAttribute)) {
            result = false;
            break;
          }
        }
        if (result) {
          compareMap.put(object1, object2);
        }
        else {
          compareMap.put(object1, null);
        }
        return result;
      }

  11. 2012-03-6Jorge Manrubia say:

    Great work!! Thanks a lot for sharing.

Write a comment: