Monday 31 March 2014

A better diff system than DiffUtils?

After looking around with some help, the help guided me to DiffMatchPatch ( google-diff-match-patch ), as the name says, a diff system from Google.

It was decided to use this instead of DiffUtils because you can compare two Strings instead of two List<String> and the deltas can also be stored as a single text String which is quite a bit smaller storage needs than storing the DiffUtils.Patch object.

So as a new example of how to use DiffMatchPatch, I've rewritten the code from the previous blog post to use it with the same JUnit test.

Note that "firstVersion" and "secondVersion" Strings are the same value as they were in the previous blog post.

Example:


@Test
public void javaDiffTests()
{
 String firstVersion = getStringVersionOne();
 String secondVersion = getStringVersionTwo();

 String result = getDiffMatchPatch(firstVersion, secondVersion);
 if(!result.equals(secondVersion))
 {
  fail("diff test has failed, they are not the same. Check the getDiffUtils() method");
 }
}

public String getDiffMatchPatch(String firstVersion, String secondVersion)
{
 DiffMatchPatch dmp = new DiffMatchPatch();
 LinkedList<DiffMatchPatch.Diff> result = dmp.diff_main(firstVersion, secondVersion);

 LinkedList<DiffMatchPatch.Patch> patches = dmp.patch_make(result);

 String patchesText = dmp.patch_toText(patches);//This is to prove that you can store your diffs as a single String
 LinkedList<DiffMatchPatch.Patch> revivedFromTextPatches = (LinkedList<DiffMatchPatch.Patch>) dmp.patch_fromText(patchesText);

 Object[] patchedResultObjects = dmp.patch_apply(revivedFromTextPatches, firstVersion);

 boolean[] passRate = (boolean[]) patchedResultObjects[1];//"...The second element is an array of true/false values indicating which of the patches were successfully applied."

 for(boolean b : passRate)
 {
  if(!b)//we only want to proceed if all patches were successfully applied
   return null;
 }

 String merge = (String) patchedResultObjects[0];//"...The first element of the return value is the newly patched text."

 return merge;
}


Links:

The site for DiffMatchPatch is here .
You can find the Maven dependency for DiffMatchPatch from its forked project DiffPatch here .

An example of how to use DiffUtils

Might want to look at the newer blog post about using DiffMatchPatch instead of DiffUtils?

Summary:

The code that follows is a JUnit test example of how to use the DiffUtils API.

This example has two variables and two goals. 

The first variable is the original String (firstVersion).
The second variable is how the original String has been edited (secondVersion).

The first goal is to print out the Delta's (differences) of the edited String compared to the original String.

The second goal is to initiate a "rebuilding process". Taking the original edited String, applying the new found Delta's (changes if you like) to the original String to return a result String. If the "rebuilding process"(ie: the Patch object methods) is successful, the result String should be identical to the secondVersion (edited String).

Code:


@Test
public void diffUtilsExample()
{
 String firstVersion = "This is a sentence with no change.\n" +
   "\n" +
   "No change here.\n" +
   "No change here. No change here. No change here. No change here.\n" +
   "No change here. No change here.\n" +
   "No change here.\n" +
   "No change here.\n" +
   "No change here. No change here. No change here.\n" +
   "No change here.\n" +
   "\n" +
   "No change here.\n" +
   "No change here. No change here. No change here.";

 String secondVersion = "This is a sentence with some change.\n" +
   "\n" +
   "No change here.\n" +
   "No change here. No change here. No change here. No change here.\n" +
   "Changed over here. No change here.\n" +
   "No change here.\n" +
   "No change here.\n" +
   "No change here. Also changed here. No change here.\n" +
   "No change here.\n" +
   "\n" +
   "No change here.\n" +
   "No change here. Some changes over here. No change here.";

 String result = getDiff(firstVersion, secondVersion, "\n");
 if(!result.equals(secondVersion))
 {
  fail("diff test has failed, they are not the same. Check the getDiff() method");
 }
}

public String getDiff(String firstVersion, String secondVersion, String splitValue)
{
 List<String> original = new ArrayList(Arrays.asList(firstVersion.split(splitValue)));
 List<String> revised = new ArrayList(Arrays.asList(secondVersion.split(splitValue)));

 Patch patch = DiffUtils.diff(original, revised);

 for(Delta delta : patch.getDeltas())
 {
  System.out.println(delta);
 }

 try {
  List<String> result = (List<String>) patch.applyTo(original);

  if(!result.equals(revised))
  {
   fail("the patch.applyTo 'rebuild from diffs' method has not produced a result that matches the revised string-list");
   return null;
  }

  StringBuilder stringList = new StringBuilder();

  for(int i = 0; i < result.size(); i++)
  {
   String s = result.get(i);
   if(i != result.size() - 1)
    stringList.append(s + splitValue);
   else
    stringList.append(s);
  }

  String merge = String.valueOf(stringList);

  return merge;
 } catch (PatchFailedException e) {
  e.printStackTrace();
 }
 return null;
}

Notes:

The java-diff-utils libraries can be found here .

I haven't investigated a way to use DiffUtils without using two List<String> as the comparison variables. I wish I could simply use two Strings as the arguments for the DiffUtils.diff() method. Hope I'll come across this answer sometime in the future. Found a solution for this problem in my newer post .

Friday 28 March 2014

Getting mutable (non-finite) values from a String array.

Problem:

Had a String that I needed to split into Strings by new-line and put into List<String> , afterwards adding more values to the List.

So code looked something like this:
String sentence = "The quick brown fox \n" +
 "jumped over the lazy dog.";
 
List<String> lines = Arrays.asList(sentence.split("\n"));

lines.add("\n" + " ...and the fox escaped.");

But that didn't work. Got an java.lang.UnsupportedOperationException .

The fix:

I found out the problem was that Arrays.asList was returning a finite List, but I needed to add to it. So the work around to that was to make sure the asList was given to an ArrayList which can take your array of Strings[] and turn it into a mutable list.

List<String> lines = new ArrayList(Arrays.asList(sentence.split("\n")));


Context Notes:
This post was from me working with DiffUtils (com.googlecode.java-diff-utils) . Taking lines of text from an original String and detecting the Delta 's and then reapplying the deltas to a "rebuild" String that would match the original as a JUnit test.
This stackoverflow question gave me the answer under "For a Mutable List".