voxmedia/hashdiff

Name: hashdiff

Owner: Vox Media

Description: HashDiff is a ruby library to to compute the smallest difference between two hashes

Forked from: liufengyun/hashdiff

Created: 2017-07-17 17:51:57.0

Updated: 2017-08-29 02:00:01.0

Pushed: 2017-07-17 18:54:20.0

Homepage: null

Size: 170

Language: Ruby

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

HashDiff Build Status Gem Version

HashDiff is a ruby library to compute the smallest difference between two hashes.

It also supports comparing two arrays.

HashDiff does not monkey-patch any existing class. All features are contained inside the HashDiff module.

Docs: Documentation

Why HashDiff?

Given two Hashes A and B, sometimes you face the question: what's the smallest modification that can be made to change A into B?

An algorithm that responds to this question has to do following:

HashDiff answers the question above using an opinionated approach:

Usage

To use the gem, add the following to your Gemfile:

'hashdiff'
Quick Start
Diff

Two simple hashes:

{a:3, b:2}
{}

 = HashDiff.diff(a, b)
.should == [['-', 'a', 3], ['-', 'b', 2]]

More complex hashes:

{a:{x:2, y:3, z:4}, b:{x:3, z:45}}
{a:{y:3}, b:{y:3, z:30}}

 = HashDiff.diff(a, b)
.should == [['-', 'a.x', 2], ['-', 'a.z', 4], ['-', 'b.x', 3], ['~', 'b.z', 45, 30], ['+', 'b.y', 3]]

Arrays in hashes:

{a:[{x:2, y:3, z:4}, {x:11, y:22, z:33}], b:{x:3, z:45}}
{a:[{y:3}, {x:11, z:33}], b:{y:22}}

 = HashDiff.best_diff(a, b)
.should == [['-', 'a[0].x', 2], ['-', 'a[0].z', 4], ['-', 'a[1].y', 22], ['-', 'b.x', 3], ['-', 'b.z', 45], ['+', 'b.y', 22]]
Patch

patch example:

{a: 3}
{a: {a1: 1, a2: 2}}

 = HashDiff.diff(a, b)
Diff.patch!(a, diff).should == b

unpatch example:

[{a: 1, b: 2, c: 3, d: 4, e: 5}, {x: 5, y: 6, z: 3}, 1]
[1, {a: 1, b: 2, c: 3, e: 5}]

 = HashDiff.diff(a, b) # diff two array is OK
Diff.unpatch!(b, diff).should == a
Options

There are six options available: :delimiter, :similarity, :strict, :numeric_tolerance, :strip and :case_insensitive.

:delimiter

You can specify :delimiter to be something other than the default dot. For example:

{a:{x:2, y:3, z:4}, b:{x:3, z:45}}
{a:{y:3}, b:{y:3, z:30}}

 = HashDiff.diff(a, b, :delimiter => '\t')
.should == [['-', 'a\tx', 2], ['-', 'a\tz', 4], ['-', 'b\tx', 3], ['~', 'b\tz', 45, 30], ['+', 'b\ty', 3]]
:similarity

In cases where you have similar hash objects in arrays, you can pass a custom value for :similarity instead of the default 0.8. This is interpreted as a ratio of similarity (default is 80% similar, whereas :similarity => 0.5 would look for at least a 50% similarity).

:strict

The :strict option, which defaults to true, specifies whether numeric types are compared on type as well as value. By default, a Fixnum will never be equal to a Float (e.g. 4 != 4.0). Setting :strict to false makes the comparison looser (e.g. 4 == 4.0).

:numeric_tolerance

The :numeric_tolerance option allows for a small numeric tolerance.

{x:5, y:3.75, z:7}
{x:6, y:3.76, z:7}

 = HashDiff.diff(a, b, :numeric_tolerance => 0.1)
.should == [["~", "x", 5, 6]]
:strip

The :strip option strips all strings before comparing.

{x:5, s:'foo '}
{x:6, s:'foo'}

 = HashDiff.diff(a, b, :comparison => { :numeric_tolerance => 0.1, :strip => true })
.should == [["~", "x", 5, 6]]
:case_insensitive

The :case_insensitive option makes string comparisions ignore case.

{x:5, s:'FooBar'}
{x:6, s:'foobar'}

 = HashDiff.diff(a, b, :comparison => { :numeric_tolerance => 0.1, :case_insensitive => true })
.should == [["~", "x", 5, 6]]
Specifying a custom comparison method

It's possible to specify how the values of a key should be compared.

{a:'car', b:'boat', c:'plane'}
{a:'bus', b:'truck', c:' plan'}

 = HashDiff.diff(a, b) do |path, obj1, obj2|
se path
en  /a|b|c/
obj1.length == obj2.length
d


.should == [['~', 'b', 'boat', 'truck']]

The yielded params of the comparison block is |path, obj1, obj2|, in which path is the key (or delimited compound key) to the value being compared. When comparing elements in array, the path is with the format array[*]. For example:

{a:'car', b:['boat', 'plane'] }
{a:'bus', b:['truck', ' plan'] }

 = HashDiff.diff(a, b) do |path, obj1, obj2|
se path
en 'b[*]'
obj1.length == obj2.length
d


.should == [["~", "a", "car", "bus"], ["~", "b[1]", "plane", " plan"], ["-", "b[0]", "boat"], ["+", "b[0]", "truck"]]

When a comparison block is given, it'll be given priority over other specified options. If the block returns value other than true or false, then the two values will be compared with other specified options.

Sorting arrays before comparison

An order difference alone between two arrays can create too many diffs to be useful. Consider sorting them prior to diffing.

{a:'car', b:['boat', 'plane'] }
{a:'car', b:['plane', 'boat'] }

Diff.diff(a, b) => [["+", "b[0]", "plane"], ["-", "b[2]", "plane"]]

].sort!

Diff.diff(a, b) => []
License

HashDiff is distributed under the MIT-LICENSE.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.