Skip to content

Commit

Permalink
Merge pull request #273 from seperman/dev
Browse files Browse the repository at this point in the history
5.6.0
  • Loading branch information
seperman authored Oct 13, 2021
2 parents d82d4fc + 4c6fe59 commit 06faa82
Show file tree
Hide file tree
Showing 34 changed files with 871 additions and 157 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/main.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.6, 3.7, 3.8, 3.9]
python-version: [3.7, 3.8, 3.9, "3.10"]
architecture: ["x64"]

steps:
Expand Down
10 changes: 7 additions & 3 deletions AUTHORS.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Authors:
# Authors

Authors in order of the contributions:
Authors in order of the timeline of their contributions:

- [Sep Dehpour (Seperman)](http://www.zepworks.com)
- [Victor Hahn Castell](http://hahncastell.de) for the tree view and major contributions:
Expand Down Expand Up @@ -36,4 +36,8 @@ Authors in order of the contributions:
- Tim Klein [timjklein36](https://github.com/timjklein36) for retaining the order of multiple dictionary items added via Delta.
- Wilhelm Schürmann[wbsch](https://github.com/wbsch) for fixing the typo with yml files.
- [lyz-code](https://github.com/lyz-code) for adding support for regular expressions in DeepSearch and strict_checking feature in DeepSearch.
- [dtorres-sf](https://github.com/dtorres-sf)for adding the option for custom compare function
- [dtorres-sf](https://github.com/dtorres-sf) for adding the option for custom compare function
- Tony Wang [Tony-Wang](https://github.com/Tony-Wang) for bugfix: verbose_level==0 should disable values_changes.
- Sun Ao [eggachecat](https://github.com/eggachecat) for adding custom operators.
- Sun Ao [eggachecat](https://github.com/eggachecat) for adding ignore_order_func.
- [SlavaSkvortsov](https://github.com/SlavaSkvortsov) for fixing unprocessed key error.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# DeepDiff Change log

- v5-6-0: Adding custom operators, and ignore_order_func. Bugfix: verbose_level==0 should disable values_changes. Bugfix: unprocessed key error.
- v5-5-0: adding iterable_compare_func for DeepDiff, adding output_format of list for path() in tree view.
- v5-4-0: adding strict_checking for numbers in DeepSearch.
- v5-3-0: add support for regular expressions in DeepSearch.
Expand Down
84 changes: 64 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# DeepDiff v 5.5.0
# DeepDiff v 5.6.0

![Downloads](https://img.shields.io/pypi/dm/deepdiff.svg?style=flat)
![Python Versions](https://img.shields.io/pypi/pyversions/deepdiff.svg?style=flat)
Expand All @@ -18,21 +18,65 @@ Tested on Python 3.6+ and PyPy3.

**NOTE: The last version of DeepDiff to work on Python 3.5 was DeepDiff 5-0-2**

- [Documentation](https://zepworks.com/deepdiff/5.5.0/)
- [Documentation](https://zepworks.com/deepdiff/5.6.0/)

## What is new?

Deepdiff 5.5.0 comes with regular expressions in the DeepSearch and grep modules:
DeepDiff 5-6-0 allows you to pass custom operators.

```python
>>> from deepdiff import grep
>>> from pprint import pprint
>>> obj = ["something here", {"long": "somewhere", "someone": 2, 0: 0, "somewhere": "around"}]
>>> ds = obj | grep("some.*", use_regexp=True)
{ 'matched_paths': ["root[1]['someone']", "root[1]['somewhere']"],
'matched_values': ['root[0]', "root[1]['long']"]}
>>> from deepdiff import DeepDiff
>>> from deepdiff.operator import BaseOperator
>>> class CustomClass:
... def __init__(self, d: dict, l: list):
... self.dict = d
... self.dict['list'] = l
...
>>>
>>> custom1 = CustomClass(d=dict(a=1, b=2), l=[1, 2, 3])
>>> custom2 = CustomClass(d=dict(c=3, d=4), l=[1, 2, 3, 2])
>>> custom3 = CustomClass(d=dict(a=1, b=2), l=[1, 2, 3, 4])
>>>
>>>
>>> class ListMatchOperator(BaseOperator):
... def give_up_diffing(self, level, diff_instance):
... if set(level.t1.dict['list']) == set(level.t2.dict['list']):
... return True
...
>>>
>>> DeepDiff(custom1, custom2, custom_operators=[
... ListMatchOperator(types=[CustomClass])
... ])
{}
>>>
>>>
>>> DeepDiff(custom2, custom3, custom_operators=[
... ListMatchOperator(types=[CustomClass])
... ])
{'dictionary_item_added': [root.dict['a'], root.dict['b']], 'dictionary_item_removed': [root.dict['c'], root.dict['d']], 'values_changed': {"root.dict['list'][3]": {'new_value': 4, 'old_value': 2}}}
>>>

```

**New in 5-6-0: Dynamic ignore order function**

Ignoring order when certain word in the path

```python
>>> from deepdiff import DeepDiff
>>> t1 = {'a': [1, 2], 'b': [3, 4]}
>>> t2 = {'a': [2, 1], 'b': [4, 3]}
>>> DeepDiff(t1, t2, ignore_order=True)
{}
>>> def ignore_order_func(level):
... return 'a' in level.path()
...
>>> DeepDiff(t1, t2, ignore_order=True, ignore_order_func=ignore_order_func)
{'values_changed': {"root['b'][0]": {'new_value': 4, 'old_value': 3}, "root['b'][1]": {'new_value': 3, 'old_value': 4}}}

```


## Installation

### Install from PyPi:
Expand Down Expand Up @@ -66,13 +110,13 @@ Note: if you want to use DeepDiff via commandline, make sure to run `pip install

DeepDiff gets the difference of 2 objects.

> - Please take a look at the [DeepDiff docs](https://zepworks.com/deepdiff/5.5.0/diff.html)
> - The full documentation of all modules can be found on <https://zepworks.com/deepdiff/5.5.0/>
> - Please take a look at the [DeepDiff docs](https://zepworks.com/deepdiff/5.6.0/diff.html)
> - The full documentation of all modules can be found on <https://zepworks.com/deepdiff/5.6.0/>
> - Tutorials and posts about DeepDiff can be found on <https://zepworks.com/tags/deepdiff/>
## A few Examples

> Note: This is just a brief overview of what DeepDiff can do. Please visit <https://zepworks.com/deepdiff/5.5.0/> for full documentation.
> Note: This is just a brief overview of what DeepDiff can do. Please visit <https://zepworks.com/deepdiff/5.6.0/> for full documentation.
### List difference ignoring order or duplicates

Expand Down Expand Up @@ -276,8 +320,8 @@ Example:
```


> - Please take a look at the [DeepDiff docs](https://zepworks.com/deepdiff/5.5.0/diff.html)
> - The full documentation can be found on <https://zepworks.com/deepdiff/5.5.0/>
> - Please take a look at the [DeepDiff docs](https://zepworks.com/deepdiff/5.6.0/diff.html)
> - The full documentation can be found on <https://zepworks.com/deepdiff/5.6.0/>

# Deep Search
Expand Down Expand Up @@ -309,17 +353,17 @@ And you can pass all the same kwargs as DeepSearch to grep too:
{'matched_paths': {"root['somewhere']": 'around'}, 'matched_values': {"root['long']": 'somewhere'}}
```

> - Please take a look at the [DeepSearch docs](https://zepworks.com/deepdiff/5.5.0/dsearch.html)
> - The full documentation can be found on <https://zepworks.com/deepdiff/5.5.0/>
> - Please take a look at the [DeepSearch docs](https://zepworks.com/deepdiff/5.6.0/dsearch.html)
> - The full documentation can be found on <https://zepworks.com/deepdiff/5.6.0/>
# Deep Hash
(New in v4-0-0)

DeepHash is designed to give you hash of ANY python object based on its contents even if the object is not considered hashable!
DeepHash is supposed to be deterministic in order to make sure 2 objects that contain the same data, produce the same hash.

> - Please take a look at the [DeepHash docs](https://zepworks.com/deepdiff/5.5.0/deephash.html)
> - The full documentation can be found on <https://zepworks.com/deepdiff/5.5.0/>
> - Please take a look at the [DeepHash docs](https://zepworks.com/deepdiff/5.6.0/deephash.html)
> - The full documentation can be found on <https://zepworks.com/deepdiff/5.6.0/>
Let's say you have a dictionary object.

Expand Down Expand Up @@ -367,8 +411,8 @@ Which you can write as:
At first it might seem weird why DeepHash(obj)[obj] but remember that DeepHash(obj) is a dictionary of hashes of all other objects that obj contains too.


> - Please take a look at the [DeepHash docs](https://zepworks.com/deepdiff/5.5.0/deephash.html)
> - The full documentation can be found on <https://zepworks.com/deepdiff/5.5.0/>
> - Please take a look at the [DeepHash docs](https://zepworks.com/deepdiff/5.6.0/deephash.html)
> - The full documentation can be found on <https://zepworks.com/deepdiff/5.6.0/>

# Using DeepDiff in unit tests
Expand Down
2 changes: 1 addition & 1 deletion deepdiff/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"""This module offers the DeepDiff, DeepSearch, grep, Delta and DeepHash classes."""
# flake8: noqa
__version__ = '5.5.0'
__version__ = '5.6.0'
import logging

if __name__ == '__main__':
Expand Down
7 changes: 3 additions & 4 deletions deepdiff/deephash.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,8 @@
from deepdiff.base import Base
logger = logging.getLogger(__name__)

UNPROCESSED_KEY = 'unprocessed'
UNPROCESSED_KEY = object()

RESERVED_DICT_KEYS = {UNPROCESSED_KEY}
EMPTY_FROZENSET = frozenset()

INDEX_VS_ATTRIBUTE = ('[%s]', '.%s')
Expand Down Expand Up @@ -185,7 +184,7 @@ def _getitem(hashes, obj, extract_index=0):
except KeyError:
raise KeyError(HASH_LOOKUP_ERR_MSG.format(obj)) from None

if isinstance(obj, strings) and obj in RESERVED_DICT_KEYS:
if obj is UNPROCESSED_KEY:
extract_index = None

return result_n_count if extract_index is None else result_n_count[extract_index]
Expand Down Expand Up @@ -229,7 +228,7 @@ def _get_objects_to_hashes_dict(self, extract_index=0):
"""
result = dict_()
for key, value in self.hashes.items():
if key in RESERVED_DICT_KEYS:
if key is UNPROCESSED_KEY:
result[key] = value
else:
result[key] = value[extract_index]
Expand Down
64 changes: 59 additions & 5 deletions deepdiff/diff.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,14 +28,13 @@
RemapDict, ResultDict, TextResult, TreeResult, DiffLevel,
DictRelationship, AttributeRelationship,
SubscriptableIterableRelationship, NonSubscriptableIterableRelationship,
SetRelationship, NumpyArrayRelationship)
SetRelationship, NumpyArrayRelationship, CUSTOM_FIELD)
from deepdiff.deephash import DeepHash, combine_hashes_lists
from deepdiff.base import Base
from deepdiff.lfucache import LFUCache, DummyLFU

logger = logging.getLogger(__name__)


MAX_PASSES_REACHED_MSG = (
'DeepDiff has reached the max number of passes of {}. '
'You can possibly get more accurate results by increasing the max_passes parameter.')
Expand Down Expand Up @@ -120,6 +119,7 @@ def __init__(self,
hasher=None,
hashes=None,
ignore_order=False,
ignore_order_func=None,
ignore_type_in_groups=None,
ignore_string_type_changes=False,
ignore_numeric_type_changes=False,
Expand All @@ -140,6 +140,7 @@ def __init__(self,
verbose_level=1,
view=TEXT_VIEW,
iterable_compare_func=None,
custom_operators=None,
_original_type=None,
_parameters=None,
_shared_parameters=None,
Expand All @@ -156,12 +157,17 @@ def __init__(self,
"cutoff_distance_for_pairs, cutoff_intersection_for_pairs, log_frequency_in_sec, cache_size, "
"cache_tuning_sample_size, get_deep_distance, group_by, cache_purge_level, "
"math_epsilon, iterable_compare_func, _original_type, "
"ignore_order_func, custom_operators, "
"_parameters and _shared_parameters.") % ', '.join(kwargs.keys()))

if _parameters:
self.__dict__.update(_parameters)
else:
self.custom_operators = custom_operators or []
self.ignore_order = ignore_order

self.ignore_order_func = ignore_order_func or (lambda *_args, **_kwargs: ignore_order)

ignore_type_in_groups = ignore_type_in_groups or []
if numbers == ignore_type_in_groups or numbers in ignore_type_in_groups:
ignore_numeric_type_changes = True
Expand Down Expand Up @@ -327,6 +333,24 @@ def _report_result(self, report_type, level):
level.report_type = report_type
self.tree[report_type].add(level)

def custom_report_result(self, report_type, level, extra_info=None):
"""
Add a detected change to the reference-style result dictionary.
report_type will be added to level.
(We'll create the text-style report from there later.)
:param report_type: A well defined string key describing the type of change.
Examples: "set_item_added", "values_changed"
:param parent: A DiffLevel object describing the objects in question in their
before-change and after-change object structure.
:param extra_info: A dict that describe this result
:rtype: None
"""

if not self._skip_this(level):
level.report_type = report_type
level.additional[CUSTOM_FIELD] = extra_info
self.tree[report_type].add(level)

@staticmethod
def _dict_from_slots(object):
def unmangle(attribute):
Expand Down Expand Up @@ -556,7 +580,7 @@ def _iterables_subscriptable(t1, t2):

def _diff_iterable(self, level, parents_ids=frozenset(), _original_type=None):
"""Difference of iterables"""
if self.ignore_order:
if self.ignore_order_func(level):
self._diff_iterable_with_deephash(level, parents_ids, _original_type=_original_type)
else:
self._diff_iterable_in_order(level, parents_ids, _original_type=_original_type)
Expand Down Expand Up @@ -1133,7 +1157,7 @@ def _diff_numpy_array(self, level, parents_ids=frozenset()):
# which means numpy module needs to be available. So np can't be None.
raise ImportError(CANT_FIND_NUMPY_MSG) # pragma: no cover

if not self.ignore_order:
if not self.ignore_order_func(level):
# fast checks
if self.significant_digits is None:
if np.array_equal(level.t1, level.t2):
Expand All @@ -1159,7 +1183,7 @@ def _diff_numpy_array(self, level, parents_ids=frozenset()):
dimensions = len(shape)
if dimensions == 1:
self._diff_iterable(level, parents_ids, _original_type=_original_type)
elif self.ignore_order:
elif self.ignore_order_func(level):
# arrays are converted to python lists so that certain features of DeepDiff can apply on them easier.
# They will be converted back to Numpy at their final dimension.
level.t1 = level.t1.tolist()
Expand Down Expand Up @@ -1219,6 +1243,33 @@ def _auto_off_cache(self):
self._stats[DISTANCE_CACHE_ENABLED] = False
self.progress_logger('Due to minimal cache hits, {} is disabled.'.format('distance cache'))

def _use_custom_operator(self, level):
"""
For each level we check all custom operators.
If any one of them was a match for the level, we run the diff of the operator.
If the operator returned True, the operator must have decided these objects should not
be compared anymore. It might have already reported their results.
In that case the report will appear in the final results of this diff.
Otherwise basically the 2 objects in the level are being omitted from the results.
"""

# used = False

# for operator in self.custom_operators:
# if operator.match(level):
# prevent_default = operator.diff(level, self)
# used = True if prevent_default is None else prevent_default

# return used

for operator in self.custom_operators:
if operator.match(level):
prevent_default = operator.give_up_diffing(level=level, diff_instance=self)
if prevent_default:
return True

return False

def _diff(self, level, parents_ids=frozenset(), _original_type=None):
"""
The main diff method
Expand All @@ -1232,6 +1283,9 @@ def _diff(self, level, parents_ids=frozenset(), _original_type=None):
if self._count_diff() is StopIteration:
return

if self._use_custom_operator(level):
return

if level.t1 is level.t2:
return

Expand Down
Loading

0 comments on commit 06faa82

Please sign in to comment.