Skip to content
This repository has been archived by the owner on Nov 30, 2022. It is now read-only.

Add support for Array Access Requests in MongoDB [#146][#147] #194

Merged
merged 26 commits into from
Feb 16, 2022
Merged
Show file tree
Hide file tree
Changes from 25 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
534f0ed
Start expanding initial mongo data populated and the mongo example da…
pattisdr Jan 28, 2022
cce348f
Instead of using pandas json normalize to only retrieve data categori…
pattisdr Feb 2, 2022
3e0629f
Cache the inputs that were used to locate records on each collection …
pattisdr Feb 2, 2022
06b0445
Use inputs into the collection to potentially filter array data to on…
pattisdr Feb 2, 2022
16cc23f
Update `to_dask_input_data` to consolidate array outputs and outputs …
pattisdr Feb 3, 2022
e0669e5
Don't delete empty dicts out of arrays -
pattisdr Feb 3, 2022
d0f1cd6
Uncomment more complex mongo dataset annotations and add more detaile…
pattisdr Feb 3, 2022
9854c34
Add draft of build_incoming_refined_target_paths to recursively expan…
pattisdr Feb 6, 2022
48bb90b
First draft of adding method to remove embedded documents and array i…
pattisdr Feb 6, 2022
47849cd
Before filtering results on data category, first run "remove_unmatche…
pattisdr Feb 7, 2022
ff87403
First cleanup round, reorganize/rename newly added methods, breaking …
pattisdr Feb 7, 2022
ec4ca99
Add more detailed tests on inner components of refine_target_path and…
pattisdr Feb 7, 2022
4c8131b
Add some more integration tests on accessing array data in mongo, end…
pattisdr Feb 8, 2022
e035810
Refactor so "filter_element_match" happens after each access request,…
pattisdr Feb 8, 2022
9c9e650
Move filter_data_categories back to "graph_task.py" so the diff is ea…
pattisdr Feb 8, 2022
fa39289
Update quickstart expected results from access request to include nes…
pattisdr Feb 8, 2022
71edc58
Give the postgres and mongo connection configs write access in postma…
pattisdr Feb 8, 2022
054543e
Add logging for debugging purposes.
pattisdr Feb 8, 2022
2f047cd
Add guides for working with complex data (move nested object docs, an…
pattisdr Feb 8, 2022
61244ce
Merge branch 'main' into fidesops_146_array_support
pattisdr Feb 8, 2022
8efe79b
Fix failing test - (CI is incorrectly showing green).
pattisdr Feb 10, 2022
fba51ec
Merge remote-tracking branch 'ethyca/main' into fidesops_146_array_su…
pattisdr Feb 11, 2022
741c9dd
Address bug related to type coercion. Cast incoming values to the cor…
pattisdr Feb 15, 2022
ff66c7b
Rename remove_empty_objects to remove_empty_containers and use it to …
pattisdr Feb 15, 2022
e30a76e
Turn all customer ids into integers on mongo collections.
pattisdr Feb 15, 2022
9269ce1
Rephrase docstring.
pattisdr Feb 15, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
99 changes: 71 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,35 +55,78 @@ information for Jane across tables in both the postgres and mongo databases.

```json
{
"postgres_example_test_dataset:customer": [
{
"email": "[email protected]",
"name": "Jane Customer"
}
],
"postgres_example_test_dataset:address": [
{
"city": "Example Mountain",
"house": 1111,
"state": "TX",
"street": "Example Place",
"zip": "54321"
}
],
"postgres_example_test_dataset:payment_card": [
{
"ccn": 373719391,
"code": 222,
"name": "Example Card 3"
}
],
"mongo_test:customer_details": [
{
"gender": "female",
"birthday": "1990-02-28T00:00:00"
}
]
"mongo_test:flights": [
{
"passenger_information": {
"full_name": "Jane Customer"
}
}
],
"mongo_test:payment_card": [
{
"ccn": "987654321",
"name": "Example Card 2",
"code": "123"
}
],
"postgres_example_test_dataset:address": [
{
"zip": "54321",
"street": "Example Place",
"state": "TX",
"city": "Example Mountain",
"house": 1111
}
],
"mongo_test:customer_details": [
{
"birthday": "1990-02-28T00:00:00",
"gender": "female",
"children": [
"Erica Example"
]
}
],
"postgres_example_test_dataset:customer": [
{
"email": "[email protected]",
"name": "Jane Customer"
}
],
"postgres_example_test_dataset:payment_card": [
{
"ccn": 373719391,
"name": "Example Card 3",
"code": 222
}
],
"mongo_test:employee": [
{
"email": "[email protected]",
"name": "Jane Employee"
}
],
"mongo_test:conversations": [
{
"thread": [
{
"chat_name": "Jane C"
}
]
},
{
"thread": [
{
"chat_name": "Jane C"
},
{
"chat_name": "Jane C"
}
]
}
]
}

```

### Step Four: Create an Erasure Policy
Expand Down
157 changes: 157 additions & 0 deletions data/dataset/mongo_example_test_dataset.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ dataset:
fidesops_meta:
data_type: string
- name: workplace_info
fidesops_meta:
data_type: object
fields:
- name: employer
fidesops_meta:
Expand All @@ -33,8 +35,51 @@ dataset:
data_categories: [ user.provided.identifiable.job_title ]
fidesops_meta:
data_type: string
- name: direct_reports
data_categories: [ user.provided.identifiable.name ]
fidesops_meta:
data_type: string[]
- name: emergency_contacts
fidesops_meta:
data_type: object[]
fields:
- name: name
data_categories: [ user.provided.identifiable.name ]
fidesops_meta:
data_type: string
- name: relationship
fidesops_meta:
data_type: string
- name: phone
data_categories: [ user.provided.identifiable.contact.phone_number ]
fidesops_meta:
data_type: string
- name: children
data_categories: [ user.provided.identifiable.childrens ]
fidesops_meta:
data_type: string[]
- name: travel_identifiers
fidesops_meta:
data_type: string[]
data_categories: [system.operations]
- name: comments
fidesops_meta:
data_type: object[]
fields:
- name: comment_id
fidesops_meta:
data_type: string
references:
- dataset: mongo_test
field: conversations.thread.comment
direction: to
- name: internal_customer_profile
fields:
- name: _id
data_categories: [ system.operations ]
fidesops_meta:
primary_key: True
data_type: object_id
- name: customer_identifiers
fields:
- name: internal_id
Expand All @@ -44,6 +89,11 @@ dataset:
- dataset: mongo_test
field: customer_feedback.customer_information.internal_customer_id
direction: from
- name: derived_emails
data_categories: [user.derived]
fidesops_meta:
data_type: string[]
identity: email
- name: derived_interests
data_categories: [ user.derived ]
fidesops_meta:
Expand Down Expand Up @@ -81,3 +131,110 @@ dataset:
data_categories: [ user.provided.nonidentifiable ]
fidesops_meta:
data_type: string
- name: flights
fields:
- name: _id
data_categories: [ system.operations ]
fidesops_meta:
primary_key: True
data_type: object_id
- name: passenger_information
fields:
- name: passenger_ids
fidesops_meta:
data_type: string[]
references:
- dataset: mongo_test
field: customer_details.travel_identifiers
direction: from
- name: full_name
data_categories: [user.provided.identifiable.name]
fidesops_meta:
data_type: string
- name: flight_no
- name: date
- name: pilots
data_categories: [ system.operations ]
fidesops_meta:
data_type: string[]
- name: plane
data_categories: [ system.operations ]
fidesops_meta:
data_type: integer
- name: conversations
fidesops_meta:
data_type: object[]
fields:
- name: thread
fields:
- name: comment
fidesops_meta:
data_type: string
- name: message
fidesops_meta:
data_type: string
- name: chat_name
data_categories: [ user.provided.identifiable.name ]
fidesops_meta:
data_type: string
- name: employee
fields:
- name: email
data_categories: [ user.provided.identifiable.contact.email ]
fidesops_meta:
identity: email
data_type: string
- name: id
data_categories: [ user.derived.identifiable.unique_id ]
fidesops_meta:
primary_key: True
references:
- dataset: mongo_test
field: flights.pilots
direction: from
- name: name
data_categories: [ user.provided.identifiable.name ]
fidesops_meta:
data_type: string
- name: aircraft
fields:
- name: _id
data_categories: [ system.operations ]
fidesops_meta:
primary_key: True
data_type: object_id
- name: planes
data_categories: [ system.operations ]
fidesops_meta:
data_type: string[]
references:
- dataset: mongo_test
field: flights.plane
direction: from
- name: model
data_categories: [ system.operations ]
fidesops_meta:
data_type: string
- name: payment_card
fields:
- name: billing_address_id
data_categories: [ system.operations ]
- name: ccn
data_categories: [ user.provided.identifiable.financial.account_number ]
fidesops_meta:
references:
- dataset: mongo_test
field: conversations.thread.ccn
direction: from
- name: code
data_categories: [ user.provided.identifiable.financial ]
- name: customer_id
data_categories: [ user.derived.identifiable.unique_id ]
- name: id
data_categories: [ system.operations ]
fidesops_meta:
primary_key: True
- name: name
data_categories: [ user.provided.identifiable.financial ]
- name: preferred
data_categories: [ user.provided.nonidentifiable ]
Loading