Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

monitor containers for modified entities #51

Merged
merged 27 commits into from
Feb 15, 2022
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
95f9329
fix parameter names and formatting to cli
hhunterzinck Feb 1, 2022
9d94a32
monitor contents of a container
hhunterzinck Feb 1, 2022
a3aaf37
remove test for unimplemented function
hhunterzinck Feb 1, 2022
4e38847
remove print statement
hhunterzinck Feb 6, 2022
f9eec2c
add tests for modified containers
hhunterzinck Feb 6, 2022
abc17f5
remove unused modules
hhunterzinck Feb 6, 2022
254a5dc
update help messages in cli
hhunterzinck Feb 6, 2022
dd141fd
lint using black
hhunterzinck Feb 6, 2022
013836f
patch all function calls in modified container function
hhunterzinck Feb 7, 2022
34dd50b
refactor _traverse
hhunterzinck Feb 8, 2022
057c768
Update synapsemonitor/monitor.py
hhunterzinck Feb 9, 2022
dbae3a4
rewrite boolean for traverse and extend
hhunterzinck Feb 9, 2022
9bbb4fa
lint with black
hhunterzinck Feb 9, 2022
90d186a
fix filtering by type at end of traverse
hhunterzinck Feb 9, 2022
917dcca
only traverse containers to limit recursive calls
hhunterzinck Feb 9, 2022
5c57b35
lint with black
hhunterzinck Feb 9, 2022
ae142cf
check entity type before adding if not traversing
hhunterzinck Feb 9, 2022
336f82c
add tests for traverse on folder and project
hhunterzinck Feb 9, 2022
cda348a
correct id versus parentId specification of synapse entity objects
hhunterzinck Feb 9, 2022
de717e6
remove unnecessary syn.get
hhunterzinck Feb 10, 2022
5511064
add test for traverse folder with file child
hhunterzinck Feb 10, 2022
07bcd2f
lint with black
hhunterzinck Feb 10, 2022
01f0076
change default include types in traverse function
hhunterzinck Feb 14, 2022
a280afc
remove check for child of type project
hhunterzinck Feb 14, 2022
ed6dbd5
create wrapper function for traverse
hhunterzinck Feb 15, 2022
fcad2db
update tests for new wrapper function
hhunterzinck Feb 15, 2022
fd26fa2
lint with black
hhunterzinck Feb 15, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions synapsemonitor/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,13 +20,13 @@ def monitor_cli(syn, args):

email_action = actions.EmailAction(
syn=syn,
syn_id=args.view_id,
syn_id=args.synapse_id,
email_subject=args.email_subject,
users=args.users,
days=args.days,
)
action_results = actions.synapse_action(action_cls=email_action)
filesdf = pd.DataFrame({"syn_id": action_results})
ids = pd.DataFrame({"syn_id": action_results})
if args.output:
pd.DataFrame(ids).to_csv(args.output, index=False, header=False)
else:
Expand All @@ -48,11 +48,12 @@ def create_file_view_cli(syn, args):
def build_parser():
"""Set up argument parser and returns"""
parser = argparse.ArgumentParser(
description="Checks for new/modified entities in a Fileview. A Synapse "
"Fileview can be created to allow users to track entities in a Project "
"or Folder. For more information, head to "
description="Checks for new/modified Synapse entities. "
"If a Project or Folder entity is specified, that entity and all its contents will be monitored. "
"A Synapse File View can be created to allow users to track the contents of Projects "
"or Folders with many entities more efficiently. For more information, head to "
"https://docs.synapse.org/articles/views.html. You can use the "
"`create-file-view` function provided in this package to create a File View."
"`create` function provided in this package to create a File View."
)
parser.add_argument(
"-c",
Expand All @@ -75,7 +76,7 @@ def build_parser():
"synapse_id",
metavar="synapse_id",
type=str,
help="Synapse ID of fileview to be monitored.",
help="Synapse ID of entity to be monitored.",
)
parser_monitor.add_argument(
"--users",
Expand Down Expand Up @@ -107,7 +108,6 @@ def build_parser():
parser_monitor.add_argument(
"--log",
"-l",
metavar="level",
type=str,
choices=["debug", "info", "warning", "error"],
default="error",
Expand Down
51 changes: 50 additions & 1 deletion synapsemonitor/monitor.py
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,47 @@ def _find_modified_entities_file(syn: Synapse, syn_id: str, days: int = 1) -> li
return []


def _traverse(
syn: Synapse,
synid_root: str,
include_types: typing.List = ["folder", "file", "project"],
) -> list:
"""Traverse Synapse entity hierarchy to gather all descendant
entities of a root entity.
Args:
syn: Synapse connection
synid_root: Synapse ID of root entity.
include_types: Must be a list of entity types (ie. [“folder”,”file”])
which can be found here:
http://docs.synapse.org/rest/org/sagebionetworks/repo/model/EntityType.html
Returns:
List of descendant Synapse IDs and root Synapse ID
"""

synid_desc = []

# full traverse depends on examining folder entities, even if not requested
include_types_mod = set(include_types)
include_types_mod.add("folder")
include_types_mod = list(include_types_mod)

synid_children = syn.getChildren(parent=synid_root, includeTypes=include_types_mod)
for synid_child in synid_children:
synid_desc.extend(
_traverse(
Copy link
Member

@thomasyu888 thomasyu888 Feb 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just remembered, in Python there is a max recursion limit - limit of calling a function recursively ~1000 times. So you may get into trouble here because if you do a getChildren on a folder with 1000 files, you're actually calling _traverse 1000 times. We may want to do something similar as synapseutils.walk where we only call _traverse when it is a container.

Then again - for those cases, we should encourage the use of file views...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a check for the entity type of the children to limit recursive calls. Now, if a child entity is neither a folder nor a project, the child entity type is added directly to the descendant list after checking that it is a requested entity type.

syn=syn, synid_root=synid_child["id"], include_types=include_types
)
)

# only requested entity types
entity = syn.get(synid_root, downloadFile=False)
thomasyu888 marked this conversation as resolved.
Show resolved Hide resolved
entity_type = entity["concreteType"].split(".")[-1].lower().replace("entity", "")
if entity_type in include_types:
synid_desc.append(synid_root)

return synid_desc


def _find_modified_entities_container(syn: Synapse, syn_id: str, days: int = 1) -> list:
"""Finds entities in a folder or project modified in the past N number of days

Expand All @@ -124,7 +165,15 @@ def _find_modified_entities_container(syn: Synapse, syn_id: str, days: int = 1)
Returns:
List of synapse ids
"""
raise NotImplementedError("Projects and folders not supported yet")
syn_id_mod = []
syn_id_children = _traverse(syn, syn_id)

for syn_id_child in syn_id_children:
syn_id_res = _find_modified_entities_file(syn, syn_id_child, days)
if syn_id_res:
syn_id_mod.extend(syn_id_res)

return syn_id_mod


def _force_update_view(syn: Synapse, view_id: str):
Expand Down
Loading