Skip to content

Commit

Permalink
Moderate update's' destructive behaviour
Browse files Browse the repository at this point in the history
A documented but still-surprising behaviour of
`codelists update` is that it deletes files in the
codelists directory that aren't referenced by
codelists.txt.

Running a `codelists update` after a `codelists
add` is beneficial as it makes "adding" a codelist
(i.e. making a codelist available to be added to
your study) a single operation.

This change minimises the most harmful/surprising
side effects of `update` when it is implicitly
called after `add`, whilst leaving potentially
beneficial ones intact (updating dm+d lists
affected by "acute" codelist rot/VMP id updates),
and all other codelists unaffected.

An explicit call to `update` maintains the
previous behaviour.
  • Loading branch information
Jongmassey committed Jan 16, 2025
1 parent 11065c7 commit e421132
Showing 1 changed file with 9 additions and 4 deletions.
13 changes: 9 additions & 4 deletions opensafely/codelists.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,10 +90,10 @@ def add(codelist_url, codelists_dir=None):
if not f.readlines()[-1].endswith("\n"):
line = "\n" + line
f.write(line + "\n")
update(codelists_dir)
update(codelists_dir, delete_unknown=False)


def update(codelists_dir=None):
def update(codelists_dir=None, delete_unknown=True):
if not codelists_dir:
codelists_dir = Path.cwd() / CODELISTS_DIR
codelists = parse_codelist_file(codelists_dir)
Expand Down Expand Up @@ -123,8 +123,13 @@ def update(codelists_dir=None):
preserve_download_dates(manifest, manifest_file)
manifest_file.write_text(json.dumps(manifest, indent=2))
for file in old_files - new_files:
print(f"Deleting {file.name}")
file.unlink()
if delete_unknown:
print(f"Deleting {file.name}")
file.unlink()
else:
print(
f"Unknown file in codelists directory {file.name}\nwould be deleted by 'opensafely codelists update'"
)
return True


Expand Down

0 comments on commit e421132

Please sign in to comment.