-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ensure all migration logic was moved to mojp-dbs-pipelines #204
Labels
Comments
the following code was deleted, need to review the logic it had and ensure it all moved to pipelines (if needed) bhs_api/item.pydef update_es(collection_name, doc, is_new, es_index_name=None, es=None, app=None):
app = current_app if not app else app
es_index_name = app.es_data_db_index_name if not es_index_name else es_index_name
es = app.es if not es else es
# index only the docs that are publicly available
if doc_show_filter(collection_name, doc):
body = deepcopy(doc)
# adjust attributes for elasticsearch
if collection_name == "persons":
body["person_id"] = body.get("id", body.get("ID"))
body["first_name_lc"] = body["name_lc"][0]
body["last_name_lc"] = body["name_lc"][1]
# maps all known SEX values to normalized gender value
body["gender"] = {"F": "F", "M": "M",
None: "U", "": "U", "U": "U", "?": "U", "P": "U"}[body.get("SEX", "").strip()]
# _id field is internal to mongo
if '_id' in body:
del body['_id']
# id field has special meaning in elasticsearch
if 'id' in body:
del body['id']
if "thumbnail" in body and "data" in body["thumbnail"]:
# no need to have thumbnail data in elasticsearch
# TODO: ensure we only store and use thumbnail from filesystem
del body["thumbnail"]["data"]
# persons collection gets a fake header to support searching
if collection_name == "persons":
name = " ".join(body["name"]) if isinstance(body["name"], list) else body["name"]
body["Header"] = {"En": name, "He": name}
# elasticsearch uses the header for completion field
# this field does not support empty values, so we put a string with space here
# this is most likely wrong, but works for now
# TODO: figure out how to handle it properly, maybe items without header are invalid?
if "Header" in body:
for lang in ("He", "En"):
if body["Header"].get(lang) is None:
body["Header"][lang] = '_'
if collection_name == "persons":
doc_id = "{}_{}_{}".format(body["tree_num"], body["tree_version"], body["person_id"])
else:
doc_id = get_doc_id(collection_name, body)
if is_new:
uuids_to_str(body)
es.index(index=es_index_name, doc_type=collection_name, id=doc_id, body=body)
return True, "indexed successfully (inserted)"
else:
es.update(index=es_index_name, doc_type=collection_name, id=doc_id, body=body)
return True, "indexed successfully (updated)"
else:
return True, "item should not be shown - so not indexed" |
following should be deleted:
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
need to review the migration code and logic and ensure it all moved to pipelines
make sure we didn't miss some logic
The text was updated successfully, but these errors were encountered: