You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now, the NDC plugin is using the RootKeyMergerStorage class, to join documents with duplicate _id (productndc) values.
However, with this method, it seems like a lot of information is duplicated. For example, the query: http://mychem.info/v1/query?q=69168-318&fields=ndc&dotfield=true (using the dotfield parameter helps us see the duplicated data side-by-side)
shows that the values in most the fields contain the same information.
In the case of the document above, the only fields that contain significantly different values are ndc.listing_record_certified_through and ndc.product_id. Other fields like ndc.proprietaryname and ndc.nonproprietaryname differ only in their capitalization. It was the same case in other documents that I checked manually.
If this is widespread, I think we can should merge the documents using MergerStorage class, which should result in less duplication.
The text was updated successfully, but these errors were encountered:
Right now, the NDC plugin is using the
RootKeyMergerStorage
class, to join documents with duplicate_id
(productndc) values.However, with this method, it seems like a lot of information is duplicated. For example, the query:
http://mychem.info/v1/query?q=69168-318&fields=ndc&dotfield=true (using the dotfield parameter helps us see the duplicated data side-by-side)
shows that the values in most the fields contain the same information.
In the case of the document above, the only fields that contain significantly different values are
ndc.listing_record_certified_through
andndc.product_id
. Other fields likendc.proprietaryname
andndc.nonproprietaryname
differ only in their capitalization. It was the same case in other documents that I checked manually.If this is widespread, I think we can should merge the documents using
MergerStorage
class, which should result in less duplication.The text was updated successfully, but these errors were encountered: