-
Notifications
You must be signed in to change notification settings - Fork 822
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rvztz/hubspot ingest connector #1760
Conversation
try: | ||
response = get_by_id_method(self.object_id, **kwargs) | ||
except not_found_exception as e: | ||
logger.error(e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want this to throw the not_found_exception
here and let the pipeline logic either raise the error or simply log it there? @ryannikolaidis opinions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I think letting it raise makes sense. Not sure that hiding here brings value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to catch for not_found_exception
to return None if the object is not found so that when we call update_source_metadata
we can set exists=False
if the obj is None. I made some changes to raise a ValueError on get_file
if the object is not found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the point of checking exists is to see if it exists on the remote service. this is ideally just an ls of sorts, not actually relying on fetching content. not possible in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...really that goes for any metadata. ideally doesn't rely on actually fetching the content
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we need to make a call, would be worth seeing it there's a HEAD
version since that doesn't require pulling down the data just to check for existence. I had to do that in the notion connector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that makes sense to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we omit the try-except and wrap this method with the SourceConnectionNetworkError.wrap
decorator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to try-catch for NotFoundException
to return None if the file doesn't exists. When we call this method from update_source_metadata
we set a flag to retrieve only the base properties of the object.
@rvztz ready for a review? probably needs to rebase with latest changes? |
CHANGELOG.md
Outdated
@@ -136,7 +137,7 @@ ocr agent tesseract/paddle in environment variable `OCR_AGENT` for OCRing the en | |||
|
|||
### Features | |||
|
|||
* **Adds element type percent match function** In order to evaluate the element type extracted, we add a function that calculates the matched percentage between two frequency dictionary. | |||
* **Adds element type percent match function** In order to evaluate the element type extracted, we add a function that calculates the matched percentage between two frequency dictionary.s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra character at the end
…ndleMixin._session_handle` from the init of the class
This pull request includes updated ingest test fixtures. Please review and merge if appropriate. Co-authored-by: rvztz <[email protected]>
…red-IO/unstructured into rvztz/hubspot-ingest-connector
Closes #1843
Ingest connector for HubSpot. Supports:
From each record,
body/
description`information is grabbed. When a title property is available, this is registered at the beggining of the output file. The CLI receives three params:api-token
: Private app token.object-types: One of the noted supported objects in the form of a comma separated list:
calls,products,tasks`custom-properties
: Custom properties to grab information from. Must be in the form<object_type>:<custom_property_id>,<object_type>:<custom_property_id>