-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🎉 Source Freshdesk: providing the updated_since parameter on the initial sync for Tickets stream #6442
🎉 Source Freshdesk: providing the updated_since parameter on the initial sync for Tickets stream #6442
Conversation
/test connector=source-freshdesk
|
/test connector=source-freshdesk
|
"tickets_updated_since": { | ||
"title": "Tickets updated since", | ||
"description": "By default, only tickets that have been created within the past 30 days will be returned. For older tickets, use this parameter", | ||
"pattern": "^[0-9]{4}-[0-9]{2}-[0-9]{2}T[0-9]{2}:[0-9]{2}:[0-9]{2}Z$", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just "format": "date-time"?
@@ -48,8 +48,10 @@ | |||
|
|||
|
|||
class Client(BaseClient): | |||
def __init__(self, domain, api_key, requests_per_minute: int = None): | |||
self._api = API(domain=domain, api_key=api_key, requests_per_minute=requests_per_minute) | |||
def __init__(self, domain, api_key, requests_per_minute: int = None, tickets_updated_since: str = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pass array of ticket_ids instead
airbyte-integrations/connectors/source-freshdesk/source_freshdesk/spec.json
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comments
/test connector=source-freshdesk
|
params = {**params, **self._state_params()} | ||
state_params = self._state_params() | ||
if not state_params and self._api.start_date: | ||
params["updated_since"] = pendulum.parse(self._api.start_date) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't you parse it in the source?
@@ -326,7 +333,10 @@ def get_tickets( | |||
def read(self, getter: Callable, params: Mapping[str, Any] = None) -> Iterator: | |||
"""Read using getter, patched to respect current state""" | |||
params = params or {} | |||
params = {**params, **self._state_params()} | |||
state_params = self._state_params() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this logic to state params
@property | ||
def start_date(self): | ||
return self._start_date | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no use
"title": "Tickets updated since", | ||
"description": "By default, only tickets that have been created within the past 30 days will be returned. For older tickets, set start date for syncing", | ||
"format": "date-time", | ||
"examples": ["2020-12-01T00:00:00Z"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
regex to validate datetime format? maybe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comments
@@ -22,6 +22,13 @@ | |||
"title": "Requests per minute", | |||
"type": "integer", | |||
"description": "Number of requests per minute that this source allowed to use." | |||
}, | |||
"start_date": { | |||
"title": "Tickets updated since", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this option is only used for tickets? If so, the title should be Tickets lookback window
and param name tickets_lookback_window
}, | ||
"start_date": { | ||
"title": "Tickets updated since", | ||
"description": "By default, only tickets that have been created within the past 30 days will be returned. For older tickets, set start date for syncing", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"description": "By default, only tickets that have been created within the past 30 days will be returned. For older tickets, set start date for syncing", | |
"description": "By default, only tickets that have been created within the past 30 days will be returned. Set this parameter to override that value", |
@@ -255,6 +258,13 @@ def list(self, fields: Sequence[str] = None) -> Iterator[dict]: | |||
class TicketsAPI(IncrementalStreamAPI): | |||
call_credit = 3 # each include consumes 2 additional credits | |||
|
|||
def _state_params(self) -> Mapping[str, Any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this applicable only for tickets? why not apply to all incremental streams? (just asking, not sure what the APi supports)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sherifnada , @vitaliizazmic
it is a good question, I can think about
- users that want to sync only specific tickets (because of the size of stream?? mb)
- and users that want to sync everything after a specific date
so both usages are possible, the general start_date
will satisfy the second scenario.
To achieve also first one user will need to sync all streams except tickets in full_refresh
mode and then enable incremental.
So, yes, we can probably introduce start_date
for all incremental streams
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sherifnada @keu I've checked API documentation and Source. Freshdesk API doesn't have ability to filter most of streams. Almost all incremental streams use full refresh syncing and than filter records on our side. But contacts can be filtered by _updated_since, Satisfaction Ratings by created_since, Time Entries by executed_after. Should I implement start_date for this streams?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if they also have a 30 day max by default, then I would say we should implement it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TimeEntries, Contacts and Companies support real increment, I'm not sure about Satisfaction Ratings as it is not clear if records can be updated, it definitely has ts_updated column, but can it be different from ts_created? I don't know.
taking into account that TimeEntries related to tickets I think it worth to make general start_date
config, and not just for tickets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sherifnada only Tickets has this restriction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
upon discussion with @vitaliizazmic we agreed to add start_date
to the following incremental streams:
- contacts
- tickets
- companies
- satisfaction ratings
the default value is now - 30 days
we will always use updated_since
parameters in tickets streams (because old logic rely on created tickets, not updated, which break dependant streams - conversations)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
# Since this logic rely not on updated tickets, it can break tickets dependant streams - conversations. | ||
# So updated_since parameter will be always used in tickets streams. And start_date will be used too | ||
# with default value 30 days look back. | ||
self._start_date = pendulum.parse(start_date) if start_date else pendulum.now().add(days=-30) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self._start_date = pendulum.parse(start_date) if start_date else pendulum.now().add(days=-30) | |
self._start_date = pendulum.parse(start_date) if start_date else pendulum.now() - duration(days=30) |
/test connector=source-freshdesk
|
/publish connector=connectors/source-freshdesk
|
What
By default Freshdesk API only retrieves tickets created within the past 30 days since. The connector should provide the updated_since parameter for the Tickets streams on the initial sync. The pagination logic should probably change to always use update_since instead of only after the first 300 pages are retrieved.
Closes #6133
How
Requested feature is already partially implemented, when state is set. Connector uses updated_since and custom pagination. So for solving issue was made changes:
Pre-merge Checklist
airbyte_secret
./gradlew :airbyte-integrations:connectors:<name>:integrationTest
.README.md
bootstrap.md
. See description and examplesdocs/integrations/<source or destination>/<name>.md
including changelog. See changelog example