Skip to content

Commit

Permalink
source-hubspot-native: ignore search results with timestamps more rec…
Browse files Browse the repository at this point in the history
…ent than the requested window

The search API is used for requesting a stream of "delayed" records, held back
by a 1 hour horizon, to account for the eventual consistency of the HubSpot APIs
in general.

Sometimes we see that a record in our search result is returned out of order
with respect to its "updated at" timestamp. When this happens, there ends up
being a record with an "updated at" timestamp fairly near the present, and
outside the upper limit that was requested.

We can only speculate as to why this happens, but it could be because the record
is getting updated around the same time as we make our search request, and it is
getting included in the search results based on its original timestamp, but we
are getting the updated record in the place of where the original one should
have been.

Currently the strategy is for the connector to crash when this happens, at which
point it will retry when restarted and eventually make progress. But we've seen
cases where it's happening so often that limits the connector's progress and is
just generally confusing for users to see, so the strategy is being modified to
implement client-side filtering to exclude records with timestamps more recent
than what were requested.

The upper timestamp limit `until` argument to our search API function is
actually optional and you may be wondering when it would be absent: Custom
record types, line items, and products are only obtainable through the search
API, so these records actually use the search API for the non-delayed stream
too. If we end up getting results out of order for these cases I guess we'll
just continue to crash and have to deal with that later. I haven't seen the
error with these stream types so I'm not particularly worried about it right
now.
  • Loading branch information
williamhbaker committed Dec 16, 2024
1 parent a5b56f4 commit 5f388c2
Showing 1 changed file with 8 additions and 2 deletions.
10 changes: 8 additions & 2 deletions source-hubspot-native/source_hubspot_native/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -555,9 +555,15 @@ async def fetch_search_objects(

for r in result.results:
this_mod_time = r.properties.hs_lastmodifieddate

if until and this_mod_time > until:
log.info(
"ignoring search result with record modification time that is later than maximum search window",
{id: r.id, "this_mod_time": this_mod_time, "until": until},
)
continue

if this_mod_time < max_updated:
# This should never happen since results are requested in
# ASCENDING order by the last modified time.
raise Exception(f"last modified date {this_mod_time} is before {max_updated} for {r.id}")

max_updated = this_mod_time
Expand Down

0 comments on commit 5f388c2

Please sign in to comment.