You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
As a user of the s3 source with sqs, I have messages/objects that differentiates in size. This means that there is not an optimal visibility timeout for the SQS queue, as too small a timeout cause issues with large messages, and too large a timeout could cause delays on processing files if data prepper were to crash.
Describe the solution you'd like
Making timely calls to the ChangeMessageVisbility API of SQS from the S3 source. This could be an optional parameter for the sqs queue.
source:
s3:
sqs:
visibility_timeout: "dynamic"
The S3 source would be responsible for keeping track of the time that it has been processing a message, and would make an API call if it couldn't process the message in time. For example, if the visibility timeout of the queue is 2 minutes, and the S3 source pulls this message, and finds it won't be able to process it in time, an API call to ChangeMessageVisbility would be made to increase the visibility timeout for the message by another 2 minutes. This would continue until the message is fully processed, or until the instance of Data Prepper crashes, which means the visibility timeout would not be increased again, and another instance of data prepper could grab the message as intended.
Describe alternatives you've considered (Optional)
Defaulting the visibility timeout to a much larger value (maybe even the max of 12 hr), and then if Data Prepper is going to shutdown, to call a ChangeMessageVisibility with a value of 0 to allow another instance of Data Prepper to immediately
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
As a user of the s3 source with sqs, I have messages/objects that differentiates in size. This means that there is not an optimal visibility timeout for the SQS queue, as too small a timeout cause issues with large messages, and too large a timeout could cause delays on processing files if data prepper were to crash.
Describe the solution you'd like
Making timely calls to the
ChangeMessageVisbility
API of SQS from the S3 source. This could be an optional parameter for the sqs queue.The S3 source would be responsible for keeping track of the time that it has been processing a message, and would make an API call if it couldn't process the message in time. For example, if the visibility timeout of the queue is 2 minutes, and the S3 source pulls this message, and finds it won't be able to process it in time, an API call to
ChangeMessageVisbility
would be made to increase the visibility timeout for the message by another 2 minutes. This would continue until the message is fully processed, or until the instance of Data Prepper crashes, which means the visibility timeout would not be increased again, and another instance of data prepper could grab the message as intended.Describe alternatives you've considered (Optional)
Defaulting the visibility timeout to a much larger value (maybe even the max of 12 hr), and then if Data Prepper is going to shutdown, to call a
ChangeMessageVisibility
with a value of 0 to allow another instance of Data Prepper to immediatelyThe text was updated successfully, but these errors were encountered: