Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idempotent utility should consider json attribute ordering when creating HASH keys #638

Closed
walmsles opened this issue Aug 22, 2021 · 7 comments
Labels
bug Something isn't working

Comments

@walmsles
Copy link
Contributor

I was curious about the idempotent implementation and the general property of REST API bodies being un-ordered lists of JSON attributes. A REST API should be resilient to consumers and pragmatic in its approach to reading data supplied by clients calling them.

I created a simple JSON REST API using serverless and Powertools and implemented the Idempotency utility as follows:

import json
import os
from datetime import datetime
from aws_lambda_powertools import Logger, Tracer
from aws_lambda_powertools.utilities.idempotency import DynamoDBPersistenceLayer, idempotent, IdempotencyConfig

logger = Logger()
tracer = Tracer()

table_name = os.environ['TABLE']
persistence_layer = DynamoDBPersistenceLayer(table_name=table_name)
config = IdempotencyConfig(event_key_jmespath="body")

@logger.inject_lambda_context
@tracer.capture_lambda_handler
@idempotent(persistence_store=persistence_layer, config=config)
def handler(event, context):
    body = {
        "message": f"Data Created: {datetime.now()}",
        "input": event['body']
    }

    logger.info('Processing API call - creating the data now!')
    logger.info(event['body'])
    response = {
        "statusCode": 200,
        "body": json.dumps(body)
    }

    return response

I deployed this to AWS and called the API with input as follows:

Test 1

{
    "data": "test message 1",
    "more_data": "more data 1"
}

Test 2
Note JSON attributes are submitted in different order which is OK for a REST API.

{
    "more_data": "more data 1",
    "data": "test message 1"
}

My expectation is that both these payloads will result in the exact same idempotent response since the payloads are essentially the same so far as REST APi conventions go.

The result I actually get is as follows:

Test 1
Result from first call to the API

{
    "message": "Data Created: 2021-08-22 05:14:01.524104",
    "input": "{\n    \"data\": \"test message 1\",\n    \"more_data\": \"more data 1\"\n}"
}

Test 2
Note: This is a different response than the first call to the API - I was expecting the same idempotent response.

{
    "message": "Data Created: 2021-08-22 05:14:42.554149",
    "input": "{\n    \"more_data\": \"more data 1\",\n    \"data\": \"test message 1\"\n}"
}

What were you trying to accomplish?
Implement Idempotency over a simple API using the entire API Body as the key for recognising Idempotency.

Expected Behavior

When calling my API with Test 1 payload and test 2 payload I am expecting the exact same idempotent response.

Current Behavior

Instead I am getting 2 different responses since the order of attributes plays a role in the idempotency key Hashing algorithm used by this utility.

Possible Solution

Suggest looking at using the sort_keys=True option for json.dumps when generating the Hash

Steps to Reproduce (for bugs)

  1. Create lambda API using code similar to code in this report.
  2. send payload: {"data":"test message 1","more_data":"more data 1"}
  3. call API again with same inputs but attribute order changed: {"more_data":"more data 1","data":"test message 1"}
  4. Expect same idempotent response from step 4 as in step 3

Environment

  • Powertools version used: 1.20.0
  • Packaging format (Layers, PyPi):
  • AWS Lambda function runtime: Python 3.8
  • Debugging logs
@walmsles walmsles added bug Something isn't working triage Pending triage from maintainers labels Aug 22, 2021
@heitorlessa
Copy link
Contributor

heitorlessa commented Aug 22, 2021 via email

@heitorlessa heitorlessa added area/idempotency and removed triage Pending triage from maintainers labels Aug 22, 2021
@walmsles
Copy link
Contributor Author

@heitorlessa Let me have a go - The test for this one are slightly scary but let me take a look :-)

@heitorlessa
Copy link
Contributor

Not to worry, I know it can be intimidating as it's the most complex part of Powertools - Just got a test for that while I wait for my flight ;) Fix coming soon

@heitorlessa
Copy link
Contributor

Waiting for CI checks then releasing it ;)

That's how a test for that looks like

def test_idempotent_data_sorting():
    # Scenario to validate same data in different order hashes to the same idempotency key
    data_one = {"data": "test message 1", "more_data": "more data 1"}
    data_two = {"more_data": "more data 1", "data": "test message 1"}

    # Assertion will happen in MockPersistenceLayer
    persistence_layer = MockPersistenceLayer("test-func#" + hashlib.md5(json.dumps(data_one).encode()).hexdigest())

    # GIVEN
    @idempotent_function(data_keyword_argument="payload", persistence_store=persistence_layer)
    def dummy(payload):
        return {"message": "hello"}

    # WHEN
    dummy(payload=data_two)

@walmsles
Copy link
Contributor Author

@heitorlessa all the MockPersistenceLayer setups need to include 'sort_keys=True' for all the existing tests as well. I had progressed to making the existing tests fail Will wait for your fix to come through

@heitorlessa
Copy link
Contributor

Fixed it - it was actually needed in conftest too as there were plenty of json.dumps for dummy hash idempotency keys. Turns out I forgot how long I had to walk to find my gate ;D

For the MockPersistenceLayer, I didn't include the sort_keys up there deliberately, so I can pass payload=data_two down there to ensure they're sorted in the implementation

@heitorlessa
Copy link
Contributor

Available in PyPi now as 1.20.1 - Lambda Layers might take up to 5 minutes to complete ;)

Thank you so so much again @walmsles !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants