Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug-1889185: implement stage submitter and fake collector #6617

Merged
merged 3 commits into from
May 21, 2024

Conversation

willkg
Copy link
Contributor

@willkg willkg commented May 15, 2024

This implements a stage submitter that pulls crash ids from QUEUE and crash data from STORAGE, packages that up into a crash report payload, and submits it to STAGE_SUBMITTER_DESTINATIONS.

This creates a make runsubmitter system that runs the stage submitter and a fake collector allowing you to use bin/process_crash.sh to set everything up and publish a crash id to the standard queue which the submitter will see and submit a crash to the fake collector. The fake collector will parse the payload and log headers, annotations, and files.

This adds stage submitter tests.

To test:

  1. make build
  2. make setup
  3. make runservices
  4. make runsubmitter
  5. in another terminal, run make shell and then you can use bin/process_crash.sh to download crash data, put it in storage, and publish the crash id to the standard queue
    1. if it's sampled, then you'll see a socorro.submitter.sample incr metric
    2. if it's accepted, then you'll see a socorro.submitter.accept incr metric and you'll see the headers, annotations, and files logged by the fake collector

Light documentation is provided in docs/services/stage_submitter.rst.

@willkg willkg requested a review from a team as a code owner May 15, 2024 21:15
@@ -54,6 +54,7 @@ echo ">>> run tests"

# Run socorro tests
"${PYTEST}"
CLOUD_PROVIDER=GCP "${PYTEST}" -m gcp
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The submitter tests use queue_helper and storage_helper, so we need to do AWS and GCP tests now for socorro tests.

@@ -103,6 +103,10 @@ SESSION_COOKIE_SECURE=False
# -----------
SENTRY_PORT=8090

# stage_submitter
# ---------------
STAGE_SUBMITTER_DESTINATIONS=http://fakecollector:8000/submit|20
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sets up a single destination sampled at 20%.

@willkg willkg force-pushed the willkg-bug-1889185-submitter branch from 1155fc2 to 04fad15 Compare May 15, 2024 21:18
from werkzeug.exceptions import HTTPException, MethodNotAllowed
from werkzeug.formparser import parse_form_data
from werkzeug.routing import Map, Rule
from werkzeug.wrappers import Request, Response
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only web framework we have available is Django. I didn't want to use Django for the fake collector because that involves a whole bunch of additional set up for the db I didn't want to do.

Flask is an ergonomic shell on top of werkzeug. We don't need templates, localizations, extensions, etc so we can use raw werkzeug to build a very minimal API for /submit.

content_encoding = request.headers.get("Content-Encoding", "")
self.logger.info("content_encoding: %s", content_encoding)

stream, form, files = parse_form_data(request.environ, max_content_length=content_length)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This worked surprisingly well.

Even so, I want to see how parse_form_data deals with some of the malformed data issues we deal with in Antenna. Depending on that, switching Antenna from Falcon (which uses the deprecated-to-be-removed cgi module) to Flask seems viable.

]


def remove_collector_keys(raw_crash):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large portions of this file were copied verbatim from socorro-submitter repository where they've been running for years.

return repr(thing).encode("utf-8")


def multipart_encode(raw_crash, dumps, payload_type, payload_compressed):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is copied from socorro-submitter.

if dest.sample >= 100 or random.randint(0, 100) > dest.sample
]

def process(self, crash):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The structure of this was copied from the socorro-submitter.


self.shutdown()

def run_once(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run_once is used in tests.

@@ -0,0 +1,594 @@
# This Source Code Form is subject to the terms of the Mozilla Public
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests were lifted from socorro-submitter and then adjusted to work with the SubmitterApp instead of the socorro-submitter lambda function.

@willkg willkg requested a review from relud May 15, 2024 21:44
willkg added 2 commits May 17, 2024 21:49
This implements a stage submitter that pulls crash ids from QUEUE and
crash data from STORAGE, packages that up into a crash report payload,
and submits it to STAGE_SUBMITTER_DESTINATIONS.

This creates a `make runsubmitter` system that runs the stage submitter
and a fake collector allowing you to use `bin/process_crash.sh` to
set everything up and publish a crash id to the standard queue which the
submitter will see and submit a crash to the fake collector. The fake
collector will parse the payload and log headers, annotations, and
files.

This adds stage submitter tests.
@willkg willkg force-pushed the willkg-bug-1889185-submitter branch from 4c006d9 to d269f85 Compare May 18, 2024 01:49
Copy link
Member

@relud relud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r+wc

Comment on lines +5 to 7
addopts = -rsxX --tb=native -p no:django -m 'not gcp'
norecursedirs = .git docs config docker __pycache__
testpaths = socorro/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretty sure marker definitions need to be copied from webapp/pytest.ini as well:

Suggested change
addopts = -rsxX --tb=native -p no:django -m 'not gcp'
norecursedirs = .git docs config docker __pycache__
testpaths = socorro/
addopts = -rsxX --tb=native -p no:django -m 'not gcp'
norecursedirs = .git docs config docker __pycache__
testpaths = socorro/
markers =
aws: tests that require aws backends to be configured in the environment. this is the default.
gcp: tests that require gcp backends to be configured in the environment. skipped unless explicitly requested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add that. I'm not sure how it works currently without those markers.

Comment on lines 27 to 28
def generate_s3_key(kind, crash_id):
"""Generates the key in S3 for this object kind
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this isn't just used for s3, i suggest we rename it:

Suggested change
def generate_s3_key(kind, crash_id):
"""Generates the key in S3 for this object kind
def generate_storage_key(kind, crash_id):
"""Generates the key in cloud storage for this object kind

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops--this is a copy/paste fail.

@willkg willkg merged commit 00d064a into main May 21, 2024
2 checks passed
@willkg
Copy link
Contributor Author

willkg commented May 21, 2024

Thank you!

@willkg willkg deleted the willkg-bug-1889185-submitter branch May 21, 2024 12:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants