Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating inference logic to add node level request-response logging #3874

Merged
merged 3 commits into from
Feb 2, 2022

Conversation

SachinVarghese
Copy link
Contributor

What this PR does / why we need it:
This PR corrects the paylod logging mechanism in seldon executor for logging node level request-response pairs in complex inference graphs
Which issue(s) this PR fixes:
Fixes #3873

@seldondev
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign sachinvarghese
You can assign the PR to them by writing /assign @sachinvarghese in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Comment on lines +103 to +108
//Log Request
if node.Logger != nil && (node.Logger.Mode == v1.LogRequest || node.Logger.Mode == v1.LogAll) {
err := p.logPayload(node.Name, node.Logger, payloadLogger.InferenceRequest, msg, puid)
if err != nil {
return nil, err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of logging inside transformInput. It is not clear from the Predict function where transformInput is called that the message may or may not be logged.

edit; unfortunately, due to the way the code has been structured, this is the easiest way to do what's needed because the callTransformInput and callModel flags are determined in here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I also tried structuring the code in a different manner to start with but this seemed easiest considering other aspects of the implementation.

@ukclivecox
Copy link
Contributor

/test integration

@ukclivecox
Copy link
Contributor

/test notebooks

@RafalSkolasinski
Copy link
Contributor

/test integration

@RafalSkolasinski
Copy link
Contributor

logs of integration tests are quite confusing, it is not clear if there was an actual failure or not

===Flaky Test Report===

test_namespace_update[1.10.0] passed 1 out of the required 1 times. Success!
test_namespace_update[1.11.0] passed 1 out of the required 1 times. Success!
test_namespace_update[1.12.0] passed 1 out of the required 1 times. Success!
test_rolling_update6[ambas] passed 1 out of the required 1 times. Success!
test_rolling_update6[istio] passed 1 out of the required 1 times. Success!
test_rolling_update7[ambas] passed 1 out of the required 1 times. Success!
test_rolling_update7[istio] passed 1 out of the required 1 times. Success!
test_rolling_update8[ambas] failed (4 runs remaining out of 5).
	<class 'AssertionError'>
	assert 503 == 200
  -503
  +200
	[<TracebackEntry /workspace/source/testing/scripts/test_rolling_updates.py:140>]
test_rolling_update8[ambas] failed (3 runs remaining out of 5).
	<class 'AssertionError'>
	assert 503 == 200
  -503
  +200
	[<TracebackEntry /workspace/source/testing/scripts/test_rolling_updates.py:140>]
test_rolling_update8[ambas] failed (2 runs remaining out of 5).
	<class 'AssertionError'>
	assert 500 == 200
  -500
  +200
	[<TracebackEntry /workspace/source/testing/scripts/test_rolling_updates.py:140>]
test_rolling_update8[ambas] passed 1 out of the required 1 times. Success!
test_rolling_update8[istio] passed 1 out of the required 1 times. Success!
test_rolling_update9[ambas] passed 1 out of the required 1 times. Success!
test_rolling_update9[istio] passed 1 out of the required 1 times. Success!

===End Flaky Test Report===

========== 52 passed, 6 skipped, 67 deselected in 6307.24s (1:45:07) ===========
Test returned errors
kind delete cluster
Deleting cluster "kind" ...
Stopping Docker: dockerProgram process in pidfile '/var/run/docker-ssd.pid', 1 process(es), refused to die.
�[31m
Pipeline failed on stage 'integration-test-task' : container 'step-integration-step'. The execution of the pipeline has stopped.�[0m

if it was, it could have been on test_rolling_update8[ambas] failed but why it does not count towards total number in the summary?

@RafalSkolasinski
Copy link
Contributor

Actually, I also see failure on TestPrepack.test_mlflow_v2.

Triggered tests again anyway to rule out flakiness first

@RafalSkolasinski
Copy link
Contributor

/test integration

@RafalSkolasinski
Copy link
Contributor

The TestPrepack.test_mlflow_v2 seem to passed I do see failure now on test_rolling_deployment[graph1.json-graph4.json-True-ambas].

@ukclivecox ukclivecox merged commit bfb30c1 into SeldonIO:master Feb 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The seldon executor doesn't log node level request-response pairs in complex inference graphs
5 participants