Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datadog tracing #2441

Closed
wants to merge 196 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
196 commits
Select commit Hold shift + click to select a range
1dde625
Add release-notes Makefile and Go template
Jul 24, 2020
ca612fe
Remove tf binary from args when creating image for tfserving
ukclivecox Aug 28, 2020
a48549a
Allow terminationGracePeriodSeconds to be overridden
ukclivecox Aug 28, 2020
3902758
Merge branch 'master' into 2332_termination_grace_period
ukclivecox Aug 28, 2020
6c11a13
Add example model termination grace period
ukclivecox Aug 28, 2020
3336172
Updated pinned versions
axsaucedo Sep 8, 2020
0aad072
Pinned back xgboost lib
axsaucedo Sep 8, 2020
9fa30e8
Reverted numpy to be gt
axsaucedo Sep 8, 2020
59a0748
Updated xgboost to 1.2.0
axsaucedo Sep 10, 2020
c12689a
Update sklearn.md
axsaucedo Sep 11, 2020
76d9a48
Disallow 2 shadows
ukclivecox Sep 11, 2020
decdcf7
Add -2 as option for route abort
ukclivecox Sep 11, 2020
3d900a3
Update single helm chart
ukclivecox Sep 14, 2020
3b591f1
Merge branch 'master' into mlops
ukclivecox Sep 14, 2020
9077b9c
Add mlflow features
ukclivecox Sep 14, 2020
848b91d
add datadog tracer in executor, tests
mwm5945 Sep 9, 2020
150677c
add datadog tracer to python microservice
mwm5945 Sep 9, 2020
b31b136
remove extraneous files
mwm5945 Sep 9, 2020
266f0b0
test python microservice locally, test with env vars
mwm5945 Sep 9, 2020
6deb658
test executor with DD tracer
mwm5945 Sep 9, 2020
32ed5a3
update setup requirements
mwm5945 Sep 10, 2020
43eb5c0
spruce up docs, enable sampling in python
mwm5945 Sep 11, 2020
b61d796
add temp files
mwm5945 Sep 11, 2020
ff7a232
try this
mwm5945 Sep 11, 2020
bdd714a
remove the manual agent config
mwm5945 Sep 11, 2020
85808f1
temp logging
mwm5945 Sep 11, 2020
3a81800
fix setup.py
mwm5945 Sep 11, 2020
c83218d
convert to int
mwm5945 Sep 11, 2020
8aa29ca
more logging
mwm5945 Sep 11, 2020
f102d4d
update config, fix logging
mwm5945 Sep 11, 2020
9eb9ace
try this way of creating a dd tracer
mwm5945 Sep 11, 2020
2011a3c
try a test span
mwm5945 Sep 11, 2020
619e5d6
oops
mwm5945 Sep 11, 2020
ea9fabb
add sampler, enabled
mwm5945 Sep 11, 2020
63301b5
simplify things a gbid
mwm5945 Sep 13, 2020
3c047c5
add log
mwm5945 Sep 14, 2020
a727f3a
try this
mwm5945 Sep 14, 2020
8aa92ef
returning wrong thing
mwm5945 Sep 14, 2020
58f6098
set through opentracing?
mwm5945 Sep 14, 2020
409b9cd
force the opentracer?
mwm5945 Sep 14, 2020
3286a1f
errmm what?
mwm5945 Sep 14, 2020
f0a855b
try this
mwm5945 Sep 14, 2020
e975071
try this
mwm5945 Sep 14, 2020
4bd6abd
try this
mwm5945 Sep 14, 2020
307328f
parse tags
mwm5945 Sep 14, 2020
9b0523f
clean up python code
mwm5945 Sep 14, 2020
1abac0c
clean up executor
mwm5945 Sep 14, 2020
aa428e2
remove extra files
mwm5945 Sep 14, 2020
59fee64
fix whitespace
mwm5945 Sep 14, 2020
3ee4013
fix makefile
mwm5945 Sep 14, 2020
3ff5108
fix config of DD env vars
mwm5945 Sep 14, 2020
8a6da89
fix failing TestPrepack::test_text_alibi_exlainer test
RafalSkolasinski Sep 15, 2020
ab2d866
Merge branch 'master' into 2332_termination_grace_period
ukclivecox Sep 17, 2020
c90ece2
Update xgboost.md
axsaucedo Sep 17, 2020
f63804a
Merge pull request #2399 from axsaucedo/2065_servers_pinned_versions
ukclivecox Sep 17, 2020
9806b5d
Merge pull request #2435 from cliveseldon/mlops
ukclivecox Sep 17, 2020
7138c89
Merge pull request #2190 from adriangonz/release-notes
ukclivecox Sep 17, 2020
9df5d05
Add SSL listener back in after removal by multiplexing reversion
Sep 17, 2020
3d77267
Fixed jx issue of building
axsaucedo Sep 17, 2020
b2e3279
Add concepts page to docs
Sep 14, 2020
0e27849
Update concepts.md
glindsell Sep 17, 2020
3a79641
Add post_worker_init hook to dettach multiprocessing signal handler
Sep 11, 2020
544679a
Merge pull request #2449 from axsaucedo/2443_fix_image_builds
axsaucedo Sep 17, 2020
fb74868
remove extra dependencies from executor
mwm5945 Sep 17, 2020
053ebfb
update licenses
mwm5945 Sep 17, 2020
d06785b
Merge pull request #2415 from cliveseldon/2400_routing_protocol
axsaucedo Sep 17, 2020
69cb972
Merge pull request #2414 from cliveseldon/2383_multiple_shadows
axsaucedo Sep 17, 2020
b98d776
Merge pull request #2345 from cliveseldon/2332_termination_grace_period
axsaucedo Sep 17, 2020
8a4e5b8
Merge pull request #2343 from cliveseldon/2133_tf_args
ukclivecox Sep 18, 2020
a09f590
add support for runtime metrics
RafalSkolasinski Sep 15, 2020
8be23a8
add support for runtime tags
RafalSkolasinski Sep 15, 2020
31a5fad
code review suggestions
RafalSkolasinski Sep 16, 2020
09046d8
code review suggestions #2
RafalSkolasinski Sep 16, 2020
970170c
add runtime metrics/tags example
RafalSkolasinski Sep 16, 2020
9d5c836
rename CLientResponse to SeldonResponse
RafalSkolasinski Sep 17, 2020
dfd4fe1
code review suggestions #3
RafalSkolasinski Sep 17, 2020
8eb6dcb
make fmt
RafalSkolasinski Sep 18, 2020
f88dbda
Bump google.golang.org/grpc from 1.31.0 to 1.32.0 in /executor
dependabot-preview[bot] Sep 18, 2020
bad4ddc
Removed broken links
axsaucedo Sep 18, 2020
6e42ae8
ensure updated pygments installed
ukclivecox Sep 18, 2020
0457e6a
Merge pull request #2457 from cliveseldon/2455_docs_failure
ukclivecox Sep 18, 2020
a7604fb
prototype disabling encoder / decoder
Aug 25, 2020
5825854
Merge env vars
Aug 25, 2020
ab89af3
Remove check from handle_raw_custom_metric
Aug 25, 2020
c72d5a0
Add some tests for jsonify
Aug 25, 2020
ba88975
Add initial set of files for JavaJNI server
Aug 25, 2020
7c129e4
Add requirements.txt with JPype
Aug 25, 2020
df2ad58
Add raw methods to Java wrapper
Aug 25, 2020
3693ca6
Bump version of Java wrapper
Aug 25, 2020
977adc7
Use String insted of byte[] for REST
Aug 25, 2020
4287f53
Change structure of get_request
Aug 25, 2020
4ccef46
Fix defaults
Aug 25, 2020
2ade467
Fix type annotations
Aug 25, 2020
9a24903
Add Docker images for java-jni server
Aug 26, 2020
f692b3f
Update scripts to pass Java env vars
Aug 26, 2020
6570f19
Ignore _python folder
Aug 26, 2020
b380ad0
Fixed debug message
Aug 28, 2020
7106d52
Add model-template-app example for JNI server
Aug 28, 2020
aea7cd5
Build fat JAR
Aug 28, 2020
a0f5c32
Fix env variables in exec
Sep 3, 2020
1dfc1e2
Change image name
Sep 3, 2020
aa0b42e
Add docs section on JNI gateway for Java
Sep 3, 2020
7cc16d8
Bump to 0.3.0
Sep 3, 2020
68d4ca1
Use released package
Sep 14, 2020
c064684
Remove version.txt
Sep 14, 2020
c6eb194
Fix backticks
Sep 14, 2020
c510d1e
Add note on PAYLOAD_PASSTHROUGH to docs
Sep 14, 2020
5701100
Add example from model-template-app
Sep 18, 2020
ca649c0
Move constant outside
Sep 18, 2020
e349d34
Add test to build s2i image
Sep 18, 2020
66966ce
Add helpers to deploy models using Helm chart
Sep 18, 2020
94143d6
Add ignore to pre-commit to match Makefile in python/
Sep 18, 2020
4748ebe
Add mising s2i folder
Sep 18, 2020
f578bce
Fix tests for JNI gateway
Sep 18, 2020
079342a
Add s2i helpers and fixture
Sep 18, 2020
7045d68
Run tests on Java wrapper
Sep 18, 2020
766e01b
Build new version of legacy Java wrapper
Sep 18, 2020
ec90e1b
Rename test folder to java
Sep 18, 2020
bf954f6
Fix tests and skip pure Java ones
Sep 18, 2020
b59d2a3
Remove deprecated TODO
Sep 21, 2020
192caa2
Update k8s libs to 1.18.6
ukclivecox Sep 6, 2020
bcc5d8b
update licenses
ukclivecox Sep 14, 2020
ce6cfb9
Update licrenses
ukclivecox Sep 18, 2020
4cd53f7
Add operator licenses
Sep 18, 2020
85662e7
Update licenses
ukclivecox Sep 21, 2020
8fc7a88
Merge pull request #2448 from glindsell/tls-multiplex-revert
axsaucedo Sep 21, 2020
71da71b
upgrade core-builder to 0.18 and fix metadata_grpc notebook test
RafalSkolasinski Sep 18, 2020
5bb8e8f
lock shap to 0.35.0 in integration tests
RafalSkolasinski Sep 21, 2020
a6894e7
allow extra tags in model metadata
RafalSkolasinski Sep 3, 2020
ec8caf0
fix tests and generate protos
RafalSkolasinski Sep 3, 2020
fd5c363
update executor protos to include new metadata field
RafalSkolasinski Sep 8, 2020
b128cb8
metadata extension: rename tags fields to custom
RafalSkolasinski Sep 8, 2020
9b6e405
add examples
RafalSkolasinski Sep 8, 2020
64f190f
fix unit test
RafalSkolasinski Sep 8, 2020
9e2466a
fix accidental commit of whitespaces change in two rst files
RafalSkolasinski Sep 9, 2020
de403ed
adjust integration tests
RafalSkolasinski Sep 9, 2020
3382606
adjust integration tests #2
RafalSkolasinski Sep 9, 2020
00f2d30
Merge pull request #2376 from RafalSkolasinski/issues/2312/metadata-c…
axsaucedo Sep 21, 2020
4713c16
change to 1.3.0-dev
RafalSkolasinski Sep 21, 2020
c0fb068
document wrapper CLI and environ flags
RafalSkolasinski Sep 22, 2020
1a08fe2
add datadog tracer in executor, tests
mwm5945 Sep 9, 2020
24917a5
add datadog tracer to python microservice
mwm5945 Sep 9, 2020
c7d80cf
remove extraneous files
mwm5945 Sep 9, 2020
2e5d17b
test python microservice locally, test with env vars
mwm5945 Sep 9, 2020
516663b
test executor with DD tracer
mwm5945 Sep 9, 2020
51cb508
update setup requirements
mwm5945 Sep 10, 2020
219bba9
spruce up docs, enable sampling in python
mwm5945 Sep 11, 2020
d3ffdb3
add temp files
mwm5945 Sep 11, 2020
20e31c5
try this
mwm5945 Sep 11, 2020
a9697ae
remove the manual agent config
mwm5945 Sep 11, 2020
036a0d8
temp logging
mwm5945 Sep 11, 2020
5d6e475
fix setup.py
mwm5945 Sep 11, 2020
d37c93f
convert to int
mwm5945 Sep 11, 2020
2864048
more logging
mwm5945 Sep 11, 2020
6c08f1e
update config, fix logging
mwm5945 Sep 11, 2020
de0e0d6
try this way of creating a dd tracer
mwm5945 Sep 11, 2020
7c836e2
try a test span
mwm5945 Sep 11, 2020
32aba79
oops
mwm5945 Sep 11, 2020
5468ebe
add sampler, enabled
mwm5945 Sep 11, 2020
3f79331
simplify things a gbid
mwm5945 Sep 13, 2020
139604c
add log
mwm5945 Sep 14, 2020
a3f476e
try this
mwm5945 Sep 14, 2020
0fe4e40
returning wrong thing
mwm5945 Sep 14, 2020
6de20c5
set through opentracing?
mwm5945 Sep 14, 2020
9b66130
force the opentracer?
mwm5945 Sep 14, 2020
be771cb
errmm what?
mwm5945 Sep 14, 2020
071e412
try this
mwm5945 Sep 14, 2020
80b9db5
try this
mwm5945 Sep 14, 2020
d25e5f4
try this
mwm5945 Sep 14, 2020
cbf17ba
parse tags
mwm5945 Sep 14, 2020
586db85
clean up python code
mwm5945 Sep 14, 2020
efa0bcc
clean up executor
mwm5945 Sep 14, 2020
ef3f990
remove extra files
mwm5945 Sep 14, 2020
26bbfd6
fix whitespace
mwm5945 Sep 14, 2020
c57f86b
fix makefile
mwm5945 Sep 14, 2020
9831487
fix config of DD env vars
mwm5945 Sep 14, 2020
e07b6ba
update licenses
mwm5945 Sep 17, 2020
ceaaefc
update mod and license
mwm5945 Sep 22, 2020
9fdfeb3
conflicts
mwm5945 Sep 22, 2020
7440150
add datadog tracer in executor, tests
mwm5945 Sep 9, 2020
0b639ac
add datadog tracer to python microservice
mwm5945 Sep 9, 2020
d609c08
remove extraneous files
mwm5945 Sep 9, 2020
b9d1fe4
test executor with DD tracer
mwm5945 Sep 9, 2020
ad43c20
add temp files
mwm5945 Sep 11, 2020
2884344
clean up executor
mwm5945 Sep 14, 2020
db1569e
remove extra files
mwm5945 Sep 14, 2020
2d8171e
fix makefile
mwm5945 Sep 14, 2020
950629e
add datadog tracer in executor, tests
mwm5945 Sep 9, 2020
eea761a
add datadog tracer to python microservice
mwm5945 Sep 9, 2020
a0de257
remove extraneous files
mwm5945 Sep 9, 2020
1c269ef
test executor with DD tracer
mwm5945 Sep 9, 2020
9ab9346
add temp files
mwm5945 Sep 11, 2020
d56199c
clean up executor
mwm5945 Sep 14, 2020
b1fdfe9
remove extra files
mwm5945 Sep 14, 2020
396f305
fix makefile
mwm5945 Sep 14, 2020
f39ab78
undo changes to makefile
mwm5945 Sep 22, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -188,7 +188,7 @@ testing/scripts/tensorflow
testing/scripts/run.log
testing/scripts/my-model/
wrappers/s2i/python/_python/
wrappers/s2i/python-conda/_python/
incubating/wrappers/s2i/java-jni/_python/

seldon-controller/go

Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,4 @@ repos:
rev: stable
hooks:
- id: black
args: ['python/', 'testing/', 'operator/helm', 'operator/seldon-operator/hack', '--exclude', '(testing/scripts/proto|seldon_core/proto/|.eggs)']
args: ['python/', 'testing/', '--exclude', '(testing/scripts/proto|seldon_core/proto/|.eggs)']
4 changes: 2 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ run_core_builder_in_host:
-v /var/run/docker.sock:/var/run/docker.sock \
-v $${HOME}/.m2:/root/.m2 \
-v $(SELDON_CORE_LOCAL_DIR):/work \
seldonio/core-builder:0.17 bash
seldonio/core-builder:0.18 bash


run_core_builder_in_minikube:
Expand All @@ -37,7 +37,7 @@ run_core_builder_in_minikube:
-v /var/run/docker.sock:/var/run/docker.sock \
-v /home/docker/.m2:/root/.m2 \
-v $(SELDON_CORE_VM_DIR):/work \
seldonio/core-builder:0.17 bash
seldonio/core-builder:0.18 bash

show_paths:
@echo "local: $(SELDON_CORE_LOCAL_DIR)"
Expand Down
2 changes: 1 addition & 1 deletion components/alibi-detect-server/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
FROM registry.access.redhat.com/ubi8/python-36
LABEL name="Seldon Alibi Detect Server" \
vendor="Seldon Technologies" \
version="1.2.3-dev" \
version="1.3.0-dev" \
release="1" \
summary="Alibi Detect Server for Seldon Core" \
description="The Alibi Detect Server provides outlier, drift and adversarial detection services for Seldon Core"
Expand Down
2 changes: 1 addition & 1 deletion components/alibi-explain-server/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
FROM registry.access.redhat.com/ubi8/python-36
LABEL name="Seldon Alibi Wrapper" \
vendor="Seldon Technologies" \
version="1.2.3-dev" \
version="1.3.0-dev" \
release="1" \
summary="Alibi Explainer Wrapper for Seldon Core" \
description="Allows Seldon Core inference models to run with a black box model explanation model from the Alibi:Explain project"
Expand Down
6 changes: 3 additions & 3 deletions components/alibi-explain-server/alibiexplainer/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
EXPLAINER_FILENAME = "explainer.dill"
KERAS_MODEL = "model.h5"


def main():
args, extra = parse_args(sys.argv[1:])
# Pretrained Alibi explainer
Expand Down Expand Up @@ -60,12 +61,11 @@ def main():
alibi_model,
Protocol(args.protocol),
args.tf_data_type,
keras_model
keras_model,
)
explainer.load()
ExplainerServer(args.http_port).start(explainer)


if __name__ == "__main__":
main()


Original file line number Diff line number Diff line change
@@ -1,4 +1,2 @@
scikit-learn==0.20.1
numpy==1.15.1
scipy==1.1.0
xgboost==0.81
numpy >= 1.8.2
xgboost==1.2.0
2 changes: 1 addition & 1 deletion components/seldon-request-logger/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
FROM registry.access.redhat.com/ubi8/python-36
LABEL name="Seldon Request Logger" \
vendor="Seldon Technologies" \
version="1.2.3-dev" \
version="1.3.0-dev" \
release="1" \
summary="The payload logger for Seldon Core" \
description="The Seldon Payload Logger allows request and response payloads from a Seldon Core inference graph to be processed and sent to an ELK endpoint"
Expand Down
2 changes: 1 addition & 1 deletion components/storage-initializer/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
FROM registry.access.redhat.com/ubi8/python-36
LABEL name="Storage Initializer" \
vendor="Seldon Technologies" \
version="1.2.3-dev" \
version="1.3.0-dev" \
release="1" \
summary="Storage Initializer for Seldon Core" \
description="Allows Seldon Core to download artifacts from cloud and local storage to a local volume"
Expand Down
1 change: 1 addition & 0 deletions core-builder/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ RUN wget https://storage.googleapis.com/kubernetes-release/release/v1.16.2/bin/l
RUN \
apt-get update -y && \
apt-get install -y vim && \
apt-get install -y jq && \
apt-get install -y build-essential && \
apt-get remove -y --auto-remove && apt-get clean -y && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

Expand Down
2 changes: 1 addition & 1 deletion core-builder/Makefile
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
DOCKER_IMAGE_NAME=seldonio/core-builder
DOCKER_IMAGE_VERSION=0.17
DOCKER_IMAGE_VERSION=0.18

build_docker_image:
cp ../testing/scripts/dev_requirements.txt .
Expand Down
1 change: 1 addition & 0 deletions doc/requirements_docs.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ nbsphinx>=0.4.2
nbsphinx-link>=1.2.0
ipykernel>=5.1.0
ipython>=7.2.0
pygments>=2.4.1,<3
10 changes: 9 additions & 1 deletion doc/source/analytics/routers.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,15 @@ Currently we provide two reference implementations of routers in Python. Both ar
* [Thompson Sampling](https://github.com/SeldonIO/seldon-core/tree/master/components/routers/thompson-sampling)

## Implementing custom routers
A router component must implement a ```Route``` method which will return one of the children that the router component is connected to for routing an incoming request. Optionally a ```SendFeedback``` method can be implemented to provide a mechanism for informing the router on the quality of its decisions. This would be used in adaptive routers such as multi-armed bandits, refer to the [epsilon-greedy](https://github.com/SeldonIO/seldon-core/tree/master/components/routers/epsilon-greedy) example for more detail.
A router component must implement a ```Route``` method which will return one of the children that the router component is connected to for routing an incoming request. The options for the return value for a custom router at present are

* -1 : Route to all children
* -2 : Route to no children and return the current request as the response
* N >= 0 : Route to child N

The response for REST calls should be returned as a SeldonMessage with the payload containing the route value or a JSON array containing a single integer.

Optionally a ```SendFeedback``` method can be implemented to provide a mechanism for informing the router on the quality of its decisions. This would be used in adaptive routers such as multi-armed bandits, refer to the [epsilon-greedy](https://github.com/SeldonIO/seldon-core/tree/master/components/routers/epsilon-greedy) example for more detail.

As an example, consider writing a custom A/B/C... testing component with a user-specified number of children and routing probabilities (two-model routing is already supported in Seldon Core: [RANDOM_ABTEST](../reference/seldon-deployment.md#proto-buffer-definition)). In this scenario because the routing logic is static there is no need to implement ```SendFeedback``` as we will not be dynamically changing the routing by providing feedback for its routing choices. On the other hand, an adaptive router whose routing is required to change dynamically by providing feedback will need to implement the ```SendFeedback``` method.

Expand Down
1 change: 1 addition & 0 deletions doc/source/examples/notebooks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ Python Language Wrapper Examples
Sagemaker SKLearn Example <sagemaker_sklearn>
TFserving MNIST <tfserving_mnist>
Statsmodels Holt-Winter's time-series model <statsmodels>
Runtime Metrics & Tags <runtime_metrics_tags>

Specialised Framework Examples
-----
Expand Down
3 changes: 3 additions & 0 deletions doc/source/examples/runtime_metrics_tags.nblink
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"path": "../../../examples/models/runtime_metrics_tags/runtime_metrics_tags.ipynb"
}
83 changes: 83 additions & 0 deletions doc/source/graph/distributed-tracing-dd.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# Distributed Tracing with Datadog

You can use Open Tracing to trace your API calls to Seldon Core. By default Jaeger is supported ([see here](distributed-tracing.md)), but Datadog can also be used.

Datadog is only supported in the Executor and the Python wrapper at this time.

## Install Datadog

You will need to install the Datadog Agent on your Kubernetes cluster. Follow their [documentation](https://docs.datadoghq.com/agent/kubernetes/?tab=helm). Ensure that APM is enabled in the deployment, see [here](https://docs.datadoghq.com/agent/kubernetes/apm/?tab=helm).

## Configuration

You will need to annotate your Seldon Deployment resource with environment variables to make tracing active and set the appropriate Datadog configuration variables.

* For each Seldon component you run (e.g., model transformer etc.) you will need to add environment variables to the container section.


### Python Wrapper Configuration

Add an environment variable: `TRACING` with value `1` to activate tracing.

To ensure that Datadog tracing is used, set the environment variable `DD_ENABLED` to `1`

For a complete list of available environment variables, see the [Datadogs python documentation](https://docs.datadoghq.com/tracing/setup/python/#configuration) for the model wrapper, and [Datadogs Go documentation](https://docs.datadoghq.com/tracing/setup/go/) for the executor, but the relevant ones are below:
* `DD_AGENT_HOST=<host>` (defaults to `localhost`)
* `DATADOG_TRACE_AGENT_PORT=<port>` (defaults to `8126`)
* `DD_SERVICE=<svc>` (will default to either `executor`, or the name of your Python class)
* `DD_TAGS=<key:value,key2:value2`
* `DD_SAMPLE_RATE:1` (defaults to 1, keeping all traces)
* _NOTE: This is a non-standard environment variable, meaning its sepcific to Seldon._



An example is show below:

```yaml
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
name: dd-tracing-example
namespace: seldon
spec:
name: dd-tracing-example
predictors:
- componentSpecs:
- spec:
containers:
- env:
- name: TRACING
value: '1'
- name: DD_AGENT_HOST
valueFrom:
fieldRef:
fieldPath: status.hostIP
- name: DATADOG_TRACE_AGENT_PORT
value: '8126'
- name: DD_SAMPLE_RATE
value: 0.75 # Keep 75% of traces
image: seldonio/mock_classifier_rest:1.3
name: model1
terminationGracePeriodSeconds: 1
graph:
children: []
endpoint:
type: REST
name: model1
type: MODEL
name: tracing
replicas: 1
svcOrchSpec:
env:
- name: TRACING
value: '1'
- name: DD_AGENT_HOST
valueFrom:
fieldRef:
fieldPath: status.hostIP
- name: DATADOG_TRACE_AGENT_PORT
value: '8126'
- name: DD_SAMPLE_RATE
value: 0.9 # Keep 90% of traces
```

2 changes: 2 additions & 0 deletions doc/source/graph/distributed-tracing.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

You can use Open Tracing to trace your API calls to Seldon Core. By default we support Jaeger for Distributed Tracing, which will allow you to obtain insights on latency and performance across each microservice-hop in your Seldon deployment.

Datadog tracing is also supported, see [Distributed Tracing with Datadog](distributed-tracing-dd.md)

## Install Jaeger

You will need to install Jaeger on your Kubernetes cluster. Follow their [documentation](https://www.jaegertracing.io/docs/1.18/operator/)
Expand Down
2 changes: 2 additions & 0 deletions doc/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,7 @@ Documentation Index
:caption: Incubating Projects

Java Language Wrapper <java/README.md>
Java (JNI) Language Wrapper [ALPHA] <java-jni/README.md>
R Language Wrapper [ALPHA] <R/README.md>
NodeJS Language Wrapper [ALPHA] <nodejs/README.md>
Go Language Wrapper [ALPHA] <go/go_wrapper_link.rst>
Expand Down Expand Up @@ -156,6 +157,7 @@ Documentation Index
Seldon Deployment CRD <reference/seldon-deployment.md>
Service Orchestrator <graph/svcorch.md>
Kubeflow <analytics/kubeflow.md>
Concepts <reference/concepts.md>

.. toctree::
:maxdepth: 1
Expand Down
Loading