[ML] PyTorch Command Processor #1770

davidkyle · 2021-02-23T17:04:32Z

Defines the input and output documents for the PyTorch 3rd party model app and adds a command processor which parses JSON documents from a stream then calls a handler function for each request. This all happens in a single thread, the output will be written before the next request document is parsed.

Input

Models accept a variable number of arguments depending the purpose. In Python PyTorch these are named arguments, in LibTorch an array of input tensors is used. All BERT models take a list of tokens, the other parameters are passed in the fields arg_1, arg_2 etc. This program knows nothing about the expected number of arguments it simply consumes all fields starting with arg_ and forwards them to the model.

{
  "request_id": "foo", 
  "tokens": [101, 20164, 10932],
  "arg_1": [1, 1, 1],
  "arg_2": [0, 0, 0]
}

RapidJSON supports reading multiple documents from a stream if the kParseStopWhenDoneFlag flag is used. The docs don't have to have a common root (e.g. in an array or nested inside a wrapper object). Docs can optionally be separated by whitespace but any other separator is invalid.

Input Validation

The input will come from Elasticsearch never from a client, we control the comms with Elasticsearch so minimal validation is required. If the request is not correctly formed then something catastrophic has happened (broken pipe).

Typically the model throws a std::runtime_error if the input is not right, this is caught and returned to the caller.

Output

All BERT models output a tuple the first element of which is the output tensor. The remaining elements are model dependent (might be logits or labels) we have not found a use case requiring the full tuple yet so the output response will only contain the tensor (for now). The tensor must have 2 dimensions or be reducible to 2 dimensions.

The output is a JSON document containing the tensor as an array of arrays.

{
  "request_id": "foo",
  "inference": [[9.265243530273438,-2.515533685684204,-1.3969738483428956,],[1.2384196519851685,-1.9405957460403443,-1.775153398513794]]
}

In the case of an error the output doc has an error field. It is envisaged any errors will be returned directly to the client.

{
  "request_id": "foo",
  "error": "what went wrong"
}

Tests

The inputs to evaluate.py have been reworked so that a single JSON file contains both the input and expected output. Invoking the test and adding new examples is now much easier:

python3 evaluate.py /path/todistilbert-base-uncased-finetuned-sst-2-english.pt examples/sentiment_analysis/test_run.json
python3 evaluate.py /path/to/conll03_traced_ner.pt examples/ner/test_run.json

Closes #1700
Closes #1701

davidkyle · 2021-02-24T12:48:42Z

The macOS aarch64 build failed because PyTorch has not been setup on the machine

In file included from Main.cc:15:
./CBufferedIStreamAdapter.h:10:10: fatal error: 'caffe2/serialize/read_adapter_interface.h' file not found
#include <caffe2/serialize/read_adapter_interface.h>
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
gnumake[2]: *** [.objs/Main.o] Error 1
gnumake[1]: *** [pytorch_inference/targetdirectory] Error 2
gnumake[1]: *** Waiting for unfinished jobs....

bin/pytorch_inference/CCommandParser.cc

bin/pytorch_inference/CCommandParser.h

bin/pytorch_inference/Main.cc

bin/pytorch_inference/Makefile

droberts195 · 2021-02-24T13:19:06Z

bin/pytorch_inference/Main.cc

+        torch::Tensor tensor =
+            torch::from_blob(static_cast<void*>(args.data()),
+                             {1, static_cast<std::int64_t>(args.size())},
+                             at::dtype(torch::kInt32))
+                .to(torch::kInt64);
+
+        inputs.push_back(tensor);


This is copying the tensors at the moment. It may be possible to emplace:

Suggested change

torch::Tensor tensor =

torch::from_blob(static_cast<void*>(args.data()),

{1, static_cast<std::int64_t>(args.size())},

at::dtype(torch::kInt32))

.to(torch::kInt64);

inputs.push_back(tensor);

inputs.emplace_back(

torch::from_blob(static_cast<void*>(args.data()),

{1, static_cast<std::int64_t>(args.size())},

at::dtype(torch::kInt32))

.to(torch::kInt64));

Or if that doesn't work for some reason, you could at least move when adding, i.e. inputs.push_back(std::move(tensor));.

torch::tensor is a wrapper around at::tensor and internally has a smart pointer to its data. Passing by value/copying is the way to use it by design

Some discussion here https://discuss.pytorch.org/t/tensor-move-semantics-in-c-frontend/77901/5

droberts195 · 2021-02-24T13:21:36Z

bin/pytorch_inference/Main.cc

    torch::Tensor tokensTensor =
-        torch::from_blob(data.data(), {1, static_cast<std::int64_t>(data.size())})
+        torch::from_blob(static_cast<void*>(request.s_Tokens.data()),
+                         {1, static_cast<std::int64_t>(request.s_Tokens.size())},
+                         at::dtype(torch::kInt32))
            .to(torch::kInt64);
+
    std::vector<torch::jit::IValue> inputs;
    inputs.push_back(tokensTensor);


As below, it looks like this is copying the tensor. If it compiles this should be more efficient:

std::vector<torch::jit::IValue> inputs; inputs.emplace_back( torch::from_blob(static_cast<void*>(request.s_Tokens.data()), {1, static_cast<std::int64_t>(request.s_Tokens.size())}, at::dtype(torch::kInt32)) .to(torch::kInt64));

bin/pytorch_inference/Main.cc

droberts195 · 2021-02-24T13:26:05Z

The macOS aarch64 build failed because PyTorch has not been setup on the machine

Hopefully this will be possible in PyTorch 1.8, which is not too far off now - see pytorch/pytorch#51886 (comment)

tveasey

I did a pass through. Overall looks nice and clean. I made some minor suggestions. My main observation is since this is long running and you are going to be streaming stuff to this executable I'd try and avoid all the temporary large heap objects. I don't feel this would complicate matters and as is feels like premature pessimization.

bin/pytorch_inference/CCommandParser.cc

tveasey · 2021-02-24T13:23:50Z

bin/pytorch_inference/CCommandParser.cc

+void debug(const rapidjson::Document& doc) {
+    rapidjson::StringBuffer buffer;
+    rapidjson::Writer<rapidjson::StringBuffer> writer(buffer);
+    doc.Accept(writer);
+    LOG_TRACE(<< buffer.GetString());
+}


I wonder if we should move this to core/CRapidJsonUtils.h. This pattern comes up a certain amount and it would be good to have a single definition for this.

I added operator<< to this file. It is a trivial function that has to live in the rapidjson namespace so I don't think it belongs in core/CRapidJsonUtils.h

It's this stuff I meant:

rapidjson::StringBuffer buffer; rapidjson::Writer<rapidjson::StringBuffer> writer(buffer); doc.Accept(writer);

i.e. conversion to string which I'm sure I've put somewhere local myself in the past. I find I have to remind myself every time how to convert a rapidjson::Document to a string and thought perhaps it was time to make this a utility somewhere. That said you don't have to make this change and perhaps we should hunt for other cases address them all in one go.

bin/pytorch_inference/CCommandParser.cc

bin/pytorch_inference/CCommandParser.h

bin/pytorch_inference/Main.cc

tveasey

I think an issue got introduced in the refactor, plus I do think input validation warrants a code comment.

bin/pytorch_inference/CCommandParser.cc

tveasey · 2021-02-24T18:39:14Z

bin/pytorch_inference/Main.cc

-    torch::NoGradGuard noGrad;
-    auto tuple = module.forward(inputs).toTuple();
-    auto predictions = tuple->elements()[0].toTensor();
+    for (auto args : request.s_SecondaryArguments) {


One more thing...

Suggested change

for (auto args : request.s_SecondaryArguments) {

for (const auto& args : request.s_SecondaryArguments) {

👍 reference yes but it can't be const because later there is non-const access to .data()

Interestingly, this should have caused undefined behaviour because the tensor is then referencing memory from the loop copy of request vector. I wonder if this points to a missing test of non-empty secondary arguments.

Actually, calling to() is probably what saves you since it'll create a copy. This makes me wonder, should we be doing this here? I'd have thought it would be better to write the values into std::vector<std::uint64_t> and keep with just having a reference to this memory for the Tensor, assuming they need to be 64 bit. It may need some alignment shenanigans to make the most of library optimisations but that should be manageable. Something to investigate in a follow up anyway.

This reverts commit 2e48245.

This reverts commit 4e77ca3.

This reverts commit 83e6172.

tveasey

Thanks for iterating. LGTM.

droberts195

LGTM if you could just change a couple of things related to securing the input.

droberts195 · 2021-02-25T10:09:45Z

bin/pytorch_inference/CCommandParser.h

+//! Validation on the input documents is light. It is expected the input
+//! comes from another process which tightly controls what is sent.
+//! Input from an outside source that has not been sanitized should never
+//! be sent.


This is saying to a hacker, "If you can manage to send dodgy input to this process we'll give you a shell prompt on the system."

I think the input is actually validated to the extent of preventing array bounds overwrites. So instead the comment could be more along the lines of, "Validation exists to prevent memory violations from malicious input, but no more. The caller is responsible for sending input that will not result in errors from libTorch and will produce meaningful results."

droberts195 · 2021-02-25T10:16:47Z

bin/pytorch_inference/CCommandParser.cc

+}
+
+bool CCommandParser::validateJson(const rapidjson::Document& doc) const {
+    if (doc.HasMember(REQUEST_ID) == false) {


Actually there is one security hole in the validation, which is that we need to confirm doc[REQUEST_ID].IsString(). Without this additional check, sending an integer for this field instead would be a way to get a pointer of choice dereferenced.

I added this and also the checks that the token arrays contain unsigned ints

droberts195 · 2021-02-25T16:54:23Z

Windows is showing up a problem:

CBufferedIStreamAdapter.obj : error LNK2019: unresolved external symbol __imp_ntohl referenced in function "private: bool __cdecl ml::torch::CBufferedIStreamAdapter::parseSizeFromStream(unsigned __int64 &)" (?parseSizeFromStream@CBufferedIStreamAdapter@torch@ml@@AEAA_NAEA_K@Z)
.objs\pytorch_inference.exe : fatal error LNK1120: 1 unresolved externals

It seems that:

// For ntohl
#ifdef Windows
#include <WinSock2.h>
#else
#include <netinet/in.h>
#endif

is still used in CBufferedIStreamAdapter.cc which wasn't touched in this PR.

So I was wrong when I said:

USE_NET=1

could be removed from the Makefile. Sorry about that.

davidkyle · 2021-03-02T11:34:55Z

I merged this before CI was green because the macOS ARM build will not pass until PyTorch has been added to the build machine, Windows was failing for unrelated reasons. Given this is a feature branch PR it is fine to merge

davidkyle added the 3rd party models label Feb 23, 2021

davidkyle added 7 commits February 24, 2021 09:50

Add command parser

31b644f

Run multiple requests in evaluate.py

f5b7a2e

Pass all attention mask etc as command arguments

1ed29a3

Catch and output inference errors

85b5ebc

Add the sentiment analysis example

ec8b883

tidy up

afbb3ba

clang format

6adf034

davidkyle force-pushed the json-input branch from c4a2378 to 6adf034 Compare February 24, 2021 09:58

droberts195 reviewed Feb 24, 2021

View reviewed changes

tveasey reviewed Feb 24, 2021

View reviewed changes

Address review comments

441b2fc

tveasey approved these changes Feb 24, 2021

View reviewed changes

bin/pytorch_inference/CCommandParser.cc Outdated Show resolved Hide resolved

tveasey reviewed Feb 24, 2021

View reviewed changes

davidkyle added 5 commits February 24, 2021 22:37

Review comments round 2

2e48245

Rebuild unit tests when the object files change

4e77ca3

Revert "Review comments round 2"

83e6172

This reverts commit 2e48245.

Revert "Rebuild unit tests when the object files change"

fc00887

This reverts commit 4e77ca3.

Revert "Revert "Review comments round 2""

3a268ff

This reverts commit 83e6172.

davidkyle force-pushed the json-input branch from 69b9e6c to 3a268ff Compare February 25, 2021 09:03

tveasey approved these changes Feb 25, 2021

View reviewed changes

droberts195 approved these changes Feb 25, 2021

View reviewed changes

davidkyle added 4 commits February 25, 2021 11:42

Print errors from process in python script

89e4424

Parse tokens as Uint64 to avoid copying tensor

2df2cb9

Validate token arrays contain unsigned ints

b819c19

update docs

2910f87

Network is required

a83d667

clang format

ed8dbff

davidkyle merged commit 3ab6f07 into elastic:feature/pytorch-inference Mar 2, 2021

davidkyle deleted the json-input branch March 2, 2021 11:33

This was referenced Apr 23, 2021

[ML] [PyTorch] Create command processor for inference app #1701

Closed

[ML] [PyTorch] Communications with between ES and the native process #1700

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] PyTorch Command Processor #1770

[ML] PyTorch Command Processor #1770

davidkyle commented Feb 23, 2021 •

edited

Loading

davidkyle commented Feb 24, 2021

droberts195 Feb 24, 2021

davidkyle Feb 24, 2021

droberts195 Feb 24, 2021

droberts195 commented Feb 24, 2021

tveasey left a comment

tveasey Feb 24, 2021

davidkyle Feb 24, 2021

tveasey Feb 24, 2021

tveasey left a comment •

edited

Loading

tveasey Feb 24, 2021

davidkyle Feb 25, 2021

tveasey Feb 25, 2021 •

edited

Loading

tveasey Feb 25, 2021 •

edited

Loading

tveasey left a comment

droberts195 left a comment

droberts195 Feb 25, 2021

davidkyle Feb 25, 2021

droberts195 Feb 25, 2021

davidkyle Feb 25, 2021

droberts195 commented Feb 25, 2021

davidkyle commented Mar 2, 2021

	for (auto args : request.s_SecondaryArguments) {
	for (const auto& args : request.s_SecondaryArguments) {

[ML] PyTorch Command Processor #1770

[ML] PyTorch Command Processor #1770

Conversation

davidkyle commented Feb 23, 2021 • edited Loading

Input

Input Validation

Output

Tests

davidkyle commented Feb 24, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

droberts195 commented Feb 24, 2021

tveasey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tveasey left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tveasey Feb 25, 2021 • edited Loading

Choose a reason for hiding this comment

tveasey Feb 25, 2021 • edited Loading

Choose a reason for hiding this comment

tveasey left a comment

Choose a reason for hiding this comment

droberts195 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

droberts195 commented Feb 25, 2021

davidkyle commented Mar 2, 2021

davidkyle commented Feb 23, 2021 •

edited

Loading

tveasey left a comment •

edited

Loading

tveasey Feb 25, 2021 •

edited

Loading

tveasey Feb 25, 2021 •

edited

Loading