Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Operator for Implicit Models #134

Merged
merged 20 commits into from
Aug 15, 2022

Conversation

oliverholworthy
Copy link
Member

Adds an Operator for Implicit Models PredictImplicit.

Self-contained operator for serving implicit models.

@nvidia-merlin-bot
Copy link

Click to view CI Results
GitHub pull request #134 of commit 0e2a7dab4b7303ac973cbe834075fc01bb6602af, no merge conflicts.
Running as SYSTEM
Setting status of 0e2a7dab4b7303ac973cbe834075fc01bb6602af to PENDING with url https://10.20.13.93:8080/job/merlin_systems/130/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_systems
using credential fce1c729-5d7c-48e8-90cb-b0c314b1076e
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/systems # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/systems
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems user + githubtoken
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/systems +refs/pull/134/*:refs/remotes/origin/pr/134/* # timeout=10
 > git rev-parse 0e2a7dab4b7303ac973cbe834075fc01bb6602af^{commit} # timeout=10
Checking out Revision 0e2a7dab4b7303ac973cbe834075fc01bb6602af (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 0e2a7dab4b7303ac973cbe834075fc01bb6602af # timeout=10
Commit message: "Add docstrings to OperatorRunner and pass through model config"
 > git rev-list --no-walk 609cc61933ae594080de9e1c48004cac1a0da7a2 # timeout=10
[merlin_systems] $ /bin/bash /tmp/jenkins10727564167416475803.sh
PYTHONPATH=:/usr/local/lib/python3.8/dist-packages/:/usr/local/hugectr/lib:/var/jenkins_home/workspace/merlin_systems/systems
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_systems/systems, configfile: pyproject.toml
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 47 items / 1 skipped

tests/unit/test_version.py . [ 2%]
tests/unit/systems/test_ensemble.py ...F [ 10%]
tests/unit/systems/test_ensemble_ops.py FF [ 14%]
tests/unit/systems/test_export.py . [ 17%]
tests/unit/systems/test_graph.py . [ 19%]
tests/unit/systems/test_inference_ops.py .. [ 23%]
tests/unit/systems/test_op_runner.py FFF. [ 31%]
tests/unit/systems/test_tensorflow_inf_op.py ... [ 38%]
tests/unit/systems/fil/test_fil.py .......................... [ 93%]
tests/unit/systems/fil/test_forest.py ... [100%]

=================================== FAILURES ===================================
_____________________ test_workflow_with_forest_inference ______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-15/test_workflow_with_forest_infe0')

@pytest.mark.skipif(not TRITON_SERVER_PATH, reason="triton server not found")
def test_workflow_with_forest_inference(tmpdir):
    rows = 200
    num_features = 16
    X, y = sklearn.datasets.make_regression(
        n_samples=rows,
        n_features=num_features,
        n_informative=num_features // 3,
        random_state=0,
    )
    feature_names = [str(i) for i in range(num_features)]
    df = pd.DataFrame(X, columns=feature_names, dtype=np.float32)
    dataset = Dataset(df)

    # Fit GBDT Model
    model = xgboost.XGBRegressor()
    model.fit(X, y)

    input_column_schemas = [ColumnSchema(col, dtype=np.float32) for col in feature_names]
    input_schema = Schema(input_column_schemas)
    selector = ColumnSelector(feature_names)

    workflow_ops = feature_names >> wf_ops.LogOp()
    workflow = Workflow(workflow_ops)
    workflow.fit(dataset)

    triton_chain = selector >> TransformWorkflow(workflow) >> PredictForest(model, input_schema)

    triton_ens = Ensemble(triton_chain, input_schema)

    request_df = df[:5]
    triton_ens.export(tmpdir)
  response = _run_ensemble_on_tritonserver(
        str(tmpdir), ["output__0"], request_df, triton_ens.name
    )

tests/unit/systems/test_ensemble.py:220:


tests/unit/systems/utils/triton.py:39: in _run_ensemble_on_tritonserver
with run_triton_server(tmpdir) as client:
/usr/lib/python3.8/contextlib.py:113: in enter
return next(self.gen)


modelpath = '/tmp/pytest-of-jenkins/pytest-15/test_workflow_with_forest_infe0'

@contextlib.contextmanager
def run_triton_server(modelpath):
    """This function starts up a Triton server instance and returns a client to it.

    Parameters
    ----------
    modelpath : string
        The path to the model to load.

    Yields
    ------
    client: tritonclient.InferenceServerClient
        The client connected to the Triton server.

    """
    cmdline = [
        TRITON_SERVER_PATH,
        "--model-repository",
        modelpath,
        "--backend-config=tensorflow,version=2",
    ]
    env = os.environ.copy()
    env["CUDA_VISIBLE_DEVICES"] = "0"
    with subprocess.Popen(cmdline, env=env) as process:
        try:
            with grpcclient.InferenceServerClient("localhost:8001") as client:
                # wait until server is ready
                for _ in range(60):
                    if process.poll() is not None:
                        retcode = process.returncode
                      raise RuntimeError(f"Tritonserver failed to start (ret={retcode})")

E RuntimeError: Tritonserver failed to start (ret=1)

merlin/systems/triton/utils.py:46: RuntimeError
----------------------------- Captured stderr call -----------------------------
I0711 17:08:29.039796 6067 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7fdb0e000000' with size 268435456
I0711 17:08:29.040561 6067 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0711 17:08:29.044592 6067 model_repository_manager.cc:1191] loading: 0_transformworkflow:1
I0711 17:08:29.144966 6067 model_repository_manager.cc:1191] loading: 1_fil:1
I0711 17:08:29.150821 6067 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 0_transformworkflow (GPU device 0)
I0711 17:08:29.245367 6067 model_repository_manager.cc:1191] loading: 1_predictforest:1
I0711 17:08:31.542948 6067 model_repository_manager.cc:1345] successfully loaded '0_transformworkflow' version 1
I0711 17:08:31.573773 6067 initialize.hpp:43] TRITONBACKEND_Initialize: fil
I0711 17:08:31.573814 6067 backend.hpp:47] Triton TRITONBACKEND API version: 1.9
I0711 17:08:31.573823 6067 backend.hpp:52] 'fil' TRITONBACKEND API version: 1.9
I0711 17:08:31.576791 6067 model_initialize.hpp:37] TRITONBACKEND_ModelInitialize: 1_fil (version 1)
I0711 17:08:31.579624 6067 instance_initialize.hpp:46] TRITONBACKEND_ModelInstanceInitialize: 1_fil_0 (GPU device 0)
I0711 17:08:31.647534 6067 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 1_predictforest (GPU device 0)
I0711 17:08:31.647648 6067 model_repository_manager.cc:1345] successfully loaded '1_fil' version 1
0711 17:08:33.750995 6210 pb_stub.cc:301] Failed to initialize Python stub: JSONDecodeError: Expecting value: line 1 column 1 (char 0)

At:
/usr/lib/python3.8/json/decoder.py(355): raw_decode
/usr/lib/python3.8/json/decoder.py(337): decode
/usr/lib/python3.8/json/init.py(357): loads
/tmp/pytest-of-jenkins/pytest-15/test_workflow_with_forest_infe0/1_predictforest/1/model.py(59): initialize

E0711 17:08:34.100981 6067 model_repository_manager.cc:1348] failed to load '1_predictforest' version 1: Internal: JSONDecodeError: Expecting value: line 1 column 1 (char 0)

At:
/usr/lib/python3.8/json/decoder.py(355): raw_decode
/usr/lib/python3.8/json/decoder.py(337): decode
/usr/lib/python3.8/json/init.py(357): loads
/tmp/pytest-of-jenkins/pytest-15/test_workflow_with_forest_infe0/1_predictforest/1/model.py(59): initialize

E0711 17:08:34.101135 6067 model_repository_manager.cc:1551] Invalid argument: ensemble 'ensemble_model' depends on '1_predictforest' which has no loaded version
I0711 17:08:34.101243 6067 server.cc:556]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0711 17:08:34.101349 6067 server.cc:583]
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Backend | Path | Config |
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| python | /opt/tritonserver/backends/python/libtriton_python.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |
| fil | /opt/tritonserver/backends/fil/libtriton_fil.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0711 17:08:34.101474 6067 server.cc:626]
+---------------------+---------+---------------------------------------------------------------------------------------------------------------+
| Model | Version | Status |
+---------------------+---------+---------------------------------------------------------------------------------------------------------------+
| 0_transformworkflow | 1 | READY |
| 1_fil | 1 | READY |
| 1_predictforest | 1 | UNAVAILABLE: Internal: JSONDecodeError: Expecting value: line 1 column 1 (char 0) |
| | | |
| | | At: |
| | | /usr/lib/python3.8/json/decoder.py(355): raw_decode |
| | | /usr/lib/python3.8/json/decoder.py(337): decode |
| | | /usr/lib/python3.8/json/init.py(357): loads |
| | | /tmp/pytest-of-jenkins/pytest-15/test_workflow_with_forest_infe0/1_predictforest/1/model.py(59): initialize |
+---------------------+---------+---------------------------------------------------------------------------------------------------------------+

I0711 17:08:34.164011 6067 metrics.cc:650] Collecting metrics for GPU 0: Tesla P100-DGXS-16GB
I0711 17:08:34.164921 6067 tritonserver.cc:2138]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.22.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
| model_repository_path[0] | /tmp/pytest-of-jenkins/pytest-15/test_workflow_with_forest_infe0 |
| model_control_mode | MODE_NONE |
| strict_model_config | 1 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 67108864 |
| response_cache_byte_size | 0 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0711 17:08:34.164958 6067 server.cc:257] Waiting for in-flight requests to complete.
I0711 17:08:34.164969 6067 server.cc:273] Timeout 30: Found 0 model versions that have in-flight inferences
I0711 17:08:34.164978 6067 model_repository_manager.cc:1223] unloading: 1_fil:1
I0711 17:08:34.165019 6067 model_repository_manager.cc:1223] unloading: 0_transformworkflow:1
I0711 17:08:34.165051 6067 server.cc:288] All models are stopped, unloading models
I0711 17:08:34.165060 6067 server.cc:295] Timeout 30: Found 2 live models and 0 in-flight non-inference requests
I0711 17:08:34.165124 6067 instance_finalize.hpp:36] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0711 17:08:34.165638 6067 model_finalize.hpp:36] TRITONBACKEND_ModelFinalize: delete model state
I0711 17:08:34.165708 6067 model_repository_manager.cc:1328] successfully unloaded '1_fil' version 1
I0711 17:08:35.165147 6067 server.cc:295] Timeout 29: Found 1 live models and 0 in-flight non-inference requests
W0711 17:08:35.184681 6067 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0711 17:08:35.184736 6067 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0
I0711 17:08:35.601661 6067 model_repository_manager.cc:1328] successfully unloaded '0_transformworkflow' version 1
I0711 17:08:36.165273 6067 server.cc:295] Timeout 28: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models
W0711 17:08:36.184899 6067 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0711 17:08:36.184948 6067 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0
____________________________ test_softmax_sampling _____________________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-15/test_softmax_sampling0')

@pytest.mark.skipif(not TRITON_SERVER_PATH, reason="triton server not found")
def test_softmax_sampling(tmpdir):
    request_schema = Schema(
        [
            ColumnSchema("movie_ids", dtype=np.int32),
            ColumnSchema("output_1", dtype=np.float32),
        ]
    )

    combined_features = {
        "movie_ids": np.random.randint(0, 10000, 100).astype(np.int32),
        "output_1": np.random.random(100).astype(np.float32),
    }

    request = make_df(combined_features)

    ordering = ["movie_ids"] >> SoftmaxSampling(relevance_col="output_1", topk=10, temperature=20.0)

    ensemble = Ensemble(ordering, request_schema)
    ens_config, node_configs = ensemble.export(tmpdir)
  response = _run_ensemble_on_tritonserver(
        tmpdir, ensemble.graph.output_schema.column_names, request, "ensemble_model"
    )

tests/unit/systems/test_ensemble_ops.py:52:


tests/unit/systems/utils/triton.py:39: in _run_ensemble_on_tritonserver
with run_triton_server(tmpdir) as client:
/usr/lib/python3.8/contextlib.py:113: in enter
return next(self.gen)


modelpath = local('/tmp/pytest-of-jenkins/pytest-15/test_softmax_sampling0')

@contextlib.contextmanager
def run_triton_server(modelpath):
    """This function starts up a Triton server instance and returns a client to it.

    Parameters
    ----------
    modelpath : string
        The path to the model to load.

    Yields
    ------
    client: tritonclient.InferenceServerClient
        The client connected to the Triton server.

    """
    cmdline = [
        TRITON_SERVER_PATH,
        "--model-repository",
        modelpath,
        "--backend-config=tensorflow,version=2",
    ]
    env = os.environ.copy()
    env["CUDA_VISIBLE_DEVICES"] = "0"
    with subprocess.Popen(cmdline, env=env) as process:
        try:
            with grpcclient.InferenceServerClient("localhost:8001") as client:
                # wait until server is ready
                for _ in range(60):
                    if process.poll() is not None:
                        retcode = process.returncode
                      raise RuntimeError(f"Tritonserver failed to start (ret={retcode})")

E RuntimeError: Tritonserver failed to start (ret=1)

merlin/systems/triton/utils.py:46: RuntimeError
----------------------------- Captured stderr call -----------------------------
I0711 17:08:38.468079 6268 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7fe58e000000' with size 268435456
I0711 17:08:38.468915 6268 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0711 17:08:38.471393 6268 model_repository_manager.cc:1191] loading: 0_softmaxsampling:1
I0711 17:08:38.578570 6268 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 0_softmaxsampling (GPU device 0)
0711 17:08:40.680625 6308 pb_stub.cc:301] Failed to initialize Python stub: JSONDecodeError: Expecting value: line 1 column 1 (char 0)

At:
/usr/lib/python3.8/json/decoder.py(355): raw_decode
/usr/lib/python3.8/json/decoder.py(337): decode
/usr/lib/python3.8/json/init.py(357): loads
/tmp/pytest-of-jenkins/pytest-15/test_softmax_sampling0/0_softmaxsampling/1/model.py(59): initialize

E0711 17:08:41.060744 6268 model_repository_manager.cc:1348] failed to load '0_softmaxsampling' version 1: Internal: JSONDecodeError: Expecting value: line 1 column 1 (char 0)

At:
/usr/lib/python3.8/json/decoder.py(355): raw_decode
/usr/lib/python3.8/json/decoder.py(337): decode
/usr/lib/python3.8/json/init.py(357): loads
/tmp/pytest-of-jenkins/pytest-15/test_softmax_sampling0/0_softmaxsampling/1/model.py(59): initialize

E0711 17:08:41.060945 6268 model_repository_manager.cc:1551] Invalid argument: ensemble 'ensemble_model' depends on '0_softmaxsampling' which has no loaded version
I0711 17:08:41.061056 6268 server.cc:556]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0711 17:08:41.061152 6268 server.cc:583]
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Backend | Path | Config |
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| python | /opt/tritonserver/backends/python/libtriton_python.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0711 17:08:41.061247 6268 server.cc:626]
+-------------------+---------+--------------------------------------------------------------------------------------------------------+
| Model | Version | Status |
+-------------------+---------+--------------------------------------------------------------------------------------------------------+
| 0_softmaxsampling | 1 | UNAVAILABLE: Internal: JSONDecodeError: Expecting value: line 1 column 1 (char 0) |
| | | |
| | | At: |
| | | /usr/lib/python3.8/json/decoder.py(355): raw_decode |
| | | /usr/lib/python3.8/json/decoder.py(337): decode |
| | | /usr/lib/python3.8/json/init.py(357): loads |
| | | /tmp/pytest-of-jenkins/pytest-15/test_softmax_sampling0/0_softmaxsampling/1/model.py(59): initialize |
+-------------------+---------+--------------------------------------------------------------------------------------------------------+

I0711 17:08:41.124204 6268 metrics.cc:650] Collecting metrics for GPU 0: Tesla P100-DGXS-16GB
I0711 17:08:41.125096 6268 tritonserver.cc:2138]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.22.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
| model_repository_path[0] | /tmp/pytest-of-jenkins/pytest-15/test_softmax_sampling0 |
| model_control_mode | MODE_NONE |
| strict_model_config | 1 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 67108864 |
| response_cache_byte_size | 0 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0711 17:08:41.125132 6268 server.cc:257] Waiting for in-flight requests to complete.
I0711 17:08:41.125139 6268 server.cc:273] Timeout 30: Found 0 model versions that have in-flight inferences
I0711 17:08:41.125150 6268 server.cc:288] All models are stopped, unloading models
I0711 17:08:41.125156 6268 server.cc:295] Timeout 30: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models
W0711 17:08:42.144093 6268 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0711 17:08:42.144178 6268 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0
____________________________ test_filter_candidates ____________________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-15/test_filter_candidates0')

@pytest.mark.skipif(not TRITON_SERVER_PATH, reason="triton server not found")
def test_filter_candidates(tmpdir):
    request_schema = Schema(
        [
            ColumnSchema("candidate_ids", dtype=np.int32),
            ColumnSchema("movie_ids", dtype=np.int32),
        ]
    )

    candidate_ids = np.random.randint(1, 100000, 100).astype(np.int32)
    movie_ids_1 = np.zeros(100, dtype=np.int32)
    movie_ids_1[:20] = np.unique(candidate_ids)[:20]

    combined_features = {
        "candidate_ids": candidate_ids,
        "movie_ids": movie_ids_1,
    }

    request = make_df(combined_features)

    filtering = ["candidate_ids"] >> FilterCandidates(filter_out=["movie_ids"])

    ensemble = Ensemble(filtering, request_schema)
    ens_config, node_configs = ensemble.export(tmpdir)
  response = _run_ensemble_on_tritonserver(
        tmpdir, ensemble.graph.output_schema.column_names, request, "ensemble_model"
    )

tests/unit/systems/test_ensemble_ops.py:84:


tests/unit/systems/utils/triton.py:39: in _run_ensemble_on_tritonserver
with run_triton_server(tmpdir) as client:
/usr/lib/python3.8/contextlib.py:113: in enter
return next(self.gen)


modelpath = local('/tmp/pytest-of-jenkins/pytest-15/test_filter_candidates0')

@contextlib.contextmanager
def run_triton_server(modelpath):
    """This function starts up a Triton server instance and returns a client to it.

    Parameters
    ----------
    modelpath : string
        The path to the model to load.

    Yields
    ------
    client: tritonclient.InferenceServerClient
        The client connected to the Triton server.

    """
    cmdline = [
        TRITON_SERVER_PATH,
        "--model-repository",
        modelpath,
        "--backend-config=tensorflow,version=2",
    ]
    env = os.environ.copy()
    env["CUDA_VISIBLE_DEVICES"] = "0"
    with subprocess.Popen(cmdline, env=env) as process:
        try:
            with grpcclient.InferenceServerClient("localhost:8001") as client:
                # wait until server is ready
                for _ in range(60):
                    if process.poll() is not None:
                        retcode = process.returncode
                      raise RuntimeError(f"Tritonserver failed to start (ret={retcode})")

E RuntimeError: Tritonserver failed to start (ret=1)

merlin/systems/triton/utils.py:46: RuntimeError
----------------------------- Captured stderr call -----------------------------
I0711 17:08:43.569853 6435 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f01a6000000' with size 268435456
I0711 17:08:43.570584 6435 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0711 17:08:43.573734 6435 model_repository_manager.cc:1191] loading: 0_filtercandidates:1
I0711 17:08:43.681006 6435 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 0_filtercandidates (GPU device 0)
0711 17:08:45.785922 6515 pb_stub.cc:301] Failed to initialize Python stub: JSONDecodeError: Expecting value: line 1 column 1 (char 0)

At:
/usr/lib/python3.8/json/decoder.py(355): raw_decode
/usr/lib/python3.8/json/decoder.py(337): decode
/usr/lib/python3.8/json/init.py(357): loads
/tmp/pytest-of-jenkins/pytest-15/test_filter_candidates0/0_filtercandidates/1/model.py(59): initialize

E0711 17:08:46.149111 6435 model_repository_manager.cc:1348] failed to load '0_filtercandidates' version 1: Internal: JSONDecodeError: Expecting value: line 1 column 1 (char 0)

At:
/usr/lib/python3.8/json/decoder.py(355): raw_decode
/usr/lib/python3.8/json/decoder.py(337): decode
/usr/lib/python3.8/json/init.py(357): loads
/tmp/pytest-of-jenkins/pytest-15/test_filter_candidates0/0_filtercandidates/1/model.py(59): initialize

E0711 17:08:46.149317 6435 model_repository_manager.cc:1551] Invalid argument: ensemble 'ensemble_model' depends on '0_filtercandidates' which has no loaded version
I0711 17:08:46.149410 6435 server.cc:556]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0711 17:08:46.149496 6435 server.cc:583]
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Backend | Path | Config |
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| python | /opt/tritonserver/backends/python/libtriton_python.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0711 17:08:46.149590 6435 server.cc:626]
+--------------------+---------+----------------------------------------------------------------------------------------------------------+
| Model | Version | Status |
+--------------------+---------+----------------------------------------------------------------------------------------------------------+
| 0_filtercandidates | 1 | UNAVAILABLE: Internal: JSONDecodeError: Expecting value: line 1 column 1 (char 0) |
| | | |
| | | At: |
| | | /usr/lib/python3.8/json/decoder.py(355): raw_decode |
| | | /usr/lib/python3.8/json/decoder.py(337): decode |
| | | /usr/lib/python3.8/json/init.py(357): loads |
| | | /tmp/pytest-of-jenkins/pytest-15/test_filter_candidates0/0_filtercandidates/1/model.py(59): initialize |
+--------------------+---------+----------------------------------------------------------------------------------------------------------+

I0711 17:08:46.218836 6435 metrics.cc:650] Collecting metrics for GPU 0: Tesla P100-DGXS-16GB
I0711 17:08:46.219637 6435 tritonserver.cc:2138]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.22.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
| model_repository_path[0] | /tmp/pytest-of-jenkins/pytest-15/test_filter_candidates0 |
| model_control_mode | MODE_NONE |
| strict_model_config | 1 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 67108864 |
| response_cache_byte_size | 0 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0711 17:08:46.219668 6435 server.cc:257] Waiting for in-flight requests to complete.
I0711 17:08:46.219675 6435 server.cc:273] Timeout 30: Found 0 model versions that have in-flight inferences
I0711 17:08:46.219685 6435 server.cc:288] All models are stopped, unloading models
I0711 17:08:46.219690 6435 server.cc:295] Timeout 30: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models
W0711 17:08:47.234617 6435 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0711 17:08:47.234689 6435 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0
_____________________ test_op_runner_loads_config[parquet] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-15/test_op_runner_loads_config_pa0')
dataset = <merlin.io.dataset.Dataset object at 0x7f85da82c100>
engine = 'parquet'

@pytest.mark.parametrize("engine", ["parquet"])
def test_op_runner_loads_config(tmpdir, dataset, engine):
    input_columns = ["x", "y", "id"]

    # NVT
    workflow_ops = input_columns >> wf_ops.Rename(postfix="_nvt")
    workflow = nvt.Workflow(workflow_ops)
    workflow.fit(dataset)
    workflow.save(str(tmpdir))

    repository = "repository_path/"
    version = 1
    kind = ""
    config = {
        "parameters": {
            "operator_names": {"string_value": json.dumps(["PlusTwoOp_1"])},
            "PlusTwoOp_1": {
                "string_value": json.dumps(
                    {
                        "module_name": PlusTwoOp.__module__,
                        "class_name": "PlusTwoOp",
                    }
                )
            },
        }
    }
  runner = op_runner.OperatorRunner(config, repository, version, kind)

tests/unit/systems/test_op_runner.py:60:


self = <merlin.systems.dag.op_runner.OperatorRunner object at 0x7f85dc343370>
config = {'parameters': {'PlusTwoOp_1': {'string_value': '{"module_name": "tests.unit.systems.utils.ops", "class_name": "PlusTwoOp"}'}, 'operator_names': {'string_value': '["PlusTwoOp_1"]'}}}
model_repository = 'repository_path/', model_version = 1, model_name = ''
kind = ''

def __init__(self, config, model_repository="./", model_version=1, model_name=None, kind=""):
    """Instantiate an OperatorRunner"""
    operator_names = self.fetch_json_param(config, "operator_names")
    op_configs = [self.fetch_json_param(config, op_name) for op_name in operator_names]

    self.operators = []
    for op_config in op_configs:
        module_name = op_config["module_name"]
        class_name = op_config["class_name"]

        op_module = importlib.import_module(module_name)
        op_class = getattr(op_module, class_name)
      operator = op_class.from_config(
            op_config,
            model_repository=model_repository,
            model_name=model_name,
            model_version=model_version,
        )

E TypeError: from_config() got an unexpected keyword argument 'model_repository'

merlin/systems/dag/op_runner.py:36: TypeError
_______________ test_op_runner_loads_multiple_ops_same[parquet] ________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-15/test_op_runner_loads_multiple_0')
dataset = <merlin.io.dataset.Dataset object at 0x7f85dc9db880>
engine = 'parquet'

@pytest.mark.parametrize("engine", ["parquet"])
def test_op_runner_loads_multiple_ops_same(tmpdir, dataset, engine):
    # NVT
    schema = dataset.schema
    for name in schema.column_names:
        dataset.schema.column_schemas[name] = dataset.schema.column_schemas[name].with_tags(
            [Tags.USER]
        )

    repository = "repository_path/"
    version = 1
    kind = ""
    config = {
        "parameters": {
            "operator_names": {"string_value": json.dumps(["PlusTwoOp_1", "PlusTwoOp_2"])},
            "PlusTwoOp_1": {
                "string_value": json.dumps(
                    {
                        "module_name": PlusTwoOp.__module__,
                        "class_name": "PlusTwoOp",
                    }
                )
            },
            "PlusTwoOp_2": {
                "string_value": json.dumps(
                    {
                        "module_name": PlusTwoOp.__module__,
                        "class_name": "PlusTwoOp",
                    }
                )
            },
        }
    }
  runner = op_runner.OperatorRunner(config, repository, version, kind)

tests/unit/systems/test_op_runner.py:100:


self = <merlin.systems.dag.op_runner.OperatorRunner object at 0x7f85dc7764c0>
config = {'parameters': {'PlusTwoOp_1': {'string_value': '{"module_name": "tests.unit.systems.utils.ops", "class_name": "PlusTw...ystems.utils.ops", "class_name": "PlusTwoOp"}'}, 'operator_names': {'string_value': '["PlusTwoOp_1", "PlusTwoOp_2"]'}}}
model_repository = 'repository_path/', model_version = 1, model_name = ''
kind = ''

def __init__(self, config, model_repository="./", model_version=1, model_name=None, kind=""):
    """Instantiate an OperatorRunner"""
    operator_names = self.fetch_json_param(config, "operator_names")
    op_configs = [self.fetch_json_param(config, op_name) for op_name in operator_names]

    self.operators = []
    for op_config in op_configs:
        module_name = op_config["module_name"]
        class_name = op_config["class_name"]

        op_module = importlib.import_module(module_name)
        op_class = getattr(op_module, class_name)
      operator = op_class.from_config(
            op_config,
            model_repository=model_repository,
            model_name=model_name,
            model_version=model_version,
        )

E TypeError: from_config() got an unexpected keyword argument 'model_repository'

merlin/systems/dag/op_runner.py:36: TypeError
___________ test_op_runner_loads_multiple_ops_same_execute[parquet] ____________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-15/test_op_runner_loads_multiple_1')
dataset = <merlin.io.dataset.Dataset object at 0x7f85da95d3d0>
engine = 'parquet'

@pytest.mark.parametrize("engine", ["parquet"])
def test_op_runner_loads_multiple_ops_same_execute(tmpdir, dataset, engine):
    # NVT
    schema = dataset.schema
    for name in schema.column_names:
        dataset.schema.column_schemas[name] = dataset.schema.column_schemas[name].with_tags(
            [Tags.USER]
        )

    repository = "repository_path/"
    version = 1
    kind = ""
    config = {
        "parameters": {
            "operator_names": {"string_value": json.dumps(["PlusTwoOp_1", "PlusTwoOp_2"])},
            "PlusTwoOp_1": {
                "string_value": json.dumps(
                    {
                        "module_name": PlusTwoOp.__module__,
                        "class_name": "PlusTwoOp",
                    }
                )
            },
            "PlusTwoOp_2": {
                "string_value": json.dumps(
                    {
                        "module_name": PlusTwoOp.__module__,
                        "class_name": "PlusTwoOp",
                    }
                )
            },
        }
    }
  runner = op_runner.OperatorRunner(config, repository, version, kind)

tests/unit/systems/test_op_runner.py:142:


self = <merlin.systems.dag.op_runner.OperatorRunner object at 0x7f85da8d9c70>
config = {'parameters': {'PlusTwoOp_1': {'string_value': '{"module_name": "tests.unit.systems.utils.ops", "class_name": "PlusTw...ystems.utils.ops", "class_name": "PlusTwoOp"}'}, 'operator_names': {'string_value': '["PlusTwoOp_1", "PlusTwoOp_2"]'}}}
model_repository = 'repository_path/', model_version = 1, model_name = ''
kind = ''

def __init__(self, config, model_repository="./", model_version=1, model_name=None, kind=""):
    """Instantiate an OperatorRunner"""
    operator_names = self.fetch_json_param(config, "operator_names")
    op_configs = [self.fetch_json_param(config, op_name) for op_name in operator_names]

    self.operators = []
    for op_config in op_configs:
        module_name = op_config["module_name"]
        class_name = op_config["class_name"]

        op_module = importlib.import_module(module_name)
        op_class = getattr(op_module, class_name)
      operator = op_class.from_config(
            op_config,
            model_repository=model_repository,
            model_name=model_name,
            model_version=model_version,
        )

E TypeError: from_config() got an unexpected keyword argument 'model_repository'

merlin/systems/dag/op_runner.py:36: TypeError
=============================== warnings summary ===============================
../../../../../usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18
/usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18: DeprecationWarning: The nvtabular.framework_utils module is being replaced by the Merlin Models library. Support for importing from nvtabular.framework_utils is deprecated, and will be removed in a future version. Please consider using the models and layers from Merlin Models instead.
warnings.warn(

tests/unit/systems/test_ensemble.py::test_workflow_tf_e2e_config_verification[parquet]
tests/unit/systems/test_ensemble.py::test_workflow_tf_e2e_multi_op_run[parquet]
tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
tests/unit/systems/test_inference_ops.py::test_workflow_op_validates_schemas[parquet]
tests/unit/systems/test_inference_ops.py::test_workflow_op_exports_own_config[parquet]
tests/unit/systems/test_op_runner.py::test_op_runner_loads_config[parquet]
tests/unit/systems/test_op_runner.py::test_op_runner_loads_multiple_ops_same[parquet]
tests/unit/systems/test_op_runner.py::test_op_runner_loads_multiple_ops_same_execute[parquet]
tests/unit/systems/test_op_runner.py::test_op_runner_single_node_export[parquet]
/usr/local/lib/python3.8/dist-packages/cudf/core/frame.py:384: UserWarning: The deep parameter is ignored and is only included for pandas compatibility.
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column x is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column y is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column id is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/fil/test_fil.py::test_binary_classifier_default[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_binary_classifier_with_proba[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_multi_classifier[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_regressor[sklearn_forest_regressor-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_model_file[sklearn_forest_regressor-checkpoint.tl]
/usr/local/lib/python3.8/dist-packages/sklearn/utils/deprecation.py:103: FutureWarning: Attribute n_features_ was deprecated in version 1.0 and will be removed in 1.2. Use n_features_in_ instead.
warnings.warn(msg, category=FutureWarning)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED tests/unit/systems/test_ensemble.py::test_workflow_with_forest_inference
FAILED tests/unit/systems/test_ensemble_ops.py::test_softmax_sampling - Runti...
FAILED tests/unit/systems/test_ensemble_ops.py::test_filter_candidates - Runt...
FAILED tests/unit/systems/test_op_runner.py::test_op_runner_loads_config[parquet]
FAILED tests/unit/systems/test_op_runner.py::test_op_runner_loads_multiple_ops_same[parquet]
FAILED tests/unit/systems/test_op_runner.py::test_op_runner_loads_multiple_ops_same_execute[parquet]
======= 6 failed, 41 passed, 1 skipped, 18 warnings in 360.21s (0:06:00) =======
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://api.GitHub.com/repos/NVIDIA-Merlin/systems/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_systems] $ /bin/bash /tmp/jenkins421593222794637263.sh

@nvidia-merlin-bot
Copy link

Click to view CI Results
GitHub pull request #134 of commit eee034fdf579d5e62575b89159f76fc22aaed600, no merge conflicts.
Running as SYSTEM
Setting status of eee034fdf579d5e62575b89159f76fc22aaed600 to PENDING with url https://10.20.13.93:8080/job/merlin_systems/131/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_systems
using credential fce1c729-5d7c-48e8-90cb-b0c314b1076e
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/systems # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/systems
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems user + githubtoken
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/systems +refs/pull/134/*:refs/remotes/origin/pr/134/* # timeout=10
 > git rev-parse eee034fdf579d5e62575b89159f76fc22aaed600^{commit} # timeout=10
Checking out Revision eee034fdf579d5e62575b89159f76fc22aaed600 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f eee034fdf579d5e62575b89159f76fc22aaed600 # timeout=10
Commit message: "Add workflow for implicit op"
 > git rev-list --no-walk 0e2a7dab4b7303ac973cbe834075fc01bb6602af # timeout=10
[merlin_systems] $ /bin/bash /tmp/jenkins7531757995567704480.sh
PYTHONPATH=:/usr/local/lib/python3.8/dist-packages/:/usr/local/hugectr/lib:/var/jenkins_home/workspace/merlin_systems/systems
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_systems/systems, configfile: pyproject.toml
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 47 items / 1 skipped

tests/unit/test_version.py . [ 2%]
tests/unit/systems/test_ensemble.py ...F [ 10%]
tests/unit/systems/test_ensemble_ops.py FF [ 14%]
tests/unit/systems/test_export.py . [ 17%]
tests/unit/systems/test_graph.py . [ 19%]
tests/unit/systems/test_inference_ops.py .. [ 23%]
tests/unit/systems/test_op_runner.py FFF. [ 31%]
tests/unit/systems/test_tensorflow_inf_op.py ... [ 38%]
tests/unit/systems/fil/test_fil.py .......................... [ 93%]
tests/unit/systems/fil/test_forest.py ... [100%]

=================================== FAILURES ===================================
_____________________ test_workflow_with_forest_inference ______________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-18/test_workflow_with_forest_infe0')

@pytest.mark.skipif(not TRITON_SERVER_PATH, reason="triton server not found")
def test_workflow_with_forest_inference(tmpdir):
    rows = 200
    num_features = 16
    X, y = sklearn.datasets.make_regression(
        n_samples=rows,
        n_features=num_features,
        n_informative=num_features // 3,
        random_state=0,
    )
    feature_names = [str(i) for i in range(num_features)]
    df = pd.DataFrame(X, columns=feature_names, dtype=np.float32)
    dataset = Dataset(df)

    # Fit GBDT Model
    model = xgboost.XGBRegressor()
    model.fit(X, y)

    input_column_schemas = [ColumnSchema(col, dtype=np.float32) for col in feature_names]
    input_schema = Schema(input_column_schemas)
    selector = ColumnSelector(feature_names)

    workflow_ops = feature_names >> wf_ops.LogOp()
    workflow = Workflow(workflow_ops)
    workflow.fit(dataset)

    triton_chain = selector >> TransformWorkflow(workflow) >> PredictForest(model, input_schema)

    triton_ens = Ensemble(triton_chain, input_schema)

    request_df = df[:5]
    triton_ens.export(tmpdir)
  response = _run_ensemble_on_tritonserver(
        str(tmpdir), ["output__0"], request_df, triton_ens.name
    )

tests/unit/systems/test_ensemble.py:220:


tests/unit/systems/utils/triton.py:39: in _run_ensemble_on_tritonserver
with run_triton_server(tmpdir) as client:
/usr/lib/python3.8/contextlib.py:113: in enter
return next(self.gen)


modelpath = '/tmp/pytest-of-jenkins/pytest-18/test_workflow_with_forest_infe0'

@contextlib.contextmanager
def run_triton_server(modelpath):
    """This function starts up a Triton server instance and returns a client to it.

    Parameters
    ----------
    modelpath : string
        The path to the model to load.

    Yields
    ------
    client: tritonclient.InferenceServerClient
        The client connected to the Triton server.

    """
    cmdline = [
        TRITON_SERVER_PATH,
        "--model-repository",
        modelpath,
        "--backend-config=tensorflow,version=2",
    ]
    env = os.environ.copy()
    env["CUDA_VISIBLE_DEVICES"] = "0"
    with subprocess.Popen(cmdline, env=env) as process:
        try:
            with grpcclient.InferenceServerClient("localhost:8001") as client:
                # wait until server is ready
                for _ in range(60):
                    if process.poll() is not None:
                        retcode = process.returncode
                      raise RuntimeError(f"Tritonserver failed to start (ret={retcode})")

E RuntimeError: Tritonserver failed to start (ret=1)

merlin/systems/triton/utils.py:46: RuntimeError
----------------------------- Captured stderr call -----------------------------
I0711 18:23:49.759742 7179 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f5576000000' with size 268435456
I0711 18:23:49.760484 7179 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0711 18:23:49.764566 7179 model_repository_manager.cc:1191] loading: 1_fil:1
I0711 18:23:49.864897 7179 model_repository_manager.cc:1191] loading: 1_predictforest:1
I0711 18:23:49.883753 7179 initialize.hpp:43] TRITONBACKEND_Initialize: fil
I0711 18:23:49.883779 7179 backend.hpp:47] Triton TRITONBACKEND API version: 1.9
I0711 18:23:49.883785 7179 backend.hpp:52] 'fil' TRITONBACKEND API version: 1.9
I0711 18:23:49.884347 7179 model_initialize.hpp:37] TRITONBACKEND_ModelInitialize: 1_fil (version 1)
I0711 18:23:49.885864 7179 instance_initialize.hpp:46] TRITONBACKEND_ModelInstanceInitialize: 1_fil_0 (GPU device 0)
I0711 18:23:49.931958 7179 model_repository_manager.cc:1345] successfully loaded '1_fil' version 1
I0711 18:23:49.965130 7179 model_repository_manager.cc:1191] loading: 0_transformworkflow:1
I0711 18:23:49.967772 7179 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 1_predictforest (GPU device 0)
0711 18:23:52.558734 7236 pb_stub.cc:301] Failed to initialize Python stub: TypeError: from_config() got an unexpected keyword argument 'model_repository'

At:
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/op_runner.py(36): init
/tmp/pytest-of-jenkins/pytest-18/test_workflow_with_forest_infe0/1_predictforest/1/model.py(73): initialize

I0711 18:23:53.093952 7179 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 0_transformworkflow (GPU device 0)
E0711 18:23:53.103773 7179 model_repository_manager.cc:1348] failed to load '1_predictforest' version 1: Internal: TypeError: from_config() got an unexpected keyword argument 'model_repository'

At:
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/op_runner.py(36): init
/tmp/pytest-of-jenkins/pytest-18/test_workflow_with_forest_infe0/1_predictforest/1/model.py(73): initialize

I0711 18:23:55.452826 7179 model_repository_manager.cc:1345] successfully loaded '0_transformworkflow' version 1
E0711 18:23:55.452955 7179 model_repository_manager.cc:1551] Invalid argument: ensemble 'ensemble_model' depends on '1_predictforest' which has no loaded version
I0711 18:23:55.453094 7179 server.cc:556]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0711 18:23:55.453196 7179 server.cc:583]
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Backend | Path | Config |
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| fil | /opt/tritonserver/backends/fil/libtriton_fil.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |
| python | /opt/tritonserver/backends/python/libtriton_python.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0711 18:23:55.453302 7179 server.cc:626]
+---------------------+---------+---------------------------------------------------------------------------------------------------------------+
| Model | Version | Status |
+---------------------+---------+---------------------------------------------------------------------------------------------------------------+
| 0_transformworkflow | 1 | READY |
| 1_fil | 1 | READY |
| 1_predictforest | 1 | UNAVAILABLE: Internal: TypeError: from_config() got an unexpected keyword argument 'model_repository' |
| | | |
| | | At: |
| | | /var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/op_runner.py(36): init |
| | | /tmp/pytest-of-jenkins/pytest-18/test_workflow_with_forest_infe0/1_predictforest/1/model.py(73): initialize |
+---------------------+---------+---------------------------------------------------------------------------------------------------------------+

I0711 18:23:55.517075 7179 metrics.cc:650] Collecting metrics for GPU 0: Tesla P100-DGXS-16GB
I0711 18:23:55.517932 7179 tritonserver.cc:2138]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.22.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
| model_repository_path[0] | /tmp/pytest-of-jenkins/pytest-18/test_workflow_with_forest_infe0 |
| model_control_mode | MODE_NONE |
| strict_model_config | 1 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 67108864 |
| response_cache_byte_size | 0 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0711 18:23:55.517967 7179 server.cc:257] Waiting for in-flight requests to complete.
I0711 18:23:55.517976 7179 server.cc:273] Timeout 30: Found 0 model versions that have in-flight inferences
I0711 18:23:55.517984 7179 model_repository_manager.cc:1223] unloading: 1_fil:1
I0711 18:23:55.518022 7179 model_repository_manager.cc:1223] unloading: 0_transformworkflow:1
I0711 18:23:55.518055 7179 server.cc:288] All models are stopped, unloading models
I0711 18:23:55.518069 7179 server.cc:295] Timeout 30: Found 2 live models and 0 in-flight non-inference requests
I0711 18:23:55.518126 7179 instance_finalize.hpp:36] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0711 18:23:55.518678 7179 model_finalize.hpp:36] TRITONBACKEND_ModelFinalize: delete model state
I0711 18:23:55.518736 7179 model_repository_manager.cc:1328] successfully unloaded '1_fil' version 1
I0711 18:23:56.518176 7179 server.cc:295] Timeout 29: Found 1 live models and 0 in-flight non-inference requests
W0711 18:23:56.538918 7179 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0711 18:23:56.538976 7179 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0
I0711 18:23:57.132540 7179 model_repository_manager.cc:1328] successfully unloaded '0_transformworkflow' version 1
I0711 18:23:57.518320 7179 server.cc:295] Timeout 28: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models
W0711 18:23:57.539134 7179 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0711 18:23:57.539183 7179 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0
____________________________ test_softmax_sampling _____________________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-18/test_softmax_sampling0')

@pytest.mark.skipif(not TRITON_SERVER_PATH, reason="triton server not found")
def test_softmax_sampling(tmpdir):
    request_schema = Schema(
        [
            ColumnSchema("movie_ids", dtype=np.int32),
            ColumnSchema("output_1", dtype=np.float32),
        ]
    )

    combined_features = {
        "movie_ids": np.random.randint(0, 10000, 100).astype(np.int32),
        "output_1": np.random.random(100).astype(np.float32),
    }

    request = make_df(combined_features)

    ordering = ["movie_ids"] >> SoftmaxSampling(relevance_col="output_1", topk=10, temperature=20.0)

    ensemble = Ensemble(ordering, request_schema)
    ens_config, node_configs = ensemble.export(tmpdir)
  response = _run_ensemble_on_tritonserver(
        tmpdir, ensemble.graph.output_schema.column_names, request, "ensemble_model"
    )

tests/unit/systems/test_ensemble_ops.py:52:


tests/unit/systems/utils/triton.py:39: in _run_ensemble_on_tritonserver
with run_triton_server(tmpdir) as client:
/usr/lib/python3.8/contextlib.py:113: in enter
return next(self.gen)


modelpath = local('/tmp/pytest-of-jenkins/pytest-18/test_softmax_sampling0')

@contextlib.contextmanager
def run_triton_server(modelpath):
    """This function starts up a Triton server instance and returns a client to it.

    Parameters
    ----------
    modelpath : string
        The path to the model to load.

    Yields
    ------
    client: tritonclient.InferenceServerClient
        The client connected to the Triton server.

    """
    cmdline = [
        TRITON_SERVER_PATH,
        "--model-repository",
        modelpath,
        "--backend-config=tensorflow,version=2",
    ]
    env = os.environ.copy()
    env["CUDA_VISIBLE_DEVICES"] = "0"
    with subprocess.Popen(cmdline, env=env) as process:
        try:
            with grpcclient.InferenceServerClient("localhost:8001") as client:
                # wait until server is ready
                for _ in range(60):
                    if process.poll() is not None:
                        retcode = process.returncode
                      raise RuntimeError(f"Tritonserver failed to start (ret={retcode})")

E RuntimeError: Tritonserver failed to start (ret=1)

merlin/systems/triton/utils.py:46: RuntimeError
----------------------------- Captured stderr call -----------------------------
I0711 18:24:00.191569 7380 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f3b0e000000' with size 268435456
I0711 18:24:00.192340 7380 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0711 18:24:00.194790 7380 model_repository_manager.cc:1191] loading: 0_softmaxsampling:1
I0711 18:24:00.301933 7380 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 0_softmaxsampling (GPU device 0)
0711 18:24:02.411844 7420 pb_stub.cc:301] Failed to initialize Python stub: TypeError: from_config() got an unexpected keyword argument 'model_repository'

At:
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/op_runner.py(36): init
/tmp/pytest-of-jenkins/pytest-18/test_softmax_sampling0/0_softmaxsampling/1/model.py(73): initialize

E0711 18:24:02.770412 7380 model_repository_manager.cc:1348] failed to load '0_softmaxsampling' version 1: Internal: TypeError: from_config() got an unexpected keyword argument 'model_repository'

At:
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/op_runner.py(36): init
/tmp/pytest-of-jenkins/pytest-18/test_softmax_sampling0/0_softmaxsampling/1/model.py(73): initialize

E0711 18:24:02.770582 7380 model_repository_manager.cc:1551] Invalid argument: ensemble 'ensemble_model' depends on '0_softmaxsampling' which has no loaded version
I0711 18:24:02.770692 7380 server.cc:556]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0711 18:24:02.770777 7380 server.cc:583]
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Backend | Path | Config |
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| python | /opt/tritonserver/backends/python/libtriton_python.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0711 18:24:02.770872 7380 server.cc:626]
+-------------------+---------+--------------------------------------------------------------------------------------------------------+
| Model | Version | Status |
+-------------------+---------+--------------------------------------------------------------------------------------------------------+
| 0_softmaxsampling | 1 | UNAVAILABLE: Internal: TypeError: from_config() got an unexpected keyword argument 'model_repository' |
| | | |
| | | At: |
| | | /var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/op_runner.py(36): init |
| | | /tmp/pytest-of-jenkins/pytest-18/test_softmax_sampling0/0_softmaxsampling/1/model.py(73): initialize |
+-------------------+---------+--------------------------------------------------------------------------------------------------------+

I0711 18:24:02.835276 7380 metrics.cc:650] Collecting metrics for GPU 0: Tesla P100-DGXS-16GB
I0711 18:24:02.836122 7380 tritonserver.cc:2138]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.22.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
| model_repository_path[0] | /tmp/pytest-of-jenkins/pytest-18/test_softmax_sampling0 |
| model_control_mode | MODE_NONE |
| strict_model_config | 1 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 67108864 |
| response_cache_byte_size | 0 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0711 18:24:02.836159 7380 server.cc:257] Waiting for in-flight requests to complete.
I0711 18:24:02.836167 7380 server.cc:273] Timeout 30: Found 0 model versions that have in-flight inferences
I0711 18:24:02.836184 7380 server.cc:288] All models are stopped, unloading models
I0711 18:24:02.836190 7380 server.cc:295] Timeout 30: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models
W0711 18:24:03.864321 7380 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0711 18:24:03.864386 7380 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0
____________________________ test_filter_candidates ____________________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-18/test_filter_candidates0')

@pytest.mark.skipif(not TRITON_SERVER_PATH, reason="triton server not found")
def test_filter_candidates(tmpdir):
    request_schema = Schema(
        [
            ColumnSchema("candidate_ids", dtype=np.int32),
            ColumnSchema("movie_ids", dtype=np.int32),
        ]
    )

    candidate_ids = np.random.randint(1, 100000, 100).astype(np.int32)
    movie_ids_1 = np.zeros(100, dtype=np.int32)
    movie_ids_1[:20] = np.unique(candidate_ids)[:20]

    combined_features = {
        "candidate_ids": candidate_ids,
        "movie_ids": movie_ids_1,
    }

    request = make_df(combined_features)

    filtering = ["candidate_ids"] >> FilterCandidates(filter_out=["movie_ids"])

    ensemble = Ensemble(filtering, request_schema)
    ens_config, node_configs = ensemble.export(tmpdir)
  response = _run_ensemble_on_tritonserver(
        tmpdir, ensemble.graph.output_schema.column_names, request, "ensemble_model"
    )

tests/unit/systems/test_ensemble_ops.py:84:


tests/unit/systems/utils/triton.py:39: in _run_ensemble_on_tritonserver
with run_triton_server(tmpdir) as client:
/usr/lib/python3.8/contextlib.py:113: in enter
return next(self.gen)


modelpath = local('/tmp/pytest-of-jenkins/pytest-18/test_filter_candidates0')

@contextlib.contextmanager
def run_triton_server(modelpath):
    """This function starts up a Triton server instance and returns a client to it.

    Parameters
    ----------
    modelpath : string
        The path to the model to load.

    Yields
    ------
    client: tritonclient.InferenceServerClient
        The client connected to the Triton server.

    """
    cmdline = [
        TRITON_SERVER_PATH,
        "--model-repository",
        modelpath,
        "--backend-config=tensorflow,version=2",
    ]
    env = os.environ.copy()
    env["CUDA_VISIBLE_DEVICES"] = "0"
    with subprocess.Popen(cmdline, env=env) as process:
        try:
            with grpcclient.InferenceServerClient("localhost:8001") as client:
                # wait until server is ready
                for _ in range(60):
                    if process.poll() is not None:
                        retcode = process.returncode
                      raise RuntimeError(f"Tritonserver failed to start (ret={retcode})")

E RuntimeError: Tritonserver failed to start (ret=1)

merlin/systems/triton/utils.py:46: RuntimeError
----------------------------- Captured stderr call -----------------------------
I0711 18:24:05.305813 7477 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7efd10000000' with size 268435456
I0711 18:24:05.306593 7477 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0711 18:24:05.309098 7477 model_repository_manager.cc:1191] loading: 0_filtercandidates:1
I0711 18:24:05.416288 7477 python.cc:2388] TRITONBACKEND_ModelInstanceInitialize: 0_filtercandidates (GPU device 0)
0711 18:24:07.495069 7522 pb_stub.cc:301] Failed to initialize Python stub: TypeError: from_config() got an unexpected keyword argument 'model_repository'

At:
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/op_runner.py(36): init
/tmp/pytest-of-jenkins/pytest-18/test_filter_candidates0/0_filtercandidates/1/model.py(73): initialize

E0711 18:24:07.877864 7477 model_repository_manager.cc:1348] failed to load '0_filtercandidates' version 1: Internal: TypeError: from_config() got an unexpected keyword argument 'model_repository'

At:
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/op_runner.py(36): init
/tmp/pytest-of-jenkins/pytest-18/test_filter_candidates0/0_filtercandidates/1/model.py(73): initialize

E0711 18:24:07.878041 7477 model_repository_manager.cc:1551] Invalid argument: ensemble 'ensemble_model' depends on '0_filtercandidates' which has no loaded version
I0711 18:24:07.878158 7477 server.cc:556]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0711 18:24:07.878244 7477 server.cc:583]
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Backend | Path | Config |
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| python | /opt/tritonserver/backends/python/libtriton_python.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0711 18:24:07.878331 7477 server.cc:626]
+--------------------+---------+----------------------------------------------------------------------------------------------------------+
| Model | Version | Status |
+--------------------+---------+----------------------------------------------------------------------------------------------------------+
| 0_filtercandidates | 1 | UNAVAILABLE: Internal: TypeError: from_config() got an unexpected keyword argument 'model_repository' |
| | | |
| | | At: |
| | | /var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/op_runner.py(36): init |
| | | /tmp/pytest-of-jenkins/pytest-18/test_filter_candidates0/0_filtercandidates/1/model.py(73): initialize |
+--------------------+---------+----------------------------------------------------------------------------------------------------------+

I0711 18:24:07.941259 7477 metrics.cc:650] Collecting metrics for GPU 0: Tesla P100-DGXS-16GB
I0711 18:24:07.942126 7477 tritonserver.cc:2138]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.22.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
| model_repository_path[0] | /tmp/pytest-of-jenkins/pytest-18/test_filter_candidates0 |
| model_control_mode | MODE_NONE |
| strict_model_config | 1 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 67108864 |
| response_cache_byte_size | 0 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0711 18:24:07.942140 7477 server.cc:257] Waiting for in-flight requests to complete.
I0711 18:24:07.942147 7477 server.cc:273] Timeout 30: Found 0 model versions that have in-flight inferences
I0711 18:24:07.942167 7477 server.cc:288] All models are stopped, unloading models
I0711 18:24:07.942174 7477 server.cc:295] Timeout 30: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models
W0711 18:24:08.962938 7477 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
W0711 18:24:08.962996 7477 metrics.cc:507] Unable to get memory usage for GPU 0. Memory usage status:Success, value:0. Memory total status:Success, value:0
_____________________ test_op_runner_loads_config[parquet] _____________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-18/test_op_runner_loads_config_pa0')
dataset = <merlin.io.dataset.Dataset object at 0x7f2458148e50>
engine = 'parquet'

@pytest.mark.parametrize("engine", ["parquet"])
def test_op_runner_loads_config(tmpdir, dataset, engine):
    input_columns = ["x", "y", "id"]

    # NVT
    workflow_ops = input_columns >> wf_ops.Rename(postfix="_nvt")
    workflow = nvt.Workflow(workflow_ops)
    workflow.fit(dataset)
    workflow.save(str(tmpdir))

    repository = "repository_path/"
    version = 1
    kind = ""
    config = {
        "parameters": {
            "operator_names": {"string_value": json.dumps(["PlusTwoOp_1"])},
            "PlusTwoOp_1": {
                "string_value": json.dumps(
                    {
                        "module_name": PlusTwoOp.__module__,
                        "class_name": "PlusTwoOp",
                    }
                )
            },
        }
    }
  runner = op_runner.OperatorRunner(config, repository, version, kind)

tests/unit/systems/test_op_runner.py:60:


self = <merlin.systems.dag.op_runner.OperatorRunner object at 0x7f24580afd60>
config = {'parameters': {'PlusTwoOp_1': {'string_value': '{"module_name": "tests.unit.systems.utils.ops", "class_name": "PlusTwoOp"}'}, 'operator_names': {'string_value': '["PlusTwoOp_1"]'}}}
model_repository = 'repository_path/', model_version = 1, model_name = ''
kind = ''

def __init__(self, config, model_repository="./", model_version=1, model_name=None, kind=""):
    """Instantiate an OperatorRunner"""
    operator_names = self.fetch_json_param(config, "operator_names")
    op_configs = [self.fetch_json_param(config, op_name) for op_name in operator_names]

    self.operators = []
    for op_config in op_configs:
        module_name = op_config["module_name"]
        class_name = op_config["class_name"]

        op_module = importlib.import_module(module_name)
        op_class = getattr(op_module, class_name)
      operator = op_class.from_config(
            op_config,
            model_repository=model_repository,
            model_name=model_name,
            model_version=model_version,
        )

E TypeError: from_config() got an unexpected keyword argument 'model_repository'

merlin/systems/dag/op_runner.py:36: TypeError
_______________ test_op_runner_loads_multiple_ops_same[parquet] ________________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-18/test_op_runner_loads_multiple_0')
dataset = <merlin.io.dataset.Dataset object at 0x7f245df259a0>
engine = 'parquet'

@pytest.mark.parametrize("engine", ["parquet"])
def test_op_runner_loads_multiple_ops_same(tmpdir, dataset, engine):
    # NVT
    schema = dataset.schema
    for name in schema.column_names:
        dataset.schema.column_schemas[name] = dataset.schema.column_schemas[name].with_tags(
            [Tags.USER]
        )

    repository = "repository_path/"
    version = 1
    kind = ""
    config = {
        "parameters": {
            "operator_names": {"string_value": json.dumps(["PlusTwoOp_1", "PlusTwoOp_2"])},
            "PlusTwoOp_1": {
                "string_value": json.dumps(
                    {
                        "module_name": PlusTwoOp.__module__,
                        "class_name": "PlusTwoOp",
                    }
                )
            },
            "PlusTwoOp_2": {
                "string_value": json.dumps(
                    {
                        "module_name": PlusTwoOp.__module__,
                        "class_name": "PlusTwoOp",
                    }
                )
            },
        }
    }
  runner = op_runner.OperatorRunner(config, repository, version, kind)

tests/unit/systems/test_op_runner.py:100:


self = <merlin.systems.dag.op_runner.OperatorRunner object at 0x7f245d19b610>
config = {'parameters': {'PlusTwoOp_1': {'string_value': '{"module_name": "tests.unit.systems.utils.ops", "class_name": "PlusTw...ystems.utils.ops", "class_name": "PlusTwoOp"}'}, 'operator_names': {'string_value': '["PlusTwoOp_1", "PlusTwoOp_2"]'}}}
model_repository = 'repository_path/', model_version = 1, model_name = ''
kind = ''

def __init__(self, config, model_repository="./", model_version=1, model_name=None, kind=""):
    """Instantiate an OperatorRunner"""
    operator_names = self.fetch_json_param(config, "operator_names")
    op_configs = [self.fetch_json_param(config, op_name) for op_name in operator_names]

    self.operators = []
    for op_config in op_configs:
        module_name = op_config["module_name"]
        class_name = op_config["class_name"]

        op_module = importlib.import_module(module_name)
        op_class = getattr(op_module, class_name)
      operator = op_class.from_config(
            op_config,
            model_repository=model_repository,
            model_name=model_name,
            model_version=model_version,
        )

E TypeError: from_config() got an unexpected keyword argument 'model_repository'

merlin/systems/dag/op_runner.py:36: TypeError
___________ test_op_runner_loads_multiple_ops_same_execute[parquet] ____________

tmpdir = local('/tmp/pytest-of-jenkins/pytest-18/test_op_runner_loads_multiple_1')
dataset = <merlin.io.dataset.Dataset object at 0x7f2458568b80>
engine = 'parquet'

@pytest.mark.parametrize("engine", ["parquet"])
def test_op_runner_loads_multiple_ops_same_execute(tmpdir, dataset, engine):
    # NVT
    schema = dataset.schema
    for name in schema.column_names:
        dataset.schema.column_schemas[name] = dataset.schema.column_schemas[name].with_tags(
            [Tags.USER]
        )

    repository = "repository_path/"
    version = 1
    kind = ""
    config = {
        "parameters": {
            "operator_names": {"string_value": json.dumps(["PlusTwoOp_1", "PlusTwoOp_2"])},
            "PlusTwoOp_1": {
                "string_value": json.dumps(
                    {
                        "module_name": PlusTwoOp.__module__,
                        "class_name": "PlusTwoOp",
                    }
                )
            },
            "PlusTwoOp_2": {
                "string_value": json.dumps(
                    {
                        "module_name": PlusTwoOp.__module__,
                        "class_name": "PlusTwoOp",
                    }
                )
            },
        }
    }
  runner = op_runner.OperatorRunner(config, repository, version, kind)

tests/unit/systems/test_op_runner.py:142:


self = <merlin.systems.dag.op_runner.OperatorRunner object at 0x7f2458357ac0>
config = {'parameters': {'PlusTwoOp_1': {'string_value': '{"module_name": "tests.unit.systems.utils.ops", "class_name": "PlusTw...ystems.utils.ops", "class_name": "PlusTwoOp"}'}, 'operator_names': {'string_value': '["PlusTwoOp_1", "PlusTwoOp_2"]'}}}
model_repository = 'repository_path/', model_version = 1, model_name = ''
kind = ''

def __init__(self, config, model_repository="./", model_version=1, model_name=None, kind=""):
    """Instantiate an OperatorRunner"""
    operator_names = self.fetch_json_param(config, "operator_names")
    op_configs = [self.fetch_json_param(config, op_name) for op_name in operator_names]

    self.operators = []
    for op_config in op_configs:
        module_name = op_config["module_name"]
        class_name = op_config["class_name"]

        op_module = importlib.import_module(module_name)
        op_class = getattr(op_module, class_name)
      operator = op_class.from_config(
            op_config,
            model_repository=model_repository,
            model_name=model_name,
            model_version=model_version,
        )

E TypeError: from_config() got an unexpected keyword argument 'model_repository'

merlin/systems/dag/op_runner.py:36: TypeError
=============================== warnings summary ===============================
../../../../../usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18
/usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18: DeprecationWarning: The nvtabular.framework_utils module is being replaced by the Merlin Models library. Support for importing from nvtabular.framework_utils is deprecated, and will be removed in a future version. Please consider using the models and layers from Merlin Models instead.
warnings.warn(

tests/unit/systems/test_ensemble.py::test_workflow_tf_e2e_config_verification[parquet]
tests/unit/systems/test_ensemble.py::test_workflow_tf_e2e_multi_op_run[parquet]
tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
tests/unit/systems/test_inference_ops.py::test_workflow_op_validates_schemas[parquet]
tests/unit/systems/test_inference_ops.py::test_workflow_op_exports_own_config[parquet]
tests/unit/systems/test_op_runner.py::test_op_runner_loads_config[parquet]
tests/unit/systems/test_op_runner.py::test_op_runner_loads_multiple_ops_same[parquet]
tests/unit/systems/test_op_runner.py::test_op_runner_loads_multiple_ops_same_execute[parquet]
tests/unit/systems/test_op_runner.py::test_op_runner_single_node_export[parquet]
/usr/local/lib/python3.8/dist-packages/cudf/core/frame.py:384: UserWarning: The deep parameter is ignored and is only included for pandas compatibility.
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column x is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column y is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column id is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/fil/test_fil.py::test_binary_classifier_default[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_binary_classifier_with_proba[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_multi_classifier[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_regressor[sklearn_forest_regressor-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_model_file[sklearn_forest_regressor-checkpoint.tl]
/usr/local/lib/python3.8/dist-packages/sklearn/utils/deprecation.py:103: FutureWarning: Attribute n_features_ was deprecated in version 1.0 and will be removed in 1.2. Use n_features_in_ instead.
warnings.warn(msg, category=FutureWarning)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED tests/unit/systems/test_ensemble.py::test_workflow_with_forest_inference
FAILED tests/unit/systems/test_ensemble_ops.py::test_softmax_sampling - Runti...
FAILED tests/unit/systems/test_ensemble_ops.py::test_filter_candidates - Runt...
FAILED tests/unit/systems/test_op_runner.py::test_op_runner_loads_config[parquet]
FAILED tests/unit/systems/test_op_runner.py::test_op_runner_loads_multiple_ops_same[parquet]
FAILED tests/unit/systems/test_op_runner.py::test_op_runner_loads_multiple_ops_same_execute[parquet]
======= 6 failed, 41 passed, 1 skipped, 18 warnings in 158.49s (0:02:38) =======
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://api.GitHub.com/repos/NVIDIA-Merlin/systems/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_systems] $ /bin/bash /tmp/jenkins4038119014243997576.sh

@nvidia-merlin-bot
Copy link

Click to view CI Results
GitHub pull request #134 of commit d2a0476dacce18146b10bd56bb8d0d4745315d07, no merge conflicts.
Running as SYSTEM
Setting status of d2a0476dacce18146b10bd56bb8d0d4745315d07 to PENDING with url https://10.20.13.93:8080/job/merlin_systems/132/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_systems
using credential fce1c729-5d7c-48e8-90cb-b0c314b1076e
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/systems # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/systems
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems user + githubtoken
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/systems +refs/pull/134/*:refs/remotes/origin/pr/134/* # timeout=10
 > git rev-parse d2a0476dacce18146b10bd56bb8d0d4745315d07^{commit} # timeout=10
Checking out Revision d2a0476dacce18146b10bd56bb8d0d4745315d07 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f d2a0476dacce18146b10bd56bb8d0d4745315d07 # timeout=10
Commit message: "Handle kwargs in all from_config operator methods"
 > git rev-list --no-walk eee034fdf579d5e62575b89159f76fc22aaed600 # timeout=10
[merlin_systems] $ /bin/bash /tmp/jenkins17994122288879948751.sh
PYTHONPATH=:/usr/local/lib/python3.8/dist-packages/:/usr/local/hugectr/lib:/var/jenkins_home/workspace/merlin_systems/systems
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_systems/systems, configfile: pyproject.toml
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 47 items / 1 skipped

tests/unit/test_version.py . [ 2%]
tests/unit/systems/test_ensemble.py .... [ 10%]
tests/unit/systems/test_ensemble_ops.py .. [ 14%]
tests/unit/systems/test_export.py . [ 17%]
tests/unit/systems/test_graph.py . [ 19%]
tests/unit/systems/test_inference_ops.py .. [ 23%]
tests/unit/systems/test_op_runner.py .... [ 31%]
tests/unit/systems/test_tensorflow_inf_op.py ... [ 38%]
tests/unit/systems/fil/test_fil.py .......................... [ 93%]
tests/unit/systems/fil/test_forest.py ... [100%]

=============================== warnings summary ===============================
../../../../../usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18
/usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18: DeprecationWarning: The nvtabular.framework_utils module is being replaced by the Merlin Models library. Support for importing from nvtabular.framework_utils is deprecated, and will be removed in a future version. Please consider using the models and layers from Merlin Models instead.
warnings.warn(

tests/unit/systems/test_ensemble.py::test_workflow_tf_e2e_config_verification[parquet]
tests/unit/systems/test_ensemble.py::test_workflow_tf_e2e_multi_op_run[parquet]
tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
tests/unit/systems/test_inference_ops.py::test_workflow_op_validates_schemas[parquet]
tests/unit/systems/test_inference_ops.py::test_workflow_op_exports_own_config[parquet]
tests/unit/systems/test_op_runner.py::test_op_runner_loads_config[parquet]
tests/unit/systems/test_op_runner.py::test_op_runner_loads_multiple_ops_same[parquet]
tests/unit/systems/test_op_runner.py::test_op_runner_loads_multiple_ops_same_execute[parquet]
tests/unit/systems/test_op_runner.py::test_op_runner_single_node_export[parquet]
/usr/local/lib/python3.8/dist-packages/cudf/core/frame.py:384: UserWarning: The deep parameter is ignored and is only included for pandas compatibility.
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column x is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column y is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column id is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/fil/test_fil.py::test_binary_classifier_default[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_binary_classifier_with_proba[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_multi_classifier[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_regressor[sklearn_forest_regressor-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_model_file[sklearn_forest_regressor-checkpoint.tl]
/usr/local/lib/python3.8/dist-packages/sklearn/utils/deprecation.py:103: FutureWarning: Attribute n_features_ was deprecated in version 1.0 and will be removed in 1.2. Use n_features_in_ instead.
warnings.warn(msg, category=FutureWarning)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
============ 47 passed, 1 skipped, 18 warnings in 164.46s (0:02:44) ============
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://api.GitHub.com/repos/NVIDIA-Merlin/systems/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_systems] $ /bin/bash /tmp/jenkins11349804985688031197.sh

@github-actions
Copy link

Documentation preview

https://nvidia-merlin.github.io/systems/review/pr-134

@nvidia-merlin-bot
Copy link

Click to view CI Results
GitHub pull request #134 of commit 285d68e8317a62301710af3827be8d10fa5b2675, no merge conflicts.
Running as SYSTEM
Setting status of 285d68e8317a62301710af3827be8d10fa5b2675 to PENDING with url https://10.20.13.93:8080/job/merlin_systems/133/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_systems
using credential fce1c729-5d7c-48e8-90cb-b0c314b1076e
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/systems # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/systems
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems user + githubtoken
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/systems +refs/pull/134/*:refs/remotes/origin/pr/134/* # timeout=10
 > git rev-parse 285d68e8317a62301710af3827be8d10fa5b2675^{commit} # timeout=10
Checking out Revision 285d68e8317a62301710af3827be8d10fa5b2675 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 285d68e8317a62301710af3827be8d10fa5b2675 # timeout=10
Commit message: "Remove depencency on merlin models for tests"
 > git rev-list --no-walk d2a0476dacce18146b10bd56bb8d0d4745315d07 # timeout=10
[merlin_systems] $ /bin/bash /tmp/jenkins15980071591452208488.sh
PYTHONPATH=:/usr/local/lib/python3.8/dist-packages/:/usr/local/hugectr/lib:/var/jenkins_home/workspace/merlin_systems/systems
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_systems/systems, configfile: pyproject.toml
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 47 items / 1 skipped

tests/unit/test_version.py . [ 2%]
tests/unit/systems/test_ensemble.py .... [ 10%]
tests/unit/systems/test_ensemble_ops.py .. [ 14%]
tests/unit/systems/test_export.py . [ 17%]
tests/unit/systems/test_graph.py . [ 19%]
tests/unit/systems/test_inference_ops.py .. [ 23%]
tests/unit/systems/test_op_runner.py .... [ 31%]
tests/unit/systems/test_tensorflow_inf_op.py ... [ 38%]
tests/unit/systems/fil/test_fil.py .......................... [ 93%]
tests/unit/systems/fil/test_forest.py ... [100%]

=============================== warnings summary ===============================
../../../../../usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18
/usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18: DeprecationWarning: The nvtabular.framework_utils module is being replaced by the Merlin Models library. Support for importing from nvtabular.framework_utils is deprecated, and will be removed in a future version. Please consider using the models and layers from Merlin Models instead.
warnings.warn(

tests/unit/systems/test_ensemble.py::test_workflow_tf_e2e_config_verification[parquet]
tests/unit/systems/test_ensemble.py::test_workflow_tf_e2e_multi_op_run[parquet]
tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
tests/unit/systems/test_inference_ops.py::test_workflow_op_validates_schemas[parquet]
tests/unit/systems/test_inference_ops.py::test_workflow_op_exports_own_config[parquet]
tests/unit/systems/test_op_runner.py::test_op_runner_loads_config[parquet]
tests/unit/systems/test_op_runner.py::test_op_runner_loads_multiple_ops_same[parquet]
tests/unit/systems/test_op_runner.py::test_op_runner_loads_multiple_ops_same_execute[parquet]
tests/unit/systems/test_op_runner.py::test_op_runner_single_node_export[parquet]
/usr/local/lib/python3.8/dist-packages/cudf/core/frame.py:384: UserWarning: The deep parameter is ignored and is only included for pandas compatibility.
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column x is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column y is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column id is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/fil/test_fil.py::test_binary_classifier_default[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_binary_classifier_with_proba[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_multi_classifier[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_regressor[sklearn_forest_regressor-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_model_file[sklearn_forest_regressor-checkpoint.tl]
/usr/local/lib/python3.8/dist-packages/sklearn/utils/deprecation.py:103: FutureWarning: Attribute n_features_ was deprecated in version 1.0 and will be removed in 1.2. Use n_features_in_ instead.
warnings.warn(msg, category=FutureWarning)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
============ 47 passed, 1 skipped, 18 warnings in 179.24s (0:02:59) ============
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://api.GitHub.com/repos/NVIDIA-Merlin/systems/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_systems] $ /bin/bash /tmp/jenkins16277721664789783252.sh

@nvidia-merlin-bot
Copy link

Click to view CI Results
GitHub pull request #134 of commit 25755796a4a2239c32167236b19a2b63c95d66cc, no merge conflicts.
Running as SYSTEM
Setting status of 25755796a4a2239c32167236b19a2b63c95d66cc to PENDING with url https://10.20.13.93:8080/job/merlin_systems/134/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_systems
using credential fce1c729-5d7c-48e8-90cb-b0c314b1076e
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/systems # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/systems
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems user + githubtoken
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/systems +refs/pull/134/*:refs/remotes/origin/pr/134/* # timeout=10
 > git rev-parse 25755796a4a2239c32167236b19a2b63c95d66cc^{commit} # timeout=10
Checking out Revision 25755796a4a2239c32167236b19a2b63c95d66cc (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 25755796a4a2239c32167236b19a2b63c95d66cc # timeout=10
Commit message: "Add "als" to ci/ignore_codespell_words.txt"
 > git rev-list --no-walk 285d68e8317a62301710af3827be8d10fa5b2675 # timeout=10
[merlin_systems] $ /bin/bash /tmp/jenkins16496730202414173715.sh
PYTHONPATH=:/usr/local/lib/python3.8/dist-packages/:/usr/local/hugectr/lib:/var/jenkins_home/workspace/merlin_systems/systems
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_systems/systems, configfile: pyproject.toml
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 47 items / 1 skipped

tests/unit/test_version.py . [ 2%]
tests/unit/systems/test_ensemble.py .... [ 10%]
tests/unit/systems/test_ensemble_ops.py .. [ 14%]
tests/unit/systems/test_export.py . [ 17%]
tests/unit/systems/test_graph.py . [ 19%]
tests/unit/systems/test_inference_ops.py .. [ 23%]
tests/unit/systems/test_op_runner.py .... [ 31%]
tests/unit/systems/test_tensorflow_inf_op.py ... [ 38%]
tests/unit/systems/fil/test_fil.py .......................... [ 93%]
tests/unit/systems/fil/test_forest.py ... [100%]

=============================== warnings summary ===============================
../../../../../usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18
/usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18: DeprecationWarning: The nvtabular.framework_utils module is being replaced by the Merlin Models library. Support for importing from nvtabular.framework_utils is deprecated, and will be removed in a future version. Please consider using the models and layers from Merlin Models instead.
warnings.warn(

tests/unit/systems/test_ensemble.py::test_workflow_tf_e2e_config_verification[parquet]
tests/unit/systems/test_ensemble.py::test_workflow_tf_e2e_multi_op_run[parquet]
tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
tests/unit/systems/test_inference_ops.py::test_workflow_op_validates_schemas[parquet]
tests/unit/systems/test_inference_ops.py::test_workflow_op_exports_own_config[parquet]
tests/unit/systems/test_op_runner.py::test_op_runner_loads_config[parquet]
tests/unit/systems/test_op_runner.py::test_op_runner_loads_multiple_ops_same[parquet]
tests/unit/systems/test_op_runner.py::test_op_runner_loads_multiple_ops_same_execute[parquet]
tests/unit/systems/test_op_runner.py::test_op_runner_single_node_export[parquet]
/usr/local/lib/python3.8/dist-packages/cudf/core/frame.py:384: UserWarning: The deep parameter is ignored and is only included for pandas compatibility.
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column x is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column y is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column id is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/fil/test_fil.py::test_binary_classifier_default[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_binary_classifier_with_proba[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_multi_classifier[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_regressor[sklearn_forest_regressor-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_model_file[sklearn_forest_regressor-checkpoint.tl]
/usr/local/lib/python3.8/dist-packages/sklearn/utils/deprecation.py:103: FutureWarning: Attribute n_features_ was deprecated in version 1.0 and will be removed in 1.2. Use n_features_in_ instead.
warnings.warn(msg, category=FutureWarning)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
============ 47 passed, 1 skipped, 18 warnings in 168.96s (0:02:48) ============
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://api.GitHub.com/repos/NVIDIA-Merlin/systems/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_systems] $ /bin/bash /tmp/jenkins2047485446149255434.sh

@@ -30,14 +33,21 @@ def __init__(self, config, repository="./", version=1, kind=""):
op_module = importlib.import_module(module_name)
op_class = getattr(op_module, class_name)

operator = op_class.from_config(op_config)
operator = op_class.from_config(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karlhigley In order to load an artifact required by an operator, passing the model_* related params through to all ops that use the op runner here. I'm guessing this may have considered before because the existing OperatorRunner accepts a model repository and version (though currently unused). Does this seem like a reasonable thing to do?

@oliverholworthy
Copy link
Member Author

The ensemble tests for the implicit op has been skipped in Jenkins because I think the implicit package needs to be installed/updated.

@nvidia-merlin-bot
Copy link

Click to view CI Results
GitHub pull request #134 of commit 62468f5f5103beaa39a10c333bcb97960611a075, no merge conflicts.
Running as SYSTEM
Setting status of 62468f5f5103beaa39a10c333bcb97960611a075 to PENDING with url https://10.20.13.93:8080/job/merlin_systems/135/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_systems
using credential fce1c729-5d7c-48e8-90cb-b0c314b1076e
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/systems # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/systems
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems user + githubtoken
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/systems +refs/pull/134/*:refs/remotes/origin/pr/134/* # timeout=10
 > git rev-parse 62468f5f5103beaa39a10c333bcb97960611a075^{commit} # timeout=10
Checking out Revision 62468f5f5103beaa39a10c333bcb97960611a075 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 62468f5f5103beaa39a10c333bcb97960611a075 # timeout=10
Commit message: "Add 'n' (number of items to recommend) to the  inputs of the op"
 > git rev-list --no-walk 25755796a4a2239c32167236b19a2b63c95d66cc # timeout=10
[merlin_systems] $ /bin/bash /tmp/jenkins4185695878036075395.sh
PYTHONPATH=:/usr/local/lib/python3.8/dist-packages/:/usr/local/hugectr/lib:/var/jenkins_home/workspace/merlin_systems/systems
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_systems/systems, configfile: pyproject.toml
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 47 items / 1 skipped

tests/unit/test_version.py . [ 2%]
tests/unit/systems/test_ensemble.py .... [ 10%]
tests/unit/systems/test_ensemble_ops.py .. [ 14%]
tests/unit/systems/test_export.py . [ 17%]
tests/unit/systems/test_graph.py . [ 19%]
tests/unit/systems/test_inference_ops.py .. [ 23%]
tests/unit/systems/test_op_runner.py .... [ 31%]
tests/unit/systems/test_tensorflow_inf_op.py ... [ 38%]
tests/unit/systems/fil/test_fil.py .......................... [ 93%]
tests/unit/systems/fil/test_forest.py ... [100%]

=============================== warnings summary ===============================
../../../../../usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18
/usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18: DeprecationWarning: The nvtabular.framework_utils module is being replaced by the Merlin Models library. Support for importing from nvtabular.framework_utils is deprecated, and will be removed in a future version. Please consider using the models and layers from Merlin Models instead.
warnings.warn(

tests/unit/systems/test_ensemble.py::test_workflow_tf_e2e_config_verification[parquet]
tests/unit/systems/test_ensemble.py::test_workflow_tf_e2e_multi_op_run[parquet]
tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
tests/unit/systems/test_inference_ops.py::test_workflow_op_validates_schemas[parquet]
tests/unit/systems/test_inference_ops.py::test_workflow_op_exports_own_config[parquet]
tests/unit/systems/test_op_runner.py::test_op_runner_loads_config[parquet]
tests/unit/systems/test_op_runner.py::test_op_runner_loads_multiple_ops_same[parquet]
tests/unit/systems/test_op_runner.py::test_op_runner_loads_multiple_ops_same_execute[parquet]
tests/unit/systems/test_op_runner.py::test_op_runner_single_node_export[parquet]
/usr/local/lib/python3.8/dist-packages/cudf/core/frame.py:384: UserWarning: The deep parameter is ignored and is only included for pandas compatibility.
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column x is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column y is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column id is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/fil/test_fil.py::test_binary_classifier_default[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_binary_classifier_with_proba[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_multi_classifier[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_regressor[sklearn_forest_regressor-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_model_file[sklearn_forest_regressor-checkpoint.tl]
/usr/local/lib/python3.8/dist-packages/sklearn/utils/deprecation.py:103: FutureWarning: Attribute n_features_ was deprecated in version 1.0 and will be removed in 1.2. Use n_features_in_ instead.
warnings.warn(msg, category=FutureWarning)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
============ 47 passed, 1 skipped, 18 warnings in 164.02s (0:02:44) ============
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://api.GitHub.com/repos/NVIDIA-Merlin/systems/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_systems] $ /bin/bash /tmp/jenkins6337764186917382333.sh

Returns a transformed dataframe for this operator"""
user_id = df["user_id"][0]
num_to_recommend = df["n"][0]
user_items = None
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could move this to the inputs if it's a common pattern for users to pass in. My understanding is that this mostly serves as a mechanism to filter out items already seen.

Copy link
Member

@benfred benfred Jul 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ok - user_items can be left out if the filter_already_liked_items is False, and recalculate_user is also False (otherwise we need to know what the items the user has liked).

Returns a transformed dataframe for this operator"""
user_id = df["user_id"][0]
num_to_recommend = df["n"][0]
user_items = None
Copy link
Member

@benfred benfred Jul 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ok - user_items can be left out if the filter_already_liked_items is False, and recalculate_user is also False (otherwise we need to know what the items the user has liked).

@@ -59,7 +59,7 @@ def __init__(self, index_path, topk=10):
super().__init__()

@classmethod
def from_config(cls, config: dict) -> "QueryFaiss":
def from_config(cls, config: dict, **kwargs) -> "QueryFaiss":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whats the benefit of adding the kwargs here ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is required with the change made in op_runner.py. The from_config now receives model_repository, model_name, and model_version keyword arguments which enables us to read files stored alongside the model config.

This QueryFaiss operator in this case relies on the config containing the full path to the index file currently. However with this extra information could be updated to read from a location relative to the model repository and model config.

It could be clearer to update this **kwargs to specify the named arguments explicitly here (and in the other from_config methods.)

@benfred benfred requested a review from karlhigley July 13, 2022 23:20
@viswa-nvidia viswa-nvidia added this to the Merlin 22.08 milestone Jul 21, 2022
class PredictImplicit(PipelineableInferenceOperator):
"""Operator for running inference on Implicit models.."""

def __init__(self, model, **kwargs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PredictTensorflow op has model_or_path as its param and will load the model from a path if that's what is provided. I think we should try to be consistent with the constructor functionality, and PredictTensorflow is probably the most mature/popular, so matching that seems like a good idea to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I agree that having consistency across these Predict ops is a good idea.

However, the implicit serialization format doesn't appear to have a mechanism for knowing which model to load.

e.g. if you have created a model and saved to a file, for example model.npz.

import implicit

# random training data
n = 100  # number of both users and items
user_items = csr_matrix(np.random.choice([0, 1], size=n * n, p=[0.9, 0.1]).reshape(n, n))

model = implicit.bpr.BayesianPersonalizedRanking()
model.fit(user_items)
model.save("model.npz")  # save model to file

to call the load method we require knowledge of the kind of model to be loaded. There's not an implicit.load function that I can see (correct me if there is something I've missed @benfred )

implicit.gpu.bpr.BayesianPersonalizedRanking.load("model.npz")

We could change the signature to model_or_path at a later stage if the functionality becomes available and that is our preferred option for these predict ops.

@nvidia-merlin-bot
Copy link

Click to view CI Results
GitHub pull request #134 of commit 4bd4c30d814ecc74005582cf1d177064c81184e7, no merge conflicts.
Running as SYSTEM
Setting status of 4bd4c30d814ecc74005582cf1d177064c81184e7 to PENDING with url https://10.20.13.93:8080/job/merlin_systems/189/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_systems
using credential fce1c729-5d7c-48e8-90cb-b0c314b1076e
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/systems # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/systems
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems user + githubtoken
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/systems +refs/pull/134/*:refs/remotes/origin/pr/134/* # timeout=10
 > git rev-parse 4bd4c30d814ecc74005582cf1d177064c81184e7^{commit} # timeout=10
Checking out Revision 4bd4c30d814ecc74005582cf1d177064c81184e7 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 4bd4c30d814ecc74005582cf1d177064c81184e7 # timeout=10
Commit message: "Add check for implicit version"
 > git rev-list --no-walk 5ea4551c2082e2c308ce604baa76edc006dd9b91 # timeout=10
[merlin_systems] $ /bin/bash /tmp/jenkins10946567777681847258.sh
PYTHONPATH=:/usr/local/lib/python3.8/dist-packages/:/usr/local/hugectr/lib:/var/jenkins_home/workspace/merlin_systems/systems
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_systems/systems, configfile: pyproject.toml
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 53 items / 1 skipped

tests/unit/test_version.py . [ 1%]
tests/unit/examples/test_serving_ranking_models_with_merlin_systems.py . [ 3%]
[ 3%]
tests/unit/systems/test_ensemble.py .... [ 11%]
tests/unit/systems/test_ensemble_ops.py .. [ 15%]
tests/unit/systems/test_export.py . [ 16%]
tests/unit/systems/test_graph.py . [ 18%]
tests/unit/systems/test_inference_ops.py ... [ 24%]
tests/unit/systems/test_model_registry.py . [ 26%]
tests/unit/systems/test_op_runner.py .... [ 33%]
tests/unit/systems/test_tensorflow_inf_op.py .... [ 41%]
tests/unit/systems/dag/ops/test_softmax_sampling.py . [ 43%]
tests/unit/systems/fil/test_fil.py .......................... [ 92%]
tests/unit/systems/fil/test_forest.py .... [100%]

=============================== warnings summary ===============================
../../../../../usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18
/usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18: DeprecationWarning: The nvtabular.framework_utils module is being replaced by the Merlin Models library. Support for importing from nvtabular.framework_utils is deprecated, and will be removed in a future version. Please consider using the models and layers from Merlin Models instead.
warnings.warn(

tests/unit/examples/test_serving_ranking_models_with_merlin_systems.py: 1 warning
tests/unit/systems/test_ensemble.py: 2 warnings
tests/unit/systems/test_export.py: 1 warning
tests/unit/systems/test_inference_ops.py: 2 warnings
tests/unit/systems/test_op_runner.py: 4 warnings
/usr/local/lib/python3.8/dist-packages/cudf/core/frame.py:384: UserWarning: The deep parameter is ignored and is only included for pandas compatibility.
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column x is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column y is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column id is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/fil/test_fil.py::test_binary_classifier_default[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_binary_classifier_with_proba[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_multi_classifier[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_regressor[sklearn_forest_regressor-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_model_file[sklearn_forest_regressor-checkpoint.tl]
/usr/local/lib/python3.8/dist-packages/sklearn/utils/deprecation.py:103: FutureWarning: Attribute n_features_ was deprecated in version 1.0 and will be removed in 1.2. Use n_features_in_ instead.
warnings.warn(msg, category=FutureWarning)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
============ 53 passed, 1 skipped, 19 warnings in 249.36s (0:04:09) ============
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://api.GitHub.com/repos/NVIDIA-Merlin/systems/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_systems] $ /bin/bash /tmp/jenkins8096625236486142168.sh

@karlhigley karlhigley removed their request for review August 2, 2022 18:42
@nvidia-merlin-bot
Copy link

Click to view CI Results
GitHub pull request #134 of commit d591a110e397fa1bff2f03ab4659b62abc9170ac, no merge conflicts.
Running as SYSTEM
Setting status of d591a110e397fa1bff2f03ab4659b62abc9170ac to PENDING with url https://10.20.13.93:8080/job/merlin_systems/191/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_systems
using credential fce1c729-5d7c-48e8-90cb-b0c314b1076e
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/systems # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/systems
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems user + githubtoken
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/systems +refs/pull/134/*:refs/remotes/origin/pr/134/* # timeout=10
 > git rev-parse d591a110e397fa1bff2f03ab4659b62abc9170ac^{commit} # timeout=10
Checking out Revision d591a110e397fa1bff2f03ab4659b62abc9170ac (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f d591a110e397fa1bff2f03ab4659b62abc9170ac # timeout=10
Commit message: "Uncomment ensemble tests for als/lmf"
 > git rev-list --no-walk 476d3914b2f69960dd7252b706dd787bccf26988 # timeout=10
[merlin_systems] $ /bin/bash /tmp/jenkins12045396800483704919.sh
PYTHONPATH=:/usr/local/lib/python3.8/dist-packages/:/usr/local/hugectr/lib:/var/jenkins_home/workspace/merlin_systems/systems
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_systems/systems, configfile: pyproject.toml
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 53 items / 1 skipped

tests/unit/test_version.py . [ 1%]
tests/unit/examples/test_serving_ranking_models_with_merlin_systems.py . [ 3%]
[ 3%]
tests/unit/systems/test_ensemble.py .... [ 11%]
tests/unit/systems/test_ensemble_ops.py .. [ 15%]
tests/unit/systems/test_export.py . [ 16%]
tests/unit/systems/test_graph.py . [ 18%]
tests/unit/systems/test_inference_ops.py ... [ 24%]
tests/unit/systems/test_model_registry.py . [ 26%]
tests/unit/systems/test_op_runner.py .... [ 33%]
tests/unit/systems/test_tensorflow_inf_op.py .... [ 41%]
tests/unit/systems/dag/ops/test_softmax_sampling.py . [ 43%]
tests/unit/systems/fil/test_fil.py .......................... [ 92%]
tests/unit/systems/fil/test_forest.py .... [100%]

=============================== warnings summary ===============================
../../../../../usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18
/usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18: DeprecationWarning: The nvtabular.framework_utils module is being replaced by the Merlin Models library. Support for importing from nvtabular.framework_utils is deprecated, and will be removed in a future version. Please consider using the models and layers from Merlin Models instead.
warnings.warn(

tests/unit/examples/test_serving_ranking_models_with_merlin_systems.py: 1 warning
tests/unit/systems/test_ensemble.py: 2 warnings
tests/unit/systems/test_export.py: 1 warning
tests/unit/systems/test_inference_ops.py: 2 warnings
tests/unit/systems/test_op_runner.py: 4 warnings
/usr/local/lib/python3.8/dist-packages/cudf/core/frame.py:384: UserWarning: The deep parameter is ignored and is only included for pandas compatibility.
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column x is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column y is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column id is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/fil/test_fil.py::test_binary_classifier_default[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_binary_classifier_with_proba[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_multi_classifier[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_regressor[sklearn_forest_regressor-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_model_file[sklearn_forest_regressor-checkpoint.tl]
/usr/local/lib/python3.8/dist-packages/sklearn/utils/deprecation.py:103: FutureWarning: Attribute n_features_ was deprecated in version 1.0 and will be removed in 1.2. Use n_features_in_ instead.
warnings.warn(msg, category=FutureWarning)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
============ 53 passed, 1 skipped, 19 warnings in 246.18s (0:04:06) ============
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://api.GitHub.com/repos/NVIDIA-Merlin/systems/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_systems] $ /bin/bash /tmp/jenkins603692551053146186.sh

@nvidia-merlin-bot
Copy link

Click to view CI Results
GitHub pull request #134 of commit 94d46c58ad0c63bdabd476347fd0b3a68eaded3c, no merge conflicts.
Running as SYSTEM
Setting status of 94d46c58ad0c63bdabd476347fd0b3a68eaded3c to PENDING with url https://10.20.13.93:8080/job/merlin_systems/192/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_systems
using credential fce1c729-5d7c-48e8-90cb-b0c314b1076e
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/systems # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/systems
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems user + githubtoken
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/systems +refs/pull/134/*:refs/remotes/origin/pr/134/* # timeout=10
 > git rev-parse 94d46c58ad0c63bdabd476347fd0b3a68eaded3c^{commit} # timeout=10
Checking out Revision 94d46c58ad0c63bdabd476347fd0b3a68eaded3c (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 94d46c58ad0c63bdabd476347fd0b3a68eaded3c # timeout=10
Commit message: "Rename test for config for clarity"
 > git rev-list --no-walk d591a110e397fa1bff2f03ab4659b62abc9170ac # timeout=10
[merlin_systems] $ /bin/bash /tmp/jenkins4520550627757499299.sh
PYTHONPATH=:/usr/local/lib/python3.8/dist-packages/:/usr/local/hugectr/lib:/var/jenkins_home/workspace/merlin_systems/systems
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_systems/systems, configfile: pyproject.toml
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 53 items / 1 skipped

tests/unit/test_version.py . [ 1%]
tests/unit/examples/test_serving_ranking_models_with_merlin_systems.py . [ 3%]
[ 3%]
tests/unit/systems/test_ensemble.py .... [ 11%]
tests/unit/systems/test_ensemble_ops.py .. [ 15%]
tests/unit/systems/test_export.py . [ 16%]
tests/unit/systems/test_graph.py . [ 18%]
tests/unit/systems/test_inference_ops.py ... [ 24%]
tests/unit/systems/test_model_registry.py . [ 26%]
tests/unit/systems/test_op_runner.py .... [ 33%]
tests/unit/systems/test_tensorflow_inf_op.py .... [ 41%]
tests/unit/systems/dag/ops/test_softmax_sampling.py . [ 43%]
tests/unit/systems/fil/test_fil.py .......................... [ 92%]
tests/unit/systems/fil/test_forest.py .... [100%]

=============================== warnings summary ===============================
../../../../../usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18
/usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18: DeprecationWarning: The nvtabular.framework_utils module is being replaced by the Merlin Models library. Support for importing from nvtabular.framework_utils is deprecated, and will be removed in a future version. Please consider using the models and layers from Merlin Models instead.
warnings.warn(

tests/unit/examples/test_serving_ranking_models_with_merlin_systems.py: 1 warning
tests/unit/systems/test_ensemble.py: 2 warnings
tests/unit/systems/test_export.py: 1 warning
tests/unit/systems/test_inference_ops.py: 2 warnings
tests/unit/systems/test_op_runner.py: 4 warnings
/usr/local/lib/python3.8/dist-packages/cudf/core/frame.py:384: UserWarning: The deep parameter is ignored and is only included for pandas compatibility.
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column x is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column y is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column id is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/fil/test_fil.py::test_binary_classifier_default[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_binary_classifier_with_proba[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_multi_classifier[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_regressor[sklearn_forest_regressor-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_model_file[sklearn_forest_regressor-checkpoint.tl]
/usr/local/lib/python3.8/dist-packages/sklearn/utils/deprecation.py:103: FutureWarning: Attribute n_features_ was deprecated in version 1.0 and will be removed in 1.2. Use n_features_in_ instead.
warnings.warn(msg, category=FutureWarning)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
============ 53 passed, 1 skipped, 19 warnings in 231.74s (0:03:51) ============
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://api.GitHub.com/repos/NVIDIA-Merlin/systems/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_systems] $ /bin/bash /tmp/jenkins15389038890707694255.sh

@oliverholworthy
Copy link
Member Author

  • Re-based on top of Pass model kwargs through to operator from_config methods #158

  • Moved the specification of N the number of items to recommend to the Op constructor instead of as a triton input.

    • This is because as a Triton Input it didn't seem very use-friendly. The way we write the config currentlyu requires inputs to be of shape (-1, 1) which doesn't fit very well with a single integer value. Since other Ops don't seem to support this kind of runtime config at the moment, it seems reasonable to me to put this in the constructor for now. Perhaps we can consider better ways to support the kind of arguments that are not part of the input to a model for runtime optional configuration.

@nvidia-merlin-bot
Copy link

Click to view CI Results
GitHub pull request #134 of commit 413f554d429c604f16fbadefa77d0bb7de10e49d, no merge conflicts.
Running as SYSTEM
Setting status of 413f554d429c604f16fbadefa77d0bb7de10e49d to PENDING with url https://10.20.13.93:8080/job/merlin_systems/193/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_systems
using credential fce1c729-5d7c-48e8-90cb-b0c314b1076e
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/systems # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/systems
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems user + githubtoken
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/systems +refs/pull/134/*:refs/remotes/origin/pr/134/* # timeout=10
 > git rev-parse 413f554d429c604f16fbadefa77d0bb7de10e49d^{commit} # timeout=10
Checking out Revision 413f554d429c604f16fbadefa77d0bb7de10e49d (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 413f554d429c604f16fbadefa77d0bb7de10e49d # timeout=10
Commit message: "Specify low for random num_to_recommend to avoid zero"
 > git rev-list --no-walk 94d46c58ad0c63bdabd476347fd0b3a68eaded3c # timeout=10
[merlin_systems] $ /bin/bash /tmp/jenkins15738677089271391124.sh
PYTHONPATH=:/usr/local/lib/python3.8/dist-packages/:/usr/local/hugectr/lib:/var/jenkins_home/workspace/merlin_systems/systems
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_systems/systems, configfile: pyproject.toml
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 53 items / 1 skipped

tests/unit/test_version.py . [ 1%]
tests/unit/examples/test_serving_ranking_models_with_merlin_systems.py . [ 3%]
[ 3%]
tests/unit/systems/test_ensemble.py .... [ 11%]
tests/unit/systems/test_ensemble_ops.py .. [ 15%]
tests/unit/systems/test_export.py . [ 16%]
tests/unit/systems/test_graph.py . [ 18%]
tests/unit/systems/test_inference_ops.py ... [ 24%]
tests/unit/systems/test_model_registry.py . [ 26%]
tests/unit/systems/test_op_runner.py .... [ 33%]
tests/unit/systems/test_tensorflow_inf_op.py .... [ 41%]
tests/unit/systems/dag/ops/test_softmax_sampling.py . [ 43%]
tests/unit/systems/fil/test_fil.py .......................... [ 92%]
tests/unit/systems/fil/test_forest.py .... [100%]

=============================== warnings summary ===============================
../../../../../usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18
/usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18: DeprecationWarning: The nvtabular.framework_utils module is being replaced by the Merlin Models library. Support for importing from nvtabular.framework_utils is deprecated, and will be removed in a future version. Please consider using the models and layers from Merlin Models instead.
warnings.warn(

tests/unit/examples/test_serving_ranking_models_with_merlin_systems.py: 1 warning
tests/unit/systems/test_ensemble.py: 2 warnings
tests/unit/systems/test_export.py: 1 warning
tests/unit/systems/test_inference_ops.py: 2 warnings
tests/unit/systems/test_op_runner.py: 4 warnings
/usr/local/lib/python3.8/dist-packages/cudf/core/frame.py:384: UserWarning: The deep parameter is ignored and is only included for pandas compatibility.
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column x is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column y is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column id is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/fil/test_fil.py::test_binary_classifier_default[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_binary_classifier_with_proba[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_multi_classifier[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_regressor[sklearn_forest_regressor-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_model_file[sklearn_forest_regressor-checkpoint.tl]
/usr/local/lib/python3.8/dist-packages/sklearn/utils/deprecation.py:103: FutureWarning: Attribute n_features_ was deprecated in version 1.0 and will be removed in 1.2. Use n_features_in_ instead.
warnings.warn(msg, category=FutureWarning)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
============ 53 passed, 1 skipped, 19 warnings in 243.68s (0:04:03) ============
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://api.GitHub.com/repos/NVIDIA-Merlin/systems/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_systems] $ /bin/bash /tmp/jenkins16616557217314652964.sh

@nvidia-merlin-bot
Copy link

Click to view CI Results
GitHub pull request #134 of commit 8ba0eb40a13c8194fd45133344062480c489712b, no merge conflicts.
Running as SYSTEM
Setting status of 8ba0eb40a13c8194fd45133344062480c489712b to PENDING with url https://10.20.13.93:8080/job/merlin_systems/252/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_systems
using credential fce1c729-5d7c-48e8-90cb-b0c314b1076e
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/systems # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/systems
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems user + githubtoken
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/systems +refs/pull/134/*:refs/remotes/origin/pr/134/* # timeout=10
 > git rev-parse 8ba0eb40a13c8194fd45133344062480c489712b^{commit} # timeout=10
Checking out Revision 8ba0eb40a13c8194fd45133344062480c489712b (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 8ba0eb40a13c8194fd45133344062480c489712b # timeout=10
Commit message: "Merge branch 'main' into op-implicit"
 > git rev-list --no-walk 88e7864addd5809fd75b7554e85701eb9c1f26c4 # timeout=10
[merlin_systems] $ /bin/bash /tmp/jenkins1387852930417552259.sh
PYTHONPATH=:/usr/local/lib/python3.8/dist-packages/:/usr/local/hugectr/lib:/var/jenkins_home/workspace/merlin_systems/systems
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_systems/systems, configfile: pyproject.toml
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 81 items

tests/unit/test_version.py . [ 1%]
tests/unit/examples/test_serving_an_xgboost_model_with_merlin_systems.py . [ 2%]
[ 2%]
tests/unit/examples/test_serving_ranking_models_with_merlin_systems.py . [ 3%]
[ 3%]
tests/unit/systems/test_ensemble.py .... [ 8%]
tests/unit/systems/test_ensemble_ops.py .. [ 11%]
tests/unit/systems/test_export.py . [ 12%]
tests/unit/systems/test_graph.py . [ 13%]
tests/unit/systems/test_inference_ops.py ... [ 17%]
tests/unit/systems/test_model_registry.py . [ 18%]
tests/unit/systems/test_op_runner.py .... [ 23%]
tests/unit/systems/test_tensorflow_inf_op.py .... [ 28%]
tests/unit/systems/dag/ops/test_feast.py ..... [ 34%]
tests/unit/systems/dag/ops/test_softmax_sampling.py ................. [ 55%]
tests/unit/systems/fil/test_fil.py .......................... [ 87%]
tests/unit/systems/fil/test_forest.py .... [ 92%]
tests/unit/systems/implicit/test_implicit.py ...FFF [100%]

=================================== FAILURES ===================================
__________________ test_ensemble[BayesianPersonalizedRanking] __________________

model_cls = <function BayesianPersonalizedRanking at 0x7fd4be2fe5e0>
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_ensemble_BayesianPersonal0')

@pytest.mark.skipif(not TRITON_SERVER_PATH, reason="triton server not found")
@pytest.mark.parametrize(
    "model_cls",
    [
        implicit.bpr.BayesianPersonalizedRanking,
        implicit.als.AlternatingLeastSquares,
        implicit.lmf.LogisticMatrixFactorization,
    ],
)
def test_ensemble(model_cls, tmpdir):
    model = model_cls()
    n = 100
    user_items = csr_matrix(np.random.choice([0, 1], size=n * n, p=[0.9, 0.1]).reshape(n, n))
    model.fit(user_items)

    num_to_recommend = np.random.randint(1, n)

    user_items = None
    ids, scores = model.recommend(
        [0, 1], user_items, N=num_to_recommend, filter_already_liked_items=False
    )

    implicit_op = PredictImplicit(model, num_to_recommend=num_to_recommend)

    input_schema = Schema([ColumnSchema("user_id", dtype="int64")])

    triton_chain = input_schema.column_names >> implicit_op

    triton_ens = Ensemble(triton_chain, input_schema)
    triton_ens.export(tmpdir)

    model_name = triton_ens.name
    input_user_id = np.array([[0], [1]], dtype=np.int64)
    inputs = [
        grpcclient.InferInput(
            "user_id", input_user_id.shape, triton.np_to_triton_dtype(input_user_id.dtype)
        ),
    ]
    inputs[0].set_data_from_numpy(input_user_id)
    outputs = [grpcclient.InferRequestedOutput("scores"), grpcclient.InferRequestedOutput("ids")]

    response = None
  with run_triton_server(tmpdir) as client:

tests/unit/systems/implicit/test_implicit.py:120:


/usr/lib/python3.8/contextlib.py:113: in enter
return next(self.gen)


modelpath = local('/tmp/pytest-of-jenkins/pytest-7/test_ensemble_BayesianPersonal0')

@contextlib.contextmanager
def run_triton_server(modelpath):
    """This function starts up a Triton server instance and returns a client to it.

    Parameters
    ----------
    modelpath : string
        The path to the model to load.

    Yields
    ------
    client: tritonclient.InferenceServerClient
        The client connected to the Triton server.

    """
    cmdline = [
        TRITON_SERVER_PATH,
        "--model-repository",
        modelpath,
        "--backend-config=tensorflow,version=2",
    ]
    env = os.environ.copy()
    env["CUDA_VISIBLE_DEVICES"] = "0"
    with subprocess.Popen(cmdline, env=env) as process:
        try:
            with grpcclient.InferenceServerClient("localhost:8001") as client:
                # wait until server is ready
                for _ in range(60):
                    if process.poll() is not None:
                        retcode = process.returncode
                      raise RuntimeError(f"Tritonserver failed to start (ret={retcode})")

E RuntimeError: Tritonserver failed to start (ret=1)

merlin/systems/triton/utils.py:46: RuntimeError
----------------------------- Captured stderr call -----------------------------

0%| | 0/100 [00:00<?, ?it/s]
1%| | 1/100 [00:00<00:00, 1402.78it/s, train_auc=51.03%, skipped=9.66%]
2%|▏ | 2/100 [00:00<00:00, 1326.26it/s, train_auc=48.72%, skipped=10.49%]
3%|▎ | 3/100 [00:00<00:00, 1407.33it/s, train_auc=49.88%, skipped=10.49%]
4%|▍ | 4/100 [00:00<00:00, 1454.59it/s, train_auc=48.96%, skipped=10.07%]
5%|▌ | 5/100 [00:00<00:00, 1484.71it/s, train_auc=51.54%, skipped=12.15%]
6%|▌ | 6/100 [00:00<00:00, 1490.87it/s, train_auc=49.04%, skipped=8.31%]
7%|▋ | 7/100 [00:00<00:00, 1506.88it/s, train_auc=48.16%, skipped=12.67%]
8%|▊ | 8/100 [00:00<00:00, 1519.74it/s, train_auc=45.20%, skipped=12.46%]
9%|▉ | 9/100 [00:00<00:00, 1530.33it/s, train_auc=48.77%, skipped=11.63%]
10%|█ | 10/100 [00:00<00:00, 1539.25it/s, train_auc=52.78%, skipped=12.25%]
11%|█ | 11/100 [00:00<00:00, 1545.95it/s, train_auc=51.50%, skipped=13.29%]
12%|█▏ | 12/100 [00:00<00:00, 1550.81it/s, train_auc=50.35%, skipped=11.11%]
13%|█▎ | 13/100 [00:00<00:00, 1555.44it/s, train_auc=49.52%, skipped=12.98%]
14%|█▍ | 14/100 [00:00<00:00, 1559.10it/s, train_auc=50.41%, skipped=11.63%]
15%|█▌ | 15/100 [00:00<00:00, 1562.43it/s, train_auc=48.59%, skipped=11.53%]
16%|█▌ | 16/100 [00:00<00:00, 1565.73it/s, train_auc=52.00%, skipped=11.73%]
17%|█▋ | 17/100 [00:00<00:00, 1541.66it/s, train_auc=49.59%, skipped=10.59%]
18%|█▊ | 18/100 [00:00<00:00, 1543.00it/s, train_auc=50.23%, skipped=10.70%]
19%|█▉ | 19/100 [00:00<00:00, 1545.97it/s, train_auc=55.58%, skipped=11.63%]
20%|██ | 20/100 [00:00<00:00, 1548.83it/s, train_auc=50.70%, skipped=11.53%]
21%|██ | 21/100 [00:00<00:00, 1551.67it/s, train_auc=51.86%, skipped=13.50%]
22%|██▏ | 22/100 [00:00<00:00, 1554.26it/s, train_auc=53.32%, skipped=12.36%]
23%|██▎ | 23/100 [00:00<00:00, 1548.68it/s, train_auc=53.75%, skipped=11.32%]
24%|██▍ | 24/100 [00:00<00:00, 1548.98it/s, train_auc=49.36%, skipped=11.21%]
25%|██▌ | 25/100 [00:00<00:00, 1547.01it/s, train_auc=49.28%, skipped=13.19%]
26%|██▌ | 26/100 [00:00<00:00, 1543.53it/s, train_auc=49.19%, skipped=10.70%]
27%|██▋ | 27/100 [00:00<00:00, 1545.37it/s, train_auc=49.88%, skipped=10.90%]
28%|██▊ | 28/100 [00:00<00:00, 1547.51it/s, train_auc=51.52%, skipped=11.11%]
29%|██▉ | 29/100 [00:00<00:00, 1549.67it/s, train_auc=50.93%, skipped=10.90%]
30%|███ | 30/100 [00:00<00:00, 1551.53it/s, train_auc=49.94%, skipped=11.63%]
31%|███ | 31/100 [00:00<00:00, 1551.67it/s, train_auc=49.76%, skipped=11.73%]
32%|███▏ | 32/100 [00:00<00:00, 1553.61it/s, train_auc=50.83%, skipped=12.15%]
33%|███▎ | 33/100 [00:00<00:00, 1555.52it/s, train_auc=47.11%, skipped=10.28%]
34%|███▍ | 34/100 [00:00<00:00, 1557.52it/s, train_auc=49.88%, skipped=11.94%]
35%|███▌ | 35/100 [00:00<00:00, 1559.30it/s, train_auc=51.54%, skipped=12.36%]
36%|███▌ | 36/100 [00:00<00:00, 1561.16it/s, train_auc=55.07%, skipped=12.88%]
37%|███▋ | 37/100 [00:00<00:00, 1562.06it/s, train_auc=52.70%, skipped=11.53%]
38%|███▊ | 38/100 [00:00<00:00, 1563.61it/s, train_auc=48.17%, skipped=12.05%]
39%|███▉ | 39/100 [00:00<00:00, 1563.62it/s, train_auc=50.52%, skipped=10.38%]
40%|████ | 40/100 [00:00<00:00, 1565.04it/s, train_auc=51.62%, skipped=10.07%]
41%|████ | 41/100 [00:00<00:00, 1566.46it/s, train_auc=50.76%, skipped=11.21%]
42%|████▏ | 42/100 [00:00<00:00, 1567.81it/s, train_auc=49.34%, skipped=12.88%]
43%|████▎ | 43/100 [00:00<00:00, 1568.99it/s, train_auc=48.76%, skipped=11.84%]
44%|████▍ | 44/100 [00:00<00:00, 1569.72it/s, train_auc=50.95%, skipped=12.36%]
45%|████▌ | 45/100 [00:00<00:00, 1570.90it/s, train_auc=50.54%, skipped=13.08%]
46%|████▌ | 46/100 [00:00<00:00, 1572.03it/s, train_auc=49.83%, skipped=10.80%]
47%|████▋ | 47/100 [00:00<00:00, 1571.76it/s, train_auc=46.94%, skipped=11.73%]
48%|████▊ | 48/100 [00:00<00:00, 1572.67it/s, train_auc=47.29%, skipped=11.94%]
49%|████▉ | 49/100 [00:00<00:00, 1572.84it/s, train_auc=49.48%, skipped=10.80%]
50%|█████ | 50/100 [00:00<00:00, 1568.83it/s, train_auc=45.94%, skipped=12.98%]
51%|█████ | 51/100 [00:00<00:00, 1562.49it/s, train_auc=49.35%, skipped=12.05%]
52%|█████▏ | 52/100 [00:00<00:00, 1558.94it/s, train_auc=52.12%, skipped=11.94%]
53%|█████▎ | 53/100 [00:00<00:00, 1555.79it/s, train_auc=50.59%, skipped=11.73%]
54%|█████▍ | 54/100 [00:00<00:00, 1551.96it/s, train_auc=48.47%, skipped=11.53%]
55%|█████▌ | 55/100 [00:00<00:00, 1545.29it/s, train_auc=50.52%, skipped=10.38%]
56%|█████▌ | 56/100 [00:00<00:00, 1541.68it/s, train_auc=49.41%, skipped=11.53%]
57%|█████▋ | 57/100 [00:00<00:00, 1537.36it/s, train_auc=51.84%, skipped=12.46%]
58%|█████▊ | 58/100 [00:00<00:00, 1533.16it/s, train_auc=50.12%, skipped=11.11%]
59%|█████▉ | 59/100 [00:00<00:00, 1529.07it/s, train_auc=47.67%, skipped=13.08%]
60%|██████ | 60/100 [00:00<00:00, 1524.91it/s, train_auc=49.35%, skipped=12.46%]
61%|██████ | 61/100 [00:00<00:00, 1520.45it/s, train_auc=49.13%, skipped=10.38%]
62%|██████▏ | 62/100 [00:00<00:00, 1517.00it/s, train_auc=50.53%, skipped=11.63%]
63%|██████▎ | 63/100 [00:00<00:00, 1513.30it/s, train_auc=51.54%, skipped=12.56%]
64%|██████▍ | 64/100 [00:00<00:00, 1504.66it/s, train_auc=48.96%, skipped=10.07%]
65%|██████▌ | 65/100 [00:00<00:00, 1501.49it/s, train_auc=50.70%, skipped=10.90%]
66%|██████▌ | 66/100 [00:00<00:00, 1497.95it/s, train_auc=48.95%, skipped=10.90%]
67%|██████▋ | 67/100 [00:00<00:00, 1494.54it/s, train_auc=48.98%, skipped=13.08%]
68%|██████▊ | 68/100 [00:00<00:00, 1492.91it/s, train_auc=51.54%, skipped=12.56%]
69%|██████▉ | 69/100 [00:00<00:00, 1487.51it/s, train_auc=50.00%, skipped=12.36%]
70%|███████ | 70/100 [00:00<00:00, 1484.67it/s, train_auc=53.63%, skipped=11.32%]
71%|███████ | 71/100 [00:00<00:00, 1481.92it/s, train_auc=51.11%, skipped=11.01%]
72%|███████▏ | 72/100 [00:00<00:00, 1479.54it/s, train_auc=52.60%, skipped=12.15%]
73%|███████▎ | 73/100 [00:00<00:00, 1477.12it/s, train_auc=49.64%, skipped=12.36%]
74%|███████▍ | 74/100 [00:00<00:00, 1475.50it/s, train_auc=48.90%, skipped=10.38%]
75%|███████▌ | 75/100 [00:00<00:00, 1474.76it/s, train_auc=50.00%, skipped=11.73%]
76%|███████▌ | 76/100 [00:00<00:00, 1472.47it/s, train_auc=51.51%, skipped=10.70%]
77%|███████▋ | 77/100 [00:00<00:00, 1469.99it/s, train_auc=51.35%, skipped=11.21%]
78%|███████▊ | 78/100 [00:00<00:00, 1467.70it/s, train_auc=49.88%, skipped=10.90%]
79%|███████▉ | 79/100 [00:00<00:00, 1465.64it/s, train_auc=52.11%, skipped=11.53%]
80%|████████ | 80/100 [00:00<00:00, 1463.25it/s, train_auc=47.32%, skipped=12.88%]
81%|████████ | 81/100 [00:00<00:00, 1460.71it/s, train_auc=48.32%, skipped=10.38%]
82%|████████▏ | 82/100 [00:00<00:00, 1458.55it/s, train_auc=48.96%, skipped=10.28%]
83%|████████▎ | 83/100 [00:00<00:00, 1456.64it/s, train_auc=51.83%, skipped=11.84%]
84%|████████▍ | 84/100 [00:00<00:00, 1455.78it/s, train_auc=49.53%, skipped=10.90%]
85%|████████▌ | 85/100 [00:00<00:00, 1457.21it/s, train_auc=49.36%, skipped=10.80%]
86%|████████▌ | 86/100 [00:00<00:00, 1458.83it/s, train_auc=50.11%, skipped=9.66%]
87%|████████▋ | 87/100 [00:00<00:00, 1454.59it/s, train_auc=47.92%, skipped=12.67%]
88%|████████▊ | 88/100 [00:00<00:00, 1453.97it/s, train_auc=47.37%, skipped=11.01%]
89%|████████▉ | 89/100 [00:00<00:00, 1455.27it/s, train_auc=49.64%, skipped=12.56%]
90%|█████████ | 90/100 [00:00<00:00, 1456.93it/s, train_auc=46.90%, skipped=12.98%]
91%|█████████ | 91/100 [00:00<00:00, 1458.53it/s, train_auc=48.67%, skipped=13.81%]
92%|█████████▏| 92/100 [00:00<00:00, 1460.10it/s, train_auc=51.83%, skipped=11.84%]
93%|█████████▎| 93/100 [00:00<00:00, 1461.63it/s, train_auc=51.60%, skipped=12.46%]
94%|█████████▍| 94/100 [00:00<00:00, 1463.13it/s, train_auc=50.18%, skipped=11.42%]
95%|█████████▌| 95/100 [00:00<00:00, 1464.62it/s, train_auc=49.19%, skipped=10.49%]
96%|█████████▌| 96/100 [00:00<00:00, 1466.02it/s, train_auc=50.12%, skipped=12.15%]
97%|█████████▋| 97/100 [00:00<00:00, 1467.20it/s, train_auc=51.74%, skipped=13.29%]
98%|█████████▊| 98/100 [00:00<00:00, 1468.68it/s, train_auc=48.65%, skipped=11.63%]
99%|█████████▉| 99/100 [00:00<00:00, 1470.10it/s, train_auc=51.46%, skipped=11.01%]
100%|██████████| 100/100 [00:00<00:00, 1471.29it/s, train_auc=48.41%, skipped=12.05%]
100%|██████████| 100/100 [00:00<00:00, 1468.81it/s, train_auc=48.41%, skipped=12.05%]
I0815 13:53:02.216984 10800 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f6a74000000' with size 268435456
I0815 13:53:02.217726 10800 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0815 13:53:02.220110 10800 model_repository_manager.cc:1191] loading: 0_predictimplicit:1
I0815 13:53:02.327621 10800 python_be.cc:1774] TRITONBACKEND_ModelInstanceInitialize: 0_predictimplicit (GPU device 0)
0815 13:53:04.495357 10808 pb_stub.cc:309] Failed to initialize Python stub: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pytest-of-jenkins/pytest-7/test_ensemble_BayesianPersonal0/0_predictimplicit/0_predictimplicit/1/model.npz'

At:
/usr/local/lib/python3.8/dist-packages/numpy/lib/npyio.py(450): load
/usr/local/lib/python3.8/dist-packages/implicit/recommender_base.py(191): load
/usr/local/lib/python3.8/dist-packages/implicit/gpu/matrix_factorization_base.py(212): load
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/ops/implicit.py(114): from_config
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/op_runner.py(43): init
/tmp/pytest-of-jenkins/pytest-7/test_ensemble_BayesianPersonal0/0_predictimplicit/1/model.py(61): initialize

/usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.11) or chardet (3.0.4) doesn't match a supported version!
warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
E0815 13:53:04.809140 10800 model_repository_manager.cc:1348] failed to load '0_predictimplicit' version 1: Internal: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pytest-of-jenkins/pytest-7/test_ensemble_BayesianPersonal0/0_predictimplicit/0_predictimplicit/1/model.npz'

At:
/usr/local/lib/python3.8/dist-packages/numpy/lib/npyio.py(450): load
/usr/local/lib/python3.8/dist-packages/implicit/recommender_base.py(191): load
/usr/local/lib/python3.8/dist-packages/implicit/gpu/matrix_factorization_base.py(212): load
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/ops/implicit.py(114): from_config
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/op_runner.py(43): init
/tmp/pytest-of-jenkins/pytest-7/test_ensemble_BayesianPersonal0/0_predictimplicit/1/model.py(61): initialize

E0815 13:53:04.809300 10800 model_repository_manager.cc:1551] Invalid argument: ensemble 'ensemble_model' depends on '0_predictimplicit' which has no loaded version
I0815 13:53:04.809386 10800 server.cc:556]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0815 13:53:04.809505 10800 server.cc:583]
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Backend | Path | Config |
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| python | /opt/tritonserver/backends/python/libtriton_python.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0815 13:53:04.809608 10800 server.cc:626]
+-------------------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Model | Version | Status |
+-------------------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 0_predictimplicit | 1 | UNAVAILABLE: Internal: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pytest-of-jenkins/pytest-7/test_ensemble_BayesianPersonal0/0_predictimplicit/0_predictimplicit/1/model.npz' |
| | | |
| | | At: |
| | | /usr/local/lib/python3.8/dist-packages/numpy/lib/npyio.py(450): load |
| | | /usr/local/lib/python3.8/dist-packages/implicit/recommender_base.py(191): load |
| | | /usr/local/lib/python3.8/dist-packages/implicit/gpu/matrix_factorization_base.py(212): load |
| | | /var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/ops/implicit.py(114): from_config |
| | | /var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/op_runner.py(43): init |
| | | /tmp/pytest-of-jenkins/pytest-7/test_ensemble_BayesianPersonal0/0_predictimplicit/1/model.py(61): initialize |
+-------------------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0815 13:53:04.873171 10800 metrics.cc:650] Collecting metrics for GPU 0: Tesla P100-DGXS-16GB
I0815 13:53:04.874000 10800 tritonserver.cc:2159]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.23.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
| model_repository_path[0] | /tmp/pytest-of-jenkins/pytest-7/test_ensemble_BayesianPersonal0 |
| model_control_mode | MODE_NONE |
| strict_model_config | 1 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 67108864 |
| response_cache_byte_size | 0 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0815 13:53:04.874034 10800 server.cc:257] Waiting for in-flight requests to complete.
I0815 13:53:04.874041 10800 server.cc:273] Timeout 30: Found 0 model versions that have in-flight inferences
I0815 13:53:04.874052 10800 server.cc:288] All models are stopped, unloading models
I0815 13:53:04.874057 10800 server.cc:295] Timeout 30: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models
W0815 13:53:05.892992 10800 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
____________________ test_ensemble[AlternatingLeastSquares] ____________________

model_cls = <function AlternatingLeastSquares at 0x7fd4c011bc10>
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_ensemble_AlternatingLeast0')

@pytest.mark.skipif(not TRITON_SERVER_PATH, reason="triton server not found")
@pytest.mark.parametrize(
    "model_cls",
    [
        implicit.bpr.BayesianPersonalizedRanking,
        implicit.als.AlternatingLeastSquares,
        implicit.lmf.LogisticMatrixFactorization,
    ],
)
def test_ensemble(model_cls, tmpdir):
    model = model_cls()
    n = 100
    user_items = csr_matrix(np.random.choice([0, 1], size=n * n, p=[0.9, 0.1]).reshape(n, n))
    model.fit(user_items)

    num_to_recommend = np.random.randint(1, n)

    user_items = None
    ids, scores = model.recommend(
        [0, 1], user_items, N=num_to_recommend, filter_already_liked_items=False
    )

    implicit_op = PredictImplicit(model, num_to_recommend=num_to_recommend)

    input_schema = Schema([ColumnSchema("user_id", dtype="int64")])

    triton_chain = input_schema.column_names >> implicit_op

    triton_ens = Ensemble(triton_chain, input_schema)
    triton_ens.export(tmpdir)

    model_name = triton_ens.name
    input_user_id = np.array([[0], [1]], dtype=np.int64)
    inputs = [
        grpcclient.InferInput(
            "user_id", input_user_id.shape, triton.np_to_triton_dtype(input_user_id.dtype)
        ),
    ]
    inputs[0].set_data_from_numpy(input_user_id)
    outputs = [grpcclient.InferRequestedOutput("scores"), grpcclient.InferRequestedOutput("ids")]

    response = None
  with run_triton_server(tmpdir) as client:

tests/unit/systems/implicit/test_implicit.py:120:


/usr/lib/python3.8/contextlib.py:113: in enter
return next(self.gen)


modelpath = local('/tmp/pytest-of-jenkins/pytest-7/test_ensemble_AlternatingLeast0')

@contextlib.contextmanager
def run_triton_server(modelpath):
    """This function starts up a Triton server instance and returns a client to it.

    Parameters
    ----------
    modelpath : string
        The path to the model to load.

    Yields
    ------
    client: tritonclient.InferenceServerClient
        The client connected to the Triton server.

    """
    cmdline = [
        TRITON_SERVER_PATH,
        "--model-repository",
        modelpath,
        "--backend-config=tensorflow,version=2",
    ]
    env = os.environ.copy()
    env["CUDA_VISIBLE_DEVICES"] = "0"
    with subprocess.Popen(cmdline, env=env) as process:
        try:
            with grpcclient.InferenceServerClient("localhost:8001") as client:
                # wait until server is ready
                for _ in range(60):
                    if process.poll() is not None:
                        retcode = process.returncode
                      raise RuntimeError(f"Tritonserver failed to start (ret={retcode})")

E RuntimeError: Tritonserver failed to start (ret=1)

merlin/systems/triton/utils.py:46: RuntimeError
----------------------------- Captured stderr call -----------------------------

0%| | 0/15 [00:00<?, ?it/s]
100%|██████████| 15/15 [00:00<00:00, 2211.10it/s]
I0815 13:53:07.364038 10866 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f7006000000' with size 268435456
I0815 13:53:07.364781 10866 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0815 13:53:07.367294 10866 model_repository_manager.cc:1191] loading: 0_predictimplicit:1
I0815 13:53:07.474267 10866 python_be.cc:1774] TRITONBACKEND_ModelInstanceInitialize: 0_predictimplicit (GPU device 0)
0815 13:53:09.683600 10874 pb_stub.cc:309] Failed to initialize Python stub: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pytest-of-jenkins/pytest-7/test_ensemble_AlternatingLeast0/0_predictimplicit/0_predictimplicit/1/model.npz'

At:
/usr/local/lib/python3.8/dist-packages/numpy/lib/npyio.py(450): load
/usr/local/lib/python3.8/dist-packages/implicit/recommender_base.py(191): load
/usr/local/lib/python3.8/dist-packages/implicit/gpu/matrix_factorization_base.py(212): load
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/ops/implicit.py(114): from_config
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/op_runner.py(43): init
/tmp/pytest-of-jenkins/pytest-7/test_ensemble_AlternatingLeast0/0_predictimplicit/1/model.py(61): initialize

/usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.11) or chardet (3.0.4) doesn't match a supported version!
warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
/usr/local/lib/python3.8/dist-packages/implicit/utils.py:28: UserWarning: OpenBLAS detected. Its highly recommend to set the environment variable 'export OPENBLAS_NUM_THREADS=1' to disable its internal multithreading
warnings.warn(
E0815 13:53:10.264893 10866 model_repository_manager.cc:1348] failed to load '0_predictimplicit' version 1: Internal: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pytest-of-jenkins/pytest-7/test_ensemble_AlternatingLeast0/0_predictimplicit/0_predictimplicit/1/model.npz'

At:
/usr/local/lib/python3.8/dist-packages/numpy/lib/npyio.py(450): load
/usr/local/lib/python3.8/dist-packages/implicit/recommender_base.py(191): load
/usr/local/lib/python3.8/dist-packages/implicit/gpu/matrix_factorization_base.py(212): load
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/ops/implicit.py(114): from_config
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/op_runner.py(43): init
/tmp/pytest-of-jenkins/pytest-7/test_ensemble_AlternatingLeast0/0_predictimplicit/1/model.py(61): initialize

E0815 13:53:10.265022 10866 model_repository_manager.cc:1551] Invalid argument: ensemble 'ensemble_model' depends on '0_predictimplicit' which has no loaded version
I0815 13:53:10.265086 10866 server.cc:556]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0815 13:53:10.265139 10866 server.cc:583]
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Backend | Path | Config |
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| python | /opt/tritonserver/backends/python/libtriton_python.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0815 13:53:10.265239 10866 server.cc:626]
+-------------------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Model | Version | Status |
+-------------------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 0_predictimplicit | 1 | UNAVAILABLE: Internal: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pytest-of-jenkins/pytest-7/test_ensemble_AlternatingLeast0/0_predictimplicit/0_predictimplicit/1/model.npz' |
| | | |
| | | At: |
| | | /usr/local/lib/python3.8/dist-packages/numpy/lib/npyio.py(450): load |
| | | /usr/local/lib/python3.8/dist-packages/implicit/recommender_base.py(191): load |
| | | /usr/local/lib/python3.8/dist-packages/implicit/gpu/matrix_factorization_base.py(212): load |
| | | /var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/ops/implicit.py(114): from_config |
| | | /var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/op_runner.py(43): init |
| | | /tmp/pytest-of-jenkins/pytest-7/test_ensemble_AlternatingLeast0/0_predictimplicit/1/model.py(61): initialize |
+-------------------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0815 13:53:10.325218 10866 metrics.cc:650] Collecting metrics for GPU 0: Tesla P100-DGXS-16GB
I0815 13:53:10.326061 10866 tritonserver.cc:2159]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.23.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
| model_repository_path[0] | /tmp/pytest-of-jenkins/pytest-7/test_ensemble_AlternatingLeast0 |
| model_control_mode | MODE_NONE |
| strict_model_config | 1 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 67108864 |
| response_cache_byte_size | 0 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0815 13:53:10.326095 10866 server.cc:257] Waiting for in-flight requests to complete.
I0815 13:53:10.326103 10866 server.cc:273] Timeout 30: Found 0 model versions that have in-flight inferences
I0815 13:53:10.326113 10866 server.cc:288] All models are stopped, unloading models
I0815 13:53:10.326119 10866 server.cc:295] Timeout 30: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models
W0815 13:53:11.352275 10866 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
__________________ test_ensemble[LogisticMatrixFactorization] __________________

model_cls = <function LogisticMatrixFactorization at 0x7fd4be2fe8b0>
tmpdir = local('/tmp/pytest-of-jenkins/pytest-7/test_ensemble_LogisticMatrixFa0')

@pytest.mark.skipif(not TRITON_SERVER_PATH, reason="triton server not found")
@pytest.mark.parametrize(
    "model_cls",
    [
        implicit.bpr.BayesianPersonalizedRanking,
        implicit.als.AlternatingLeastSquares,
        implicit.lmf.LogisticMatrixFactorization,
    ],
)
def test_ensemble(model_cls, tmpdir):
    model = model_cls()
    n = 100
    user_items = csr_matrix(np.random.choice([0, 1], size=n * n, p=[0.9, 0.1]).reshape(n, n))
    model.fit(user_items)

    num_to_recommend = np.random.randint(1, n)

    user_items = None
    ids, scores = model.recommend(
        [0, 1], user_items, N=num_to_recommend, filter_already_liked_items=False
    )

    implicit_op = PredictImplicit(model, num_to_recommend=num_to_recommend)

    input_schema = Schema([ColumnSchema("user_id", dtype="int64")])

    triton_chain = input_schema.column_names >> implicit_op

    triton_ens = Ensemble(triton_chain, input_schema)
    triton_ens.export(tmpdir)

    model_name = triton_ens.name
    input_user_id = np.array([[0], [1]], dtype=np.int64)
    inputs = [
        grpcclient.InferInput(
            "user_id", input_user_id.shape, triton.np_to_triton_dtype(input_user_id.dtype)
        ),
    ]
    inputs[0].set_data_from_numpy(input_user_id)
    outputs = [grpcclient.InferRequestedOutput("scores"), grpcclient.InferRequestedOutput("ids")]

    response = None
  with run_triton_server(tmpdir) as client:

tests/unit/systems/implicit/test_implicit.py:120:


/usr/lib/python3.8/contextlib.py:113: in enter
return next(self.gen)


modelpath = local('/tmp/pytest-of-jenkins/pytest-7/test_ensemble_LogisticMatrixFa0')

@contextlib.contextmanager
def run_triton_server(modelpath):
    """This function starts up a Triton server instance and returns a client to it.

    Parameters
    ----------
    modelpath : string
        The path to the model to load.

    Yields
    ------
    client: tritonclient.InferenceServerClient
        The client connected to the Triton server.

    """
    cmdline = [
        TRITON_SERVER_PATH,
        "--model-repository",
        modelpath,
        "--backend-config=tensorflow,version=2",
    ]
    env = os.environ.copy()
    env["CUDA_VISIBLE_DEVICES"] = "0"
    with subprocess.Popen(cmdline, env=env) as process:
        try:
            with grpcclient.InferenceServerClient("localhost:8001") as client:
                # wait until server is ready
                for _ in range(60):
                    if process.poll() is not None:
                        retcode = process.returncode
                      raise RuntimeError(f"Tritonserver failed to start (ret={retcode})")

E RuntimeError: Tritonserver failed to start (ret=1)

merlin/systems/triton/utils.py:46: RuntimeError
----------------------------- Captured stderr call -----------------------------

0%| | 0/30 [00:00<?, ?it/s]
23%|██▎ | 7/30 [00:00<00:00, 68.33it/s]
47%|████▋ | 14/30 [00:00<00:00, 68.87it/s]
70%|███████ | 21/30 [00:00<00:00, 68.86it/s]
93%|█████████▎| 28/30 [00:00<00:00, 67.93it/s]
100%|██████████| 30/30 [00:00<00:00, 68.56it/s]
I0815 13:53:14.232182 10932 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f5d96000000' with size 268435456
I0815 13:53:14.232949 10932 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0815 13:53:14.235293 10932 model_repository_manager.cc:1191] loading: 0_predictimplicit:1
I0815 13:53:14.342555 10932 python_be.cc:1774] TRITONBACKEND_ModelInstanceInitialize: 0_predictimplicit (GPU device 0)
0815 13:53:16.596618 10940 pb_stub.cc:309] Failed to initialize Python stub: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pytest-of-jenkins/pytest-7/test_ensemble_LogisticMatrixFa0/0_predictimplicit/0_predictimplicit/1/model.npz'

At:
/usr/local/lib/python3.8/dist-packages/numpy/lib/npyio.py(450): load
/usr/local/lib/python3.8/dist-packages/implicit/recommender_base.py(191): load
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/ops/implicit.py(114): from_config
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/op_runner.py(43): init
/tmp/pytest-of-jenkins/pytest-7/test_ensemble_LogisticMatrixFa0/0_predictimplicit/1/model.py(61): initialize

/usr/lib/python3/dist-packages/requests/init.py:89: RequestsDependencyWarning: urllib3 (1.26.11) or chardet (3.0.4) doesn't match a supported version!
warnings.warn("urllib3 ({}) or chardet ({}) doesn't match a supported "
E0815 13:53:17.133340 10932 model_repository_manager.cc:1348] failed to load '0_predictimplicit' version 1: Internal: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pytest-of-jenkins/pytest-7/test_ensemble_LogisticMatrixFa0/0_predictimplicit/0_predictimplicit/1/model.npz'

At:
/usr/local/lib/python3.8/dist-packages/numpy/lib/npyio.py(450): load
/usr/local/lib/python3.8/dist-packages/implicit/recommender_base.py(191): load
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/ops/implicit.py(114): from_config
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/op_runner.py(43): init
/tmp/pytest-of-jenkins/pytest-7/test_ensemble_LogisticMatrixFa0/0_predictimplicit/1/model.py(61): initialize

E0815 13:53:17.133501 10932 model_repository_manager.cc:1551] Invalid argument: ensemble 'ensemble_model' depends on '0_predictimplicit' which has no loaded version
I0815 13:53:17.133588 10932 server.cc:556]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0815 13:53:17.133681 10932 server.cc:583]
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Backend | Path | Config |
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
| python | /opt/tritonserver/backends/python/libtriton_python.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |
+---------+-------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0815 13:53:17.133766 10932 server.cc:626]
+-------------------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Model | Version | Status |
+-------------------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 0_predictimplicit | 1 | UNAVAILABLE: Internal: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/pytest-of-jenkins/pytest-7/test_ensemble_LogisticMatrixFa0/0_predictimplicit/0_predictimplicit/1/model.npz' |
| | | |
| | | At: |
| | | /usr/local/lib/python3.8/dist-packages/numpy/lib/npyio.py(450): load |
| | | /usr/local/lib/python3.8/dist-packages/implicit/recommender_base.py(191): load |
| | | /var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/ops/implicit.py(114): from_config |
| | | /var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/dag/op_runner.py(43): init |
| | | /tmp/pytest-of-jenkins/pytest-7/test_ensemble_LogisticMatrixFa0/0_predictimplicit/1/model.py(61): initialize |
+-------------------+---------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0815 13:53:17.195342 10932 metrics.cc:650] Collecting metrics for GPU 0: Tesla P100-DGXS-16GB
I0815 13:53:17.196208 10932 tritonserver.cc:2159]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.23.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
| model_repository_path[0] | /tmp/pytest-of-jenkins/pytest-7/test_ensemble_LogisticMatrixFa0 |
| model_control_mode | MODE_NONE |
| strict_model_config | 1 |
| rate_limit | OFF |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 67108864 |
| response_cache_byte_size | 0 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0815 13:53:17.196242 10932 server.cc:257] Waiting for in-flight requests to complete.
I0815 13:53:17.196250 10932 server.cc:273] Timeout 30: Found 0 model versions that have in-flight inferences
I0815 13:53:17.196260 10932 server.cc:288] All models are stopped, unloading models
I0815 13:53:17.196266 10932 server.cc:295] Timeout 30: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models
W0815 13:53:18.218103 10932 metrics.cc:468] Unable to get energy consumption for GPU 0. Status:Success, value:0
=============================== warnings summary ===============================
../../../../../usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18
/usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18: DeprecationWarning: The nvtabular.framework_utils module is being replaced by the Merlin Models library. Support for importing from nvtabular.framework_utils is deprecated, and will be removed in a future version. Please consider using the models and layers from Merlin Models instead.
warnings.warn(

tests/unit/examples/test_serving_ranking_models_with_merlin_systems.py: 1 warning
tests/unit/systems/test_ensemble.py: 2 warnings
tests/unit/systems/test_export.py: 1 warning
tests/unit/systems/test_inference_ops.py: 2 warnings
tests/unit/systems/test_op_runner.py: 4 warnings
/usr/local/lib/python3.8/dist-packages/cudf/core/frame.py:384: UserWarning: The deep parameter is ignored and is only included for pandas compatibility.
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column x is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column y is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column id is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/fil/test_fil.py::test_binary_classifier_default[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_binary_classifier_with_proba[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_multi_classifier[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_regressor[sklearn_forest_regressor-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_model_file[sklearn_forest_regressor-checkpoint.tl]
/usr/local/lib/python3.8/dist-packages/sklearn/utils/deprecation.py:103: FutureWarning: Attribute n_features_ was deprecated in version 1.0 and will be removed in 1.2. Use n_features_in_ instead.
warnings.warn(msg, category=FutureWarning)

tests/unit/systems/fil/test_forest.py::test_export_merlin_models
/usr/local/lib/python3.8/dist-packages/tornado/ioloop.py:350: DeprecationWarning: make_current is deprecated; start the event loop first
self.make_current()

tests/unit/systems/fil/test_forest.py::test_export_merlin_models
/usr/local/lib/python3.8/dist-packages/distributed/node.py:180: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 33475 instead
warnings.warn(

tests/unit/systems/implicit/test_implicit.py::test_reload_from_config[AlternatingLeastSquares]
/usr/local/lib/python3.8/dist-packages/implicit/utils.py:28: UserWarning: OpenBLAS detected. Its highly recommend to set the environment variable 'export OPENBLAS_NUM_THREADS=1' to disable its internal multithreading
warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED tests/unit/systems/implicit/test_implicit.py::test_ensemble[BayesianPersonalizedRanking]
FAILED tests/unit/systems/implicit/test_implicit.py::test_ensemble[AlternatingLeastSquares]
FAILED tests/unit/systems/implicit/test_implicit.py::test_ensemble[LogisticMatrixFactorization]
============ 3 failed, 78 passed, 22 warnings in 343.18s (0:05:43) =============
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://api.GitHub.com/repos/NVIDIA-Merlin/systems/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_systems] $ /bin/bash /tmp/jenkins4998522666014574984.sh

@nvidia-merlin-bot
Copy link

Click to view CI Results
GitHub pull request #134 of commit 50855cccba4dd20a8759095bfee872d76e71a4b7, no merge conflicts.
Running as SYSTEM
Setting status of 50855cccba4dd20a8759095bfee872d76e71a4b7 to PENDING with url https://10.20.13.93:8080/job/merlin_systems/255/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_systems
using credential fce1c729-5d7c-48e8-90cb-b0c314b1076e
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/systems # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/systems
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems user + githubtoken
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/systems +refs/pull/134/*:refs/remotes/origin/pr/134/* # timeout=10
 > git rev-parse 50855cccba4dd20a8759095bfee872d76e71a4b7^{commit} # timeout=10
Checking out Revision 50855cccba4dd20a8759095bfee872d76e71a4b7 (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 50855cccba4dd20a8759095bfee872d76e71a4b7 # timeout=10
Commit message: "Remove workflow for implicit and add package to requirements-test"
 > git rev-list --no-walk 0cc3b15d26c3b3a531895f4649a68178fb085b20 # timeout=10
[merlin_systems] $ /bin/bash /tmp/jenkins1493194512218969676.sh
PYTHONPATH=:/usr/local/lib/python3.8/dist-packages/:/usr/local/hugectr/lib:/var/jenkins_home/workspace/merlin_systems/systems
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_systems/systems, configfile: pyproject.toml
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 81 items

tests/unit/test_version.py . [ 1%]
tests/unit/examples/test_serving_an_xgboost_model_with_merlin_systems.py . [ 2%]
[ 2%]
tests/unit/examples/test_serving_ranking_models_with_merlin_systems.py . [ 3%]
[ 3%]
tests/unit/systems/test_ensemble.py .... [ 8%]
tests/unit/systems/test_ensemble_ops.py .. [ 11%]
tests/unit/systems/test_export.py . [ 12%]
tests/unit/systems/test_graph.py . [ 13%]
tests/unit/systems/test_inference_ops.py ... [ 17%]
tests/unit/systems/test_model_registry.py . [ 18%]
tests/unit/systems/test_op_runner.py .... [ 23%]
tests/unit/systems/test_tensorflow_inf_op.py .... [ 28%]
tests/unit/systems/dag/ops/test_feast.py ..... [ 34%]
tests/unit/systems/dag/ops/test_softmax_sampling.py ................. [ 55%]
tests/unit/systems/fil/test_fil.py .......................... [ 87%]
tests/unit/systems/fil/test_forest.py .... [ 92%]
tests/unit/systems/implicit/test_implicit.py ...FFF [100%]

=================================== FAILURES ===================================
__________________ test_ensemble[BayesianPersonalizedRanking] __________________

model_cls = <function BayesianPersonalizedRanking at 0x7f1701dbb940>
tmpdir = local('/tmp/pytest-of-jenkins/pytest-14/test_ensemble_BayesianPersonal0')

@pytest.mark.skipif(not TRITON_SERVER_PATH, reason="triton server not found")
@pytest.mark.parametrize(
    "model_cls",
    [
        implicit.bpr.BayesianPersonalizedRanking,
        implicit.als.AlternatingLeastSquares,
        implicit.lmf.LogisticMatrixFactorization,
    ],
)
def test_ensemble(model_cls, tmpdir):
    model = model_cls()
    n = 100
    user_items = csr_matrix(np.random.choice([0, 1], size=n * n, p=[0.9, 0.1]).reshape(n, n))
    model.fit(user_items)

    num_to_recommend = np.random.randint(1, n)

    user_items = None
    ids, scores = model.recommend(
        [0, 1], user_items, N=num_to_recommend, filter_already_liked_items=False
    )

    implicit_op = PredictImplicit(model, num_to_recommend=num_to_recommend)

    input_schema = Schema([ColumnSchema("user_id", dtype="int64")])

    triton_chain = input_schema.column_names >> implicit_op

    triton_ens = Ensemble(triton_chain, input_schema)
    triton_ens.export(tmpdir)

    model_name = triton_ens.name
    input_user_id = np.array([[0], [1]], dtype=np.int64)
    inputs = [
        grpcclient.InferInput(
            "user_id", input_user_id.shape, triton.np_to_triton_dtype(input_user_id.dtype)
        ),
    ]
    inputs[0].set_data_from_numpy(input_user_id)
    outputs = [grpcclient.InferRequestedOutput("scores"), grpcclient.InferRequestedOutput("ids")]

    response = None

    with run_triton_server(tmpdir) as client:
      response = client.infer(model_name, inputs, outputs=outputs)

tests/unit/systems/implicit/test_implicit.py:121:


/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:1322: in infer
raise_error_grpc(rpc_error)


rpc_error = <_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "Request for unknown model...all.cc","file_line":1069,"grpc_message":"Request for unknown model: 'ensemble_model' is not found","grpc_status":14}"

def raise_error_grpc(rpc_error):
  raise get_error_grpc(rpc_error) from None

E tritonclient.utils.InferenceServerException: [StatusCode.UNAVAILABLE] Request for unknown model: 'ensemble_model' is not found

/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:62: InferenceServerException
----------------------------- Captured stderr call -----------------------------

0%| | 0/100 [00:00<?, ?it/s]
1%| | 1/100 [00:00<00:00, 1536.94it/s, train_auc=51.46%, skipped=13.01%]
2%|▏ | 2/100 [00:00<00:00, 1573.26it/s, train_auc=50.11%, skipped=14.04%]
3%|▎ | 3/100 [00:00<00:00, 1557.29it/s, train_auc=49.83%, skipped=14.33%]
4%|▍ | 4/100 [00:00<00:00, 1575.03it/s, train_auc=52.58%, skipped=10.56%]
5%|▌ | 5/100 [00:00<00:00, 1587.67it/s, train_auc=49.61%, skipped=14.51%]
6%|▌ | 6/100 [00:00<00:00, 1597.93it/s, train_auc=52.07%, skipped=13.67%]
7%|▋ | 7/100 [00:00<00:00, 1602.01it/s, train_auc=47.65%, skipped=13.76%]
8%|▊ | 8/100 [00:00<00:00, 1605.94it/s, train_auc=48.32%, skipped=10.27%]
9%|▉ | 9/100 [00:00<00:00, 1607.70it/s, train_auc=51.57%, skipped=13.20%]
10%|█ | 10/100 [00:00<00:00, 1611.15it/s, train_auc=47.86%, skipped=13.95%]
11%|█ | 11/100 [00:00<00:00, 1613.76it/s, train_auc=52.33%, skipped=13.01%]
12%|█▏ | 12/100 [00:00<00:00, 1616.25it/s, train_auc=50.65%, skipped=12.54%]
13%|█▎ | 13/100 [00:00<00:00, 1615.20it/s, train_auc=51.31%, skipped=13.85%]
14%|█▍ | 14/100 [00:00<00:00, 1616.26it/s, train_auc=49.26%, skipped=10.84%]
15%|█▌ | 15/100 [00:00<00:00, 1616.84it/s, train_auc=49.84%, skipped=13.38%]
16%|█▌ | 16/100 [00:00<00:00, 1617.00it/s, train_auc=47.04%, skipped=12.44%]
17%|█▋ | 17/100 [00:00<00:00, 1618.61it/s, train_auc=49.10%, skipped=11.31%]
18%|█▊ | 18/100 [00:00<00:00, 1620.33it/s, train_auc=51.23%, skipped=12.06%]
19%|█▉ | 19/100 [00:00<00:00, 1622.26it/s, train_auc=54.18%, skipped=12.16%]
20%|██ | 20/100 [00:00<00:00, 1623.18it/s, train_auc=47.65%, skipped=11.59%]
21%|██ | 21/100 [00:00<00:00, 1624.08it/s, train_auc=50.37%, skipped=9.80%]
22%|██▏ | 22/100 [00:00<00:00, 1625.16it/s, train_auc=49.46%, skipped=12.16%]
23%|██▎ | 23/100 [00:00<00:00, 1625.64it/s, train_auc=48.90%, skipped=14.42%]
24%|██▍ | 24/100 [00:00<00:00, 1626.38it/s, train_auc=49.62%, skipped=13.38%]
25%|██▌ | 25/100 [00:00<00:00, 1627.21it/s, train_auc=49.39%, skipped=14.70%]
26%|██▌ | 26/100 [00:00<00:00, 1624.63it/s, train_auc=51.38%, skipped=11.40%]
27%|██▋ | 27/100 [00:00<00:00, 1624.56it/s, train_auc=49.29%, skipped=13.20%]
28%|██▊ | 28/100 [00:00<00:00, 1625.34it/s, train_auc=48.97%, skipped=12.82%]
29%|██▉ | 29/100 [00:00<00:00, 1613.28it/s, train_auc=48.92%, skipped=12.54%]
30%|███ | 30/100 [00:00<00:00, 1613.67it/s, train_auc=50.00%, skipped=12.91%]
31%|███ | 31/100 [00:00<00:00, 1614.50it/s, train_auc=50.65%, skipped=13.48%]
32%|███▏ | 32/100 [00:00<00:00, 1615.78it/s, train_auc=52.92%, skipped=11.31%]
33%|███▎ | 33/100 [00:00<00:00, 1616.74it/s, train_auc=49.51%, skipped=13.20%]
34%|███▍ | 34/100 [00:00<00:00, 1617.20it/s, train_auc=49.84%, skipped=12.06%]
35%|███▌ | 35/100 [00:00<00:00, 1617.73it/s, train_auc=49.73%, skipped=12.82%]
36%|███▌ | 36/100 [00:00<00:00, 1618.43it/s, train_auc=48.97%, skipped=12.82%]
37%|███▋ | 37/100 [00:00<00:00, 1619.19it/s, train_auc=50.00%, skipped=11.03%]
38%|███▊ | 38/100 [00:00<00:00, 1620.13it/s, train_auc=49.24%, skipped=12.91%]
39%|███▉ | 39/100 [00:00<00:00, 1620.72it/s, train_auc=51.50%, skipped=14.89%]
40%|████ | 40/100 [00:00<00:00, 1621.21it/s, train_auc=50.71%, skipped=13.38%]
41%|████ | 41/100 [00:00<00:00, 1621.79it/s, train_auc=53.48%, skipped=13.29%]
42%|████▏ | 42/100 [00:00<00:00, 1621.58it/s, train_auc=49.78%, skipped=13.67%]
43%|████▎ | 43/100 [00:00<00:00, 1620.06it/s, train_auc=49.07%, skipped=14.14%]
44%|████▍ | 44/100 [00:00<00:00, 1620.10it/s, train_auc=49.68%, skipped=11.97%]
45%|████▌ | 45/100 [00:00<00:00, 1620.44it/s, train_auc=50.32%, skipped=11.97%]
46%|████▌ | 46/100 [00:00<00:00, 1620.97it/s, train_auc=49.95%, skipped=14.14%]
47%|████▋ | 47/100 [00:00<00:00, 1620.46it/s, train_auc=50.17%, skipped=14.33%]
48%|████▊ | 48/100 [00:00<00:00, 1620.82it/s, train_auc=45.51%, skipped=12.82%]
49%|████▉ | 49/100 [00:00<00:00, 1621.22it/s, train_auc=49.08%, skipped=12.82%]
50%|█████ | 50/100 [00:00<00:00, 1621.89it/s, train_auc=50.05%, skipped=12.63%]
51%|█████ | 51/100 [00:00<00:00, 1622.49it/s, train_auc=47.79%, skipped=12.63%]
52%|█████▏ | 52/100 [00:00<00:00, 1623.06it/s, train_auc=48.55%, skipped=12.25%]
53%|█████▎ | 53/100 [00:00<00:00, 1623.65it/s, train_auc=52.18%, skipped=11.31%]
54%|█████▍ | 54/100 [00:00<00:00, 1624.08it/s, train_auc=50.65%, skipped=13.29%]
55%|█████▌ | 55/100 [00:00<00:00, 1624.09it/s, train_auc=52.17%, skipped=10.93%]
56%|█████▌ | 56/100 [00:00<00:00, 1624.40it/s, train_auc=48.81%, skipped=12.91%]
57%|█████▋ | 57/100 [00:00<00:00, 1618.15it/s, train_auc=48.35%, skipped=11.50%]
58%|█████▊ | 58/100 [00:00<00:00, 1617.36it/s, train_auc=49.95%, skipped=10.56%]
59%|█████▉ | 59/100 [00:00<00:00, 1613.75it/s, train_auc=47.53%, skipped=12.35%]
60%|██████ | 60/100 [00:00<00:00, 1600.26it/s, train_auc=49.56%, skipped=14.04%]
61%|██████ | 61/100 [00:00<00:00, 1600.42it/s, train_auc=49.52%, skipped=10.93%]
62%|██████▏ | 62/100 [00:00<00:00, 1600.83it/s, train_auc=52.27%, skipped=14.89%]
63%|██████▎ | 63/100 [00:00<00:00, 1601.51it/s, train_auc=47.63%, skipped=12.35%]
64%|██████▍ | 64/100 [00:00<00:00, 1602.06it/s, train_auc=51.29%, skipped=12.35%]
65%|██████▌ | 65/100 [00:00<00:00, 1602.67it/s, train_auc=47.76%, skipped=13.57%]
66%|██████▌ | 66/100 [00:00<00:00, 1603.33it/s, train_auc=49.95%, skipped=13.01%]
67%|██████▋ | 67/100 [00:00<00:00, 1601.92it/s, train_auc=47.84%, skipped=12.91%]
68%|██████▊ | 68/100 [00:00<00:00, 1602.31it/s, train_auc=51.35%, skipped=12.63%]
69%|██████▉ | 69/100 [00:00<00:00, 1602.86it/s, train_auc=49.36%, skipped=12.16%]
70%|███████ | 70/100 [00:00<00:00, 1603.48it/s, train_auc=51.40%, skipped=12.35%]
71%|███████ | 71/100 [00:00<00:00, 1603.98it/s, train_auc=52.53%, skipped=12.44%]
72%|███████▏ | 72/100 [00:00<00:00, 1604.51it/s, train_auc=52.16%, skipped=12.72%]
73%|███████▎ | 73/100 [00:00<00:00, 1605.10it/s, train_auc=51.82%, skipped=12.16%]
74%|███████▍ | 74/100 [00:00<00:00, 1605.61it/s, train_auc=48.96%, skipped=13.76%]
75%|███████▌ | 75/100 [00:00<00:00, 1605.00it/s, train_auc=49.63%, skipped=10.93%]
76%|███████▌ | 76/100 [00:00<00:00, 1604.78it/s, train_auc=50.11%, skipped=12.72%]
77%|███████▋ | 77/100 [00:00<00:00, 1605.02it/s, train_auc=50.97%, skipped=12.72%]
78%|███████▊ | 78/100 [00:00<00:00, 1605.53it/s, train_auc=49.08%, skipped=13.20%]
79%|███████▉ | 79/100 [00:00<00:00, 1605.96it/s, train_auc=50.92%, skipped=12.63%]
80%|████████ | 80/100 [00:00<00:00, 1606.15it/s, train_auc=49.02%, skipped=13.29%]
81%|████████ | 81/100 [00:00<00:00, 1606.47it/s, train_auc=47.65%, skipped=11.78%]
82%|████████▏ | 82/100 [00:00<00:00, 1606.90it/s, train_auc=49.39%, skipped=15.65%]
83%|████████▎ | 83/100 [00:00<00:00, 1607.44it/s, train_auc=50.93%, skipped=13.57%]
84%|████████▍ | 84/100 [00:00<00:00, 1607.87it/s, train_auc=50.54%, skipped=13.48%]
85%|████████▌ | 85/100 [00:00<00:00, 1608.30it/s, train_auc=49.79%, skipped=11.97%]
86%|████████▌ | 86/100 [00:00<00:00, 1608.67it/s, train_auc=51.06%, skipped=11.40%]
87%|████████▋ | 87/100 [00:00<00:00, 1609.08it/s, train_auc=49.79%, skipped=11.78%]
88%|████████▊ | 88/100 [00:00<00:00, 1608.81it/s, train_auc=48.53%, skipped=13.20%]
89%|████████▉ | 89/100 [00:00<00:00, 1608.81it/s, train_auc=50.38%, skipped=12.06%]
90%|█████████ | 90/100 [00:00<00:00, 1609.09it/s, train_auc=52.32%, skipped=12.82%]
91%|█████████ | 91/100 [00:00<00:00, 1609.52it/s, train_auc=50.50%, skipped=14.33%]
92%|█████████▏| 92/100 [00:00<00:00, 1609.52it/s, train_auc=49.68%, skipped=12.35%]
93%|█████████▎| 93/100 [00:00<00:00, 1609.49it/s, train_auc=50.94%, skipped=14.33%]
94%|█████████▍| 94/100 [00:00<00:00, 1609.80it/s, train_auc=50.91%, skipped=11.88%]
95%|█████████▌| 95/100 [00:00<00:00, 1610.16it/s, train_auc=49.84%, skipped=13.01%]
96%|█████████▌| 96/100 [00:00<00:00, 1610.55it/s, train_auc=51.34%, skipped=12.25%]
97%|█████████▋| 97/100 [00:00<00:00, 1610.96it/s, train_auc=51.97%, skipped=11.31%]
98%|█████████▊| 98/100 [00:00<00:00, 1611.40it/s, train_auc=50.48%, skipped=11.88%]
99%|█████████▉| 99/100 [00:00<00:00, 1611.74it/s, train_auc=50.98%, skipped=13.29%]
100%|██████████| 100/100 [00:00<00:00, 1611.96it/s, train_auc=50.76%, skipped=13.48%]
100%|██████████| 100/100 [00:00<00:00, 1609.17it/s, train_auc=50.76%, skipped=13.48%]
____________________ test_ensemble[AlternatingLeastSquares] ____________________

model_cls = <function AlternatingLeastSquares at 0x7f1703b8cf70>
tmpdir = local('/tmp/pytest-of-jenkins/pytest-14/test_ensemble_AlternatingLeast0')

@pytest.mark.skipif(not TRITON_SERVER_PATH, reason="triton server not found")
@pytest.mark.parametrize(
    "model_cls",
    [
        implicit.bpr.BayesianPersonalizedRanking,
        implicit.als.AlternatingLeastSquares,
        implicit.lmf.LogisticMatrixFactorization,
    ],
)
def test_ensemble(model_cls, tmpdir):
    model = model_cls()
    n = 100
    user_items = csr_matrix(np.random.choice([0, 1], size=n * n, p=[0.9, 0.1]).reshape(n, n))
    model.fit(user_items)

    num_to_recommend = np.random.randint(1, n)

    user_items = None
    ids, scores = model.recommend(
        [0, 1], user_items, N=num_to_recommend, filter_already_liked_items=False
    )

    implicit_op = PredictImplicit(model, num_to_recommend=num_to_recommend)

    input_schema = Schema([ColumnSchema("user_id", dtype="int64")])

    triton_chain = input_schema.column_names >> implicit_op

    triton_ens = Ensemble(triton_chain, input_schema)
    triton_ens.export(tmpdir)

    model_name = triton_ens.name
    input_user_id = np.array([[0], [1]], dtype=np.int64)
    inputs = [
        grpcclient.InferInput(
            "user_id", input_user_id.shape, triton.np_to_triton_dtype(input_user_id.dtype)
        ),
    ]
    inputs[0].set_data_from_numpy(input_user_id)
    outputs = [grpcclient.InferRequestedOutput("scores"), grpcclient.InferRequestedOutput("ids")]

    response = None

    with run_triton_server(tmpdir) as client:
      response = client.infer(model_name, inputs, outputs=outputs)

tests/unit/systems/implicit/test_implicit.py:121:


/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:1322: in infer
raise_error_grpc(rpc_error)


rpc_error = <_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "Request for unknown model...all.cc","file_line":1069,"grpc_message":"Request for unknown model: 'ensemble_model' is not found","grpc_status":14}"

def raise_error_grpc(rpc_error):
  raise get_error_grpc(rpc_error) from None

E tritonclient.utils.InferenceServerException: [StatusCode.UNAVAILABLE] Request for unknown model: 'ensemble_model' is not found

/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:62: InferenceServerException
----------------------------- Captured stderr call -----------------------------

0%| | 0/15 [00:00<?, ?it/s]
100%|██████████| 15/15 [00:00<00:00, 2326.89it/s]
__________________ test_ensemble[LogisticMatrixFactorization] __________________

model_cls = <function LogisticMatrixFactorization at 0x7f1701dbbc10>
tmpdir = local('/tmp/pytest-of-jenkins/pytest-14/test_ensemble_LogisticMatrixFa0')

@pytest.mark.skipif(not TRITON_SERVER_PATH, reason="triton server not found")
@pytest.mark.parametrize(
    "model_cls",
    [
        implicit.bpr.BayesianPersonalizedRanking,
        implicit.als.AlternatingLeastSquares,
        implicit.lmf.LogisticMatrixFactorization,
    ],
)
def test_ensemble(model_cls, tmpdir):
    model = model_cls()
    n = 100
    user_items = csr_matrix(np.random.choice([0, 1], size=n * n, p=[0.9, 0.1]).reshape(n, n))
    model.fit(user_items)

    num_to_recommend = np.random.randint(1, n)

    user_items = None
    ids, scores = model.recommend(
        [0, 1], user_items, N=num_to_recommend, filter_already_liked_items=False
    )

    implicit_op = PredictImplicit(model, num_to_recommend=num_to_recommend)

    input_schema = Schema([ColumnSchema("user_id", dtype="int64")])

    triton_chain = input_schema.column_names >> implicit_op

    triton_ens = Ensemble(triton_chain, input_schema)
    triton_ens.export(tmpdir)

    model_name = triton_ens.name
    input_user_id = np.array([[0], [1]], dtype=np.int64)
    inputs = [
        grpcclient.InferInput(
            "user_id", input_user_id.shape, triton.np_to_triton_dtype(input_user_id.dtype)
        ),
    ]
    inputs[0].set_data_from_numpy(input_user_id)
    outputs = [grpcclient.InferRequestedOutput("scores"), grpcclient.InferRequestedOutput("ids")]

    response = None

    with run_triton_server(tmpdir) as client:
      response = client.infer(model_name, inputs, outputs=outputs)

tests/unit/systems/implicit/test_implicit.py:121:


/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:1322: in infer
raise_error_grpc(rpc_error)


rpc_error = <_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "Request for unknown model...all.cc","file_line":1069,"grpc_message":"Request for unknown model: 'ensemble_model' is not found","grpc_status":14}"

def raise_error_grpc(rpc_error):
  raise get_error_grpc(rpc_error) from None

E tritonclient.utils.InferenceServerException: [StatusCode.UNAVAILABLE] Request for unknown model: 'ensemble_model' is not found

/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:62: InferenceServerException
----------------------------- Captured stderr call -----------------------------

0%| | 0/30 [00:00<?, ?it/s]
30%|███ | 9/30 [00:00<00:00, 82.25it/s]
60%|██████ | 18/30 [00:00<00:00, 81.87it/s]
90%|█████████ | 27/30 [00:00<00:00, 79.49it/s]
100%|██████████| 30/30 [00:00<00:00, 79.97it/s]
=============================== warnings summary ===============================
../../../../../usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18
/usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18: DeprecationWarning: The nvtabular.framework_utils module is being replaced by the Merlin Models library. Support for importing from nvtabular.framework_utils is deprecated, and will be removed in a future version. Please consider using the models and layers from Merlin Models instead.
warnings.warn(

tests/unit/examples/test_serving_ranking_models_with_merlin_systems.py: 1 warning
tests/unit/systems/test_ensemble.py: 2 warnings
tests/unit/systems/test_export.py: 1 warning
tests/unit/systems/test_inference_ops.py: 2 warnings
tests/unit/systems/test_op_runner.py: 4 warnings
/usr/local/lib/python3.8/dist-packages/cudf/core/frame.py:384: UserWarning: The deep parameter is ignored and is only included for pandas compatibility.
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column x is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column y is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column id is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/fil/test_fil.py::test_binary_classifier_default[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_binary_classifier_with_proba[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_multi_classifier[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_regressor[sklearn_forest_regressor-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_model_file[sklearn_forest_regressor-checkpoint.tl]
/usr/local/lib/python3.8/dist-packages/sklearn/utils/deprecation.py:103: FutureWarning: Attribute n_features_ was deprecated in version 1.0 and will be removed in 1.2. Use n_features_in_ instead.
warnings.warn(msg, category=FutureWarning)

tests/unit/systems/fil/test_forest.py::test_export_merlin_models
/usr/local/lib/python3.8/dist-packages/tornado/ioloop.py:350: DeprecationWarning: make_current is deprecated; start the event loop first
self.make_current()

tests/unit/systems/implicit/test_implicit.py::test_reload_from_config[AlternatingLeastSquares]
/usr/local/lib/python3.8/dist-packages/implicit/utils.py:28: UserWarning: OpenBLAS detected. Its highly recommend to set the environment variable 'export OPENBLAS_NUM_THREADS=1' to disable its internal multithreading
warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED tests/unit/systems/implicit/test_implicit.py::test_ensemble[BayesianPersonalizedRanking]
FAILED tests/unit/systems/implicit/test_implicit.py::test_ensemble[AlternatingLeastSquares]
FAILED tests/unit/systems/implicit/test_implicit.py::test_ensemble[LogisticMatrixFactorization]
============ 3 failed, 78 passed, 21 warnings in 423.80s (0:07:03) =============
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://api.GitHub.com/repos/NVIDIA-Merlin/systems/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_systems] $ /bin/bash /tmp/jenkins8439803627731493573.sh

return str(model_repository_path)
# instead of path to model directory within the model repository
if model_repository.endswith(".py"):
return str(pathlib.Path(model_repository).parent.parent.parent)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

example:

import pathlib

pathlib.Path("/tmp/my_model_repository/my_model_name/1/model.py").parent.parent.parent
# => PosixPath('/tmp/my_model_repository')

@oliverholworthy
Copy link
Member Author

rerun tests

@nvidia-merlin-bot
Copy link

Click to view CI Results
GitHub pull request #134 of commit a4990ac3a4287173faa8c24cd0df79d1c217937a, no merge conflicts.
GitHub pull request #134 of commit a4990ac3a4287173faa8c24cd0df79d1c217937a, no merge conflicts.
Running as SYSTEM
Setting status of a4990ac3a4287173faa8c24cd0df79d1c217937a to PENDING with url https://10.20.13.93:8080/job/merlin_systems/260/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_systems
using credential fce1c729-5d7c-48e8-90cb-b0c314b1076e
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/systems # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/systems
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems user + githubtoken
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/systems +refs/pull/134/*:refs/remotes/origin/pr/134/* # timeout=10
 > git rev-parse a4990ac3a4287173faa8c24cd0df79d1c217937a^{commit} # timeout=10
Checking out Revision a4990ac3a4287173faa8c24cd0df79d1c217937a (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f a4990ac3a4287173faa8c24cd0df79d1c217937a # timeout=10
Commit message: "Correct function handling TritonPythonModel model_repository"
 > git rev-list --no-walk a4990ac3a4287173faa8c24cd0df79d1c217937a # timeout=10
[merlin_systems] $ /bin/bash /tmp/jenkins6225257845585263035.sh
PYTHONPATH=:/usr/local/lib/python3.8/dist-packages/:/usr/local/hugectr/lib:/var/jenkins_home/workspace/merlin_systems/systems
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_systems/systems, configfile: pyproject.toml
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 81 items

tests/unit/test_version.py . [ 1%]
tests/unit/examples/test_serving_an_xgboost_model_with_merlin_systems.py F [ 2%]
[ 2%]
tests/unit/examples/test_serving_ranking_models_with_merlin_systems.py . [ 3%]
[ 3%]
tests/unit/systems/test_ensemble.py .... [ 8%]
tests/unit/systems/test_ensemble_ops.py .. [ 11%]
tests/unit/systems/test_export.py . [ 12%]
tests/unit/systems/test_graph.py . [ 13%]
tests/unit/systems/test_inference_ops.py ... [ 17%]
tests/unit/systems/test_model_registry.py . [ 18%]
tests/unit/systems/test_op_runner.py .... [ 23%]
tests/unit/systems/test_tensorflow_inf_op.py .... [ 28%]
tests/unit/systems/dag/ops/test_feast.py ..... [ 34%]
tests/unit/systems/dag/ops/test_softmax_sampling.py ................. [ 55%]
tests/unit/systems/fil/test_fil.py .......................... [ 87%]
tests/unit/systems/fil/test_forest.py .... [ 92%]
tests/unit/systems/implicit/test_implicit.py ...... [100%]

=================================== FAILURES ===================================
_________________________ test_example_serving_xgboost _________________________

self = <testbook.client.TestbookNotebookClient object at 0x7ff6d518ea90>
cell = {'cell_type': 'markdown', 'id': '7808bc12', 'metadata': {}, 'source': "Let's now package the information up as inputs and send it to Triton for inference."}
kwargs = {}, cell_indexes = [14, 15, 16, 17, 18, 19, ...]
executed_cells = [{'cell_type': 'markdown', 'id': '65b7e4e8', 'metadata': {}, 'source': '## Retrieving Recommendations from Triton Infe...c12', 'metadata': {}, 'source': "Let's now package the information up as inputs and send it to Triton for inference."}]
idx = 17

def execute_cell(self, cell, **kwargs) -> Union[Dict, List[Dict]]:
    """
    Executes a cell or list of cells
    """
    if isinstance(cell, slice):
        start, stop = self._cell_index(cell.start), self._cell_index(cell.stop)
        if cell.step is not None:
            raise TestbookError('testbook does not support step argument')

        cell = range(start, stop + 1)
    elif isinstance(cell, str) or isinstance(cell, int):
        cell = [cell]

    cell_indexes = cell

    if all(isinstance(x, str) for x in cell):
        cell_indexes = [self._cell_index(tag) for tag in cell]

    executed_cells = []
    for idx in cell_indexes:
        try:
          cell = super().execute_cell(self.nb['cells'][idx], idx, **kwargs)

/usr/local/lib/python3.8/dist-packages/testbook/client.py:133:


args = (<testbook.client.TestbookNotebookClient object at 0x7ff6d518ea90>, {'cell_type': 'code', 'execution_count': 8, 'id': ...erverClient("localhost:8001") as client:\n response = client.infer("ensemble_model", inputs, outputs=outputs)'}, 17)
kwargs = {}

def wrapped(*args, **kwargs):
  return just_run(coro(*args, **kwargs))

/usr/local/lib/python3.8/dist-packages/nbclient/util.py:85:


coro = <coroutine object NotebookClient.async_execute_cell at 0x7ff635b966c0>

def just_run(coro: Awaitable) -> Any:
    """Make the coroutine run, even if there is an event loop running (using nest_asyncio)"""
    try:
        loop = asyncio.get_running_loop()
    except RuntimeError:
        loop = None
    if loop is None:
        had_running_loop = False
        loop = asyncio.new_event_loop()
        asyncio.set_event_loop(loop)
    else:
        had_running_loop = True
    if had_running_loop:
        # if there is a running loop, we patch using nest_asyncio
        # to have reentrant event loops
        check_ipython()
        import nest_asyncio

        nest_asyncio.apply()
        check_patch_tornado()
  return loop.run_until_complete(coro)

/usr/local/lib/python3.8/dist-packages/nbclient/util.py:60:


self = <_UnixSelectorEventLoop running=False closed=False debug=False>
future = <Task finished name='Task-43' coro=<NotebookClient.async_execute_cell() done, defined at /usr/local/lib/python3.8/dist...und\nInferenceServerException: [StatusCode.UNAVAILABLE] Request for unknown model: 'ensemble_model' is not found\n')>

def run_until_complete(self, future):
    """Run until the Future is done.

    If the argument is a coroutine, it is wrapped in a Task.

    WARNING: It would be disastrous to call run_until_complete()
    with the same coroutine twice -- it would wrap it in two
    different Tasks and that can't be good.

    Return the Future's result, or raise its exception.
    """
    self._check_closed()
    self._check_running()

    new_task = not futures.isfuture(future)
    future = tasks.ensure_future(future, loop=self)
    if new_task:
        # An exception is raised if the future didn't complete, so there
        # is no need to log the "destroy pending task" message
        future._log_destroy_pending = False

    future.add_done_callback(_run_until_complete_cb)
    try:
        self.run_forever()
    except:
        if new_task and future.done() and not future.cancelled():
            # The coroutine raised a BaseException. Consume the exception
            # to not log a warning, the caller doesn't have access to the
            # local task.
            future.exception()
        raise
    finally:
        future.remove_done_callback(_run_until_complete_cb)
    if not future.done():
        raise RuntimeError('Event loop stopped before Future completed.')
  return future.result()

/usr/lib/python3.8/asyncio/base_events.py:616:


self = <testbook.client.TestbookNotebookClient object at 0x7ff6d518ea90>
cell = {'cell_type': 'code', 'execution_count': 8, 'id': '2fefd5b8', 'metadata': {'execution': {'iopub.status.busy': '2022-08...enceServerClient("localhost:8001") as client:\n response = client.infer("ensemble_model", inputs, outputs=outputs)'}
cell_index = 17, execution_count = None, store_history = True

async def async_execute_cell(
    self,
    cell: NotebookNode,
    cell_index: int,
    execution_count: t.Optional[int] = None,
    store_history: bool = True,
) -> NotebookNode:
    """
    Executes a single code cell.

    To execute all cells see :meth:`execute`.

    Parameters
    ----------
    cell : nbformat.NotebookNode
        The cell which is currently being processed.
    cell_index : int
        The position of the cell within the notebook object.
    execution_count : int
        The execution count to be assigned to the cell (default: Use kernel response)
    store_history : bool
        Determines if history should be stored in the kernel (default: False).
        Specific to ipython kernels, which can store command histories.

    Returns
    -------
    output : dict
        The execution output payload (or None for no output).

    Raises
    ------
    CellExecutionError
        If execution failed and should raise an exception, this will be raised
        with defaults about the failure.

    Returns
    -------
    cell : NotebookNode
        The cell which was just processed.
    """
    assert self.kc is not None

    await run_hook(self.on_cell_start, cell=cell, cell_index=cell_index)

    if cell.cell_type != 'code' or not cell.source.strip():
        self.log.debug("Skipping non-executing cell %s", cell_index)
        return cell

    if self.skip_cells_with_tag in cell.metadata.get("tags", []):
        self.log.debug("Skipping tagged cell %s", cell_index)
        return cell

    if self.record_timing:  # clear execution metadata prior to execution
        cell['metadata']['execution'] = {}

    self.log.debug("Executing cell:\n%s", cell.source)

    cell_allows_errors = (not self.force_raise_errors) and (
        self.allow_errors or "raises-exception" in cell.metadata.get("tags", [])
    )

    await run_hook(self.on_cell_execute, cell=cell, cell_index=cell_index)
    parent_msg_id = await ensure_async(
        self.kc.execute(
            cell.source, store_history=store_history, stop_on_error=not cell_allows_errors
        )
    )
    await run_hook(self.on_cell_complete, cell=cell, cell_index=cell_index)
    # We launched a code cell to execute
    self.code_cells_executed += 1
    exec_timeout = self._get_timeout(cell)

    cell.outputs = []
    self.clear_before_next_output = False

    task_poll_kernel_alive = asyncio.ensure_future(self._async_poll_kernel_alive())
    task_poll_output_msg = asyncio.ensure_future(
        self._async_poll_output_msg(parent_msg_id, cell, cell_index)
    )
    self.task_poll_for_reply = asyncio.ensure_future(
        self._async_poll_for_reply(
            parent_msg_id, cell, exec_timeout, task_poll_output_msg, task_poll_kernel_alive
        )
    )
    try:
        exec_reply = await self.task_poll_for_reply
    except asyncio.CancelledError:
        # can only be cancelled by task_poll_kernel_alive when the kernel is dead
        task_poll_output_msg.cancel()
        raise DeadKernelError("Kernel died")
    except Exception as e:
        # Best effort to cancel request if it hasn't been resolved
        try:
            # Check if the task_poll_output is doing the raising for us
            if not isinstance(e, CellControlSignal):
                task_poll_output_msg.cancel()
        finally:
            raise

    if execution_count:
        cell['execution_count'] = execution_count
    await run_hook(
        self.on_cell_executed, cell=cell, cell_index=cell_index, execute_reply=exec_reply
    )
  await self._check_raise_for_error(cell, cell_index, exec_reply)

/usr/local/lib/python3.8/dist-packages/nbclient/client.py:1025:


self = <testbook.client.TestbookNotebookClient object at 0x7ff6d518ea90>
cell = {'cell_type': 'code', 'execution_count': 8, 'id': '2fefd5b8', 'metadata': {'execution': {'iopub.status.busy': '2022-08...enceServerClient("localhost:8001") as client:\n response = client.infer("ensemble_model", inputs, outputs=outputs)'}
cell_index = 17
exec_reply = {'buffers': [], 'content': {'ename': 'InferenceServerException', 'engine_info': {'engine_id': -1, 'engine_uuid': '73dd...e, 'engine': '73dd1960-1d4a-4b48-8c1e-99e1d47f9cec', 'started': '2022-08-15T16:12:47.034705Z', 'status': 'error'}, ...}

async def _check_raise_for_error(
    self, cell: NotebookNode, cell_index: int, exec_reply: t.Optional[t.Dict]
) -> None:

    if exec_reply is None:
        return None

    exec_reply_content = exec_reply['content']
    if exec_reply_content['status'] != 'error':
        return None

    cell_allows_errors = (not self.force_raise_errors) and (
        self.allow_errors
        or exec_reply_content.get('ename') in self.allow_error_names
        or "raises-exception" in cell.metadata.get("tags", [])
    )
    await run_hook(
        self.on_cell_error, cell=cell, cell_index=cell_index, execute_reply=exec_reply
    )
    if not cell_allows_errors:
      raise CellExecutionError.from_cell_and_msg(cell, exec_reply_content)

E nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
E ------------------
E from merlin.systems.triton import convert_df_to_triton_input
E import tritonclient.grpc as grpcclient
E
E ten_examples = train.compute().drop(columns=['rating', 'title', 'rating_binary'])[:10]
E inputs = convert_df_to_triton_input(inf_workflow.input_schema.column_names, ten_examples, grpcclient.InferInput)
E
E outputs = [
E grpcclient.InferRequestedOutput(col)
E for col in inf_ops.output_schema.column_names
E ]
E # send request to tritonserver
E with grpcclient.InferenceServerClient("localhost:8001") as client:
E response = client.infer("ensemble_model", inputs, outputs=outputs)
E ------------------
E
E �[0;31m---------------------------------------------------------------------------�[0m
E �[0;31mInferenceServerException�[0m Traceback (most recent call last)
E Input �[0;32mIn [8]�[0m, in �[0;36m<cell line: 12>�[0;34m()�[0m
E �[1;32m 11�[0m �[38;5;66;03m# send request to tritonserver�[39;00m
E �[1;32m 12�[0m �[38;5;28;01mwith�[39;00m grpcclient�[38;5;241m.�[39mInferenceServerClient(�[38;5;124m"�[39m�[38;5;124mlocalhost:8001�[39m�[38;5;124m"�[39m) �[38;5;28;01mas�[39;00m client:
E �[0;32m---> 13�[0m response �[38;5;241m=�[39m �[43mclient�[49m�[38;5;241;43m.�[39;49m�[43minfer�[49m�[43m(�[49m�[38;5;124;43m"�[39;49m�[38;5;124;43mensemble_model�[39;49m�[38;5;124;43m"�[39;49m�[43m,�[49m�[43m �[49m�[43minputs�[49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[38;5;241;43m=�[39;49m�[43moutputs�[49m�[43m)�[49m
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:1322�[0m, in �[0;36mInferenceServerClient.infer�[0;34m(self, model_name, inputs, model_version, outputs, request_id, sequence_id, sequence_start, sequence_end, priority, timeout, client_timeout, headers, compression_algorithm)�[0m
E �[1;32m 1320�[0m �[38;5;28;01mreturn�[39;00m result
E �[1;32m 1321�[0m �[38;5;28;01mexcept�[39;00m grpc�[38;5;241m.�[39mRpcError �[38;5;28;01mas�[39;00m rpc_error:
E �[0;32m-> 1322�[0m �[43mraise_error_grpc�[49m�[43m(�[49m�[43mrpc_error�[49m�[43m)�[49m
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:62�[0m, in �[0;36mraise_error_grpc�[0;34m(rpc_error)�[0m
E �[1;32m 61�[0m �[38;5;28;01mdef�[39;00m �[38;5;21mraise_error_grpc�[39m(rpc_error):
E �[0;32m---> 62�[0m �[38;5;28;01mraise�[39;00m get_error_grpc(rpc_error) �[38;5;28;01mfrom�[39;00m �[38;5;28mNone�[39m
E
E �[0;31mInferenceServerException�[0m: [StatusCode.UNAVAILABLE] Request for unknown model: 'ensemble_model' is not found
E InferenceServerException: [StatusCode.UNAVAILABLE] Request for unknown model: 'ensemble_model' is not found

/usr/local/lib/python3.8/dist-packages/nbclient/client.py:919: CellExecutionError

During handling of the above exception, another exception occurred:

tb = <testbook.client.TestbookNotebookClient object at 0x7ff6d518ea90>

@testbook(REPO_ROOT / "examples/Serving-An-XGboost-Model-With-Merlin-Systems.ipynb", execute=False)
def test_example_serving_xgboost(tb):
    tb.inject(
        """
        from unittest.mock import patch
        from merlin.datasets.synthetic import generate_data
        mock_train, mock_valid = generate_data(
            input="movielens-100k",
            num_rows=1000,
            set_sizes=(0.8, 0.2)
        )
        p1 = patch(
            "merlin.datasets.entertainment.get_movielens",
            return_value=[mock_train, mock_valid]
        )
        p1.start()
        """
    )
    NUM_OF_CELLS = len(tb.cells)
    # TODO: the following line is a hack -- remove when merlin-models#624 gets fixed
    tb.cells[4].source = tb.cells[4].source.replace(
        "without(['rating_binary', 'title'])", "without(['rating_binary', 'title', 'userId_count'])"
    )
    tb.execute_cell(list(range(0, 14)))

    with run_triton_server("ensemble"):
      tb.execute_cell(list(range(14, NUM_OF_CELLS)))

tests/unit/examples/test_serving_an_xgboost_model_with_merlin_systems.py:38:


self = <testbook.client.TestbookNotebookClient object at 0x7ff6d518ea90>
cell = {'cell_type': 'markdown', 'id': '7808bc12', 'metadata': {}, 'source': "Let's now package the information up as inputs and send it to Triton for inference."}
kwargs = {}, cell_indexes = [14, 15, 16, 17, 18, 19, ...]
executed_cells = [{'cell_type': 'markdown', 'id': '65b7e4e8', 'metadata': {}, 'source': '## Retrieving Recommendations from Triton Infe...c12', 'metadata': {}, 'source': "Let's now package the information up as inputs and send it to Triton for inference."}]
idx = 17

def execute_cell(self, cell, **kwargs) -> Union[Dict, List[Dict]]:
    """
    Executes a cell or list of cells
    """
    if isinstance(cell, slice):
        start, stop = self._cell_index(cell.start), self._cell_index(cell.stop)
        if cell.step is not None:
            raise TestbookError('testbook does not support step argument')

        cell = range(start, stop + 1)
    elif isinstance(cell, str) or isinstance(cell, int):
        cell = [cell]

    cell_indexes = cell

    if all(isinstance(x, str) for x in cell):
        cell_indexes = [self._cell_index(tag) for tag in cell]

    executed_cells = []
    for idx in cell_indexes:
        try:
            cell = super().execute_cell(self.nb['cells'][idx], idx, **kwargs)
        except CellExecutionError as ce:
          raise TestbookRuntimeError(ce.evalue, ce, self._get_error_class(ce.ename))

E testbook.exceptions.TestbookRuntimeError: An error occurred while executing the following cell:
E ------------------
E from merlin.systems.triton import convert_df_to_triton_input
E import tritonclient.grpc as grpcclient
E
E ten_examples = train.compute().drop(columns=['rating', 'title', 'rating_binary'])[:10]
E inputs = convert_df_to_triton_input(inf_workflow.input_schema.column_names, ten_examples, grpcclient.InferInput)
E
E outputs = [
E grpcclient.InferRequestedOutput(col)
E for col in inf_ops.output_schema.column_names
E ]
E # send request to tritonserver
E with grpcclient.InferenceServerClient("localhost:8001") as client:
E response = client.infer("ensemble_model", inputs, outputs=outputs)
E ------------------
E
E �[0;31m---------------------------------------------------------------------------�[0m
E �[0;31mInferenceServerException�[0m Traceback (most recent call last)
E Input �[0;32mIn [8]�[0m, in �[0;36m<cell line: 12>�[0;34m()�[0m
E �[1;32m 11�[0m �[38;5;66;03m# send request to tritonserver�[39;00m
E �[1;32m 12�[0m �[38;5;28;01mwith�[39;00m grpcclient�[38;5;241m.�[39mInferenceServerClient(�[38;5;124m"�[39m�[38;5;124mlocalhost:8001�[39m�[38;5;124m"�[39m) �[38;5;28;01mas�[39;00m client:
E �[0;32m---> 13�[0m response �[38;5;241m=�[39m �[43mclient�[49m�[38;5;241;43m.�[39;49m�[43minfer�[49m�[43m(�[49m�[38;5;124;43m"�[39;49m�[38;5;124;43mensemble_model�[39;49m�[38;5;124;43m"�[39;49m�[43m,�[49m�[43m �[49m�[43minputs�[49m�[43m,�[49m�[43m �[49m�[43moutputs�[49m�[38;5;241;43m=�[39;49m�[43moutputs�[49m�[43m)�[49m
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:1322�[0m, in �[0;36mInferenceServerClient.infer�[0;34m(self, model_name, inputs, model_version, outputs, request_id, sequence_id, sequence_start, sequence_end, priority, timeout, client_timeout, headers, compression_algorithm)�[0m
E �[1;32m 1320�[0m �[38;5;28;01mreturn�[39;00m result
E �[1;32m 1321�[0m �[38;5;28;01mexcept�[39;00m grpc�[38;5;241m.�[39mRpcError �[38;5;28;01mas�[39;00m rpc_error:
E �[0;32m-> 1322�[0m �[43mraise_error_grpc�[49m�[43m(�[49m�[43mrpc_error�[49m�[43m)�[49m
E
E File �[0;32m/usr/local/lib/python3.8/dist-packages/tritonclient/grpc/init.py:62�[0m, in �[0;36mraise_error_grpc�[0;34m(rpc_error)�[0m
E �[1;32m 61�[0m �[38;5;28;01mdef�[39;00m �[38;5;21mraise_error_grpc�[39m(rpc_error):
E �[0;32m---> 62�[0m �[38;5;28;01mraise�[39;00m get_error_grpc(rpc_error) �[38;5;28;01mfrom�[39;00m �[38;5;28mNone�[39m
E
E �[0;31mInferenceServerException�[0m: [StatusCode.UNAVAILABLE] Request for unknown model: 'ensemble_model' is not found
E InferenceServerException: [StatusCode.UNAVAILABLE] Request for unknown model: 'ensemble_model' is not found

/usr/local/lib/python3.8/dist-packages/testbook/client.py:135: TestbookRuntimeError
----------------------------- Captured stderr call -----------------------------
2022-08-15 16:12:42,156 - distributed.preloading - INFO - Import preload module: dask_cuda.initialize
2022-08-15 16:12:42,168 - distributed.preloading - INFO - Import preload module: dask_cuda.initialize
[16:12:46] task [xgboost.dask]:tcp://127.0.0.1:37163 got new rank 0
I0815 16:12:47.198108 16029 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f8ee6000000' with size 268435456
I0815 16:12:47.198834 16029 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0815 16:12:47.202785 16029 model_repository_manager.cc:1191] loading: 0_transformworkflow:1
I0815 16:12:47.303030 16029 model_repository_manager.cc:1191] loading: 1_fil:1
I0815 16:12:47.309531 16029 python_be.cc:1774] TRITONBACKEND_ModelInstanceInitialize: 0_transformworkflow (GPU device 0)
I0815 16:12:47.403427 16029 model_repository_manager.cc:1191] loading: 1_predictforest:1
0815 16:12:48.301923 16040 pb_stub.cc:1006] Non-graceful termination detected.
/usr/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 27 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
=============================== warnings summary ===============================
../../../../../usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18
/usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18: DeprecationWarning: The nvtabular.framework_utils module is being replaced by the Merlin Models library. Support for importing from nvtabular.framework_utils is deprecated, and will be removed in a future version. Please consider using the models and layers from Merlin Models instead.
warnings.warn(

tests/unit/examples/test_serving_ranking_models_with_merlin_systems.py: 1 warning
tests/unit/systems/test_ensemble.py: 2 warnings
tests/unit/systems/test_export.py: 1 warning
tests/unit/systems/test_inference_ops.py: 2 warnings
tests/unit/systems/test_op_runner.py: 4 warnings
/usr/local/lib/python3.8/dist-packages/cudf/core/frame.py:384: UserWarning: The deep parameter is ignored and is only included for pandas compatibility.
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column x is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column y is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column id is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/fil/test_fil.py::test_binary_classifier_default[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_binary_classifier_with_proba[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_multi_classifier[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_regressor[sklearn_forest_regressor-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_model_file[sklearn_forest_regressor-checkpoint.tl]
/usr/local/lib/python3.8/dist-packages/sklearn/utils/deprecation.py:103: FutureWarning: Attribute n_features_ was deprecated in version 1.0 and will be removed in 1.2. Use n_features_in_ instead.
warnings.warn(msg, category=FutureWarning)

tests/unit/systems/fil/test_forest.py::test_export_merlin_models
/usr/local/lib/python3.8/dist-packages/tornado/ioloop.py:350: DeprecationWarning: make_current is deprecated; start the event loop first
self.make_current()

tests/unit/systems/implicit/test_implicit.py::test_reload_from_config[AlternatingLeastSquares]
/usr/local/lib/python3.8/dist-packages/implicit/utils.py:28: UserWarning: OpenBLAS detected. Its highly recommend to set the environment variable 'export OPENBLAS_NUM_THREADS=1' to disable its internal multithreading
warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================== short test summary info ============================
FAILED tests/unit/examples/test_serving_an_xgboost_model_with_merlin_systems.py::test_example_serving_xgboost
============ 1 failed, 80 passed, 21 warnings in 508.26s (0:08:28) =============
Build step 'Execute shell' marked build as failure
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://api.GitHub.com/repos/NVIDIA-Merlin/systems/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_systems] $ /bin/bash /tmp/jenkins9403476798612585371.sh

@karlhigley
Copy link
Contributor

@oliverholworthy It looks like this is failing on an XGBoost test, but not I'm sure why. It can't find the ensemble, but I'm not sure if that's an issue with the test or an issue with the environment

@oliverholworthy
Copy link
Member Author

rerun tests

@oliverholworthy
Copy link
Member Author

@karlhigley triggered another run. This might be something to do with the triton isolation issue (the reason #141 can't be merged - as it is currently written)

@karlhigley
Copy link
Contributor

That was my guess too, probably due to the PR stampede now that CI is unblocked 👍🏻

@nvidia-merlin-bot
Copy link

Click to view CI Results
GitHub pull request #134 of commit a4990ac3a4287173faa8c24cd0df79d1c217937a, no merge conflicts.
GitHub pull request #134 of commit a4990ac3a4287173faa8c24cd0df79d1c217937a, no merge conflicts.
Running as SYSTEM
Setting status of a4990ac3a4287173faa8c24cd0df79d1c217937a to PENDING with url https://10.20.13.93:8080/job/merlin_systems/264/console and message: 'Pending'
Using context: Jenkins
Building on master in workspace /var/jenkins_home/workspace/merlin_systems
using credential fce1c729-5d7c-48e8-90cb-b0c314b1076e
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/NVIDIA-Merlin/systems # timeout=10
Fetching upstream changes from https://github.com/NVIDIA-Merlin/systems
 > git --version # timeout=10
using GIT_ASKPASS to set credentials login for merlin-systems user + githubtoken
 > git fetch --tags --force --progress -- https://github.com/NVIDIA-Merlin/systems +refs/pull/134/*:refs/remotes/origin/pr/134/* # timeout=10
 > git rev-parse a4990ac3a4287173faa8c24cd0df79d1c217937a^{commit} # timeout=10
Checking out Revision a4990ac3a4287173faa8c24cd0df79d1c217937a (detached)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f a4990ac3a4287173faa8c24cd0df79d1c217937a # timeout=10
Commit message: "Correct function handling TritonPythonModel model_repository"
 > git rev-list --no-walk 90ad1f220fde95a6a6ed23f3b5fa4fd11175865d # timeout=10
[merlin_systems] $ /bin/bash /tmp/jenkins6790811923613998494.sh
PYTHONPATH=:/usr/local/lib/python3.8/dist-packages/:/usr/local/hugectr/lib:/var/jenkins_home/workspace/merlin_systems/systems
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-7.1.2, pluggy-1.0.0
rootdir: /var/jenkins_home/workspace/merlin_systems/systems, configfile: pyproject.toml
plugins: anyio-3.6.1, xdist-2.5.0, forked-1.4.0, cov-3.0.0
collected 81 items

tests/unit/test_version.py . [ 1%]
tests/unit/examples/test_serving_an_xgboost_model_with_merlin_systems.py . [ 2%]
[ 2%]
tests/unit/examples/test_serving_ranking_models_with_merlin_systems.py . [ 3%]
[ 3%]
tests/unit/systems/test_ensemble.py .... [ 8%]
tests/unit/systems/test_ensemble_ops.py .. [ 11%]
tests/unit/systems/test_export.py . [ 12%]
tests/unit/systems/test_graph.py . [ 13%]
tests/unit/systems/test_inference_ops.py ... [ 17%]
tests/unit/systems/test_model_registry.py . [ 18%]
tests/unit/systems/test_op_runner.py .... [ 23%]
tests/unit/systems/test_tensorflow_inf_op.py .... [ 28%]
tests/unit/systems/dag/ops/test_feast.py ..... [ 34%]
tests/unit/systems/dag/ops/test_softmax_sampling.py ................. [ 55%]
tests/unit/systems/fil/test_fil.py .......................... [ 87%]
tests/unit/systems/fil/test_forest.py .... [ 92%]
tests/unit/systems/implicit/test_implicit.py ...... [100%]

=============================== warnings summary ===============================
../../../../../usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18
/usr/local/lib/python3.8/dist-packages/nvtabular/framework_utils/init.py:18: DeprecationWarning: The nvtabular.framework_utils module is being replaced by the Merlin Models library. Support for importing from nvtabular.framework_utils is deprecated, and will be removed in a future version. Please consider using the models and layers from Merlin Models instead.
warnings.warn(

tests/unit/examples/test_serving_ranking_models_with_merlin_systems.py: 1 warning
tests/unit/systems/test_ensemble.py: 2 warnings
tests/unit/systems/test_export.py: 1 warning
tests/unit/systems/test_inference_ops.py: 2 warnings
tests/unit/systems/test_op_runner.py: 4 warnings
/usr/local/lib/python3.8/dist-packages/cudf/core/frame.py:384: UserWarning: The deep parameter is ignored and is only included for pandas compatibility.
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column x is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column y is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/test_export.py::test_export_run_ensemble_triton[tensorflow-parquet]
/var/jenkins_home/workspace/merlin_systems/systems/merlin/systems/triton/export.py:304: UserWarning: Column id is being generated by NVTabular workflow but is unused in test_name_tf model
warnings.warn(

tests/unit/systems/fil/test_fil.py::test_binary_classifier_default[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_binary_classifier_with_proba[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_multi_classifier[sklearn_forest_classifier-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_regressor[sklearn_forest_regressor-get_model_params4]
tests/unit/systems/fil/test_fil.py::test_model_file[sklearn_forest_regressor-checkpoint.tl]
/usr/local/lib/python3.8/dist-packages/sklearn/utils/deprecation.py:103: FutureWarning: Attribute n_features_ was deprecated in version 1.0 and will be removed in 1.2. Use n_features_in_ instead.
warnings.warn(msg, category=FutureWarning)

tests/unit/systems/fil/test_forest.py::test_export_merlin_models
/usr/local/lib/python3.8/dist-packages/tornado/ioloop.py:350: DeprecationWarning: make_current is deprecated; start the event loop first
self.make_current()

tests/unit/systems/implicit/test_implicit.py::test_reload_from_config[AlternatingLeastSquares]
/usr/local/lib/python3.8/dist-packages/implicit/utils.py:28: UserWarning: OpenBLAS detected. Its highly recommend to set the environment variable 'export OPENBLAS_NUM_THREADS=1' to disable its internal multithreading
warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================= 81 passed, 21 warnings in 558.13s (0:09:18) ==================
Performing Post build task...
Match found for : : True
Logical operation result is TRUE
Running script : #!/bin/bash
cd /var/jenkins_home/
CUDA_VISIBLE_DEVICES=1 python test_res_push.py "https://api.GitHub.com/repos/NVIDIA-Merlin/systems/issues/$ghprbPullId/comments" "/var/jenkins_home/jobs/$JOB_NAME/builds/$BUILD_NUMBER/log"
[merlin_systems] $ /bin/bash /tmp/jenkins10968557408137162622.sh

@karlhigley karlhigley merged commit 8960833 into NVIDIA-Merlin:main Aug 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants