Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding the profiler for doris failed to execute #19220

Closed
NCUZK opened this issue Jan 3, 2025 · 1 comment · Fixed by #19250
Closed

Adding the profiler for doris failed to execute #19220

NCUZK opened this issue Jan 3, 2025 · 1 comment · Fixed by #19250
Assignees
Labels
bug Something isn't working profiler

Comments

@NCUZK
Copy link

NCUZK commented Jan 3, 2025

Affected module
Does it impact the UI, backend or Ingestion Framework?

Describe the bug
A clear and concise description of what the bug is.

To Reproduce

Screenshots or steps to reproduce

Expected behavior
A clear and concise description of what you expected to happen.

Version:

  • OS: [e.g. iOS]
  • Python version:
  • OpenMetadata version: 1.6.1
  • OpenMetadata Ingestion package version: 1.6.1

Additional context
Add any other context about the problem here.

@NCUZK
Copy link
Author

NCUZK commented Jan 3, 2025

image

image

0 matches
ee80541c-87bc-4b5c-96cf-963c9dc09a98-profiler-task-5izyfjzs
*** Found local files:
*** * /opt/airflow/logs/dag_id=ee80541c-87bc-4b5c-96cf-963c9dc09a98/run_id=manual__2025-01-03T08:42:39+00:00/task_id=profiler_task/attempt=1.log
[2025-01-03T08:42:55.690+0000] {local_task_job_runner.py:120} INFO - ::group::Pre task execution logs
[2025-01-03T08:42:56.065+0000] {taskinstance.py:2076} INFO - Dependencies all met for dep_context=non-requeueable deps ti=<TaskInstance: ee80541c-87bc-4b5c-96cf-963c9dc09a98.profiler_task manual__2025-01-03T08:42:39+00:00 [queued]>
[2025-01-03T08:42:56.081+0000] {taskinstance.py:2076} INFO - Dependencies all met for dep_context=requeueable deps ti=<TaskInstance: ee80541c-87bc-4b5c-96cf-963c9dc09a98.profiler_task manual__2025-01-03T08:42:39+00:00 [queued]>
[2025-01-03T08:42:56.081+0000] {taskinstance.py:2306} INFO - Starting attempt 1 of 1
[2025-01-03T08:42:56.106+0000] {taskinstance.py:2330} INFO - Executing <Task(CustomPythonOperator): profiler_task> on 2025-01-03 08:42:39+00:00
[2025-01-03T08:42:56.112+0000] {standard_task_runner.py:63} INFO - Started process 84 to run task
[2025-01-03T08:42:56.114+0000] {standard_task_runner.py:90} INFO - Running: ['airflow', 'tasks', 'run', 'ee80541c-87bc-4b5c-96cf-963c9dc09a98', 'profiler_task', 'manual__2025-01-03T08:42:39+00:00', '--job-id', '45', '--raw', '--subdir', 'DAGS_FOLDER/ee80541c-87bc-4b5c-96cf-963c9dc09a98.py', '--cfg-path', '/tmp/tmpfp6mjd5s']
[2025-01-03T08:42:56.114+0000] {standard_task_runner.py:91} INFO - Job 45: Subtask profiler_task
[2025-01-03T08:42:56.448+0000] {task_command.py:426} INFO - Running <TaskInstance: ee80541c-87bc-4b5c-96cf-963c9dc09a98.profiler_task manual__2025-01-03T08:42:39+00:00 [running]> on host ee80541c-87bc-4b5c-96cf-963c9dc09a98-profiler-task-5izyfjzs
[2025-01-03T08:42:57.053+0000] {logging_mixin.py:188} WARNING - /home/airflow/.local/lib/python3.10/site-packages/airflow/providers/cncf/kubernetes/template_rendering.py:46 AirflowProviderDeprecationWarning: This function is deprecated. Please use create_unique_id.
[2025-01-03T08:42:57.054+0000] {logging_mixin.py:188} WARNING - /home/airflow/.local/lib/python3.10/site-packages/airflow/providers/cncf/kubernetes/kubernetes_helper_functions.py:145 AirflowProviderDeprecationWarning: This function is deprecated. Please use add_unique_suffix.
[2025-01-03T08:42:57.054+0000] {pod_generator.py:557} WARNING - Model file /opt/airflow/pod_templates/pod_template.yaml does not exist
[2025-01-03T08:42:57.177+0000] {taskinstance.py:2648} INFO - Exporting env vars: AIRFLOW_CTX_DAG_OWNER='admin' AIRFLOW_CTX_DAG_ID='ee80541c-87bc-4b5c-96cf-963c9dc09a98' AIRFLOW_CTX_TASK_ID='profiler_task' AIRFLOW_CTX_EXECUTION_DATE='2025-01-03T08:42:39+00:00' AIRFLOW_CTX_TRY_NUMBER='1' AIRFLOW_CTX_DAG_RUN_ID='manual__2025-01-03T08:42:39+00:00'
[2025-01-03T08:42:57.178+0000] {taskinstance.py:430} INFO - ::endgroup::
[2025-01-03T08:42:57.201+0000] {server_mixin.py:74} INFO - OpenMetadata client running with Server version [1.6.1] and Client version [1.6.1.0]
[2025-01-03T08:42:59.199+0000] {ingestion_pipeline_mixin.py:53} DEBUG - Created Pipeline Status for pipeline 测试Doris.ee80541c-87bc-4b5c-96cf-963c9dc09a98: runId='3983484f-ea95-4353-a008-a3920f33363d' pipelineState=<PipelineState.running: 'running'> startDate=Timestamp(root=1735893777195) timestamp=Timestamp(root=1735893777195) endDate=None status=None
[2025-01-03T08:42:59.200+0000] {importer.py:129} DEBUG - Importing: metadata.ingestion.source.database.doris.service_spec.ServiceSpec
[2025-01-03T08:42:59.328+0000] {importer.py:129} DEBUG - Importing: metadata.ingestion.source.database.doris.metadata.DorisSource
[2025-01-03T08:42:59.503+0000] {metadata.py:76} INFO - Starting profiler for service 测试Doris:doris
[2025-01-03T08:42:59.505+0000] {importer.py:129} DEBUG - Importing: metadata.ingestion.sink.metadata_rest.MetadataRestSink
[2025-01-03T08:42:59.615+0000] {profiler.py:81} DEBUG - Sink type:metadata-rest, <class 'metadata.ingestion.sink.metadata_rest.MetadataRestSink'> configured
[2025-01-03T08:42:59.615+0000] {importer.py:129} DEBUG - Importing: metadata.ingestion.source.database.doris.connection.get_connection
[2025-01-03T08:42:59.616+0000] {importer.py:129} DEBUG - Importing: metadata.ingestion.source.database.doris.connection.test_connection
[2025-01-03T08:42:59.764+0000] {test_connections.py:203} INFO - Running CheckAccess...
[2025-01-03T08:43:00.621+0000] {test_connections.py:203} INFO - Running GetSchemas...
[2025-01-03T08:43:00.638+0000] {test_connections.py:203} INFO - Running GetTables...
[2025-01-03T08:43:00.671+0000] {test_connections.py:203} INFO - Running GetViews...
[2025-01-03T08:43:00.703+0000] {test_connections.py:228} INFO - Test connection results:
[2025-01-03T08:43:00.704+0000] {test_connections.py:229} INFO - lastUpdatedAt=None status=<StatusType.Running: 'Running'> steps=[TestConnectionStepResult(name='CheckAccess', mandatory=True, passed=True, message=None, errorLog=None), TestConnectionStepResult(name='GetSchemas', mandatory=True, passed=True, message=None, errorLog=None), TestConnectionStepResult(name='GetTables', mandatory=True, passed=True, message=None, errorLog=None), TestConnectionStepResult(name='GetViews', mandatory=False, passed=True, message=None, errorLog=None)]
[2025-01-03T08:44:00.704+0000] {base.py:320} INFO - OpenMetadata Service: Processed 0 records, updated 0 records, filtered 0 records, found 0 errors
[2025-01-03T08:44:00.705+0000] {base.py:320} INFO - Profiler: Processed 0 records, updated 0 records, filtered 0 records, found 0 errors
[2025-01-03T08:44:00.705+0000] {base.py:320} INFO - OpenMetadata: Processed 0 records, updated 0 records, filtered 0 records, found 0 errors
[2025-01-03T08:44:34.190+0000] {importer.py:129} DEBUG - Importing: metadata.ingestion.source.database.doris.service_spec.ServiceSpec
[2025-01-03T08:44:34.191+0000] {importer.py:129} DEBUG - Importing: metadata.profiler.interface.sqlalchemy.profiler_interface.SQAProfilerInterface
[2025-01-03T08:44:34.191+0000] {importer.py:129} DEBUG - Importing: metadata.ingestion.source.database.doris.service_spec.ServiceSpec
[2025-01-03T08:44:34.191+0000] {importer.py:129} DEBUG - Importing: metadata.sampler.sqlalchemy.sampler.SQASampler
[2025-01-03T08:44:34.192+0000] {importer.py:129} DEBUG - Importing: metadata.ingestion.source.database.doris.connection.get_connection
[2025-01-03T08:44:34.193+0000] {importer.py:129} DEBUG - Importing: metadata.ingestion.source.database.doris.connection.get_connection
[2025-01-03T08:44:34.194+0000] {core.py:483} DEBUG - Computing profile metrics for 测试Doris.default.ads_abt.ads_abt_flow_aggregate_hour_hi...
[2025-01-03T08:44:34.195+0000] {status.py:91} WARNING - Unexpected exception processing entity 测试Doris.default.ads_abt.ads_abt_flow_aggregate_hour_hi: 2 validation errors for ThreadPoolMetrics
table.function-after[parse_name(), Table]
Input should be a valid dictionary or instance of Table [type=model_type, input_value=None, input_type=NoneType]
For further information visit https://errors.pydantic.dev/2.7/v/model_type
table.is-instance[DeclarativeMeta]
Input should be an instance of DeclarativeMeta [type=is_instance_of, input_value=None, input_type=NoneType]
For further information visit https://errors.pydantic.dev/2.7/v/is_instance_of
[2025-01-03T08:44:34.195+0000] {status.py:92} DEBUG - Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/profiler/processor/processor.py", line 63, in _run
profile: ProfilerResponse = profiler_runner.process()
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/profiler/processor/core.py", line 486, in process
self.compute_metrics()
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/profiler/processor/core.py", line 468, in compute_metrics
self.profile_entity()
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/profiler/processor/core.py", line 450, in profile_entity
table_metrics = self._prepare_table_metrics()
File "/home/airflow/.local/lib/python3.10/site-packages/metadata/profiler/processor/core.py", line 331, in _prepare_table_metrics
ThreadPoolMetrics(
File "/home/airflow/.local/lib/python3.10/site-packages/pydantic/main.py", line 176, in init
self.pydantic_validator.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 2 validation errors for ThreadPoolMetrics
table.function-after[parse_name(), Table]
Input should be a valid dictionary or instance of Table [type=model_type, input_value=None, input_type=NoneType]
For further information visit https://errors.pydantic.dev/2.7/v/model_type
table.is-instance[DeclarativeMeta]
Input should be an instance of DeclarativeMeta [type=is_instance_of, input_value=None, input_type=NoneType]
For further information visit https://errors.pydantic.dev/2.7/v/is_instance_of

@TeddyCr TeddyCr added profiler bug Something isn't working labels Jan 6, 2025
@TeddyCr TeddyCr self-assigned this Jan 6, 2025
@TeddyCr TeddyCr moved this to In Progress in Release 1.6.2 Jan 6, 2025
@github-project-automation github-project-automation bot moved this from In Progress to Done in Release 1.6.2 Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working profiler
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants