Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to fetch log file from worker. Request URL missing either an 'http://' or 'https://' protocol. #23798

Closed
1 task done
Vishal2696 opened this issue May 19, 2022 · 3 comments
Labels
area:Scheduler including HA (high availability) scheduler kind:bug This is a clearly a bug pending-response

Comments

@Vishal2696
Copy link

Vishal2696 commented May 19, 2022

Apache Airflow version

2.2.3

What happened

Airflow Webserver is unable to fetch the logfiles from worker. I'm using celery executor with the airflow components deployed in kubernetes as pods (under a deployment). Below is the error I see from the webserver UI

*** Log file does not exist: /opt/airflow/airflow/logs/mydagname/mytaskname/2022-05-19T04:53:37.167551+00:00/2.log
*** Fetching from: http://:8793/log/mydagname/mytaskname/2022-05-19T04:53:37.167551+00:00/2.log
*** Failed to fetch log file from worker. Request URL missing either an 'http://' or 'https://' protocol.

Below is the error I see in worker pod's STDOUT.

self.protocol = self.protocol if parsed_url.scheme is '' else parsed_url.scheme
[2022-05-19 10:06:22,046: ERROR/ForkPoolWorker-15] Task airflow.executors.celery_executor.execute_command[feae5691-b147-4bea-9b4a-bed01b559539] raised unexpected: AirflowException('Celery command failed on host: 10.22.52.110')
Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.8/site-packages/celery/app/trace.py", line 451, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.8/site-packages/celery/app/trace.py", line 734, in __protected_call__
    return self.run(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/celery_executor.py", line 90, in execute_command
    _execute_in_fork(command_to_exec, celery_task_id)
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/executors/celery_executor.py", line 101, in _execute_in_fork
    raise AirflowException('Celery command failed on host: ' + get_hostname())
airflow.exceptions.AirflowException: Celery command failed on host: 10.22.52.110

I tried setting the AIRFLOW__CORE__HOSTNAME_CALLABLE to "airflow.utils.net.get_host_ip_address" in both webserver and worker but nothing changed. Still the same error. The worker is sitting behind a kubernetes service with port 8793 mapped to it.

Metadata DB: Azure Postgres Server Version 11.

Providers Info

apache-airflow-providers-celery | 2.1.0
apache-airflow-providers-ftp | 2.0.1
apache-airflow-providers-http | 2.0.1
apache-airflow-providers-imap | 2.0.1
apache-airflow-providers-microsoft-azure | 3.9.0
apache-airflow-providers-oracle | 2.2.3
apache-airflow-providers-postgres | 2.4.0
apache-airflow-providers-redis | 2.0.1
apache-airflow-providers-sqlite | 2.0.1

Operating System

Airflow docker image built from source code.

Deployment

Other Docker-based deployment

Deployment details

kubernetes

Code of Conduct

@Vishal2696 Vishal2696 added area:core kind:bug This is a clearly a bug labels May 19, 2022
@uranusjr
Copy link
Member

See #13692; more information is needed.

@uranusjr uranusjr added area:Scheduler including HA (high availability) scheduler pending-response and removed area:core labels May 19, 2022
@Vishal2696
Copy link
Author

Vishal2696 commented May 20, 2022

@uranusjr I'm not sure what all the info you need, but I'm providing as much as I can.

  1. The image being run was built from the official apache github repo (Source code). Following the same instructions given in the repo.
  2. The task that is problematic is trying to trigger an azure data-factory pipeline. azure.mgmt.datafactory is the library that I'm using.
  3. The DAGS are baked into the airflow image as a python artifact.
  4. In this particular case I'm running one webserver, one scheduler and one worker. I have observed this in environments with multiple schedulers and multiple workers too.
  5. Celery logs or STDOUT logs from the worker have already been attached in the description itself

@potiuk
Copy link
Member

potiuk commented May 22, 2022

@uranusjr I'm not sure what all the info you need, but I'm providing as much as I can.

Just to set expectations: Basically, any information that might help people here to help you. People here help in their free time, and more often than not the problems are connected to people misconfiguring Airlfow, or simply not troubleshooting their deployment as they should. While the community here delivers airflow as software - you are responsible for configuring it properly (and learning how to do it). So if we help here - we do it in our free time, mostly when users had shown that they've done investigations on their side and provided enough info to help others diagnose it.

There shoudl be no expectations that you will for sure get help here. If you provide enough info, you might get it.

I think - if you want to help us diagnose your deployment issue - is to provide even more information. So it's really you who should care about giving as much information as you can, so that we can help you (again - this is free help on free software, if you NEED your problem to be solved, there are companies that provide paid support for troubleshooting airlfow installations).

In this particular case I can advice you to provide more information on circumstances. A number of questions come to my mind immediately - but again - probably you can provide even more information - it's you, who want to have your problem solved, so it's your worry to provide enough information.

  • Does it happen always?
  • or for specific tasks only?
  • or maybe intermittently (sometimes works, sometimes not)?
  • what customisations you have (Especially for logging)
  • do you have any netwerking/firewall configuration?
  • did you check that networking works between the webserver and UI and that dns works properly to see your pods from the URL?
  • do you have special configuration for logs ?
  • does your pod have persistent storage for logs ?
  • is the file that you are trying to access when you see the error /opt/airflow/airflow/logs/mydagname/mytaskname/2022-05-19T04:53:37.167551+00:00/2.log - actually present in the workar that webserver tries to download it from?
  • is your callable (the one you configured) actually called and works? Please run it in the worker pod and see what it returns as hostname/Ip address. Maybe your networking is miconfigured and it provides empty value
  • do you have the right hostname in task_instance table for the tasks you run ? What is the value for an example failure.

I think going through that list might help you to gather enough information to make it possible to help you.

Converting it into discussion until more info is provided.

@apache apache locked and limited conversation to collaborators May 22, 2022
@potiuk potiuk converted this issue into discussion #23855 May 22, 2022

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
area:Scheduler including HA (high availability) scheduler kind:bug This is a clearly a bug pending-response
Projects
None yet
Development

No branches or pull requests

3 participants