-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Forward slash in dag_run_id
gives rise to trouble accessing things through the REST API
#20063
Comments
Thanks for opening your first issue here! Be sure to follow the issue template! |
Hmm, this is tricky. We can’t just allow slash in Does using |
Agreee. The %2F is the only "good" way to go. And if it does not work - I think we should probably fix it. This might become handy when we add multi-tenancy. "/" seems to be for example a nice, enforceable convention that can be used to separate dag namespaces. |
Currently, see #19745, |
I'll spend some time tomorrow looking into this. Percent encoding should work according to the specs. |
Yep. It definitely should |
I tracked this down to pallets/flask#900, and as Armin mentioned when he closed the ticket, the problem is in WSGI— @andrewgodwin Does ASGI have the same problem? If not, I wonder if it’d work if we move from Gunicorn to e.g. Uvicorn and run Flask with Update: I played with this a bit. This is still be a problem in
But there’s also
But of course the question is whether this is worth the hassle 😛 |
Bummer |
Ah yes, this old biscuit. ASGI behaves the same in this regard - as HTTP specifies that I'd suggest either double-escaping these values, which may prove a little tricky in terms of backwards compatibility, or fix the routing so it's an unambiguous path that can be routed by Flask. |
Thanks @uranusjr @andrewgodwin. TIL. I believe (WDYT) the best way to "fix" it is to warn peeople (and deprecate) the use of "/" in the ID and explain that they won't be able to access it via API until they change it. We can't straight disallow it, but we can add warning in the UI and logs about it. And then we disallow it in Airflow 3. We could potentially also disallow "/" in new DAGs while warn on updating existing. That might be a bit disruptive to someone who dynamically generates dags though. |
I think it is possible to include #!/usr/bin/env python3
## See if we can find way to send ath/to/mediafile like arguments to a restful api
from flask import Flask
from flask_restful import reqparse, Resource, Api
import logging
app = Flask(__name__)
api = Api(app)
parser = reqparse.RequestParser()
parser.add_argument("dag_id", required=False)
parser.add_argument("dag_run_id", required=False)
class Dag(Resource):
def get(self):
args = parser.parse_args()
## do clever DB access and filtering here, for now we just return the dag_run_id (default None)
return args.dag_run_id
api.add_resource(Dag, "/dags")
if __name__ == "__main__":
logging.basicConfig(level=logging.INFO, format='%(asctime)s %(levelname)s %(message)s')
app.run(host='0.0.0.0', port=4000, debug=True) Test using This could keep the current Api routes. Adding parameters like |
This means we’ll need to maintain two separate sets of URLs, which is suboptimal to me. I’d much rather just deprecate the use of |
Agree. Sounds like a hack. I also thought about it and it's actually kinda misleading to use Good example here (a bit different, but that's what we eventually will have to live with): https://github.com/apache/airflow/pkgs/container/airflow%2Fv2-1-test%2Fci%2Fpython3.7 |
Sorry for bumping the old issue, but maybe you have a quick answer for that - AWS added support for REST API in MWAA, but the problem is - usernames in MWAA start with Is there any way to use Neither |
Apache Airflow version
2.1.4
Operating System
linux
Versions of Apache Airflow Providers
apache-airflow-providers-amazon==2.2.0
apache-airflow-providers-celery==2.0.0
apache-airflow-providers-cncf-kubernetes==2.0.2
apache-airflow-providers-docker==2.1.1
apache-airflow-providers-elasticsearch==2.0.3
apache-airflow-providers-ftp==2.0.1
apache-airflow-providers-google==5.1.0
apache-airflow-providers-grpc==2.0.1
apache-airflow-providers-hashicorp==2.1.0
apache-airflow-providers-http==2.0.1
apache-airflow-providers-imap==2.0.1
apache-airflow-providers-microsoft-azure==3.1.1
apache-airflow-providers-mysql==2.1.1
apache-airflow-providers-postgres==2.2.0
apache-airflow-providers-redis==2.0.1
apache-airflow-providers-sendgrid==2.0.1
apache-airflow-providers-sftp==2.1.1
apache-airflow-providers-slack==4.0.1
apache-airflow-providers-sqlite==2.0.1
apache-airflow-providers-ssh==2.1.1
Deployment
Docker-Compose
Deployment details
We tend to trigger dag runs by some external event, e.g., a media-file upload, see #19745. It is useful to use the media-file path as a dag run id. The media-id can come with some partial path, e.g.,
path/to/mediafile
. All this seems to work fine in airflow, but we can't figure out a way to use the such a dag run id in the REST API, as the forward slashes/
interfere with the API routing.What happened
When using the API route
api/v1/dags/{dag_id}/dagRuns/{dag_run_id}
in, e.g., a HTTP GET, we expect a dag run to be found whendag_run_id
has the valuepath/to/mediafile
, but instead a.status: 404
is returned. When we change thedag_run_id
to the formatpath|to|mediafile
, the dag run is returned.What you expected to happen
We would expect a dag run to be returned, even if it contains the character
/
How to reproduce
Trigger a dag using a dag_run_id that contains a
/
, then try to retrieve it though the REST API.Anything else
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: