Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Traefik stuck when used as frontend for a streaming API #560

Closed
git-oca opened this issue Jul 25, 2016 · 29 comments
Closed

Traefik stuck when used as frontend for a streaming API #560

git-oca opened this issue Jul 25, 2016 · 29 comments

Comments

@git-oca
Copy link

git-oca commented Jul 25, 2016

Hi, I have a service that streams potentially large responses, in a streaming fashion.
When I put traefik (v1.0.1) in front of my service, the streaming API doesn't seems to reply anymore.

My real use case is to do streaming on a SolR server that may have very large result-set.
but this small service (Python/Flask) will help to reproduce the problem easily :

from flask import Response
from flask import Flask
app = Flask(__name__)

@app.route('/numbers')
def generate_route():
    def generate():
        a = 0
        while True :
            yield(str(a) + "\n")
            a = a + 1
    return Response(generate(), mimetype='text/plain')

if __name__ == "__main__":
    app.run(host='0.0.0.0', port=8080)

A call to this '/numbers' API will generate an infinite stream of numbers.
This command, while doing a broken pipe, will however still reply and print number 0 to 9 :

curl -s localhost:8080/numbers | head -n 10
0
1
2
3
...

When I'm using traefik as a reverse proxy to redirect call on port 80 to 8080, the '/numbers' API doesn't reply anymore.

curl -s localhost/numbers | head -n 10

(no response here...)

So is there anything that can be done about that ?
Thanks

Olivier

@emilevauge
Copy link
Member

@git-oca currently, traefik supports streaming if mime type is set to text/event-stream.

@git-oca
Copy link
Author

git-oca commented Jul 26, 2016

Thanks for your reply Emile, Unfortunately, I have got exactly the same result using the "text/event-stream" mime type.

so without traefik, I got the numbers (I used -i to be sure that the content type 'text/event-stream' was correct):

curl -i localhost:8085/numbers | head -n 10
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0HTTP/1.0 200 OK
Content-Type: text/event-stream; charset=utf-8
Connection: close
Server: Werkzeug/0.11.5 Python/2.7.6
Date: Tue, 26 Jul 2016 08:51:04 GMT

0
1
2
3

but with traefik as frontend, curl doesn't show up any number and seems stuck.

curl -i localhost/numbers | head -n 10
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- 0:00:04 --:--:-- 0
(nothing...)

but I can see on the python server log that it is actually writing the response.

@git-oca
Copy link
Author

git-oca commented Jul 26, 2016

Not sure if this helps, but just did a quick try with "oxy" alone as a reverse proxy and the streaming went fine... but still unable to get it working with traefik

package main
import (
  "net/http"
  "github.com/vulcand/oxy/forward"
  "github.com/vulcand/oxy/testutils"
  )


func main() {
  // Forwards incoming requests to whatever location URL points to, adds proper forwarding headers
  fwd, _ := forward.New()

  redirect := http.HandlerFunc(func(w http.ResponseWriter, req *http.Request) {
      // let us forward this request to another server
          req.URL = testutils.ParseURI("http://localhost:8085")
          fwd.ServeHTTP(w, req)
  })

  // that's it! our reverse proxy is ready!
  s := &http.Server{
    Addr:           ":8080",
    Handler:        redirect,
  }
  s.ListenAndServe()
}

@emilevauge
Copy link
Member

emilevauge commented Jul 26, 2016

@git-oca, this PR containous/oxy#9 fixed Mime Type parsing in containous/oxy. I will integrate it into traefik soon.
I don't get why it worked with vulcand/oxy, we had to patch it in the fork containous/oxy to add streaming support...

@git-oca
Copy link
Author

git-oca commented Jul 26, 2016

Fantastic ! thank you very much ! 👍

@git-oca
Copy link
Author

git-oca commented Jul 29, 2016

Hi Emile,
I just took those modifications (PR containous/oxy#9), and did a quick build of traefik, and unfortunately, the problem is still the same even with these modifications. Any idea ?

@git-oca
Copy link
Author

git-oca commented Jul 29, 2016

In oxy/forward/responseflusher.go, in

func (wf *responseFlusher) Flush() {
    flusher, ok := wf.ResponseWriter.(http.Flusher)
    if ok {
        flusher.Flush()
    }
}

ok seems to be always false when I'm streaming... unfortunately I don't know that much of Go... so I can't help that much more... désolé

@emilevauge
Copy link
Member

emilevauge commented Jul 29, 2016

@git-oca you mean you built traefik using the commit 4298f24d572dc554eb984f2ffdf6bdd54d4bd613 of containous/oxy right ?
This is really weird...
Can you paste your toml file ?

@git-oca
Copy link
Author

git-oca commented Jul 29, 2016

I just cloned the master of traefik and modified oxy files manualy

Le 29 juil. 2016 à 17:28, Emile Vauge [email protected] a écrit :

@git-oca you mean you build traefik using the commit 4298f24d572dc554eb984f2ffdf6bdd54d4bd613 of containous/oxy right ?
This is really weird...
Can you paste your toml file ?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

@git-oca
Copy link
Author

git-oca commented Jul 29, 2016

I will post my toml by next week. I can also provide a docker image to help reproduce the problem

Le 29 juil. 2016 à 17:28, Emile Vauge [email protected] a écrit :

@git-oca you mean you build traefik using the commit 4298f24d572dc554eb984f2ffdf6bdd54d4bd613 of containous/oxy right ?
This is really weird...
Can you paste your toml file ?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

@git-oca
Copy link
Author

git-oca commented Aug 2, 2016

So here is my traefik.toml

graceTimeOut = 10

logLevel = "WARN"

ProvidersThrottleDuration = 2

MaxIdleConnsPerHost = 0

defaultEntryPoints = ["http"]


 [entryPoints]
   [entryPoints.http]
   address = ":80"

[retry]

attempts = 10

[file]

filename = "rules.toml"

watch = true

and this is rules.toml

[frontends]
  [frontends.frontend1]
  backend = "backend1"
[backends]
  [backends.backend1]
    [backends.backend1.LoadBalancer]
    method = "wrr"
      [backends.backend1.servers.server1]
        url = "http://localhost:8085"
        weight = 1

@git-oca
Copy link
Author

git-oca commented Aug 2, 2016

I just pushed on dockerhub an image that may help to reproduce the problem.
The image is called ocarnal/test-traefik-streaming
The image is built on golang 1.6.3 with a few more stuff (python flask framework, vim, glide)

every thing for testing is in the /oca folder.
traefik is built from /go/src/github.com/containous/traefik (cloned from master)
Before building, I modified these files (from the source available here containous/oxy@3a329f9 ):
/go/src/github.com/containous/traefik/vendor/github.com/vulcand/oxy/forward/fwd.go
/go/src/github.com/containous/traefik/vendor/github.com/vulcand/oxy/utils/netutils.go
/go/src/github.com/containous/traefik/vendor/github.com/vulcand/oxy/utils/netutils_test.go
I also added a "NO FLUSH" trace in that file :
/go/src/github.com/containous/traefik/vendor/github.com/vulcand/oxy/forward/responseflusher.go

Now here are the steps to reproduce the problem.

Let's first start the container
docker run -d -it ocarnal/test-traefik-streaming
I got id f167dc33a2cd

You will need a few shell to start all the process. so I use docker exec -it f167dc33a2cd /bin/bash to start them
Here we go :

Step 1 : starting the python web app that will do the streaming
docker@oca:~$ docker exec -it f167dc33a2cd /bin/bash
root@f167dc33a2cd:/go# cd /oca
root@f167dc33a2cd:/oca# python server.py (just edit server.py to see the http routes )

Step 2: starting traefik to redirect 80 to 8085
docker@oca:~$ docker exec -it f167dc33a2cd /bin/bash
root@f167dc33a2cd:/go# cd /oca
root@f167dc33a2cd:/oca# ./traefik -d -c traefik.toml
INFO[2016-08-02T09:15:39Z] Traefik version dev built on I don't remember exactly
INFO[2016-08-02T09:15:39Z] Using TOML configuration file /oca/traefik.toml
...
...

Step 3 : testing

docker@oca:~$ docker exec -it f167dc33a2cd /bin/bash
root@f167dc33a2cd:/go# curl localhost:8085/hello
Hello

root@f167dc33a2cd:/go# curl localhost:8085/numbers | head -n 10
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1254 0 1254 0 0 9200 0 --:--:-- --:--:-- --:--:-- 91530
1
2
3
4
5
6
7
8
9
curl: (23) Failed writing body (2 != 5)
root@f167dc33a2cd:/go#

Ok, without using a proxy, we got our numbers.
NOTE that this kills the python server... so in the shell you used for step 1, restart the server with "python server.py"

Now try with traefik as a reverse proxy :
root@f167dc33a2cd:/go# curl localhost/numbers | head -n 10
this will never complete...
And have a look on the Step 1 shell, you will the python server printing lot's of numbers...
so just ctrl-c to stop the server.
also in the traefik shell you should see ("NO FLUSH"), A trace that I added in /go/src/github.com/containous/traefik/vendor/github.com/vulcand/oxy/forward/responseflusher.go

Now if you :

  • restart the python server again
  • kill the trafik process
  • start /oca/oxy-test instead (that is small standalone oxy proxy, see source in /go/src/oxy-test/)
  • and curl localhost/numbers | head -n 10

you should see the numbers...

@emilevauge
Copy link
Member

@git-oca Could you try using Docker image containous/traefik:pr-584 ?

@git-oca
Copy link
Author

git-oca commented Aug 2, 2016

I copied the binary traefik file of "containous/traefik:pr-584" to my container, did the test again and got the same result.

root@79ee253dd695:/oca# ./traefik version
Version: 2c41176
Codename: reblochon
Go version: go1.6.2
Built: 2016-08-02_10:39:47AM
OS/Arch: linux/amd64

@git-oca
Copy link
Author

git-oca commented Aug 3, 2016

I just put a few trace in each traefik/middlewares, in the ServeHTTP function.
like this :

func (retry *Retry) ServeHTTP(rw http.ResponseWriter, r *http.Request) {
   _,ok := rw.(http.Flusher)
   fmt.Println("retry.go, has flusher : ",ok)
   ...
}

and here is what I got :

handlerSwitcher.go, has flusher : true
retry.go, has flusher : true
saveBackend.go, has flusher : false

so it seems that the RequestWriter can no longer be typed as a http.Flusher after the "retry" middleware...

@git-oca
Copy link
Author

git-oca commented Aug 3, 2016

very dirty, but just for a quick test, in retry.go, I replaced
retry.next.ServeHTTP(recorder, r) just by retry.next.ServeHTTP(rw, r)
and that change make the streaming working... (but I got this in the log... "2016/08/03 12:25:21 server.go:2161: http: multiple response.WriteHeader calls")

@emilevauge
Copy link
Member

emilevauge commented Aug 3, 2016

@git-oca: ouch, I think I got it... The ResponseRecorder doesn't implement http.Flusher.
I'll create a PR with the fix.
Thanks a lot for investigating on this 👍

@emilevauge
Copy link
Member

@git-oca: PR made #592 ;)

@git-oca
Copy link
Author

git-oca commented Aug 3, 2016

Just tested that, but adding a Flush seems to be not enough...
I did try to add this :

func (rw *ResponseRecorder) Flush()  {
   if fl,ok := rw.responseWriter.(http.Flusher); ok {
     fl.Flush()
   }
}

but still no answer... so I think we should change Write too... cause it tries to copy to a buffer and that will never end...

@git-oca
Copy link
Author

git-oca commented Aug 3, 2016

but we are close ;)

@git-oca
Copy link
Author

git-oca commented Aug 3, 2016

just tested PR 592 , but it's not enough

@emilevauge
Copy link
Member

emilevauge commented Aug 3, 2016

Added a call to Flush() in ResponseRecorder.Write() but it's kind of hacky...
Edit: still doesn't work... I need to investigate another solution with retry.

@git-oca
Copy link
Author

git-oca commented Aug 3, 2016

defer rw.Flush() never get called...
as rw.Body.Write(buf) is on an infinite stream... not sure what's the best way here... maybe something like in oxy/fwd.go using io.Copy

@emilevauge
Copy link
Member

In fact defer rw.Flush() is not needed, ResponseRecorder.Flush() is being called.
The issue is in Retry.ServeHTTP() that doesn't copy the response back from the ResponseRecorder.

@git-oca
Copy link
Author

git-oca commented Aug 3, 2016

ok got it, ResponseRecorder.Flush() is actually called...

@git-oca
Copy link
Author

git-oca commented Aug 3, 2016

Funny way to learn GO BTW ;)

@emilevauge
Copy link
Member

@git-oca I just updated the PR, if you want to test :)

@git-oca
Copy link
Author

git-oca commented Aug 4, 2016

@emilevauge 💯 That's it ! streaming is working fine with this version ! Thank you very much !

I just noticed that log after the request completed, but that don't seems to cause trouble :

2016/08/04` 06:15:15 server.go:2161: http: multiple response.WriteHeader calls

Also something I don't care at all but that I noticed while looking in the code... so just to let you know, in server.go, there is an import of "github.com/codegangsta/negroni" that should now be "github.com/urfave/negroni" according to what they say on the project main page :

Notice: This is the library formerly known as github.com/codegangsta/negroni -- Github will automatically redirect requests to this repository, but we recommend updating your references for clarity.

So thanks again ! Traefik is now replacing haproxy in our mesos cluster here in the localization team at Autodesk.

@emilevauge
Copy link
Member

emilevauge commented Aug 4, 2016

I just noticed that log after the request completed, but that don't seems to cause trouble :

2016/08/04` 06:15:15 server.go:2161: http: multiple response.WriteHeader calls

Yeah, I know, but it will be difficult to avoid that here as response.WriteHeader is called during Write in standard library :'(

Also something I don't care at all but that I noticed while looking in the code... so just to let you know, in server.go, there is an import of "github.com/codegangsta/negroni" that should now be "github.com/urfave/negroni" according to what they say on the project main page :

Notice: This is the library formerly known as github.com/codegangsta/negroni -- Github will automatically redirect requests to this repository, but we recommend updating your references for clarity.

I will change that later, not in a bug fix release ;)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants