Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High HTTP latency when serving files from 9pfs #1370

Open
pier-oliviert opened this issue Apr 14, 2017 · 55 comments
Open

High HTTP latency when serving files from 9pfs #1370

pier-oliviert opened this issue Apr 14, 2017 · 55 comments
Labels
area/performance Performance related issues cause/go9p-limitation Issues related to our go9p implementation help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence.

Comments

@pier-oliviert
Copy link

BUG REPORT

I have a website that I'm trying to run through minikube and while everything works, loading a single page on my host browser take upwards of 2 minutes. Connections between Pods seem to be normal.

The problem might originate from VirtualBox, but I'm not sure. Here's the minikube config:

Name:            minikube
Groups:          /
Guest OS:        Linux 2.6 / 3.x / 4.x (64-bit)
UUID:            834ea363-7d8c-4b5a-8139-6f74650b0f6b
Config file:     /Users/pothibo/.minikube/machines/minikube/minikube/minikube.vbox
Snapshot folder: /Users/pothibo/.minikube/machines/minikube/minikube/Snapshots
Log folder:      /Users/pothibo/.minikube/machines/minikube/minikube/Logs
Hardware UUID:   834ea363-7d8c-4b5a-8139-6f74650b0f6b
Memory size:     2048MB
Page Fusion:     off
VRAM size:       8MB
CPU exec cap:    100%
HPET:            on
Chipset:         piix3
Firmware:        BIOS
Number of CPUs:  2
PAE:             on
Long Mode:       on
Triple Fault Reset: off
APIC:            on
X2APIC:          off
CPUID Portability Level: 0
CPUID overrides: None
Boot menu mode:  disabled
Boot Device (1): DVD
Boot Device (2): DVD
Boot Device (3): HardDisk
Boot Device (4): Not Assigned
ACPI:            on
IOAPIC:          on
BIOS APIC mode:  APIC
Time offset:     0ms
RTC:             UTC
Hardw. virt.ext: on
Nested Paging:   on
Large Pages:     on
VT-x VPID:       on
VT-x unr. exec.: on
Paravirt. Provider: Default
Effective Paravirt. Provider: KVM
State:           running (since 2017-04-14T11:28:54.001000000)
Monitor count:   1
3D Acceleration: off
2D Video Acceleration: off
Teleporter Enabled: off
Teleporter Port: 0
Teleporter Address: 
Teleporter Password: 
Tracing Enabled: off
Allow Tracing to Access VM: off
Tracing Configuration: 
Autostart Enabled: off
Autostart Delay: 0
Default Frontend: 
Storage Controller Name (0):            SATA
Storage Controller Type (0):            IntelAhci
Storage Controller Instance Number (0): 0
Storage Controller Max Port Count (0):  30
Storage Controller Port Count (0):      30
Storage Controller Bootable (0):        on
SATA (0, 0): /Users/pothibo/.minikube/machines/minikube/boot2docker.iso (UUID: 87e7fc8f-6505-40fb-8787-4389906139a6)
SATA (1, 0): /Users/pothibo/.minikube/machines/minikube/disk.vmdk (UUID: 963fad28-7406-453c-855c-a434509c15f2)
NIC 1:           MAC: 0800270E1730, Attachment: NAT, Cable connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0 Mbps, Boot priority: 0, Promisc Policy: deny, Bandwidth group: none
NIC 1 Settings:  MTU: 0, Socket (send: 64, receive: 64), TCP Window (send:64, receive: 64)
NIC 1 Rule(0):   name = ssh, protocol = tcp, host ip = 127.0.0.1, host port = 50010, guest ip = , guest port = 22
NIC 2:           MAC: 0800272BEC67, Attachment: Host-only Interface 'vboxnet0', Cable connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0 Mbps, Boot priority: 0, Promisc Policy: deny, Bandwidth group: none
NIC 3:           disabled
NIC 4:           disabled
NIC 5:           disabled
NIC 6:           disabled
NIC 7:           disabled
NIC 8:           disabled
Pointing Device: PS/2 Mouse
Keyboard Device: PS/2 Keyboard
UART 1:          disabled
UART 2:          disabled
UART 3:          disabled
UART 4:          disabled
LPT 1:           disabled
LPT 2:           disabled
Audio:           enabled (Driver: CoreAudio, Controller: AC97, Codec: STAC9700)
Clipboard Mode:  disabled
Drag and drop Mode: disabled
Session name:    headless
Video mode:      720x400x0 at 0,0 enabled
VRDE:            disabled
USB:             disabled
EHCI:            disabled
XHCI:            disabled

USB Device Filters:

<none>

Available remote USB devices:

<none>

Currently Attached USB Devices:

<none>

Bandwidth groups:  <none>

Shared folders:  

Name: 'Users', Host path: '/Users' (machine mapping), writable

VRDE Connection:    not active
Clients so far:     0

Video capturing:    not active
Capture screens:    0
Capture file:       /Users/pothibo/.minikube/machines/minikube/minikube/minikube.webm
Capture dimensions: 1024x768
Capture rate:       512 kbps
Capture FPS:        25

Guest:

Configured memory balloon size:      0 MB
OS type:                             Linux26_64
Additions run level:                 2
Additions version:                   5.1.6 r110634


Guest Facilities:

Facility "VirtualBox Base Driver": active/running (last update: 2017/04/13 22:43:01 UTC)
Facility "VirtualBox System Service": active/running (last update: 2017/04/13 22:43:02 UTC)
Facility "Seamless Mode": not active (last update: 2017/04/13 22:43:01 UTC)
Facility "Graphics Mode": not active (last update: 2017/04/13 22:43:01 UTC)

I did change the NIC to use the paravirtualized network but speed stayed the same.

I also tried #1353 but it didn't fix it for me. Here's a poorly representative screenshot of what's going on when I load the page and look at the network tab in Chrome:

screen shot 2017-04-12 at 09 05 46

minikube version: v0.18.0

Environment:

  • OS: 10.12.4 (16E195)
  • VM Driver: VirtualBox
  • ISO version: v0.18.0
  • Install tools:
  • Others:

What you expected to happen:
Get the page load under 600ms would be acceptable.

How to reproduce it (as minimally and precisely as possible):

Start minikube with VirtualBox and run a rails server and try to access it from the host. That page needs to have external asset to increase the number of connection going through minikube.

Anything else do we need to know:

My setup might not be similar to what others do and while unlikely, it could be the cause of all my problems. Here's a gist of my Dockerfile and k8s config file

Notice how the image is "empty" and only loads the Gemfile and then when the image gets loaded into the pod, a volume from the host is mounted on that image. That allows me to develop on my host in the same folder as all my other project while running everything through minikube.

Let me know if you need extra information, I'd be glad to help!

@r2d4 r2d4 added the kind/bug Categorizes issue or PR as related to a bug. label Apr 14, 2017
@pier-oliviert
Copy link
Author

As a follow up, I rebuild minikube and all my images using the xhyve driver and it takes 6s to get to DOMContentLoaded which is much, much better.

From the look of it, it might be a VM driver issue (Virtualbox). The unfortunate thing about all this is that I have permissions issue with xhyve that prevents me from using it to develop (Xhyve has permission problems writing files back to the host through volumes).

So I'm very much interested in finding out the issue with Virtualbox's driver.

Also, I started moving our assets to webpack, which concatenate all our files into 1 even in development and the page load went down from 2 minutes to ms. My assumption would be that there is a waterfall event that start clugging the pipe when there are 20+ requests in a short amount of time.

My knowledge is very limited when it comes to docker machines and virtualization so my apologies for not being more helpful :(

@markacola
Copy link

I am currently seeing this, even with webpack. My speed is ~5kb/sec, which is making it basically unusable. If there is any info I could provide that might help just let me know!

@pier-oliviert pier-oliviert changed the title Extremely slow connection between minikube and host Extremely slow connection between minikube and host on OS X May 12, 2017
@pier-oliviert
Copy link
Author

The problem seems to lie outside of Minikube. Docker uses osxfs which is a custom filesystem to try to bring native container capabilities to OS X. It works fine when communication are between container but things fall apart when trying to communicate with the host.

From what I read, it's due to synching filesystem between the 2. One way to fix it is to use a nfs server to serve files from host to guest. Or rsync.

@r2d4
Copy link
Contributor

r2d4 commented May 12, 2017

We don't actually use osxfs in minikube. The host folder mount is done with a 9p filesystem. This is how the xhyve driver as well as he minikube mount command works. Virtualbox uses vboxfs, which is its own proprietary way to share files between the guest and the host.

If find performance issues with vboxfs, you could try the minikube mount command, rsync, or nfs

@OrDuan
Copy link

OrDuan commented May 18, 2017

Having the same issue, also using 9p mont from minikube with Virtualbox. Ubuntu 16.04.

@pmlt
Copy link

pmlt commented Jun 30, 2017

My guess is that this is caused by poor I/O performance on the guest side of the 9p mount. I tried measuring the time to extract a 80MB tarball containing lots of small files as a simple test... Here are my findings:

On the host machine, in the directory mounted via 9p: 0.09s
On the guest machine, outside the directory mounted via 9p: 0.140s
On the guest machine, inside the directory mounted via 9p: 22.094s (!)

Running on: Arch Linux, minikube 0.20, VirtualBox 5.1.22

It seems that the 9p mount is really unsuited for any kind of real-life, non-trivial workloads. Any kind of I/O-heavy build step run from the guest takes forever to finish (for example: webpack, npm install, composer install).

@huguesalary
Copy link
Contributor

I'm running into this issue as well.

I had extremely poor performances for my PHP app too (HTTP request taking anywhere between 1 and 2 minutes to get a response).

My code was hosted on my mac, mounted in an xhyve virtual machine via minikube mount, and finally mounted inside my kubernetes pod via a hostPath volume mount.

As as test, I copied my source code directly inside the pod's container instead of serving it through 9p.

I am now getting responses in 800ms.

Unfortunately, since this is for local development, I need to see my changes immediately and using a full copy of my source code instead of serving it through the network is not an option.

I'm going to setup an NFS mount and see what the performances are compared to 9p. However my test goes, in the current state, 9p on minikube is definitely way too slow for any workload.

@watzo
Copy link

watzo commented Oct 7, 2017

@huguesalary: I'm very curious to find out how the NFS mount performance compared to 9p?

Thanks!
Walco

@pier-oliviert
Copy link
Author

I just retried this week with NFS and while things are better, it's still taking ~20seconds to load a page due to i/o constraints

@heygambo
Copy link

How can I setup an NFS mount with minikube?

@watzo
Copy link

watzo commented Oct 25, 2017

@pothibo thanks for the write-up. so still unusable, unfortunately.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 23, 2018
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 22, 2018
@pier-oliviert
Copy link
Author

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Feb 22, 2018
@bit0rez
Copy link

bit0rez commented Feb 28, 2018

Hi guys!
You can use NFS as PersistentVolume for POD in k8s. It work`s.
You need to setup NFS on your host machine and just add PV and PVC configurations fo your project.

My configuration:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-project-volume
spec:
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteMany
  nfs:
    server: 192.168.99.1
    path: "/var/projects/my-project"
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-project-claim
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: ""
  resources:
    requests:
      storage: 1Gi

In deployment configuration you just add mount options for persistent volume claim.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 29, 2018
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 28, 2018
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@jpswade
Copy link

jpswade commented Aug 8, 2018

/reopen

@k8s-ci-robot
Copy link
Contributor

@jpswade: you can't re-open an issue/PR unless you authored it or you are assigned to it.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 20, 2020
@YongBig
Copy link

YongBig commented Aug 21, 2020

This is still an issue.

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Aug 21, 2020
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 19, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 19, 2020
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@YongBig
Copy link

YongBig commented Jan 18, 2021

This is still an issue.

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 18, 2021
@YongBig
Copy link

YongBig commented Jan 18, 2021

/reopen

@k8s-ci-robot
Copy link
Contributor

@du86796922: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@pier-oliviert
Copy link
Author

/reopen

@k8s-ci-robot k8s-ci-robot reopened this Jan 18, 2021
@k8s-ci-robot
Copy link
Contributor

@pothibo: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 18, 2021
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 18, 2021
@sharifelgamal sharifelgamal added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels May 19, 2021
@sharifelgamal
Copy link
Collaborator

I'm going to go ahead and freeze this issue so it doesn't keep getting closed. That said, this is an issue with 9p itself and can't really be worked around unless we end up replacing it.

@spowelljr spowelljr added priority/backlog Higher priority than priority/awaiting-more-evidence. and removed priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. labels Sep 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/performance Performance related issues cause/go9p-limitation Issues related to our go9p implementation help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
Development

No branches or pull requests