Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

State of the build (io.js) April 2015 #77

Closed
rvagg opened this issue Apr 15, 2015 · 11 comments
Closed

State of the build (io.js) April 2015 #77

rvagg opened this issue Apr 15, 2015 · 11 comments

Comments

@rvagg
Copy link
Member

rvagg commented Apr 15, 2015

State of the build (io.js) April 2015

This is a summary of activity and resources within the io.js Build WG. I'm doing this to present to the WG meeting that's coming up but also to shine a bit of light into things that are mostly in my head. Some of this information could go on the README or other documentation for the project. I'd like to update this information each month so we can see how it evolves over time. Summarising in this way shows up a few TODO items that we need to tackle as a group and should also show us where our priorities should be going forward.

Build cluster servers

DigitalOcean

We have a fairly open account with DigitalOcean and this is where we do all of our non-ARM Linux computing. We also run https://iojs.org/ from here.

  • 2 x 16G instances for iojs-build-containers for running untrusted builds, 3 x Ubuntu container types and 2 x Debian container types
  • 6 x 4G instances for Ubuntu: 10.04 32-bit, 10.04 64-bit, 12.04 64-bit, 14.04 32-bit, 14.04 64-bit, 14.10 64-bit
  • 4 x 4G instances for CentOS: v5 32-bit, v5 64-bit, v6 64-bit, v7 64-bit
  • 2 x 4G instances for CentOS for release builds: v5 32-bit, v5 64-bit

Currently myself, @wblankenship and now @jbergstroem have access in to all of these machines.

Rackspace

We have a somewhat open account with Rackspace and have @kenperkins on the team who is able to give us more resources if we need them.

  • 2 x 30 GB Compute v1 instances for Windows Server 2008 R2 SP1
  • 2 x 30 GB Compute v1 instances for Windows Server 2012 R2
  • 1 x 30 GB Compute v1 instance for Windows Server 2012 R2 with Visual Studio 2015, not currently running in the general CI group
  • 2 x 30 GB Compute v1 instances for Windows Server 2008 R2 SP1 release builds: 32-bit and 64-bit

Currently it's just myself that have the Administrator passwords for these boxes, I need to identify someone else on the build team member who is competent on Windows so we can reduce our bus-factor here. The release build machines contain signing keys so I'd like to keep that somewhat restricted and will likely share access @wblankenship who is also NodeSource.

Voxer

Voxer have a primary interest in FreeBSD support in io.js for their own use which is where the FreeBSD machines come in. They are very fast because they are not virtualised at all. The FreeBSD machines are behind the Voxer VPN and the Mac Mini servers will be soon.

  • 1 x FreeBSD 10.1-RC3 32-bit jail
  • 1 x FreeBSD 10.1-RC3 64-bit jail
  • 2 x 2015 Mac Mini servers running virtual machines, each with:
    • 1 x OS X 10.10 for test builds
    • 1 x OS X 10.10 for release builds - one server creates .pkg files, the other creates the source tarball and the darwin tarball

Currently myself and @jbergstroem have VPN access into the Voxer network to connect to the FreeBSD machines. Only I have access to the Mac Mini servers but I need to get @wblankenship on to them as well at some point. They contain our signing keys in the release VMs so I'll need to keep access somewhat restricted.

Joyent

Joyent are provided two zones for test builds, they are multiarch and we are using them to do both 64-bit and 32-bit builds.

  • 1 x 8G High CPU zones with 8 vCPUs for SmartOS 64-bit tests
  • 1 x 8G High CPU zones with 8 vCPUs for SmartOS 32-bit tests

Currently myself, @geek and @jbergstroem have access to these machines.

Scaleway

Scaleway, formerly Online Labs, have provided us with a 5-server account on their ARMv7 cluster. We are using them to run plain Debian Wheezy (armhf) on ARMv7 but could potentially be running other OS combinations as well. The ARMv7 release binaries will eventually come from here as Wheezy represents the oldest libc I think we're likely to want to support on ARM.

  • 2 x ARMv7 Marvell Armada 370/XP running Debian Wheezy (armhf)
  • 1 x ARMv7 Marvell Armada 370/XP running Debian Wheezy (armhf) for release builds (yet to take over from the existing ARMv7 machine creating release builds)

Currently only I have access to these machines but I should share access with someone else from the build team.

Linaro

Linaro exists to help open source projects prepare for ARM support. We are being supported by ARM Holdings in this as they have an interest in seeing ARMv8/AArch64 support improved (we now have working ARMv8 builds!). Our access is on a monthly renewal basis but I just need to continue to request continued access.

  • 1 x ARMv8 / AArch64 APM X-Gene Mustang running Ubuntu 14.04

Currently only I have access, access is via an SSH jump-host so it's a little awkward to just give others access. I haven't asked about getting other keys into that arrangement but it likely would be OK. An interim measure is to create an SSH tunnel for access to this server, which I have done previously for io.js team members needing to test & debug their work.

I'm still investigating further ARMv8 hardware so we can expand our testing but low cost hardware is hard to get hold of at the moment and I'd really like to find a corporate partner that we can work with on this (WIP).

NodeSource

The rest of the io.js ARM cluster is running in my office and consists of hardware donated by community members and NodeSource. I'm still looking for further donations here because the more the better, particularly for the slow hardware. Not included in this list is a Beagle Bone Black donated by @julianduque that I haven't managed to hook up yet, but will do because of an interesting OS combination it comes with (and also its popularity amongst NodeBots users).

  • 2 x Raspberry Pi v1 running Raspbian Wheezy
  • 1 x Raspberry Pi v1 Plus running Raspbian Wheezy
  • 1 x Raspberry Pi v2 running Raspbian Wheezy
  • 1 x ARMv7 ODROID-XU3 / Samsung Exynos4412 Prime Cortex-A9 (big-LITTLE) running ODROID Ubuntu 14.04 for both test and release builds under different user accounts (currently creating ARMv7 binaries but this needs to be switched to the Debian Wheezy machine from Scaleway).

Currently only I have access to these machines but have given SSH tunnel access to io.js team members in the past for one-off test/debug situations.

iojs.org

We are only running a single Ubuntu 14.04 4G instance on DigitalOcean for the website, it holds all of the release builds too. The web assets are served via nginx with http redirected to https serving a certificate provided by @indutny.

Only myself, @wblankenship, @indutny and @kenperkins have full access to this machine and I'd like to keep that fairly restricted because of the security implications for the builds.

All of the release build servers in the CI cluster have access to the staging user on the server in order to upload their build artifacts. A job in crontab promotes nightly builds to the appropriate dist directory to be publicly accessible.

The 3 individuals authorised to create io.js releases (listed on the io.js README) have access to the dist user on the server in order to promote release builds from staging to the dist directory where they become publicly accessible. Release builds also have their SHASUMS256.txt files signed by the releasers.

The iojs/website team only have access via a GitHub webhook to the iojs user. The webhook responds to commits on master of their repo and performs a install and build of their code in an unprivileged account within a Docker container. A successful build results in a promotion of the website code to the public directory. A new release will also trigger a website rebuild via a job in crontab that checks the index.tab file's last update date.

This week I upgraded this machine to a 60G from a 30G because we filled up the disk with nightly, next-nightly and release builds. We'll need to come up with a scalable solution to this in the medium-term.

Jenkins

Jenkins is run on an 80G instance on DigitalOcean with Ubuntu 14.04. It's using the NodeSource wildcard SSL cert so I need to restrict access to this machine. It no longer does any slave work itself but is simply coordinating the cluster of build slaves listed above.

Automation

We now have automation of nightly and next-nightly builds via a crontab job running a node program that checks if one should be created at the end of each day UTC and triggers a build via Jenkins if it needs to.

We also have the beginnings of automation for PR testing for io.js. I'm still to publish the source that I have for this but it's currently triggering either a full test run or a containerised test run depending on whether you are in the iojs/Collaborators team or not. New PRs and any updates to commits on PRs will trigger new test runs. Currently there is no reporting of activity back to the PRs so you have to know this is happening and know where to look to see your test run. This is a work in progress, but at least there's progress.

Scripted setup

  • All of the non-ARM Linux server setups for build/release machines are written in Ansible scripts in the iojs/build repo.
  • The FreeBSD and SmartOS server setups are Ansibilised in the iojs/build repo (I'm assuming that's what there works, I believe these were both contributed by @jbergstroem and perhaps @geek too).
  • The Windows setup procedure is documented in the iojs/build repo (not scripted).
  • The ARMv7 and Raspberry Pi server setups have been Ansiblised but not merged into iojs/build yet.
  • The iojs.org server setup is in the process of having its Ansible scripts updated to match the reality of the server, work in progress by @kenperkins [WIP] Cleaning up the www setup to match production #54.

Activity summary

  • Our main io.js test job in Jenkins has performed ~511 build & test cycles and thanks to the hard work of io.js collaborators the tests are almost all passing across platforms with the exception of some Jenkins-specific timeouts on Windows builds.
  • Our main libuv test job in Jenkins has performed ~84 build & test cycles. The libuv team has a bit of work to do on their test suite across platforms before this will be as useful to them.
  • We have built and are serving:
    • 19 releases
    • 71 nightlies
  • We are now building and serving binaries for:
    • Linux ARMv6, ARMv7, x64, x86 all as both .tar.gz and .tar.xz
    • OS X as 64-bit .tar.gz and as .pkg installer
    • Windows x64 and x64 both plain .exe files and .msi installer
  • According to my (hacky, and potentially dodgy) log scraping shell scripts:
    • We've had ~1.5M downloads of io.js binaries from the website since 1.0.0
    • Our peak was 146,000 downloads on the 20th of March

iojs_downloads

  • We don't have Google Analytics (or other) running on iojs.org (I think) but traffic trends can be deduced from this graph thanks to DigitalOcean

iojs_traffic

@jbergstroem
Copy link
Member

Great summary, Rod. We should get evangelists a summary somehow. Minor correction: If I recall correctly, both smartos jails are 64-bit but we pass arch=ia32 to one of them.

@Fishrock123
Copy link
Contributor

This week I upgraded this machine to a 60G from a 30G because we filled up the disk with nightly, next-nightly and release builds. We'll need to come up with a scalable solution to this in the medium-term.

Maybe we don't actually need to keep every nightly beyond a certain point?

@cjihrig
Copy link
Contributor

cjihrig commented Apr 15, 2015

What about keeping the last 30 (or some other number) of nightlies and just tagging each nightly on GitHub? I'm not sure if you get anything by tagging nightlies except the historical aspect.

@mikeal
Copy link
Contributor

mikeal commented Apr 15, 2015

Wow, this is amazing. Thanks Rod!

@silverwind
Copy link

Great writeup on the situation, appreciated.

@rosskukulinski
Copy link

Awesome writeup @rvagg. I see that you've already linked to iojs/evangelism, so this should go out this week's update.

@retrohacker
Copy link

Amazing writeup @rvagg ❤️

Would like to take a look at setting up something like ganglia to monitor our servers and monit to monitor our service health. Thoughts?

@rvagg
Copy link
Member Author

rvagg commented May 30, 2015

@wblankenship sure, I have no opinions on tooling here, they all seem equally bad and outdated (ganglia lives on sourceforge still for instance; is it better than nagios?). If you want to put in the effort then go for it! We have a basic status page @ https://jenkins-iojs.nodesource.com/computer/ and I have a basic status tool @ https://github.com/nodejs/build/tree/master/tools/jenkins-status that I run manually occasionally but having emails or push notifications for machines and/or slave processes that go down would be amazing, that's what I want.

@retrohacker
Copy link

So what I'm thinking is that we use ganglia for trending and nagios for alerts. There are tools to integrate the two. Will start looking at deploying ganglia now.

@jbergstroem
Copy link
Member

My personal needs for monitoring/health would only go as far as "is the vm/machine up?" and "is jenkins running [potentially: is it running properly]". Both is covered by the jenkins status page.

@jbergstroem
Copy link
Member

Closing -- nothing actionable here. We now have a monitor script if slaves go down (or up).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants