We have been accepted as a mentoring organisation for Google Summer of Code 2018.
The list of official project ideas is published below.
You can also take a look at the list of project ideas published for GSoC 2017.
Add your ideas to the README list below.
Please visit the Kubernetes GSoC page for general information.
- Description: CustomResourceDefinitions (CRDs) often reuse Kubernetes types and embed them, e.g. a PodSpec. Today a CRD has to copy the whole OpenAPI spec for that type. For types like PodSpec these can be huge. The idea here is to define a Kubernetes OpenAPI universe which allows to point from a CRD to the spec of other types in the universe, e.g. the mentioned PodSpec. This project includes conceptual work of defining such a universe with the current, published OpenAPI spec in mind, to come up with a design for how and when to resolve the new references, to write a proposal for all that and of course to code the feature within the apiextensions-apiserver.
- Recommended Skills: Golang, OpenAPI
- Mentor(s): Dr. Stefan Schimanski (@sttts), Mehdy Bohlool (@mbohlool)
- Issue: kubernetes/kubernetes#54579
- Description: The client-go library is vendored by many 3rdparty projects. We version it with semantic version numbers like 5.0.0, 5.0.1, 5.0.2, 6.0.0. On each release we increase the major version. We don't have any tooling in place that warns about incompatible changes when changes are merged into the Kubernetes code-base. This project is about changing this by developing and applying a tool to each Kubernetes code change that looks at the changed Go code and whether it breaks any Golang interfaces. This also involves the definition which parts of client-go and other libraries are actual interfaces we promise to preserve. Finally, this project includes the integration into our test infrastructure.
- Recommended Skills: Golang, Go parser, Go language specification and semantics, CI
- Mentor(s): Dr. Stefan Schimanski (@sttts), Chao Xu (@caesarxuchao)
- Description: Investigate kube-arbitrator, adapt one well known batch type of application running on Kubernetes (Apache Spark/Tensorflow/others) to use dynamic resource sharing, produce an example that simulates real-world use-cases with shared multi-tenant clusters and do performance benchmarking. The final deliverables here are examples of one or more batch applications on Kubernetes running using dynamic resource sharing, and performance benchmarking.
- Recommended Skill(s): Golang
- Mentor(s): Klaus Ma (@k82cn), Anirudh Ramanathan (@foxish), @ynli
- Issue: kubernetes/enhancements#269
- Description: Strategic Merge Patches are a data format of the HTTP Patch operation against Kubernetes objects. They allow to control how maps and slices are modified, either via tags in the patches or via tags on the Go types. CustomResources are no native Go types in Kubernetes which breaks Strategic Merge Patch. This topic is about adding Strategic Merge Patch support for
runtime.Unstructured
and to create an API to define the default merge strategies for JSON paths within CustomResources. - Recommended Skill(s): Golang, Algorithms
- Mentors(s): Dr. Stefan Schimanski (@sttts)
- Issue: kubernetes/kubernetes#56348 and kubernetes/kubernetes#58064
- Description: In order to troubleshoot Kubernetes latencies it would be great to have tracing support. Metrics are great for alerting, to know if something is wrong, but by design are not meant to trace the execution of single requests. Tracing support would allow troubleshooting high latency requests in order to find performance improvement opportunities more easily and quickly. This project involves evaluating tracing solutions as well as implementing support for the Kubernetes API handlers.
- Recommended skills: Kubernetes, OpenTracing
- Mentor(s): Dr. Stefan Schimanski (@sttts), Frederic Branczyk (@brancz)
- Issue: kubernetes/kubernetes#26507
- Description: KataContainers is a OCI container runtime which leverages hypervisor-based isolation for Linux container stack. cri-containerd is a Kubernetes CRI implementation for containerd, the core part of Docker. This topic aims at integrating KataContainers as a underlying runtime of containerd and serve Kubernetes CRI. In which case, users of Kubernetes will be able to enjoy security and multi-tenancy brought by KataConainers as well as native Linux container experience brought by containerd.
[Update]: The preliminary design doc of this project can be found here.
- Recommended Skill(s): Golang, Linux Operating System
- Mentors(s): Harry Zhang (@resouer), Lantao Liu (@Random-Liu), Jiangshan Lai (@laijs)
Prometheus is an open-source systems monitoring and alerting toolkit: https://prometheus.io/
Prometheus ideas:
- Description: Currently the HTTP API sends all the data in one go and that has the chance overwhelm prometheus. To guard against this, there are limits set in Prometheus. Make APIs streaming where data is retrieved and sent in chunks rather than at once.
- Recommended Skills: golang
- Mentor(s): Frederic Branczyk (@brancz), Goutham V (@gouthamve)
- Issue: prometheus/prometheus#3690
- Description: Currently the HTTP API is not very well organized and needs some tidying up. The actual course of action is not decided yet, but go-kit looks like a good fit.
- Recommended Skills: golang
- Mentor(s): Krasi Georgiev (@krasi-georgiev)
- Issue: prometheus/prometheus#3416
- Description: Having something like PostgreSQL's log_min_duration_statement would be useful to debug performance problems. It would be great to collect detailed query information, like how many chunks were necessary to compute the result and how many had to be loaded from disk. Further having detailed stats like the number of series/samples/blocks touched would be great.
- Recommended Skills: golang
- Mentor(s): Ben Kochie (@SuperQ), Goutham V (@gouthamve)
- Issue: prometheus/prometheus#1315
- Description: We're not really excellent in the correctness / bug-free-ness department yet, partially because certain key components either lack tests or you'd have to run them in a real-world scenario for a while to discover certain bugs. I'm especially looking at our under-tested service discovery mechanisms here that require e.g. a Zookeeper or Consul as a dependency to test them for real. It'd be cool to have a test environment that runs a Prometheus release end-to-end (with different SD mechanisms) for a while and checks the results (evaluated expressions, alerts, etc.) for sanity.
- Recommended Skills: golang, infrastructure management
- Mentor(s): Brian Brazil (@brian-brazil), Conor Broderick (@Conorbro)
- Issue: prometheus/prometheus#3689
- Description: One of the most annoying (but important) things when using Prometheus is making sure that your alerting expressions are correct (semantically, syntactically, and referring to existing template variables). It would be great to be able to assist the user with that. This would be both frontend and backend / tooling work.
- Recommended Skills: golang, javascript
- Mentor(s): Julius Volz (@juliusv)
- Issue: prometheus/prometheus#1154 prometheus/prometheus#1695 prometheus/prometheus#1220 prometheus/prometheus#1219
- Description: We can use open-tracing to trace several things in both prometheus and alertmanager, for example, prometheus query tracing.
- Recommended Skills: golang
- Mentor(s): Frederic Branczyk (@brancz), Tom Wilkie (@tomwilkie)
- Description: Currently if Prometheus restarts, we lose the 'for' state for firing alerts. While this isn't an issue for short for clauses, it presents a problem for clauses that are in the hours to days range. It'd be good to persist this state in some way, so that alerts don't have to start again from scratch. We probably don't want to count the time the prometheus server is down against the 'for' clause.
- Recommended Skills: golang
- Mentor(s): Goutham V (@gouthamve), Brian Brazil (@brian-brazil)
- Issue: prometheus/prometheus#422
- Description: If a remote backend is erroring, then the whole query fails. We need to make sure that doesn't happen and at least partial data is returned.
- Recommended Skills: golang
- Mentor(s): Goutham V (@gouthamve), Brian Brazil (@brian-brazil), Tom Wilkie (@tomwilkie)
- Issue: prometheus/prometheus#2573 prometheus/prometheus#2972
Alertmanager ideas:
- Description: While benchmarking & profiling Alertmanager it quickly became obvious, that the ingest path blocks a lot and can be optimised.
- Recommended Skills: golang
- Mentor(s): Frederic Branczyk (@brancz)
- Issue: prometheus/alertmanager#1201
- Description: Currently there are various maps to improve access to certain alerts, but very scattered across the code base and seemingly unrelated. We can have a lot of improvement if we move to something similar to the TSDB reverse index.
- Recommended Skills: golang
- Mentor(s): Frederic Branczyk (@brancz)
- Issue: prometheus/alertmanager#1202
TSDB ideas:
- Description: Currently the writes and reads are not isolated and sometimes during reads we see partial write data.
- Recommended Skills: golang, databases
- Mentor(s): Fabian Reinartz (@fabxc), Brian Brazil (@brian-brazil), Goutham V (@gouthamve)
- Issue: https://github.com/prometheus/tsdb/issues/260
- Description: We can have faster queries and new APIs if we can have a composite index for multiple index names rather than just one.
- Recommended Skills: golang, databases
- Mentor(s): Goutham V (@gouthamve), Fabian Reinartz (@fabxc)
- Issue: prometheus-junkyard/tsdb#26
Kubernetes + Prometheus ideas:
- Description: A cAdvisor replacement to collect container metrics. Would be dependant on the cri-o metrics endpoint.
- Recommended Skills: golang
- Mentor(s): Frederic Branczyk (@brancz)
- Description: Prometheus backed implementation of the resource metrics API in Kubernetes.
- Recommended Skills: golang
- Mentor(s): Frederic Branczyk (@brancz)
Misc. ideas:
- Description: A full-fledged outlator fully integrated with the Alertmanager. https://landing.google.com/sre/book/chapters/tracking-outages.html
- Recommended Skills: golang, javascript
- Mentor(s): Frederic Branczyk (@brancz), Brian Brazil (@brian-brazil), Richard Hartmann (@RichiH)
- Description: Currently Envoy uses a buffer implementation based on libevent evbuffer. This implementation has a number of shortcomings. We would like to do a custom C++14 rewrite.
- Required skills: C/C++
- Mentor(s): Lyft networking team
- Description: Envoy is only now getting fuzz testing support. By this summer the coverage will be limited. We would love someone to come and increase the fuzz coverage.
- Required skills: C/C++
- Mentor(s): Lyft networking team and consulting with Google Envoy team.
CoreDNS is a DNS server that chains plugins https://coredns.io.
- Develop/extend the DNSSEC plugin be able to exchange key material with the registrar - in essence implementing zero-touch DNSSEC.
- Required skills: DNSSEC, cryptography, Go
- Mentors: Miek Gieben.
- Develop a plugin that supports etcd3 API See also coredns/coredns#341
- Required skills: Go
- Mentors: John Belamaric
- This is more complicated than it sounds. When a primary zone changes, the secondary servers are notified. If CoreDNS is running as a set of autoscaling Pods in Kubernetes, only one of the CoreDNS instances will receive the NOTIFY message through the load balancer. It is necessary for that CoreDNS Pod to understand how to relay that message to other CoreDNS Pods. This should be done without a direct reliance on the Kubernetes API as it can be useful in non-Kubernetes deployments as well, so it is necessary to define a mechanism whereby CoreDNS instances may discovery one another.
- Required skills: Go
- Mentors: John Belamaric
- Description: Currently CoreDNS supports DNS Name Server Identifier (NSID) to allow a DNS server to self-identify itself. In a distributed system collision may occur, so a mechanism is needed to allow a server to conditionally declare its identity. There a several ways to achieve this goal. One way is to ask a name server to wait until its precedence already declares (e.g.,
server-1
), before assigning a non-conflict identity to itself (e.g.,server-2
). Another way is to extract the identity from another source, e.g., the timestamp of the server on the cloud, or a lock from K-V store like zookeeper or etcd. - Recommended Skills: Golang
- Mentors: Yong Tang
Rook is an open source storage orchestrator for cloud-native environments: https://rook.io/
- Description: Contributing to a major open source project can be difficult to figure out where to start. Beyond learning what the project does, you need to learn the code base, how to build and test your code, how to get your changes approved, and more. The "help wanted" issues are scoped to a manageable effort for first time contributors. A perfect place to jump in and contribute to an open source project!
- Issue Link: Rook Help Wanted
- Diffuculty: Easy
- Recommended Skills: Golang, Kubernetes, Ceph, Linux
- Mentors: Jared Watts and Travis Nielsen
- Description: When a user creates a storage cluster, resource pool, or other object using
kubectl
, the current experience is not ideal. Since these operations can be time consuming and can also fail, the user would greatly benefit by having a richer experience such as early validation, progress updates, meaningful status and helpful messaging. - Issue Link: Rook #1539
- Diffuculty: Easy
- Recommended Skills: Golang, Kubernetes
- Mentors: Jared Watts and Travis Nielsen
- Description: We want to expand the set of storage backends that Rook can deploy and orchestrate in cloud-native environments. Network File System (NFS) would bring value to users by providing access to shared storage over the network. This project would require you to write Go code to interact with Linux OS and the Kubernetes API to deploy and configure containers running the NFS daemon.
- Issue Link: Rook #1551
- Diffuculty: Medium
- Recommended Skills: Golang, Kubernetes, Linux
- Mentors: Jared Watts and Travis Nielsen
- Description: A user of a Kubernetes cluster with Rook should be able to create, manage and backup snapshots of their data that is managed by a Rook storage backend. Policies should be defined to help configure snapshot and backup schedules, included data, retention, etc. This is a challenging but exciting project that could tremendously help users by helping ensure their valuable data is reliably protected and backed up. This project also has potential to interact with and take a leadership position in the greater Kubernetes community.
- Issue Link: Rook #1552
- Diffuculty: Hard
- Recommended Skills: Golang, Kubernetes, Ceph
- Mentors: Jared Watts and Travis Nielsen
A framework for securing software update systems - website and GitHub repo.
- Description: Generalize the mechanism of key rotation. Rotation is the process by which a role uses their old key to invalidate their old key and transfer trust in the old key to a new key. Performing a key rotation does not require parties that delegated trust to the old key to change their delegation to the new key. Conceptually, the rotation process says if you trusted key X (or threshold of keys X_0, ... X_n), now instead trust key Y (or threshold of keys Y_0, ... Y_n). Rotation of a key may be performed any number of times, transferring trust from X to Y, then from Y to Z, etc.
- Task Link: TAP 8
- Difficulty: Medium
- Recommended Skills: Python
- Mentors: Justin Cappos, Vladimir Diaz, and Sebastien Awwad
- Description: Allow a target/glob pattern to be delegated to a combination of roles on a repository, all of whom must sign the same hashes and length of the target. This is done by adding the AND relation to delegations.
- Task Link: TAP 3
- Difficulty: Medium
- Recommended Skills: Python
- Mentors: Justin Cappos, Vladimir Diaz, and Sebastien Awwad
- Description: Allow each top-level role in the root metadata file to be optionally associated with a list of URLs. This allows the implementation of at least two interesting uses cases. First, it enables a user to associate a remote repository with a different root of trust, even if the user does not control this repository. This allows the user to, for example, restrict trust in a community repository to a single project. Second, it enables repository administrators to use mirrors in a safe and limited way. Specifically, administrators can instruct TUF clients to always download some metadata files from the original repository, and others from mirrors, so that clients are always informed of the latest versions of metadata and, thus, targets.
- Task Link: TAP 5
- Difficulty: Medium
- Recommended Skills: Python
- Mentors: Justin Cappos, Vladimir Diaz, and Sebastien Awwad