Closed issues:
- Issue installing Kubeflow 0.3 #2128
- Cloud Endpoints Controller not working on master #2120
- Issue while deploying kubeflow on GKE using command line- killed message in Cloud Shell #2075
- Ambassador crashing in kops cluster #2074
- deploy.sh should not assume uuidgen is present #2072
- setup-minikube.sh doesn't install ksonnet if missing #2068
- New Jupyter spawner Ui doesn't allow entering a custom image #2060
- add a jobs component #2058
- Upgrade ks to 0.13.1 #2031
- Auto-scaling for Seldon serving? #2029
- Upgrade bootstrapper to use go modules #2023
- Grant kubeflow user service account CMLE permission #2012
- Seems to be an issue in kfctl.sh re existence of dir ${DEPLOYMENT_NAME} #2009
- trying to deploy a component using ks after initial deployment fails #2006
- Move components in core into a GCP specific ksonnet package #1996
- Installation woes - ./kfctl.sh: line 189: env.sh: No such file or directory #1993
- TF-Serving http-proxy error #1979
- TFServing template needs to convert numGpus from string to int #1972
- Expose Istio's grafana dashboard #1969
- Integrate pipelines into Kubeflow click to deploy #1967
- Figure out Istio installation in Kubeflow #1909
- upgrade cloud-endpoints-controller to use metacontroller/metacontroller:v0.3.0 #1824
- Jupyter image with NVIDIA Rapids #1806
- [GCP] Upgrade envoy used for JWT validation #1696
- On Premises deployment of Kubeflow fails unless ks flags are set v0.3 #1615
- Test Flake git clone fails RPC error #1178
- [Test Flake] GCP deployer script resource conflict setting IAM policy #1140
- Document Serving for PyTorch models #1117
Merged pull requests:
- Create cloud sql database for ml pipeline #2123 (IronPan)
- fixes "Cloud Endpoints Controller not working on mmaster" #2122 (kkasravi)
- Fix the modeldb ambassador route. #2117 (jlewi)
- Fix the check if directory named ${DEPLOYMENT_NAME} exists #2115 (jlewi)
- Update kubeflow components to v0.4.0 #2112 (richardsliu)
- Add yebrahim to pipeline ksonnet owner #2108 (IronPan)
- add r2d4 to approvers #2106 (r2d4)
- Replace with std.asciiUpper with supported by required version of jsonnet #2104 (Jeffwan)
- Don't grant cloudservices account IAM admin priveleges. #2101 (jlewi)
- ks init: --skip-default-registries + append ${KS_INIT_EXTRA_ARGS} #2100 (doodlesbykumbi)
- Update aws parameters in serving and dashboard #2097 (Jeffwan)
- Update private IP configuration with new GKE API #2085 (IronPan)
- Correct typo, "for ever image" should be "for every image" #2084 (suigh)
- Fix variable in setup-minikube.sh #2082 (andreyvelich)
- Owners file for jupyter ksonnet #2081 (pdmack)
- fixes 'upgrade cloud-endpoints-controller to use metacontroller/metacontroller:v0.3.0 #2080 (kkasravi)
- JH Spawner Enhancements - Fixes #2060 #2079 (ioandr)
- Add job to load pipeline samples #2071 (IronPan)
- Enforce use string type param for S3_USE_HTTPS and S3_VERIFY_SSL #2067 (Jeffwan)
- Temp workaround for RAPIDS shared lib issues #2066 (pdmack)
- Update katib components #2064 (richardsliu)
- update kf version to v0.3.4 #2063 (kunmingg)
- Fix a version number bug and add tf-batch-prediction into kubeflow registry #2061 (yixinshi)
- unify central dashboard layout #2056 (kunmingg)
- Updates for RAPIDS AI v0.4.0 image #2053 (pdmack)
- Add modeldb as a package for kubeflow #2050 (mpvartak)
- refactor monitoring metrics #2048 (kunmingg)
- upgrade central dashboard image to v0.3.4 to include pipeline ui link #2047 (kunmingg)
- Update Pipeline SDK version to v0.1.3 #2046 (IronPan)
- fixes 'Move components in core into a GCP specific ksonnet package' #2043 (kkasravi)
- Easy for code reading #2042 (wangkeqiang123)
- always convert numGpu to int #2034 (lluunn)
- Support running batch prediction by launching a Dataflow job on GCP #2026 (yixinshi)
- Give jupyter-notebook pods/log permission for fairing #2015 (r2d4)
v0.3.4-rc.2 (2018-12-06)
v0.3.4 (2018-12-06)
Closed issues:
- ImagePullBackOff for images on GCR within same GCP project as GKE cluster #2044
- ksonnet runtime error Seldon #2001
- Better testing output for jsonnet #1988
- Install pipelines SDK in Jupyter images #1968
- Deploy pipelines as part of kfctl.sh #1966
- Create a pipelines ksonnet package #1965
- ksonnet env should override params #1924
- Use ksonnet modules to better organize applications #1922
- Create an openvino component #1913
- Kubernetes Engine for Kubeflow Quickstart Guide - Proposed Fixes #1898
- Tooling to convert notebook to docker container #1857
- Fire off TFJob from Jupyter Notebook #1240
- JupyterHub spawner - Highlight PV requirement #541
- Extend KubeSpawner and its UI to handle Persistent Volume Claims #34
Merged pull requests:
- Merge Pipeline integration change to v0.3 #2055 (IronPan)
- More istio manifest, and README #2051 (lluunn)
- Update Pipeline version to v0.1.3 #2045 (IronPan)
- delete config set in gcp-click-to-deploy dir since its already merged with kfctl config #2041 (kunmingg)
- Update OWNERS #2040 (ellis-bigelow)
- Add fairing library to jupyter images #2038 (r2d4)
- add permission to get pod's log #2037 (IronPan)
- Add roles/dataproc.editor for kubeflow user account #2035 (IronPan)
- Grafana routing rule #2025 (lluunn)
- Istio manifest #2020 (lluunn)
- Activate CMLE during kfctl deployment #2018 (IronPan)
- Activate CMLE during one-click deployment #2017 (IronPan)
- grant KF user SA CMLE admin permission #2013 (IronPan)
- Katib 0.3 cherrypick #2011 (texasmichelle)
- Add latest stable DSL SDK to jupyter image #2008 (IronPan)
- fixes 'trying to deploy a component using ks after initial deployment fails' #2007 (kkasravi)
- use backoff module to handle IAM policy retry with randmize wait time #2004 (kunmingg)
- Fix bad merge conflict on v0.3-branch #2003 (r2d4)
- Adding scope for tf job dashboard #2002 (johnugeorge)
- Make it easier to debug jsonnet tests #1989 (jlewi)
- deploy app loadtest, python part #1986 (kunmingg)
- fixes 'ksonnet env should override params' #1939 (kkasravi)
- Support multiple PVCs in default JH UI, add example 3rd-party UI #1918 (ioandr)
- fixes 'Create an openvino component' #1916 (kkasravi)
v0.3.4-rc.1 (2018-11-26)
Closed issues:
- Missing scope in CRD spec? #1985
- Error persistentvolumeclaim "nfs" not found while host model from NFS #1964
- Fix Katib image tags in v0.3-branch #1953
- user-gcp-sa secret not show up with --platform gcp #1950
- Error deploying Kubeflow using kfctl.sh with --platform gcp #1949
- Can TFJob be parameterized by replica index #1943
- Dashboard cannot list tf jobs #1883
- Issues installing 0.3 #1871
- flaky configure_envoy_for_iap.sh #1807
- create a notebook controller that can replace jupyterhub and uses k8 native auth #1769
- Build Jupyter notebook images for TF 1.11 and 1.12 #1740
- kfctl.sh apply platform assumes availability of yaml python library #1739
- Unable to install Kubeflow on a exising 2 node ubuntu cluster; Docs need to be fixed #1711
- [gcp] Click to deploy needs to save DM config to cloud source repo as well. #1655
- PyTorch and TFJob v1beta1 API #1584
- Error in the generated YAML #1524
- Exposing service using Nginx Ingress Controller and load balancing (question) #1214
- Problems upgrading services of type NodePort; spec.clusterIP: Invalid value: "": #1145
- Simplify Image Tag management for releases #1060
- TFServing supports collection of metrics with prometheus #1036
- Add a parameter for clusters without RBAC #1027
- Can not bring up Jupyter Notebook #672
- Clusters created during e2e tests should be GC #560
- ksonnet + openshift picking up wrong k8s version number #521
- How to access notebook on KUBO? #294
- Investigate using Docker to run Kubernetes and Kubeflow locally #218
Merged pull requests:
- Pipelines 0.3.3 cherrypick #1998 (r2d4)
- Add texasmichelle to OWNERS #1992 (texasmichelle)
- Fix CRD scopes. #1987 (jlewi)
- Remove IAM permission from DM service account #1984 (kunmingg)
- add pipeline as part of kf deployment #1981 (IronPan)
- Add IronPan to pipeline ks package owner #1980 (IronPan)
- Adding Pytorch v1beta1 image #1978 (johnugeorge)
- Check for python module pyyaml when using kfctl.sh on platform GCP. #1975 (IMBurbank)
- Add IronPan as Argo owner #1974 (IronPan)
- initial change - add pipeline entry to kubeflow registry #1973 (IronPan)
- Initial version of a Kubeflow Roadmap. #1963 (jlewi)
- Fix dashboard URI and increase poll interval. #1962 (abhi-g)
- Enable node autoprovisioning in v1beta1 clusters #1959 (richardsliu)
- Add prometheus annotation to tf serving service #1958 (lluunn)
- Add docker-for-desktop platform #1954 (rogaha)
- make IAP optional for click-deploy app #1927 (kunmingg)
- miscellaneous updates and enhacements for shell scripts #1923 (ashahba)
- make prober always sleep before execute #1908 (kunmingg)
- fixes 'create a notebook controller that can replace jupyterhub and uses k8 native auth' #1855 (kkasravi)
- Separate logic for setup_backend.sh #1841 (r2d4)
v0.3.3 (2018-11-15)
Closed issues:
- Update spartakus image to v1.1.0 to fix cloud providers' annotation #1940
- Use iam-policy value for EMAIL if case-sensitive #1936
- Following Quickstart Guide - Can't Deploy Kubeflow #1929
- Enable kubeflow running on POWER #1928
- Do we have support in documentation for Inference? #1905
- Update katib manifests #1903
- Deploy same components in click-to-deploy as kfctl #1892
- Kubeflow 0.2.7 Spawning an image with GPU #1890
- ksonnet package for Jupyter/JupyterHub #1886
- Ksonnet package organization for TFJob and PyTorch #1885
- Rollout model with istio and/or ambassador #1844
- create a self-serve component for data-scientists #1842
- TensorFlow Jupyter Notebook images 1.9 and above in gcr.io cannot see GPUs #1828
- [tf-notebook] Use nvidia/cuda runtime image instead of nvidia/cuda devel image #1783
- Create 0.3.1 Release #1761
- TF data validation and tfma are in 1.9 images but not 1.10 #1718
- [GCP] kfctl should provide a helpful error message if ZONE isn't set #1697
- Argo UI doesn't work behind ambassador #1694
- [gcp] E2E test to verify IAP and certmanager works #1668
- [kfctl] Web app redirect after IAP up. #1420
- Use Istio 1.0 #1309
- TF serving supports request id #1220
- Enable GitOps: Use (Weave Flux or Argo CD) to manage Kubeflow deployments #971
Merged pull requests:
- Automated cherry pick of #1904: Fix capitalization in katib ID fields Cherry pick of #1904 on v0.3-branch. #1904: Fix capitalization in katib ID fields #1957 (richardsliu)
- Allow CloudShell origin pattern in Jupyter config #1956 (fdasilva59)
- update katib components #1955 (YujiOshima)
- Add pipeline to centaldashboard #1951 (yupbank)
- Redirect deployer webapp page to Kubeflow dashboard after its ready #1945 (abhi-g)
- Parse all command line options in one place and within function #1942 (ashahba)
- Switch spartakus volunteer image to v1.1.0 #1941 (abhi-g)
- Use iam-policy value for EMAIL if case-sensitive. #1937 (IMBurbank)
- Adding support for Pytorch v1beta1 operator #1930 (johnugeorge)
- Consolidate the ksonnet component update scripts #1926 (richardsliu)
- Update tf-operator component to v1beta1 #1921 (richardsliu)
- Change ksonnet package path for tf-job-operator #1920 (andreyvelich)
- fix typo #1919 (lluunn)
- fixes 'ksonnet package for Jupyter/JupyterHub' #1917 (kkasravi)
- Enable tf serving prometheus metrics #1911 (lluunn)
- make links open in new tab #1910 (kunmingg)
- Envoy config change for istio #1906 (lluunn)
- Fix capitalization in katib ID fields #1904 (texasmichelle)
- Fix skip init project; use spaces consistently for indentation #1902 (jlewi)
- Gkeversion #1901 (kunmingg)
- use backoff module to wrap flaky APIs; avoid using global locks when possible #1899 (kunmingg)
- Fix model rollout with Istio #1897 (lluunn)
- Dm conf #1893 (kunmingg)
- Provide an error message in case GCP ZONE is not set and exit, also fix styles for kfctl.sh #1888 (ashahba)
- Automated cherry pick of #1717 upstream v0.3 branch #1884 (jlewi)
- Add a script that automatically creates cherry-picks #1880 (richardsliu)
- fix iap script #1879 (kunmingg)
- upgrade app version to 0.3.2 #1875 (kunmingg)
- Fixing the conflict with the getting started guide #1873 (connected-bsamadi)
- create a self-serve component for data-scientists #1872 (kkasravi)
- Add base href for correct link in Argo UI #1865 (andreyvelich)
- prober test, kubeflow testing part #1845 (kunmingg)
- Refactor JH integration and UI #1839 (ioandr)
- [tf-notebook-image] add tf 1.11 cpu and gpu images #1786 (r2d4)
- [cuda] use runtime image instead of development image #1785 (r2d4)
v0.3.2 (2018-10-26)
Closed issues:
- Job not stopping once chief is complete - v0.2.0-rc.1 #1853
- For GKE deployment, change default CPU to Broadwell #1840
- [minikube] Reduce number of default components deployed to minikube #1837
- metacontroller's compositecontrollers and decoratorcontrollers need to be at Cluster scope #1833
- ERROR in ks apply default -c iap-ingress #1827
- ksonnet namespace should be retrieved from env not params #1825
- issues with 0.3.0 deployment scripts #1767
- Enable TPU in GKE #1766
- TFMA Dependency #1745
- jupyter_console needs update in the latest notebooks #1721
- [gcp] Click to deploy needs to get a valid master version for GKE by calling get-server-config #1653
- test coverage for click-to-deploy app #1578
- Required JWT token is missing #1495
- A small problem, Error validate kubeflow-core #1462
- Does CentralUi need ClusterScope? #1450
- Remove ksonnet parameter cloud #1227
- Test Flake wait_for_workflow terminated on HTTPSConnectionPool error; why don't we retry #1169
- Application Custom Resource for Kubeflow Deployments #1106
- Create & label P1 issues needed for an initial release of Horovod support #778
- Document Red/Green Model Rollout using ISTIO #667
- Central UI - need release process #527
- Model Management Features #136
Merged pull requests:
- fix iap script #1874 (kunmingg)
- fix iap set up and prometheus config (#1847) #1869 (kunmingg)
- remove default version #1868 (kunmingg)
- Run minikube e2e test if core components change #1864 (richardsliu)
- Adds release docs for central UI #1863 (swiftdiaries)
- make zone a dropdown list which has GPU available #1862 (kunmingg)
- use broadwell #1852 (lluunn)
- Fix broken presubmit tests #1851 (richardsliu)
- use TF serving 1.11.1 #1850 (lluunn)
- make alert metrics count type & add service heartbeat #1849 (kunmingg)
- fix iap set up and prometheus config #1847 (kunmingg)
- add gauge metrics for dashboard & alerting; add metrics for invalid a… #1846 (kunmingg)
- Remove argo & katib from default minikube #1838 (texasmichelle)
- Add some latency metrics for GKE & KF deployments #1836 (abhi-g)
- part of fix for 'add jsonnet tests for all libsonnet files' #1835 (kkasravi)
- fixes 'metacontroller's compositecontrollers and decoratorcontrollers need to be at Cluster scope' #1834 (kkasravi)
- Create a simple flask server to redirect http to https #1832 (jlewi)
- fixes 'ksonnet namespace should be retrieved from env not params' #1829 (kkasravi)
- Roll out model with istio #1823 (lluunn)
- Don't add a GPU pool by default #1810 (jlewi)
- Add RapidsAI notebook image #1809 (pdmack)
- Add support for using host network for OpenMPI worker pods #1805 (hmizuma)
- Make the project configurable. #1792 (jlewi)
- Fix typos in markdown #1788 (r2d4)
- components: tf-notebook-image: pin to base image #1784 (r2d4)
- Enable TPU in GKE (#1766) #1774 (dsdinter)
- Replace 'cloud' parameter with more direct names like 'params.tfDefaultImage' and 'params.platform' #1742 (ashahba)
- Application crd #1633 (kkasravi)
v0.3.1 (2018-10-19)
Fixed bugs:
- Deploy kfctl.sh apply k8s 'namespaces "kubeflow" not found' #1675
Closed issues:
- add the metacontroller component #1819
- Enabling TFMA jupyter extension on GPU images requires libcuda #1818
- Errors when starting kubeflow with minikube #1802
- add parts attribute to libsonnet for ksonnet incubator/best practices #1798
- Feature Request + Help Wanted: Attach PVs to TFJob components #1782
- Update katib suggestion images #1779
- presubmit failed when build gpu version #1778
- Build machine running out of space - Presubmit test jobs failing for PRs #1775
- Current master version fails when applying platform in GCP #1768
- kubeflow namespace not found using kfctl.sh #1758
- tensorboard generated under kubeflow/tensorboard/prototypes/tensorboard-aws.jsonnet is incomplete #1753
- Error Creating Object: Pod in version "v1" cannot be handled as a Pod #1748
- No deployment directory found in https://github.com/kubeflow/kubeflow/archive/v0.3.0.tar.gz #1743
- the variable of ENVIRONMENT #1741
- [docs] download.sh pulls from master instead of based on version tag #1738
- Installation errors with kfctl.sh #1733
- Please patch PR 1716 (update GCP credentials filename) into v0.2 #1719
- Tiny jupyter lab notebook toolbars #1704
- K8s dashboard showing "no healthy upstream"; remove K8s dashboard links and services #1699
- 1.0 Exit Criterion for TFJob and PyTorch #1683
- Deploy kfctl.sh apply k8s: Failed to pull image #1651
- Support scope for postsubmit #1587
- Kubeflow quick start misses namespace creation #1514
- Automated sync of Kubeflow between GCR and DockerHub #1320
- Deployment script didn't create namespace kubeflow #1274
- Rename TF-Hub to JupyterHub #1223
- TFServing test if failing blocking submits #1126
- Presubmit failures; Timeout waiting for TFJob v1alpha2 job #974
- Create labels for releases #885
- Create image release workflow for tf operator images #855
- [openmpi]
volumes/volumeMounts
support #838 - Upgrade ksonnet version to v0.10 for kubeflow. #727
- Liveness/Readiness checks for TF Serving #368
Merged pull requests:
- Tag Jupyter notebook images for v0.3.1 #1830 (richardsliu)
- reorganize tests to be under their respective dirs, add aws,gcp tests for tensorboard #1821 (kkasravi)
- fixes 'add the metacontroller component #1819' #1820 (kkasravi)
- add ssl cert reuse logic in e2e & prober tests #1817 (kunmingg)
- Fix TFMA and TFDV in Jupyter Images #1815 (jlewi)
- Remove root level makefile #1814 (r2d4)
- stop e2e test for deploy app till fix letsencrypt rate limit #1813 (kunmingg)
- Remove K8s dashboard link from central UI #1811 (jlewi)
- Knative build for in-cluster image builds in Kubeflow #1804 (swiftdiaries)
- skip readarray if --all provided to jsonnet fmt script #1800 (r2d4)
- fixes 'add parts attribute to libsonnet for ksonnet incubator/best practices' #1799 (kkasravi)
- Pin tf-serving version in tf-notebook #1797 (r2d4)
- Tf serving with istio 1.0 #1795 (lluunn)
- cherry-pick PR #1754 onto V0.3 branch #1794 (kkasravi)
- remove empty readme #1791 (r2d4)
- remove travis config #1790 (r2d4)
- Add a few basic metrics for deployment service #1787 (abhi-g)
- Cherry pick #1779 #1781 (leoncamel)
- Update katib suggestion images (#1779) #1780 (leoncamel)
- Cherry pick #1727 #1777 (leoncamel)
- Presubmits should be triggered if the DM configs are modified. #1776 (jlewi)
- Cleanup the OWNERs files. #1773 (jlewi)
- Enable new beta Stackdriver Kubernetes Monitoring feature if using v1beta1 (#1768) #1772 (dsdinter)
- add jupyterhub test #1771 (kkasravi)
- Use new stackdriver agents in DM config #1765 (lluunn)
- pin to minor GKE version in dm config #1763 (r2d4)
- gofmt: run go fmt github.com/kubeflow/kubeflow/bootstrap/cmd/... #1762 (r2d4)
- remove statsd from ambassador deployment #1760 (r2d4)
- Cherry pick #1746 and #1747 #1756 (jlewi)
- fixes tensorboard generated under kubeflow/tensorboard/prototypes/tensorboard-aws.jsonnet is incomplete #1754 (kkasravi)
- Seldon ksonnet refactor #1752 (cliveseldon)
- [openmpi] support
volumes
andvolumeMounts
#1750 (everpeace) - Ensure namespace exists #1747 (jlewi)
- Pin jupyter-console to 6.0.0 #1746 (jlewi)
- periodic e2e test for click deploy app #1732 (kunmingg)
- change clusterrole to role #1728 (swiftdiaries)
- Add katib's hyperband/bayesianoptimization suggestion images (#1464) #1727 (leoncamel)
- gke/deploy.sh: test uuidgen exists before using it #1674 (rabierp)
- Allow folks to have the kubeflow github repo over https + improve error message #1626 (holdenk)
- Improvements to kubeform_spawner.py (form UI and others) #1551 (tlkh)
v0.2.7 (2018-10-10)
Fixed bugs:
- [gcp] v0.3.0-rc.1 cert manager can't get the SSL certificate #1666
- [gcp] setIamPolicy error when deleting deployment #1092
Closed issues:
- How to submit multiple OpenMPI jobs? #1730
- Permission denied errors when pip (un)installing without --user in new nb images #1722
- [GCP] Update credentials filename #1715
- jsonnet test is failing but no jsonnet files changed in the PR. #1707
- Error while updating iap-ingress for custom domains #1689
- Jupyterlab service account token #1648
- Create 0.3. Release #1541
- upgrade to ksonnet 0.12.0, jsonnet v0.11.2 in the Docker image for our test workers #1540
- Chief worker cannot start #1440
- Web app deploying kubeflow through deployment manager #884
- Proposal: kubeflow-scheduler #68
Merged pull requests:
- Change the default branch in download.sh to v0.3-branch as opposed to master #1734 (jlewi)
- [Cherrypick 0.2] Update gcp credentials filename (#1716) #1725 (lluunn)
- Create an initial CHANGELOG #1723 (jlewi)
- Cherry-pick: Update gcp credentials filename (#1716) #1720 (lluunn)
- Fix the TFJobs Dashboard UI #1717 (jlewi)
- Update gcp credentials filename #1716 (texasmichelle)
- TF serving template cleanup #1714 (lluunn)
- Add a notice explaining to users what the app is doing. #1713 (jlewi)
- config & dockerfile update #1705 (kunmingg)
- Update link in README #1640 (kunmingg)
- Merge config for kftcl and click-to-deploy webapp #1594 (lluunn)
- HubSync v1.0 #1548 (TheJaySmith)
v0.3.0 (2018-10-04)
Fixed bugs:
- v0.3.0-rc.1: ERROR no prototype names matched 'pytorch-operator' #1663
Closed issues:
- Accessing custom metrics in our Python model #1681
- [v0.3.0-rc.1] "/" doesn't redirect to the centraldasbhoard UI #1670
- Run tests periodically on the release branch starting with 0.3 #1603
- Kubebench-job default param mainJobConfig points to a wrong path #1596
- Release process for kubebench #1510
Merged pull requests:
- [Cherry-pick] Update kubebench-job prototype default parameters (#1693) #1709 (xyhuang)
- Cherrypick #1700 - TF serving liveness probe #1708 (lluunn)
- Add TOS and Privacy links and some other UI improvements. #1703 (jlewi)
- Fix jsonnet formatting errors in katib #1702 (richardsliu)
- Add tf serving liveness probe #1700 (lluunn)
- Update kubebench-job prototype default parameters #1693 (xyhuang)
- Add PVC to Katib #1687 (inc0)
- fix katib metrics collector bug #1680 (YujiOshima)
- Scope postsubmit jobs by modified directories #1658 (richardsliu)
- Fix cert-manager and iap config to make dashboard accessible #1544 (sambaiz)
v0.3.0-rc.3 (2018-10-02)
Closed issues:
- [v0.3.1-rc.1] JupyterHub spawner is missing TF images for 1.9 and 1.10 #1672
- [v0.3.0-rc.1] Only TF 1.8 CPU shows up in the list of prepopulated Jupyter images #1671
- [GCP] Click to deploy doesn't create any K8s resources when version is 0.2.5 #1631
- Update Katib image in 0.3 branch #1604
- central dashboard image build workflow doesn't work #1575
- [Image Auto Release] process has lots of issues; should we use prow? #1574
- Image Auto Release Cron Job is failing #1563
- ack_guide.md out of date #1293
- GKE: can not read from google cloud storage in Jupyter notebook #1249
- Friction log for bootstrapper documentation #927
- Create a minimal release process for our ksonnet configs #215
Merged pull requests:
- (cherry pick) upgrade argo version to v2.2.0 #1692 (kunmingg)
- upgrade argo version to v2.2.0 #1690 (kunmingg)
- Tag images for TF notebooks 1.9.0 and 1.10.1 #1688 (richardsliu)
- Build TF notebook images for TF 1.9.0 and 1.10.1 #1686 (richardsliu)
- Cherry-pick #1676 fix centralUI #1685 (swiftdiaries)
- [0.3] Cherry-pick #1647 #1684 (lluunn)
- Update image tag for centralUI #1682 (swiftdiaries)
- cherrypick - Remove trailing slash from KUBEFLOW_REPO (#1664) #1677 (jlewi)
- Fix for "/" not directing to centralUI (#1670). #1676 (swiftdiaries)
v0.3.0-rc.2 (2018-09-30)
Fixed bugs:
- Deploy kfctl.sh apply k8s : Service "ambassador" is invalid ? #1566
Merged pull requests:
- remove prune from kubeflow core which delete required fields (#1580) #1667 (jlewi)
- Add swiftdiaries as reviewer #1665 (swiftdiaries)
- Remove trailing slash from KUBEFLOW_REPO #1664 (jlewi)
- Change kubebench 0.3 image #1661 (xyhuang)
- Ankush Signing Out #1652 (ankushagarwal)
v0.3.0-rc.1 (2018-09-28)
Closed issues:
- [GCP] Deploy script fails with unsupported k8s version 1.9.7-gke.5 #1641
- Update CentralUI image used at head and 0.3 branch #1435
- kfctl.sh needs to get initial cluster version based on get-server-config #1359
Merged pull requests:
- Cherrypick (#1657) tag image and change libsonnet for centraldashboard image #1660 (swiftdiaries)
- Tag and update centraldashboard image #1657 (swiftdiaries)
- Cherry-pick #1589 to v0.3 #1656 (lluunn)
- Cherrypick #1650; automatically set master version to supported version. #1654 (jlewi)
- Update initial clsuter version in cluster.jinja based on what gcloud get-server-config returns #1650 (ashahba)
- Enable periodic prow tests #1649 (richardsliu)
- Build image for centralui part of presubmit #1623 (swiftdiaries)
v0.2.6 (2018-09-28)
Fixed bugs:
- Ambassador Version 0.34.0 causing DNS Issues on Worker Node #945
Closed issues:
- Update document on Tf Serving #1634
- [test flake] pre and postsubmit failures deploying mnist #1617
- tensorboard prototypes should include the optionalParams available in that prototype #1610
- Katib ksonnet component needs an E2E test #1607
- Update Kubebench images on 0.3. branch #1602
- Update Jupyter images on 0.3. branch #1601
- Update PyTorch Job image on 0.3 branch #1600
- Update TFJob on 0.3 Branch #1599
- Update Seldon to 0.2.3 on 0.3 release branch #1598
- Envoy unable to read config #1588
- deploying kubeflow with bootstrapper failed #1586
- How can we modify jupyterhub configuration to use GitHub Authentication? #1585
- Jupyter Image Builds Are failing; Dependency issue related to TFMA? #1576
- bump gke version to 1.10.7-gke.2 #1572
- presubmit build is failing with a quota error #1562
- refactor tf-job-operator to match style-guide of libsonnet #1534
- Test to verify we can deploy Katib #1483
- [test flake] mnist gpu test is very flaky; not enough GPU; autoscaling not enabled for the GPU pool #1436
- Make TF serving component more readable and extendable #1264
- E2e test of TF Serving using built-in HTTP api #1258
- [Discussion] TF serving image: should we just keep Dockerfile for the latest TF version? #1089
- bootstrapper should support push the ksonnet app to a source repo #912
- Investigate using tensorflow serving's built-in http server #896
Merged pull requests:
- TF Serving changes: enable http server, test cleanup etc #1647 (lluunn)
- [Cherry-pick] Add a simple e2e-test for katib #1646 (ankushagarwal)
- GKE 1.9.7-gke.5 is no longer available for master; bump to 1.9.7-gke.6 #1644 (jlewi)
- Add a simple e2e-test for katib #1638 (ankushagarwal)
- Tag release 0.3.0 for Jupyter notebook images #1637 (richardsliu)
- Cherrypick changes from 0.3-branch to master #1636 (ankushagarwal)
- Update katib component to include studyjobcontroller #1632 (ankushagarwal)
- Install tensorflow-data-validation in jupyter notebook #1629 (ankushagarwal)
- Use JupyterNotebook as default instead of JupyterLab #1628 (ankushagarwal)
- Tag and update 0.3.0 release for Kubebench controller #1627 (richardsliu)
- Update katib image URIs #1625 (ankushagarwal)
- Add postsubmits to push centralui and Jupyter notebook images to kubeflow-images-public #1624 (richardsliu)
- Add richardsliu to approvers #1622 (richardsliu)
- make builder image version consistent with glide.lock hash #1621 (kunmingg)
- Tag and update pytorch operator #1619 (richardsliu)
- Update pytorch image on 0.3 release branch #1614 (johnugeorge)
- Update Seldon to 0.2.3 version on 0.3 branch (#1592) #1613 (cliveseldon)
- fixes 'tensorboard prototypes should include the optionalParams available in that prototype' #1611 (kkasravi)
- Update title & favicon for the deploy app page. #1609 (abhi-g)
- Update the TFJob image to the latest image and tag 0.3 #1608 (jlewi)
- Update Katib images on 0.3 branch #1605 (jlewi)
- Update release process for Katib. #1597 (jlewi)
- Force update IAM policy; print app url on web UI. #1595 (kunmingg)
- Update instructions about releasing images using prow. #1593 (jlewi)
- Update Seldon to 0.2.3 version #1592 (cliveseldon)
- New TF Serving template #1589 (lluunn)
- fix iam patch logic and update image in config #1582 (kunmingg)
- App frontend stop polling the status after it's done #1581 (lluunn)
- remove prune from kubeflow core which delete required fields #1580 (kunmingg)
- Don't install TFMA for TF versions < 1.9 #1579 (richardsliu)
- bump gke version to 1.10.7-gke.2 #1573 (kkasravi)
- Typo in nvidia inference server README #1569 (cliveseldon)
- Run Jupyter image and centraldashboard image release on postsubmit #1565 (jlewi)
- Adding --skipInitProject to kfctl_test.jsonnet until CI quota is increased #1564 (ashahba)
- Tag and update v0.3.0 release for chainer-operator #1552 (everpeace)
- fixes 'refactor tf-job-operator to match style-guide of libsonnet' #1535 (kkasravi)
- Rename TF-Hub to JupyterHub #1410 (pvsousalima)
4e7f4ed (2018-09-19)
Closed issues:
- kfctl.sh needs to call get-server-config to get GKE version #1570
- Certificate not working #1567
- "Patching IAM bindings" halt during deployment. #1559
- cert-manager missing clusterrole #1554
- Jupyter Notebooks for TF 1.9 and 1.10 #1546
- test_jsonnet is failing in postsubmit #1543
- Directory ${KFAPP} already exists #1530
- Move kubebench package to kubeflow repo #1513
- cloud endpoint prototype breaks on master. #1507
- Error: Failed to apply app: find objects: RUNTIME ERROR: Field does not exist: v1 #1506
- PR shows review is not required to merge #1503
- update tensorboard to use the same pattern as kubeflow/core/prototypes #1500
- No prototype names matched 'kubeflow-core' #1492
- cannot list namespace on tfjob dashboard #1491
- Issue installing on GKE with deploy script #1489
- Installation fails on Amazon EKS #1488
- add ability to only generate parts of a component in the jsonnet file #1486
- Prometheus for seldon models #1484
- Multiple issues with gke/deploy.sh #1481
- Katib StudyJob failed to mount directory #1480
- New image release for pytorch operator #1479
- simplify tensorboard as separate aws, gcp prototypes #1477
- Cut release 0.2.5 #1476
- do you have a performance benchmarks when run Horovod with your openmpi component? #1461
- Review/extend jovyan permissions in TF notebooks #1438
- [gcp] VM account should have GCS read only scope to support pulling from GCR #1432
- kfctl.sh unable to find component ambassador #1429
- [kfctl] support specify registries & version in "/kfctl/apps/create" request. #1417
- standardize remaining <component>.{jsonnet,libsonnet} files #1414
- Update 0.2 blog with new deployment script #1390
- Update E2E test to use kfctl.sh and delete gke/deploy.sh; #1331
- Docker image building workflows are failing #1135
- [bootstrap] Fail to update role kubeflow.jupyter-role #1076
- camelCase for some recently fixed params #1050
- Add document on Stackdriver agents #997
- Suggest using simple port forwarding instead of LoadBalancer for cloud deploy in User Guide #860
- Need docs for TFJobs UI #573
- Update docs to mention known issues with ksonnet and windows #501
- Need: User facing website for Kubeflow that details how to choose a stack #213
- Tutorial(s) that correspond to CUJs #85
Merged pull requests:
- Bump GKE version because 1.10.7-gke.1 is no longer valid master version. #1571 (jlewi)
- Fix cert-manager #1568 (lluunn)
- Bug fix for #1559 #1561 (lluunn)
- increase pageSize for service list to avoid truncate #1558 (kunmingg)
- Fix for issue 1050 - camelCase for some recently fixed params. #1556 (ashahba)
- fix cert-manager: add clusterrole back #1555 (kunmingg)
- enable sourcerepo.googleapis.com api if needed #1553 (kunmingg)
- Webapp: Don't use DM for IAM. #1550 (lluunn)
- add chainer-operator to releaser #1549 (everpeace)
- Add version config for TF 1.9,1.10 #1547 (pdmack)
- The minikube test should not be running the jsonnet unittests. #1545 (jlewi)
- Restore missing tf-hub-lb service #1539 (pdmack)
- Prevent ambassador getaddrinfo error logs #1537 (sambaiz)
- edit server address for app config on each request #1536 (kunmingg)
- Update the client ID for webapp #1533 (lluunn)
- Fix kfctl.sh remove the ksonnet environment for the deleted cluster #1532 (sambaiz)
- Add Kubebench package #1531 (xyhuang)
- add chainer-job package to registry.yaml #1529 (everpeace)
- Support specifying registy version in create request #1528 (kunmingg)
- Adds centralui to releaser. #1527 (swiftdiaries)
- Replace logFatal with logError #1526 (lluunn)
- NVIDIA TensorRT Inference Server #1523 (deadeyegoodwin)
- update manifest for webapp #1521 (lluunn)
- catch save-config error #1519 (kunmingg)
- update k8s version for k.libsonnet when do ks init; remove spartakus #1517 (kunmingg)
- edit gke version to 1.10.6 #1516 (kunmingg)
- Integrate with cloud source repos to save ks app config. #1515 (kunmingg)
- Remove unused Azure specific config #1512 (wbuchwalter)
- Adding v1alpha2 as the default PyTorch operator version #1511 (johnugeorge)
- update ks version to 0.12 for webapp backend #1508 (lluunn)
- Add myself as one of the approvers. #1505 (abhi-g)
- Update webapp backend to respond and then finish deployment in the background #1504 (lluunn)
- separate dep and src for bootstrapper #1502 (lluunn)
- fixes update tensorboard to use the same pattern as kubeflow/core/prototypes #1501 (kkasravi)
- Change webapp url #1499 (lluunn)
- fixes kfctl.sh unable to find component ambassador #1498 (kkasravi)
- Update jupyterhub-kubespawner version #1497 (pdmack)
- update gke to 1.10.7-gke.1 #1494 (kkasravi)
- click to deploy: remove default project name #1490 (lluunn)
- fixes add ability to only generate parts of a component in the jsonnet file #1487 (kkasravi)
- Simplify tensorboard #1485 (kkasravi)
- Delete deploy.sh scripts; we use kfctl.sh now. #1482 (jlewi)
- Ensure jovyan has site-packages perms #1470 (pdmack)
- Adding namespace scope to pytorch operator #1465 (johnugeorge)
- standardize remaining <component>.{jsonnet,libsonnet} files #1437 (kkasravi)
- Fix kfctl.sh gcpInitProject is always skipped #1425 (sambaiz)
v0.2.5 (2018-09-04)
Fixed bugs:
- JupyterHub Version Mismatch #1393
Closed issues:
- Error setting up kubeflow in minikube #1459
- Current documentation for setting up Kubeflow in minikube not working #1455
- presubmit failure for jsonnet test: name must be set #1453
- What is image registry.opensource.zalan.do/teapot/external-dns ? #1446
- What's the function of tfReplicaType: Master ? #1442
- Bootstrapper fails in docker-for-desktop #1430
- tf_job_simple_test results not being report #1426
- deploy.sh should be restart-aware in terms of directory structure #1422
- kfctl.sh should not assume uuidgen is present #1415
- How to spawn the jupyter container as a root user #1412
- Don't use DM for IAM policy management #1401
- Trigger minikube E2E test on presubmit when minikube test is modified #1350
- GKE version "foo" is unsupported. #1348
- CentralDashboard returns 404; Ambassador can't parse the route #1306
- When creating releases we should pin the version of source.tar.gz used in the deploy.sh #1239
- provide the ability to add imagePullSecrets to different ServiceAccounts so that private images can be fetched #1231
- Restrict privilege of Kubeflow services accounts such as tf-job-operator to namespace level #1213
- Minikube deploy script should start minikube #1153
- [gcp] Use BackendConfig to enable IAP #1146
- Make jupyterlab discoverable/default #1124
- Enable test for tf-job-simple prototype for v1alpha2 #1048
- Initial report for spartakus metrics #351
- Batch Prediction using GPUs with local runner #251
Merged pull requests:
- Fix bug with updating an existing deployment. #1475 (jlewi)
- Update Kaggle Dockerfile to add their API package #1469 (pdmack)
- Add storage read scope to VM scopes; v0.2-branch (#1466) #1468 (jlewi)
- Add GCS read only scope to GKE VM scopes #1466 (gindeleo)
- implement gcpUtils which takes care of integrating deployment manager API (part of PR 1458) #1460 (kunmingg)
- make web app stateless & backend security update #1458 (kunmingg)
- Fix presubmit unit test #1456 (lluunn)
- Click to deploy ui: remove ipName and make hostName optional #1449 (lluunn)
- Create webapp manifest dir #1447 (lluunn)
- Fix readme for local run #1445 (lluunn)
- Ensure MOUNT_LOCAL env var is forwarded to kfctl #1443 (abhi-g)
- Enable v1alpha2 for Pytorch operator #1441 (johnugeorge)
- Fix KUBEFLOW_REPO dir pointer. #1439 (abhi-g)
- Weaveflux version 0.1.2. #1434 (TheJaySmith)
- Add imagePullSecret to Seldon Prototypes #1431 (cliveseldon)
- Add logging statements to help figure out why test results aren't reported #1427 (jlewi)
- Add texasmichelle to OWNERS #1424 (texasmichelle)
- Use JupyterLab as default instead of Jupyter Tree interface #1423 (ankushagarwal)
- Use bash ${RANDOM} instead of uuidgen #1418 (ankushagarwal)
- Fix minikube test and trigger on presubmit when modified. #1411 (jlewi)
- Python script to declaratively manage IAM binding patches #1408 (ankushagarwal)
- Remove deploy_gcp.sh and update workflows.libsonnet. #1406 (jlewi)
- Update various scripts to use lowercase clientid/secret #1405 (ankushagarwal)
- Fix use_gcr_for_all_images.sh #1404 (ankushagarwal)
- Update tf-job-operator prototype to support namespace-scoped deployment #1403 (ankushagarwal)
- Fix annotations in ambassador and centraldashboard #1399 (richardsliu)
- Add namespaces to the tf-job-dashboard role #1397 (ankushagarwal)
- Updates to iap component to support private clusters #1396 (ankushagarwal)
- util.libsonnet has methods that allow a <component>.part to be modified #1395 (kkasravi)
- Add mxnet operator (https://github.com/kubeflow/community/issues/136\) #1392 (suleisl2000)
- support customized image in ACK and update the document #1388 (cheyang)
- Added a minikube setup script. #1387 (abhi-g)
- [kfctl web app] frontend & backend update to agree with api changes #1372 (kunmingg)
- Fix iap-ingress certificate creation and use BackendConfig for IAP #1327 (danisla)
v0.2.4-rc.0 (2018-08-21)
v0.2.4 (2018-08-21)
Closed issues:
- The testing/install_minikube.sh script assumes the host OS is Ubuntu. #1383
- Add Argo UI to Ambassador and Central UI #1310
Merged pull requests:
- Cherry-pick: Update TFJob operator to the latest image #1391 (richardsliu)
- Fix typo #1385 (fisache)
- Fixes the install_minikube.sh script for non-Ubuntu OSes #1384 (Ark-kun)
- fix a typo and remove a too restrictive limit for gpu specification #1382 (yixinshi)
- Fix groups for various resources #1380 (activatedgeek)
- Adds argo to ambassador and Central UI #1376 (swiftdiaries)
v0.2.3-rc.0 (2018-08-17)
v0.2.3 (2018-08-17)
Features and improvements:
- Extra packages in jupyterhub-image #1175
- Use GKE auto-scaling when configuring node pools in the provided Deployment Manager configs #1033
Fixed bugs:
Closed issues:
- Jupyter-role error applying kubeflow-core component with ksonnet #1353
- Jupyter notebook Connection failed because Ambassador doesn't enable websockets #1344
- Need to update glide's ksonnet version to ^0.11.0 #1340
- PS still running after tfjob is complete. #1334
- Central UI should include a link to Kubeflow docs website #1318
- Istio integration doc: Point to kubeflow/website documentation for #1315
- Build and debug improvements for bootstrapper #1312
- Fix incorrect links to user_guide in kubeflow.org #1300
- JupyterHub login unauthorized (401) #1296
- Katib apply fail with error: Field does not exist: modeldbDatabaseImage #1291
- ambassador crashing on node with wrong DNS resolver address due to misconfigured kubelet #1289
- Move Katib documentation to kubeflow website #1286
- How can we change the tensorflow image in kubeflow? #1285
- [gcp] deploy.sh should support rerunning deploy.sh when DM configs and ks app already exist #1284
- TFJob operator v1alpha2 doesn't work with TF.Estimator API for TF <=1.6 #1283
- [GCP] deploy.sh fails; can't create filestore because network is legacy #1282
- [GCP] deploy.sh filestore API not enabled for project #1280
- [GCP] deploy.sh gcloud error Invalid choice: 'filestore'. #1279
- [gcp-deployer] Set up "cors-anywhere" proxy service for k8s api requests from web app #1276
- Deploy argo by default; add it to deploy.sh scripts #1268
- [Test Flake] simple tf job failing; Job not found waiting for job #1266
- ERROR no prototype names matched 'kubeflow/core' #1263
- TF Serving GPU test failing #1262
- Keras training in Kubeflow on GKE gets "Killed" #1261
- Better installation guide on kubeflow.org #1257
- Create a batch predict example #1250
- GPU support on GKE not available #1246
- TF-job package missing #1245
- Spawning Jupyter failed; user jovyan does not have permission to write to default storage class #1241
- "Getting Involved" in README.md should point to kubeflow.org #1237
- scripts/gke/deploy.sh fails when kubeflow_deployment_manager_configs/ exists #1233
- [openmpi]- NodeSelector not working. #1230
- [GCP] deploy.sh - don't show error if deployment doesn't exist #1222
- Cant start jupyter-notebook pod in kubeflow version 0.2.1 #1221
- [Test Failure] TFJob test failure; no module named py #1218
- Error from server (NotFound): tfjobs.kubeflow.org "mycnnjob" not found #1217
- Getting started error: No such file or directory: 'cluster-kubeflow.yaml' #1206
- Getting started error #1205
- README.md QuickStart should refer to kubeflow.org Getting Start #1202
- "getting-started-gke" installer fails #1201
- [gcp] deploy.sh shouldn't download secrets to the same directory as the DM configs #1197
- deploy.sh is broken; wrong directory for the unpack? #1193
- unable to spawn jupyter notebook - volume name is too long #1177
- Delete old GCP configs #1171
- Test Flake gke teardown failed; insufficient quota #1166
- Create Prometheus Component in Kubeflow Core. #1160
- Jupyter image suitable for running the examples/codelabs #1157
- Create links on kubeflow.org to redirect to ksonnet tarballs and deploy scripts #1156
- Make it easy to customizes PVs attached to Jupyter pods #1125
- Split all prototype into separate prototypes #1107
- Support for AVX2 when using deployment manager #1082
- [gcp-click-to-deploy] Deploy click to deploy web app #1056
- Bootstrapper release instructions need to explain how to build at appropriate commit #1053
- Don't check in vendor for bootstrap; it adds 160M which slows down cloning the registry as part of Kubeflow deployment #1051
- unflake TF serving testing #1031
- Support and testing different versions of TF serving images #1005
- Serving path should support logging request input/output #1000
- 'ks delete ${KF_ENV} -c kubeflow-core' doesn't take down user notebook pods #968
- Bootstrapper should support file and http registries in a consistent manner #962
- Click-to-deploy UI upgrade. #959
- Remove "alpha" in deployment manager config when gke-1.10.2 is public #821
- TF Serving test flaky #815
- TFJobs UI doesn't work behind IAP; React APP needs support IAP? #574
- Can not launch TensorFlow Serving because AVX not available on VM #421
- Trigger rebuild of TF serving image in E2E test only when files change #371
- HTTP Proxy and TFServing should not use the same resource defaults #360
- Add script to export inception into SaveModel from checkpoints #229
- ks apply -f tf-job.jsonnet, -f is not a valid flag #201
- e2e test for http-proxy #198
- Make https://hub.docker.com/r/kubeflow/jupyterhub/ a community resource #197
- How to port model developed in Jupyter notebook to TFJobs #110
Merged pull requests:
- Change the default branch used for the configs to v0.2-branch and not master #1378 (jlewi)
- add apply handler; fix k8s auth; reset glide hash #1375 (kunmingg)
- Add Jobs permission to jupyterhub-noteook role to for common use cases #1374 (activatedgeek)
- use tf serving image #1370 (lluunn)
- Disable spartakus usage reporting in our ci clusters. #1364 (jlewi)
- Use gcr.io images for argo, ambassador and cert-manager #1362 (ankushagarwal)
- add optionsHandler for OPTIONS request from browser #1361 (kunmingg)
- fix 'kfctl.sh delete platform' doesn't delete platform correctly. #1360 (everpeace)
- Fix a bug in kfctl.sh when just creating a K8s app and not GCP. #1357 (jlewi)
- ks version update to 0.11.0, and bug fixes. #1356 (kunmingg)
- upgrade gke version for v0.2 #1355 (lluunn)
- Add README steps for releasing website #1352 (richardsliu)
- Create ks prototype for tf-batch-predict #1351 (yixinshi)
- fixes issue #1344 #1345 (fyuan1316)
- Add more tests to the subgraph we created to run the tests. #1342 (jlewi)
- Update TFJob operator to the latest image. #1341 (jlewi)
- add bindRoleHandler which bind roles to service accounts #1335 (kunmingg)
- Add TFJob test to the Kfctl test; refactor workflows to start to use #1333 (jlewi)
- [gcp-deployer] add service account key handler and a check health handler #1329 (kunmingg)
- Update ambassador to 0.37.0 #1324 (ankushagarwal)
- adds link to docs in centralui #1322 (swiftdiaries)
- points to the documentation in kubeflow.org #1319 (amsaha)
- Build and debug improvements for bootstrapper #1313 (kkasravi)
- Turn deploy.sh into kfctl.sh #1308 (jlewi)
- revert to k8s 1.9 #1307 (lluunn)
- wording #1305 (pymia)
- Add reasonable default to tf-serving s3 creds #1304 (inc0)
- Fixed broken links (to user_guide) #1301 (amsaha)
- Pointing katib doc to kubeflow/website #1292 (amsaha)
- Make it easier to rerun deploy.sh #1290 (jlewi)
- Update README.md #1287 (pdmack)
- Enable GKE node-pool autoscaling options #1273 (richardsliu)
- Deploy argo in deploy.sh #1272 (lluunn)
- Improve tf serving e2e test #1271 (lluunn)
- Small fix to tf serving component #1270 (lluunn)
- add components dir to tf serving test #1269 (lluunn)
- Fix tfjob test; the simple tfjob test prototype was renamed. #1267 (jlewi)
- Update TFJob image to include the fixes to make the UI work with IAP. #1265 (jlewi)
- [gcp-deployer] CSS fixes, Removed Placeholder, linting #1260 (yebrahim)
- Scripts' code style adjustment #1256 (zacharyzhao)
- [gcp-deployer] Setup oauth2 id & secret; insert service account keys via backend service #1255 (kunmingg)
- Enable new SD agents by default #1252 (lluunn)
- Deploy click to deploy app #1248 (jlewi)
- Modified README.md to point to Kubeflow.org community page. Issue #1237 #1247 (amsaha)
- Scope kubeflow prow jobs by job type and modified dirs #1244 (richardsliu)
- Fix deploy.sh when re-trying on existing deployment name #1243 (lluunn)
- Adds check to see if Katib is deployed and updates CentralUI accordingly. #1242 (swiftdiaries)
- Update indentation issue in jupyterhub spawner ui #1238 (jainprachi2506)
- Fix link on front page. #1236 (jlewi)
- [gcp-deployer] Enable required GCP services before deploying #1235 (yebrahim)
- Update reviewers so that automatic reviewer assignment works better. #1234 (jlewi)
- ksonnet package for WeaveFlux #1232 (TheJaySmith)
- TF serving support request logging #1229 (lluunn)
- Update the README.md; remove instructions and direct folks to kubeflow.org #1226 (jlewi)
- Improve jupyterhub form interface #1212 (ankushagarwal)
- Speed up gke script by running DM and GCFS creation in parallel #1211 (ankushagarwal)
- fix tf-job-dashboard fails to show logs #1210 (cheyang)
- Seldon 0.2 update #1209 (cliveseldon)
- [gcp-deployer] Require user sign in, fetch email, add splash screen #1208 (yebrahim)
- Split kubeflow-core into individual prototypes #1207 (ankushagarwal)
- prometheus prototype #1204 (kunmingg)
- change {username}{servername} to {userid} to avoid volume length restrictions of 63 characters #1200 (kkasravi)
- Add a script to cleanup leaked kubeflow-ci ingress resources #1199 (ankushagarwal)
- Extra packages in tensorflow-notebook-image #1198 (wmuizelaar)
- Cherrypick #1194 to master - Fix deploy.sh moving the unpacked repo #1196 (ankushagarwal)
- Delete docs/gke/configs and move tests to scripts/gke #1195 (ankushagarwal)
- Support private GKE clusters in gke deploy.sh script #1192 (ankushagarwal)
- Add doc for TF Serving #1165 (lluunn)
- Create a doc to describe the deployment process. #1159 (jlewi)
- Turn bootstrapper into an RPC server for ksonnet. #1151 (jlewi)
- Add Kaggle notebook Dockerfile #1109 (pdmack)
- Set min-cpu-platform to haswell to support avx2 in deployment manager #1083 (lluunn)
v0.2.2-rc.0 (2018-07-13)
v0.2.2 (2018-07-13)
Closed issues:
- TFMA plots don't render; GET tfma_widget_js.js returns 404 #1130
Merged pull requests:
- Fix deploy.sh moving the unpacked repo. #1194 (jlewi)
- Use PV for /home/jovyan by default #1191 (jlewi)
- GCFS support #1173 (ankushagarwal)
v0.2.1-rc.1 (2018-07-12)
v0.2.1 (2018-07-12)
Closed issues:
- Use PV by default mounted at /home/jovyan #1187
- Central UI image needs to be updated in 0.2.1 release; it is too old. #1147
- metrics_collector should emit K8s events to indicate when Kubeflow is ready #1142
- Jupyter images in 0.2.1 need to be upgraded #1129
Merged pull requests:
- Use PV for /home/jovyan by default #1188 (jlewi)
- Put the full PyTorch prototype in the jsonnet file. (#1119) #1186 (jlewi)
- Grant roles/viewer to kubeflow user service account #1185 (ankushagarwal)
- Fix socket.Error typo #1183 (ankushagarwal)
- Update Central UI image to the v0.2.1 tag. #1181 (jlewi)
- Make metric collector emit k8s event #1161 (kunmingg)
v0.2.1-rc.0 (2018-07-11)
Closed issues:
- Test Flake deploy.sh fails trying to enable the deployment service #1158
- Make downloading our ksonnet registry for getting started efficient #1154
- kubeflow cluster cannot pull image from GCR within same Project. #1139
- [Test Flake] vm_util.wait_for_operation needs to retry on socket error #1137
- Ambassador pod failed to run because kube-dns not running #1134
- [Test Flake] tf_job_simple_test needs retries for ks init to deal with git connection issues #1128
- PyTorch job prototype should contain the full job spec #1114
- Finalize release 0.2.0 #1070
- [gcp] GKE setup; do as much as possible in deploy.sh #1068
- [gcp click-to-deploy] Need to build docker container #1055
- Cherry-pick for release v0.2.0-rc.1 #1024
- Cut a 0.2 release branch #964
- [gcp] monitoring agent need to emit events/status information particularly related to IAP #955
- Include GPU daemonset in GKE configs? #288
- KubeFlow or Kubeflow? #44
- Tooling to manage configuration and deployment #23
Merged pull requests:
- Cherry pick upgrading the central UI to v0.2.1 #1182 (jlewi)
- Make the deploy scripts more efficient and other fixes. (#1174) #1180 (jlewi)
- Cherry Pick: Don't check in bootstrap/vendor. (#1152) #1176 (jlewi)
- Make the deploy scripts more efficient and other fixes. #1174 (jlewi)
- Make GKE VM service account storage.objectViewer to have read access of gcr #1164 #1172 (jlewi)
- Cherypick: Tag the latest Jupyter images with v0.2.1 (#1144) #1170 (jlewi)
- Delete kubeflow namespace before deleting the cluster #1167 (ankushagarwal)
- Make GKE VM service account storage.objectViewer #1164 (kunmingg)
- Give KF_USER_NAME service account roles/cloudbuild.builds.editor role #1163 (ankushagarwal)
- Skip project setup during deployment. #1162 (jlewi)
- Don't check in bootstrap/vendor. #1152 (jlewi)
- Make Katib work with Ambassador. (#1103) #1150 (jlewi)
- Fix the makefile so that we tag the image with the comit. #1149 (jlewi)
- Tag the latest Jupyter images with v0.2.1 #1144 (jlewi)
- [gcp-deployer] Use Material UI components and fonts #1138 (yebrahim)
- Set requests and limits for RAM and CPU in TF notebook image releaser. #1136 (jlewi)
- Make the test robust to test flakes due to problems initializing the ksonnet app #1133 (jlewi)
- Fix TFMA jupyter extensions. #1131 (jlewi)
- Fix typos #1127 (idealhack)
- add service monitor prototype for monitoring deployment status #1123 (kunmingg)
- Add Dockerfile and Makefile to build docker images #1122 (ankushagarwal)
- Pin and fix katib images. (#1113) #1120 (jlewi)
- Put the full PyTorch prototype in the jsonnet file. #1119 (jlewi)
- Pin and fix katib images. #1113 (jlewi)
- Create a deployment script for gke and minikube #1111 (ankushagarwal)
- Add a jupyter-notebook-role and use it for notebooks in jupyterhub #1110 (ankushagarwal)
- [gcp-deployer] Rudimentary progress in logs #1108 (yebrahim)
- add katib releaser #1102 (YujiOshima)
- Create a script to update a ksonnet app to the latest Kubeflow package #1100 (jlewi)
- Tfjob create fails in tfjob UI #1099 (kkasravi)
v0.2.0 (2018-06-29)
Fixed bugs:
- [gcp click-to-deploy] hostname field is not editable #1101
- [gcp click-to-deploy] Click to deploy web app crashes #1072
Closed issues:
- Wrong comment in setting default CleanPodPolicy #1081
- Deprecate tfserving http-proxy? #1080
- user guide disappear? #1078
- user_guide link is dead #1075
- TFJob prototype's default TFVersion should be v1alpha2 #1049
- TF-Serving 1.8 Images #845
Merged pull requests:
- [gcp-deployer] Fix hostname typo bug #1105 (yebrahim)
- cherry pick 3 commits #1104 (kunmingg)
- Make Katib work with Ambassador. #1103 (jlewi)
- add seldon to ks config; pre install all pkg #1098 (kunmingg)
- Create a version of echo-server to echo headers. #1097 (jlewi)
- add chainer-job/chainer-operator ksonnet package #1095 (everpeace)
- Install gpu driver in deployment manager #1094 (lluunn)
- Improvements to deploy.sh #1093 (jlewi)
- Delete the tf-job package. #1091 (jlewi)
- cherry-pick: update tf job default version (#1086) #1087 (kunmingg)
- update tf job default version #1086 (kunmingg)
- Add katib tag to images #1085 (inc0)
- fix for file_cache is unavailable when using oauth2client >= 4.0.0 #1084 (kkasravi)
- add a readme file to the mpi-job ksonnet component #1079 (rongou)
- fix #1075, user_guide link is dead #1077 (theofpa)
- TFJobs UI doesn't work behind IAP #1073 (kkasravi)
- point bootstrapper to v0.2.0 in release branch #1069 (kunmingg)
- Tooling to make it easier to tag images and update the ksonnet prototypes #1066 (jlewi)
- Create a script to update some of the docker images in the prototypes #1063 (jlewi)
- [gcp-deployer] Add Gapi manager class, more typings and fixes #1054 (yebrahim)
v0.2.0-rc.1 (2018-06-22)
Closed issues:
- Cross-Origin Resource Sharing with TF Jobs Dashboard #1046
- nvidia-smi fails for TFJob's but not for similarly configured Job's #1042
- Jupyter can't start pod; the default spawner image is way too old #1041
- TFJob pods deleted on completion/failure impairing debugging #1039
- Invalid value: "v1alpha2": field is immutable #1029
- Update PyTorch Image to officially released one #1020
- Bootstrapper in 0.2.0-RC 0 doesn't set tfJobsVersion to v1alpha2 by default #1018
- TensorFlow 1.8 not included in stock Jupyter Images #1014
- Investigate supporting of TF serving 1.7 gpu #1009
- Parameter names for Ambassador images improperly named; have extra tf prefix #994
- Bootstrapper: Friction log from minikube on macOS #981
- Enable TFJob v1alpha2 by default in 0.2 release #977
- Central UI sometimes doesn't render if screen too small #957
- [gcp] cluster-kubeflow.yaml isn't tested #950
- [central UI] needs a link to JupyterHub #810
- Verify Central UI is working #805
- Batch Prediction Beam Library #662
Merged pull requests:
- cherry pick: update tf_operator image to point to tag v0.2.0 (#1062) #1065 (kunmingg)
- cherry pick fixes to release branch #1064 (kunmingg)
- update tf_operator image to point to tag v0.2.0 #1062 (kunmingg)
- Some scripts to support retagging our images as part of our releases. #1061 (jlewi)
- Fix typo in NB image sha #1058 (pdmack)
- Update default notebook image for spawner #1052 (pdmack)
- A bunch of fixes for deploying with private ks registry #1047 (IronPan)
- test-dir-delete should depend upon teardown to complete #1045 (ankushagarwal)
- add mpi operator ksonnet package #1043 (rongou)
- Changing defaults in Pytorch-job #1040 (johnugeorge)
- Delete user_guide.md; it is now part of the website #1038 (johnugeorge)
- [gcp-deployer] Move to Typescript, clear dead code #1037 (yebrahim)
- Update examples for 0.2. #1035 (jlewi)
- update pytorch image #1034 (kunmingg)
- release workflow bug fixes and doc update #1032 (kunmingg)
- Update TF serving 1.7 gpu image to use cuda 9.0 #1030 (lluunn)
- fix format error for nightly release config #1026 (kunmingg)
- Merge master commits to release 0.2 branch #1025 (kunmingg)
- Fix param name of ambassador #1023 (lluunn)
- update bootstrapper image to use tfJobsVersion v1alpha2 as default #1021 (kunmingg)
- Pin the central UI image. #1019 (jlewi)
- Add a link to JupyterHub in the CentralUI #1016 (jlewi)
- Include the TF 1.8 image in kubeform_spawner. #1015 (jlewi)
- Fix hard coded "name" variable in example prototypes #1013 (xyhuang)
- Set default version of TFJob operator to be v1alpha2 #1012 (kunmingg)
- Use docs/gke/configs to run e2e tests #1002 (ankushagarwal)
- Fix user guide for the refactored tf-cnn example #995 (xyhuang)
- Fix permissions for jupyter-role #978 (wmuizelaar)
v0.2.0-rc.0 (2018-06-16)
Fixed bugs:
- JupyterHub authenticates login but doesn't redirect to user home #430
Closed issues:
- Bootstrapper should apply components in an order #1006
- Make Katib work with reverse proxy and ambassador #991
- ambassador memory leak statsd:0.30.1 #986
- presubmit failing: couldn't open import "X": no match locally or in the Jsonnet library path #983
- Verify Katib is working #973
- Issues with image release workflow app #970
- ks error: ERROR open /home/jlewi/app.yaml: no such file or directory #966
- Bootstrapper throwing error when deployed on GKE #961
- [gcp] bootstrapper fails to create CloudEndpoints resource #954
- Nightly builds of TensorFlow notebook images do not have have the same commit hash in the tag #943
- Create ksonnet prototype for auto image release #941
- How to configure a local volume for Jupyterhub_spawner.py #933
- Unknown variable error when applying kubeflow-core prototypes #932
- GCP Deployment Manager needs to delete IAM roles when DM is deleted #910
- Unable to use image-releaser for tf-notebook-workflow #909
- Regression Jupyter notebooks no longer include python2 runtime #906
- Deadlocks configuring envoy for IAP #903
- IAP setup script needs permission to list backend services #902
- bootstrapper should not crash loop on error #901
- openmpi-controller:0.0.3 image pull error. #887
- deployment manager should allow setting IAM roles in YAML config #883
- Bootstrapper should support packages from other registries #880
- Deployment manager config should create service accounts and set IAM roles #878
- Deployment manager config should enable Cloud Endpoints API #876
- dev.kubeflow.org envoy; iap sidecar stuck waiting for backend #871
- Presubmit unhealthy? Networking issues #869
- Failed to run TfCnn example from User Guide: ERROR find objects: parse jsonnet snippet: params.libsonnet:22:5-13 Expected a comma before next field. #863
- unknown node type issue related to mycnnjob.jsonnet #858
- Verify PVC for /home/jovyan works #854
- E2E test for TFJob v1alpha2 ksonnet package #852
- Include ssh in notebook image to support authenticated git push #850
- Jupyter image for TensorFlow 1.8 #846
- Some questions about tf-serving on NFS #844
- E2E test verifies Kubeflow installed via Deployment Manager #836
- JupyterHub GitHub OAuth Setup missing manifest/config.yaml files #835
- GCP deployment manager test handle internal errors #833
- bootstrapper error trying to create the PyTorch Operator #832
- Unify and dedup release workflows #830
- Bootstrapper should optionally use config map to specify ksonnet parameters #829
- horovod error #826
- Give release service account access to kubeflow-images-public #824
- Ambassador failed to start up #811
- IAP Envoy route should map / to central dashboard #809
- [Central UI] Remove the box "There is nothing to display here" #808
- Bootstrapper should deploy new Stackdriver agents #807
- deployment manager should disable legacy Stackdriver agents #806
- Invalid envoy config duplicate value IAP #804
- On GCP trigger bootstrapper with deployment manager #802
- TF serving GPU test failing #794
- [openmpi] support to run a job with non-root users #793
- Minikube E2E test timeout waiting for VM to be ready. #788
- Spawner options error(401) #784
- jupyterNotebookPVCMount is not work in v0.1.2 #770
- Swift Jupyter Integration #763
- It didn't return [dead loop] when Python 3 is being used. #760
- Support using bootstrapper image as a kube deployment. #758
- Support install kubeflow by Deployment Manager #757
- the tfserving prototype missing label after generation #746
- E2E test for bootstrapper #742
- E2E test for Argo Package #740
- cloud-endpoints fails when using bootstrapper #735
- Use Pod Preset to add environment variables and volume mounts to pods #732
- Bootstrapper fails if no ~/.kube/config is present #722
- openmpi controller exited with error #718
- [openmpi] Upload trained models to persistent storage #713
- Create individual prototypes for just TFJob #669
- Nightly (regular) build of container images #666
- Bootstrapper should enable IAP on GKE #665
- Bootstrapper should check if user has appropriate permissions and if not create cluster role #664
- Bootstrapper should create namespace if it doesn't exist #663
- Tests are failing because we are running out of PD quota in us-east1 #618
- ksonnet packages for Pachyderm #611
- Friction log for TFX Chicago taxi cab example on minikube #594
- TFJob prototype should contain the full TFJob spec so that ks generate is mostly just copying the prototype #564
- Can we put all of /home on PV for Jupyter notebooks #561
- Create gcr.io/kubeflow-images-public #534
- Central UI Ambassador Integration #528
- Have the JupyterHub spawner report issues with spawning the user's server #505
- Recommended minikube setup #502
- TF Serving component logging #495
- JupyterHub Spawner complains notebook is already spawning; upgrade JupyterHub to 0.9 #479
- Build TFServing images for different TF versions #468
- [Discussion] ISTIO and TFServing #464
- E2E test for tf-job-simple prototype #462
- Unable to mount volumes for pod "jupyter-... #424
- multiple Matplotlib libraries #423
- ksonnet style guide #403
- Reformat only modified files #395
- Establish a pattern for creating/using secrets used by multiple kubeflow prototypes #372
- ks delete default fails #364
- Python script to set parameters for ksonnet prototypes #322
- Flakes building TF Serving image; problems downloading unbuntu packages #310
- ksonnet prototypes for TensorBoard #297
- Enforce formatting for all jsonnet files #282
- Assess usability on vanilla dockers #267
- Discussion: Eventing model/solution #263
- Recommended Kubeflow Setup on GKE #241
- Recommended setup for different K8s Solutions #240
- Use Ambassador/Envoy as proxy for JupyterHub #239
ks apply cloud -c newjob
silently fails #217- Tracking Central UI #199
- Kubeflow logo? #187
- [discussion] Support PyTorch distributed training? #179
- Add kubeflow into tesorflow/ecosystem #177
- Central UI #141
- Investigate file server connection errors with Jupyter and IAP #140
- TfJob controllers are not namespace scoped so tests aren't isolated #134
- Add resource request and limit fields to tf-job ksonnet prototype #116
- Figure out ksonnet repo organization #106
- Add Argo package to our ksonnet registry #21
Merged pull requests:
- Add tf serving 1.8 cpu image #1011 (lluunn)
- Add test for tf-job-simple prototype #1010 (ankushagarwal)
- update ks version; run ks apply following config order #1008 (kunmingg)
- fix iam role delete for deployment manager #1001 (kunmingg)
- fixes disappearing elements in Central UI #996 (swiftdiaries)
- Enable iam api and set project flag explicitly #993 (ankushagarwal)
- bootstrapper doc update #992 (kunmingg)
- Release prototype update. #990 (kunmingg)
- Doc for katib #989 (lluunn)
- Retry on ssl.SSLError in vm_util - this resolves test flakiness #987 (ankushagarwal)
- Consolidate GKE deployment script #985 (activatedgeek)
- Remove gpu_model.jsonnet #984 (lluunn)
- config update #982 (kunmingg)
- Fix vizier core #979 (lluunn)
- use ks apply in bootstrapper to avoid creation ordering issue #976 (kunmingg)
- Disable the v1alpha2 test because it is failing because mnist data can't be downloaded. #975 (jlewi)
- Adding pytorch-operator to nightly builds #972 (johnugeorge)
- Delete tf-cnn-benchmarks.jsonnet; there is now an example prototype. #967 (jlewi)
- Source admin role should be granted to KF service accounts. #965 (jlewi)
- Remove gke_setup.md. The instructions are now part of the website. #963 (jlewi)
- A bunch of fixes to the DM configs for GCP to work with latest bootstrapper #956 (jlewi)
- Add aws cloud details #952 (suneeta-mall)
- hide registries config in bootstrapper image to save user effort #951 (kunmingg)
- ksonnet prototype for automated image release #948 (kunmingg)
- Update the notebook images used with Jupyter. #947 (jlewi)
- Downgrade ambassador version to 0.30.1 #946 (ankushagarwal)
- Update instructions for recreating minkube with required resources. #944 (abhi-g)
- update default image for TF serving #940 (lluunn)
- add ambassador route for katib #939 (lluunn)
- Set 15 minute timeout for workflow step #937 (ankushagarwal)
- Remove service account creation in testing deployment manager #936 (ankushagarwal)
- update image for central ui #935 (kunmingg)
- fix vm service account config, doc update #931 (kunmingg)
- fix minor typos #930 (rongou)
- Add AVX to Troubleshooting #929 (pdmack)
- DM config should allow setting users/groups to grant IAP access to. #928 (jlewi)
- Delete test directory after test completes #926 (ankushagarwal)
- Prevent deadlocks trying to setup IAP. #924 (jlewi)
- Update image-releaser README #922 (ankushagarwal)
- keep bootstrapper pod alive when error occurs #921 (kunmingg)
- Run v1alpha1 and v1alpha2 TfJob tests in GKE workflow #918 (ankushagarwal)
- Activate service account before pushing images #917 (ankushagarwal)
- Ambassador to jupyterhub #915 (kkasravi)
- support jupyterhub in alibaba cloud #914 (cheyang)
- copy registry into bootstrap image during e2e test #913 (kunmingg)
- Doc on adding new ksonnet package #911 (lluunn)
- Update default parameters for image-releaser workflows #908 (ankushagarwal)
- Install py2 to global conda env #907 (pdmack)
- Upgrade jupyterhub and kube_spawner #905 (jlewi)
- Improvements and bug fixes in DM config. #904 (jlewi)
- Make bootstrapper support multi ks registry, including private ones #900 (kunmingg)
- Checkin the deployment manager config for running e2e-gke test #899 (ankushagarwal)
- Create wait_for_deployment.py #898 (ankushagarwal)
- Click to deploy Kubeflow web app on GCP #897 (jlewi)
- Restrict jupyterHubRole #895 (wmuizelaar)
- Use Deployment Manager to bring up clusters for running E2E tests #894 (ankushagarwal)
- Use deployment name as the NAME_PREFIX and CLUSTER_NAME #893 (ankushagarwal)
- Merge PVC from spawner and provisioners #892 (pdmack)
- update release doc for images nightly release #890 (kunmingg)
- fix endpoin deploy prototype #889 (kunmingg)
- Update to Ambassador 0.34.0 #888 (kflynn)
- Add tf notebook image 1.8 version #886 (ankushagarwal)
- Restore openssh-client to NB images #882 (pdmack)
- Add GKE Security Features to Deployment Manager config #879 (ankushagarwal)
- Fix a bunch of issues with creating Kubeflow with deployment manager + bootstrapper #875 (jlewi)
- tfjob(v1alpha2): Add validation #874 (gaocegege)
- Katib ksonnet package #873 (lluunn)
- deployment manager and prototype bug fix #872 (kunmingg)
- Update the script so that we don't assume that the remote repository is #867 (jlewi)
- Create a python script to deploy Kubeflow on GCP via deployment manager. #866 (jlewi)
- build_image.py waiting for docker daemon #864 (lluunn)
- Use a config map to configure bootstrapper. #859 (jlewi)
- Fix push_dockerhub in k8s-model-server/images/Makefile #857 (parano)
- build_image.py supports pushing images #856 (lluunn)
- Use yaml config to manage k8s resources deployed by bootstrapper #853 (kunmingg)
- ksonnet changes to support deploying the v1alpha2 TFJob operator. #851 (jlewi)
- params update for release workflow #849 (kunmingg)
- unifying image release process #843 (lluunn)
- Use relative directory path so that build_image.py can be called from another dir #840 (lluunn)
- Add kunmingg as owner #839 (kunmingg)
- [openmpi] Update README about GPU and horovod training #831 (jiezhang)
- IAP should route to central ui instead of jupyterhub #827 (kkasravi)
- cron job & workflow config for image release #825 (kunmingg)
- Start deploying the bootstrapper via deployment manager. #823 (jlewi)
- [openmpi] support non-root users run jobs (introducing
runAsUser
,runAsGroup
,supplementalGroups
) #820 (everpeace) - deployment manager uses new stackdriver agents #814 (lluunn)
- add bootstrapper to e2e test #803 (kunmingg)
- add meta api for serving #800 (u2takey)
- seldon load model files from PVC #799 (ogre0403)
- Adds Makefile to build the centraldashboard image #798 (swiftdiaries)
- [openmpi] Download and upload data in the controller #797 (jiezhang)
- Use test_helper to simplify test_jsonnet.py #796 (ankushagarwal)
- Refactor testing package #792 (ankushagarwal)
- Python script to build TF notebook images #791 (lluunn)
- Support TF serving 1.5 image #790 (lluunn)
- Checking that Ambassador started in e2e test #789 (Maerville)
- Initial set of deployment manager configs for creating a GKE cluster to run Kubeflow #787 (jlewi)
- Modify NB Dockerfile and start scripts to support mount of /home/jovyan #786 (pdmack)
- [tf-job] support image pull secrets #785 (everpeace)
- Create a ksonnet package for injecting credentials through pod presets #782 (ankushagarwal)
- -Format go code and fix spelling errors #780 (wgliang)
- Adds ambassador integration for Central UI #779 (swiftdiaries)
- Argo workflow e2e test #777 (ankushagarwal)
- Update seldon ksonnet image versions to 0.1.6 #776 (cliveseldon)
- [openmpi]: Fix
wait\_mpi\_ready\(\)
hangs when using OpenMPI 2.1.3 #773 (everpeace) - [openmpi] Add custom resources support #772 (everpeace)
- [openmpi] support
imagePullSecrets
#771 (everpeace) - Python script to set parameters for ksonnet prototypes #769 (Maerville)
- Create an examples package in kubeflow #768 (ankushagarwal)
- Support deploy kubeflow bootstrapper within k8s #766 (kunmingg)
- Support building different version of TF Serving images #765 (lluunn)
- [openmpi] make RBAC be optional #764 (everpeace)
- [openmpi] Wait for nvidia driver to be installed before running MPI job #761 (jiezhang)
- Fix bytes literals in Python3 #759 (gangliao)
- [openmpi] Support RBAC #756 (jiezhang)
- [openmpi] Add more parameters for GPU training #754 (jiezhang)
- Move away from mlkube-testing to kubeflow-ci #750 (ankushagarwal)
- Add test to check jsonnet formatting #747 (ankushagarwal)
- Add tf1.8 build to jupyter notebook images #745 (ankushagarwal)
- support IAP setup for bootstrap on GKE #744 (kunmingg)
- Make TFJob operator a standalone prototype #743 (pdmack)
- Fix Serving metadata #741 (sozercan)
- Remove spurious comma #729 (jlewi)
- Argo Ksonnet package - Add Docker image Tag #728 (julienstroheker)
- tf-serving support model on NFS #688 (lqj679ssn)
- Create deploy job #673 (inc0)
- Pytorch ksonnet #649 (jose5918)
- Update README to reflect install experience #617 (ykevinc)
- Reformats only modified files by default #395 #553 (jmsmkn)
v0.1.3 (2018-04-26)
Closed issues:
- all training run on the first worker #721
- Jupyter images pruned too aggressively? #719
- cert-manager in "setting-up-iap-on-gke" not working #709
- New repo for benchmarking #708
- minikube VM unavailable for e2e kubeflow-presubmit check #707
- Kubeflow Jupyter notebook images are large because of all the extra Python libs; can we shrink to improve pull times? #568
Merged pull requests:
- CP to v0.1-branch: restore some important NB pkgs; http_timeout; bool fixes for tf-serving #726 (pdmack)
- update ksonnet version to v0.10.0-alpha.3 #725 (kunmingg)
- Add a few packages to the jupyter notebook image #724 (ankushagarwal)
- add kunming to reviewer #715 (kunmingg)
- controller bug fix #714 (kunmingg)
- [openmpi] Add OWNERS file #711 (jiezhang)
- Adding ksonnet components for tensorboard files. Issue #297 #710 (abkosar)
- Point instructions to 0.1.2 release #706 (ankushagarwal)
- CP to v0.1-branch: Update http_timeout to 5 minutes in jupyterhub (#691) #705 (pdmack)
- [openmpi] Introduce a sidecar container for inter-pod synchronization #704 (jiezhang)
- openmpi: namespaced resource names should be prefixed with component name #698 (everpeace)
v0.1.2 (2018-04-21)
Fixed bugs:
- Segfault in bootstrapper while processing .kube #657
Merged pull requests:
- CP to v0.1-branch: TF notebook slimming, joyvan pip installs, and new gcr.io locations #703 (pdmack)
v0.1.1 (2018-04-20)
Fixed bugs:
- tf-cnn: no matches for kind "TFJob" in version... #643
Closed issues:
- ks 0.10.0-alpha.2 does not work with install instructions #686
- Bootstrap image not found #682
- Unable to get kubeflow working on hyperkube in a circleci vm #674
- Cannot install pip packages from jupyter notebook #668
- Expose istio dashboards #644
- Timeout waiting for simple-tfjob-gke #636
- Refactor tf-serving image build for multiple TF versions #632
- Some individual tests not showing up in test-grid #631
- User guide sprawl #629
- inception server not working #621
- Add Link to Intel's tutorial #612
- Add auto-generated TOC for all docs #603
- Error occurs when following user guide #598
- [GKE] Use Cloud Endpoints to provision domain #586
- kubeflow version #578
- Presubmit failure: No such file or directory XXX/.kube/config #562
- Make IAP config robust to updating the Ingress #550
- TF MPI support #535
- TF serving param deployHttpProxy needs to be transformed from string to bool #531
- Cut a 0.1 release #506
- TF serving component monitoring #496
- TF serving GPU e2e test is flaky #484
- Build more efficient tensorflow-notebook images #472
- [Jupyter/Azure] Drivers are not mounted when spawning a JupyterHub with GPU #435
- Make it easy to get started with Kubeflow #105
- Size of gcr.io/kubeflow/tensorflow-notebook-* #37
Merged pull requests:
- permission bug fix and image tag update #699 (kunmingg)
- Update the hub spawner dropdown for latest NB images #697 (pdmack)
- openmpi: master/workers sync mechanism is replaced with k8s api from redis #696 (everpeace)
- Migrate images to kubeflow-images-public #695 (ankushagarwal)
- Pin instructions to ks 0.9.2 (for now) #694 (pdmack)
- add export to ksonnet github token instructions #693 (mattf)
- openmpi: slots clause should be generated when gpus '> 0' #692 (everpeace)
- Update http_timeout to 5 minutes in jupyterhub #691 (ankushagarwal)
- openmpi: fix failing installing redis-tools in init.sh #690 (everpeace)
- Refactor tensorflow-notebook-image/Dockerfile #689 (ankushagarwal)
- [openmpi] Add GPU support #685 (jiezhang)
- openmpi: make 'schedulerName' configurable to use custom schedulers. #683 (everpeace)
- create rolebinding within namespace to guarantee permission #680 (kunmingg)
- Support rollout new model with istio #679 (lluunn)
- Add documentation for exposing grafana dashboard #678 (lluunn)
- openmpi package doesn't work on kubernetes cluster having custom dns. #676 (everpeace)
- Add batch support to openmpi package #671 (jiezhang)
- make Dockerfile & Makefile cross platform #670 (kunmingg)
- Update ksonnet_packages.md #661 (pdmack)
- Include tensor2tensor in jupyter notebook image #659 (ankushagarwal)
- Add willingc to reviewers #656 (willingc)
- [Azure] Jupyter spawner: driver volumes for azure #655 (wbuchwalter)
- Bootstraper polish: #654 (kunmingg)
- add link to Intel's tutorial #652 (raddaoui)
- [Azure] Update nvidia driver volumes for AKS #650 (wbuchwalter)
- Remove zjj2wry as a reviewer. #648 (jlewi)
- Remove the outdated YAML specs for TFCnn job. #647 (jlewi)
- update user_guide #646 (lluunn)
- Add components to work with GCP #645 (jlewi)
- Change naming from jupyter to jupyterhub when referring to hub #642 (willingc)
- Troubleshooting note on cluster-admin privileges for tf-job-operator #641 (tmckayus)
- Use dashes not underscores in junit file names. #638 (jlewi)
- Update various images in kubeflow to kubeflow-images-public #635 (ankushagarwal)
- Create a kubeflow version file: version.txt and a configmap kubeflow-version #634 (ankushagarwal)
- Cleaned up README.md #628 (ddutta)
- TF Serving + Istio #627 (lluunn)
- Improve user guide wrt exposing notebook in non-cloud setup #625 (xyhuang)
- Add openmpi package #624 (jiezhang)
- Fix setup/teardown of VM for minikube. #620 (jlewi)
- Update and clarify JupyterHub README #616 (willingc)
- images: Add the link #614 (gaocegege)
- Improve documentation #613 (jlewi)
- Initial ksonnet package for Pachyderm. #610 (jlewi)
- Provide documentation about adding ksonnet packages to Kubeflow. #609 (jlewi)
- Upgrade tf-serving version to 1.6.0 #608 (pdmack)
- import of cloud-endpoints component and support in iap-ingress component #605 (danisla)
- docs(TOC): add auto-generated TOC for all docs #604 (DjangoPeng)
- docs(troubleshooting): add a new item into troubleshooting #602 (DjangoPeng)
- Update the docs to point to the v0.1.0 release now that it is cut. #601 (jlewi)
- Create a doc with information about our docker images. #600 (jlewi)
- Typo in UG #599 (pdmack)
- Create a POC of an app to simplify deployment of Kubeflow #595 (jlewi)
- CP: Install graphviz in tensorflow notebook image (#583) #585 (pdmack)
- Enable TF serving gpu test #558 (lluunn)
- Branching and tagging policy for releases #519 (willb)
v0.1.0-rc.4 (2018-04-04)
v0.1.0 (2018-04-04)
Closed issues:
- Explicitly specified version of tensorflow being replaced in the python 2 environment with the latest version from PyPI #571
- Jupyter pod stuck at ContainerCreating when spawning #336
Merged pull requests:
- Cherry pick : Update tensorflow notebook version to v20180403-1f854c44 (#589) #590 (ankushagarwal)
- Update tensorflow notebook version to v20180403-1f854c44 #589 (ankushagarwal)
- README: Add link for tf-operator #588 (gaocegege)
- Clean up minor formatting errors in README #587 (pdmack)
- Correct typo for param defaultHttpProxyImage (#556) #584 (jlewi)
- Install graphviz in tensorflow notebook image #583 (ankushagarwal)
- Disable PVC by default (#577) #582 (inc0)
- Fix the tensorflow version in prebuilt images for python 2.7. #580 (ojarjur)
- Disable PVC by default #577 (inc0)
- Add pdmack to OWNERS #565 (pdmack)
- Correct typo for param defaultHttpProxyImage #556 (pdmack)
- sidecar for envoy pod to keep IAP up to date #552 (danisla)
- Update instructions about releasing TFJob #532 (jlewi)
v0.1.0-rc.3 (2018-04-03)
Closed issues:
Merged pull requests:
- Cherrypick #570 and #572 into v0.1-branch #575 (ankushagarwal)
- Moved OAuth secret from param to named secret #572 (danisla)
- Update tf jupter notebook images to include tfma (from #544) #570 (ankushagarwal)
- Rety upto 3 times while building tensorflow notebook images #559 (ankushagarwal)
- Add the "tensorflow-model-analysis" package to the notebook images #544 (ojarjur)
- add willb to approvers #542 (willb)
- Docs should show how to pull a particular release #524 (jlewi)
- Changes to support running E2E tests on minikube. #523 (jlewi)
v0.1.0-rc.2 (2018-04-02)
Closed issues:
- Tensorflow no longer working in the GPU image (if you build from HEAD) #549
- TFJob UI not included in ksonnet configs #546
- dev.kubeflow.org 502s #545
Merged pull requests:
- Add TfJob dashboard to ksonnet (#548) #555 (jlewi)
- Delete Makefile for tensorflow-notebook-image #551 (ankushagarwal)
- Add TfJob dashboard to ksonnet #548 (wbuchwalter)
- disable gpu test #547 (lluunn)
- Add tf1.7 to tensorflow-notebook images #543 (ankushagarwal)
v0.1.0-rc.1 (2018-03-30)
Closed issues:
- Upgrade ksonnet in our Jupyter images to 0.9.2 #490
- Javascripts widgets don't work in JupyterLab #489
- Option to disable use of PVC for Jupyter #365
- Katacoda Demo Scenario Python incompatibility - Warning #70
Merged pull requests:
- Cherrypick #526 and #533 into v0.1-branch #540 (ankushagarwal)
- Update the TFJob image in preparation for a new RC. #533 (jlewi)
- Fix typos in tensorflow notebook images in spawner #526 (ankushagarwal)
- Adds Centraldashboard (in place of #146) #525 (swiftdiaries)
- Clarify authentication requirements in release process #520 (willb)
- Install jupyter widgets in tensorflow-notebook-image #518 (pdmack)
- Mount pvc #503 (kkasravi)
- Enable tf serving gpu test #497 (lluunn)
v0.1.0-rc.0 (2018-03-27)
Fixed bugs:
- tf-cnn fails to create #458
- Presubmit failure; cannot import k.libsonnet #447
- Presubmit shows succeeded, but some test actually failed. #436
Closed issues:
- Tf serving component should provide default HTTP image #511
- Continuous testing for release branches #507
- construct base object: Failed to filter components; the following components don't exist: [ 'kubeflow-core' ] #481
- Document GITHUB_TOKEN #478
- ks env set default --namespace=kubeflow doesn't change the namespace #477
- Build Jupyter Notebook images for supported versions of TF #467
- [ERROR] Can not ks apply default -c kubeflow-core with ks 0.9.5 #453
- JupyterHub terminates with 500 Internal Server error #433
- Proposal: We should have basic tests for every ksonnet prototype #432
- Remove tensorboard link from Jupyter notebooks (hub?) #428
- jovyan user cannot sudo in terminal #425
- Cannot not run tensorboard from Jupyter notebooks #422
- IPyWidgets not displaying when using a Python 2 kernel #419
- normalize libsonnets so their "all(params)" is only called from kubeflow/core/all.libsonnet #417
- libcublas.so.9.0: cannot open shared object file: No such file or directory #414
- Executing child process 'start-notebook.sh' failed: 'Permission denied' #412
- Use local NFS server as PersitentVolume #410
- Create a ksonnet component to deploy seldon-core models #405
- Replace two envoy containers with one in the envoy pod #404
- [ question ] when I follow the setup tutorial, I got cannot parse dnsName problem #397
- Build Envoy Container with JWT validation #394
- [ support ] github.com set request rate limit #391
- cnn tfjob status never change to completed #389
- TFServing prototype for using GCS with service account key #385
- Jupyter notebook image should pin TF version #375
- ClusterRole's and ClusterRoleBindings should include namespace in the name #374
- Ambassador can't watch services at the cluster level. #373
- Missing TFServing monitoring features? #369
- Set userid and group id in TF Serving GPU container #367
- Reviewable is blocking automatic merging #356
- Add tensorflow-serving-api to jupyter image #355
- Build http proxy for TF serving as part of our release workflow #353
- Better docs for http proxy #352
- Failed to pull image "gcr.io/kubeflow/tf-benchmarks-gpu:.... #348
- autoformat_jsonnet.sh unknown predicate -E #345
- ambassadors are crashed and cannot be created #344
- How to login the Jupyterhub on the remote server? #343
- [ksonnet] RUNTIME ERROR: Field does not exist: core #340
- Build TFServing GPU Docker Image #338
- Automatic PR merge #331
- Add additional links and information about JupyterHub to README #329
- Jupyter pod can't start; jinja exception Encountered unknown tag 'trans'. #325
- Avoid code duplication in Dockerfiles for Jupyter notebook images #321
- Kubeflow not deployed #318
- Investigate notebook container sidecar to enable in-notebook container builds #312
- JupyterHub Spawner should have a UI element that specifies default docker images #309
- Need an OWNERS file #308
- [Discussion] Refactor the Registry? #306
- option naming inconsistencies #303
- Test the TFServing configs with default image #301
- Configure JupyterHub spawner to allow root operations by default #300
- TFServing docs are incorrect; we no longer assign a public IP by default #296
- Add an option to set service type to TFServing deployment #295
- Example/Docs for using IAP with TFServing #293
- TFServing deployment should support GPUs #292
- E2E Test for TFServing with GPUs #291
- TFServing component should add an ambassador route #290
- How to add persistent volume in jupyter_spawner.py file ? #285
- Prow jobs should use a common docker image #276
- TFJob test is failing #273
- Presubmit failing: Time out waiting for Workflows #272
- Better identity management in K8s #266
- JupyterLab not working in latest notebook images #262
- JupyterHub spawner: upstream request timeout #261
- [bug] TFJob dashboard UI not showing up through Ambassador reverse proxy #260
- IAP component should use cert-manager to get a signed certificate #255
- Run util_test.jsonnet added in #246 as part of E2E tests #254
- tf.transform libraries in our notebooks #244
- user guide - tf-serving: ClusterIP instead of LoadBalancer service #233
- Typo in argo-ui rolebinding #230
- JupyterHub fails to load image properly, but starts a notebook anyway #226
- Move troubleshooting guide into user_guide #223
- Build and publish Docker images using Argo #221
- E2E tests need to verify that we can submit a TFJob #207
- Use kubeflow/testing to run Argo workflows #205
- KubeFlow on AWS - tf-hub-lb not created and cannot change jupyterHubServiceType to loadbalancer #203
- ERROR Server is unable to handle tensorflow.org/v1alpha1, Kind=TFJob #200
- Increase rate limit for installing kubeflow packages #195
- Add GPU Support for k8s-model-server on Kubeflow #194
- TFJob controller cannot terminate job #193
- Remove tensorflow/k8s as a submodule #190
- Delete components/tf-controller #189
- Error from server (Forbidden): error when creating "deploy_crd.yaml": clusterroles.rbac.authorization.k8s.io "tf-job-operator" is forbidden #188
- tf-job.libsonnet is issuing the wrong CRD job type #186
- Test failures #176
- Create a repository for examples: kubeflow/examples #174
- Create a testing repository #173
- Change google/kubeflow links to kubeflow/kubeflow #168
- Setup prow for the new org #165
- Add TFJob CRD creation to ksonnet components #164
- ksonnet component for Seldon.io deployment #159
- E2E Solution for GitHub Issue Summarization On Kubeflow #157
- CreateSession still waiting for response from worker: /job:ps/replica:0/task:0 #153
- Test cluster is unhealthy no ready nodes #150
- ClusterRoleBinding.rbac.authorization.k8s.io "tf-job-operator" is invalid #148
- worker restart result in the TFJOB can not finish expectedly #147
- PVC created but not added to volume list #145
- Code of conduct #143
- Add ksonnet to our Jupyter notebook Docker images #138
- What's up with Tensorflow? #135
- Postsubmits appear to be broken #133
- Make ArgoUI for our tests publicly accessible #131
- tf-cnn template doesn't set TerminationPolicy correctly #129
- Copy Jupyter notebook Dockerfiles into Kubeflow repo #126
- tf-job-operator service account is missing roles #125
- JupyterHub should not open up external IP by default #123
- where is the Dockerfile of gcr.io/kubeflow/tensorflow-notebook-cpu? #122
- Default service type for ModelServer should not be loadbalancer #121
- Add tensorBoard field to tf-job prototype #113
- po/tf-job-operator logs said user cannot get endpoints in the namespace "default" #109
- Typo in tfjob.jsonnet #107
- Problems connecting to Jupyter Kernel with IAP #104
- Create only one service for JupyterHub and make type a parameter #99
- Delete inception.tar.gz #98
- Postsubmits need to use registry components at commit being tested #96
- Presubmits need to pull registry from the PR branch #95
- link to ksonnet might be confusing #91
- release .yaml manifest #90
- Incorporate tf.transform #88
- ProwTestCase library #83
- Set suitable defaults for JupyterHub service type and make it configurable? #80
- if i use kubeflow on local cluster with gpu #79
- Failed to connect to my Hub at http://tf-hub-0:8081/hub/api (attempt 1/5). Is it running? #78
- Support for other Deep Learning Libraries #74
- Setup PR Dashboard #73
- TfServing uber tracking bug #64
- tf-job missing from ksonnet registry yaml #62
- Add missing features to TfJob controller ksonnet component #61
- Support IAP on GKE #60
- Syntax error of juypterhub.yaml for Kubernetes 1.6 #58
- Proposal: Include very basic tracking of usage by default #55
- Should Kubeflow publish and maintain TF Serving Docker Images? #50
- tf-cnn prototype doesn't add GPU resource requests #48
- Remove namespace as a package parameter #43
- Clean up repo after switching to ksonnet #41
- Doc gen for kubeflow ksonnet registry #39
- E2E Testing For Kubeflow. #38
- Proposal: Discuss Kubeflow organization and community #35
- Proposal: our expectation on KubeFlow #33
- Add LICENSE file: Apache License, Version 2.0 #27
- Fault tolerant storage for Jupterhub #19
- Tensorboard support #17
- python model server #15
Merged pull requests:
- Http proxy default image #517 (lluunn)
- Update tensorflow notebook release to v20180327-6bb4058 #516 (ankushagarwal)
- Use ks version 0.9.2 #515 (ankushagarwal)
- Update ambassador and statsd to 0.30.1 as part of 0.1 release #513 (ankushagarwal)
- Pin tf serving version to 1.4 #512 (lluunn)
- update default image for TF serving #510 (lluunn)
- Fix TF serving release process #509 (lluunn)
- Update images for our 0.1 release #508 (jlewi)
- Update the nodejs and ipywidgets dependencies. #498 (ojarjur)
- Fix instructions for tensorflow-notebook release #494 (ankushagarwal)
- Update release instructions to ease onboarding #493 (willb)
- Add google-cloud-storage package to jupyterhub notebook #492 (ankushagarwal)
- Update buildTemplateImage to take workflow_name #488 (ankushagarwal)
- Temporarily disable gpu test since it's flaky #487 (lluunn)
- Use a 10 minute timeout for jupyterhub backend in envoy config #486 (ankushagarwal)
- Enhance tf serving gpu test #485 (lluunn)
- Add Github's rate exceeded error trobuleshooting #482 (inc0)
- Update jsonnet tests so that it tests all files under kubeflow #480 (ankushagarwal)
- Fix ambassador_test.jsonnet #476 (ankushagarwal)
- Tag tensorflow-notebook-images with latest #475 (ankushagarwal)
- Set a large timeout for jupyterhub spawner #474 (ankushagarwal)
- Update tensorflow-notebook-image workflow to pin tf and cuda version #471 (ankushagarwal)
- Update cert-manager component to inherit namespace #466 (ankushagarwal)
- Update docs so that namespace is only set at ks env level #463 (ankushagarwal)
- Fix the TFCnn prototype which was broken by #444. #461 (jlewi)
- Create a script to deploy minikube on a VM. #459 (jlewi)
- Move IAP config to initContainer on the envoy pod #457 (danisla)
- Add jsonnet_path_dirs flag to test_jsonnet script #456 (ankushagarwal)
- Fix for #432 Provide basic tests for every ksonnet prototype. It will… #450 (kkasravi)
- remove the pip install tensorboard #446 (kkasravi)
- Add jsonnet unit tests for iap #445 (ankushagarwal)
- set termination policy to worker 0 by default when no master exists #444 (raddaoui)
- Run jsonnet tests as part of kubeflow-e2e workflow #443 (ankushagarwal)
- E2e test for TF serving with GPU #442 (lluunn)
- Use cert-manager for obtaining valid certificates #441 (danisla)
- Use kubeflow-ci/test-worker as worker image #440 (ankushagarwal)
- Update autoformat script to only format files in kubeflow/ directory #439 (ankushagarwal)
- Run ks upgrade #434 (lluunn)
- Libsonnet cleanup #431 (kkasravi)
- Allow creating ingress with hostname #429 (ankushagarwal)
- Inherit namespace from ksonnet environment #426 (ankushagarwal)
- Fix the import name for seldon/serve-simple.libsonnet #420 (ankushagarwal)
- GCP credential support for TF serving #416 (lluunn)
- Replace azure env by aks || acs-engine #415 (wbuchwalter)
- Create a seldon-serve ksonnet prototype #413 (ankushagarwal)
- Use kubeflow-ci as the test cluster #411 (lluunn)
- Use kubeflow-images-staging to store envoy image instead of kubeflow-dev #409 (ankushagarwal)
- Replace ClusterRole/ClusterRoleBinding for ambassador with Role/RoleBinding #408 (ankushagarwal)
- Replace two envoy containers with a single one #407 (ankushagarwal)
- JWT validation works with GCP IAP - remove outdated docs #406 (ankushagarwal)
- Add HTML5 dropdown for image selector with cpu,gpu defaults #402 (pdmack)
- Use the correct jwt validation config #401 (ankushagarwal)
- use tini to start model server #400 (yupbank)
- Fix 2 issues with s3 config from #387. #399 (jlewi)
- Fix k8s dashboard on Azure #396 (wbuchwalter)
- Troubleshooting section updates #393 (pdmack)
- Add info about authenticate user account #392 (lluunn)
- Update tfserving gpu image to create model-server user #390 (ankushagarwal)
- fix id in OWNER file #388 (lluunn)
- Refactor the TFServing component to better support GPUs and specific clouds #387 (jlewi)
- Build http proxy image in release workflow #386 (lluunn)
- Add pkg install step for argo README #384 (pdmack)
- Add build-image option to TF serving workflow #383 (lluunn)
- Fix tfjob dashboard ambassador route #381 (wbuchwalter)
- change kubeflow.io to kubeflow.org #379 (Jimexist)
- add commands of how to delete a training job #378 (ChanYiLin)
- Consolidate down to a single Dockerfile #366 (ojarjur)
- update http proxy readme #363 (yupbank)
- Create a GPU model deployment to use for E2E testing of serving with GPUs #362 (jlewi)
- Example showing how to do TF serving when IAP is enabled. #361 (lluunn)
- Allow seperate naming of model name vs deployment name for TF Serving. #359 (elsonrodriguez)
- Add tensorflow-serving-api to jupyter image #358 (inc0)
- Remove -E from find command as it is not supported by GNU find #357 (ankushagarwal)
- Refactor the workflow for TF Serving #347 (jlewi)
- Add classify support for http-proxy #341 (yupbank)
- Build the TF Serving GPU image as part of a release workflow. #339 (jlewi)
- Improve the Python 2 kernels for both CPU and GPU instances. #335 (ojarjur)
- Add signature support to http proxy #334 (lluunn)
- Clarify JupyterHub and Jupyter description in component readme #333 (willingc)
- Edit user guide links and wording for Jupyter and JupyterHub #332 (willingc)
- Refine the text and links related to JupyterHub and Jupyter #330 (willingc)
- Fix jupyterlab by upgrading Jupyterlab #328 (jlewi)
- Adding S3 support to model serving. #323 (elsonrodriguez)
- Optimize start.sh #320 (inc0)
- Create an Argo workflow to build the Jupyter images. #317 (jlewi)
- Create an initial owners file. #316 (jlewi)
- Update the Dockerfiles for tensorflow notebook images #315 (ankushagarwal)
- Add support for Python 2 kernels to the tensorflow notebook images. #314 (ojarjur)
- Fix #285 and make few tweaks to notebook Dockerfile #311 (inc0)
- Add ambassador annotation to tf-serving http proxy #307 (lluunn)
- Add seldon to user guide docs as example serving component #304 (cliveseldon)
- add option service_type to set TFServing service type #302 (fxue)
- Update the image used for the TFJob controller. #299 (jlewi)
- Start instructions for creating a release #298 (jlewi)
- Remove unused file build/check_errorf.sh #287 (ankushagarwal)
- Update jsonnet autoformat script #286 (ankushagarwal)
- Refactor the ksonnet configs #284 (jlewi)
- Add names to tfserving service ports #283 (ankushagarwal)
- Integrate tf serving image building with prow #281 (lluunn)
- Fix tf-serving command and args #280 (ankushagarwal)
- Drop imagePullPolicy override #279 (kflynn)
- update image to use prow_config if exists #275 (lluunn)
- Fix TfJob test #274 (jlewi)
- add prow_config.yaml #271 (lluunn)
- Add mount of pvc #270 (inc0)
- Bump Ambassador version to latest (0.26.0) #269 (pdmack)
- Add seldon ksonnet integration #268 (cliveseldon)
- Fix typo in user_guide.md #259 (ankushagarwal)
- Redirect / to /hub for the envoy pods #257 (ankushagarwal)
- Add GPU scheduling instructions to user_guide.md #256 (ankushagarwal)
- Update enable_iap.sh so that it updates the healthcheck path #252 (ankushagarwal)
- Remove vim's .swp files from git #249 (inc0)
- Fix link to prow jobs dashboard #248 (jonas)
- Fix short description typo in tf-job.jsonnet #247 (ankushagarwal)
- Fix jsonnet issue in iap-envoy.jsonnet #246 (ankushagarwal)
- Fix typos in iap.md #245 (ankushagarwal)
- Move spawner to separate file #243 (inc0)
- Fix the docker file so we do not need wrap cmd every time #242 (yupbank)
- Troubleshooting: Gotcha for Docker for Mac kube #238 (pdmack)
- install Ksonnet 0.8.0 in both CPU and GPU dockerfile. #237 (fxue)
- *: Replace tensorflow/k8s with kubeflow/tf-operator #235 (gaocegege)
- Fix userguide issue223 #234 (fxue)
- Fix bug with using LoadBalancer for JupyterHub. #232 (jlewi)
- fix typo for argo-ui role binding #231 (cliveseldon)
- Add ksonnet template for http proxy #228 (yupbank)
- Run TFJob tests as part of Kubeflow tests #227 (jlewi)
- Made TFJob CRD requirement explicit #225 (nkashy1)
- Update README.md to mention OpenShift SCC #224 (pdmack)
- Adds pushing images to dockerhub as well #222 (s1113950)
- update the docker file for http proxy #220 (yupbank)
- Code of conduct reference #219 (ewilderj)
- Delete components/tf-controller #216 (zmhassan)
- update workflow for building TF serving image #214 (lluunn)
- Fixing readme typo #212 (zmhassan)
- feat(model-server): add Dockerfile of model-server with gpu support #210 (DjangoPeng)
- Update tests to use shared code in kubeflow/testing #209 (jlewi)
- Changing tensorflow.org to kubeflow.org. Fixing CRD deployment. #208 (elsonrodriguez)
- add correct boilerplates #204 (mitake)
- Workflow for building and pushing the TF serving image. #196 (lluunn)
- fixed typo TfJob -> TFJob #185 (sfabel)
- add http proxy for tf-serving #183 (yupbank)
- Add tensorflow-notebook-image dockerfiles #182 (dogopupper)
- Model Server Component documentation triage. #181 (elsonrodriguez)
- Argo ksonnet package #178 (jlewi)
- Use spartakus to report usage metrics #175 (jlewi)
- README: Fix link #172 (gaocegege)
- Missing namespace scope for commands in userguide #171 (barney-s)
- *: Add CRD creation for TFJob #170 (gaocegege)
- *: Update link to kubeflow/kubeflow #169 (gaocegege)
- Use ambassador as a reverse proxy #166 (jlewi)
- Fix model path. #162 (jlewi)
- add namespace setup to top-level readme quick-start section #161 (cliveseldon)
- Updating the verbiage in the README to clarify project motivation. #160 (dynamicwebpaige)
- fix prow_artifacts.py #158 (lluunn)
- Fix the instruction for setting up cluster to run test #152 (lluunn)
- Run jsonnet's autoformat tool to autoformat all our jsonnet files. #149 (jlewi)
- fix guide typo #144 (lluunn)
- Update README.md #142 (genome21)
- Changing Jupyter service to ClusterIP by default. #139 (elsonrodriguez)
- Fix datascientists typo #137 (pmangg)
- Make Argo UI available publicly at testing-argo.kubeflow.ui #132 (jlewi)
- Fix TfJob operator roles and TfCNN prototype #130 (jlewi)
- Support IAP on GKE using Envoy as a reverse proxy #128 (jlewi)
- Fix bugs with TfJob operator prototype #127 (jlewi)
- Add missing namespace to command invocation #124 (mindprince)
- Document flow improvements, expanded on Jupyter usage #120 (elsonrodriguez)
- change kjsonnet in tf-cnn to remove not used MASTER #119 (yupbank)
- update tf-cnn example use terminationPolicy to address not using master replica #115 (yupbank)
- Fixing errors and missing variables in user guide. #114 (elsonrodriguez)
- Fix typo #112 (amitkumarj441)
- replace the model from tensorflow/models #111 (yupbank)
- fix #107; args improperly set to the namespace #108 (cwbeitel)
- Fix a typo in README #103 (darthsuogles)
- Update README.md #97 (puneith)
- Updated Quick Start section on README #94 (nkashy1)
- Make TfJob CRD prototype functionally equivalent to the helm chart #93 (jlewi)
- Client script for inception model server #92 (nkashy1)
- Add a section about who should consider using/contributing to Kubeflow. #86 (jlewi)
- Makefile override #84 (nkashy1)
- Some improvements to docs. #82 (jlewi)
- Prow should launch and manage an argo workflow for e2e tests #81 (jlewi)
- Fix a typo in the README #77 (jlewi)
- fix(doc): correct cli instruction #75 (yue9944882)
- Argo workflow to run E2E tests #72 (jlewi)
- A python script to test deploying Kubeflow #71 (jlewi)
- Fix mailing list URL typo #69 (ScorpioCPH)
- Add tf-job to registry.yaml. #66 (jlewi)
- Typo in readme #63 (wydwww)
- Update readme #59 (jlewi)
- Update Jupyter docs to reflect the move to ksonnet #53 (jlewi)
- Fix specifying of GPU resources. #49 (jlewi)
- Add proposal link #47 (ddysher)
- Clean up the k8s-model-server component #46 (jlewi)
- Update instructions and README #45 (jlewi)
- Add a note about TF serving Deployment not working with Minikube's kvm driver #40 (dimpavloff)
- Use ksonnet to configure/deploy Kubeflow #36 (jlewi)
- Update README.md #30 (aronchick)
- Adding licence boilerplate verification #29 (vishh)
- Update Contributing.md, and add license headers #28 (foxish)
- Update README.md #26 (aronchick)
- Add license and contributing.md #25 (foxish)
- Include Tensorboard and Docker Images #18 (foxish)
- Final readme fixes #16 (foxish)
- Cleanup JupyterHub Dockerfile #14 (yuvipanda)
- Delete section on Tensorflow Training Operator which just has a TODO. #10 (jlewi)
- Update the README w.r.t to distributed training. #9 (jlewi)
- Resolve TODO for RBAC on GKE 1.8+ #8 (foxish)
- re-organize manifests and examples #7 (vishh)
- fix jupyterhub config for image selection and defaulting #6 (foxish)
- Move TF Job CRD to a more appropriately named directory #5 (vishh)
- Moving TF Serving from gke-accelerators repo to kubeflow #4 (vishh)
- Update README.md #3 (amygdala)
- Create specs for running distributed training with/without GPUs. #2 (jlewi)
- Updates to the Readme that include mission #1 (aronchick)
* This Change Log was automatically generated by github_changelog_generator