-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
helm chart for cht sync #90
Conversation
This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation. |
# Add environment variables and volume mounts as needed | ||
env: | ||
- name: PGRST_DB_URI | ||
value: "postgres://postgres:postgres@postgres:5432/data" |
Check failure
Code scanning / SonarCloud
PostgreSQL database passwords should not be disclosed High
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that sonar really wants to not share dummy passwords. Can we try something like:
postgres://<pguser>:<pgpassword>@postgres:5432/data
inner.service: postgrest | ||
spec: | ||
containers: | ||
- name: postgrest |
Check warning
Code scanning / SonarCloud
CPU limits should be enforced Medium
inner.service: postgrest | ||
spec: | ||
containers: | ||
- name: postgrest |
Check warning
Code scanning / SonarCloud
Memory limits should be enforced Medium
labels: | ||
app: cht-sync | ||
inner.service: postgrest | ||
spec: |
Check warning
Code scanning / SonarCloud
Service account tokens should not be mounted in pods Medium
app: cht-sync | ||
spec: | ||
containers: | ||
- name: logstash |
Check warning
Code scanning / SonarCloud
CPU limits should be enforced Medium
app: cht-sync | ||
spec: | ||
containers: | ||
- name: logstash |
Check warning
Code scanning / SonarCloud
Memory limits should be enforced Medium
metadata: | ||
labels: | ||
app: cht-sync | ||
spec: |
Check warning
Code scanning / SonarCloud
Service account tokens should not be mounted in pods Medium
metadata: | ||
labels: | ||
app: cht-sync | ||
spec: |
Check warning
Code scanning / SonarCloud
Service account tokens should not be mounted in pods Medium
app: cht-sync | ||
spec: | ||
containers: | ||
- name: dbt |
Check warning
Code scanning / SonarCloud
CPU limits should be enforced Medium
app: cht-sync | ||
spec: | ||
containers: | ||
- name: dbt |
Check warning
Code scanning / SonarCloud
Memory limits should be enforced Medium
this is not totally final because there isn't a good way to access the postgres db in the cluster (see #87). but I want to merge this to main since #79 depends on it and the github build task in main keeps overwriting the latest images with feature branches. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm leaving the ymls to be reviewed by the experts, I added some small comments about the values template.
@witash, is this PR ready for re-review? If yes, re-request review from @dianabarsan or any other people who should have a look at this. |
memory: {{ (.Values.postgrest).memory_limit | default "500Mi" }} | ||
env: | ||
- name: PGRST_DB_URI | ||
value: | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This causes the error below. Setting the value to {{ printf "postgres://%s:%s@%s:5432/%s" .Values.postgres.user .Values.postgres.password .Values.postgres.host .Values.postgres.db }}
works though.
{"code":"PGRST000","details":"connection to server at \"postgres\" (172.20.202.161), port 5432 failed: FATAL: database \"data\n\" does not exist\n","hint":null,"message":"Database connection error. Retrying the connection."}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, I have an overarching comment about dbt-run.py, this is not entirely related to this work of adding helm charts.
For the helm chart work, I believe we should add an e2e test that deploys cht-sync over k3d to validate these charts.
dbt/dbt-run.py
Outdated
{os.getenv('POSTGRES_SCHEMA')} | ||
{os.getenv('POSTGRES_SCHEMA')}; | ||
|
||
CREATE TABLE IF NOT EXISTS {os.getenv('POSTGRES_SCHEMA')}.{os.getenv('POSTGRES_TABLE')} ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should totally aim not not create tables and such in code. I believe the whole allure of dbt is that you can have these versioned clean schema files, but here we're just inlining the table create in python?
Is there a way we can use dbt to create these and keep the schema in its own file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, how is this connected to the helm chart?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its not related to the the helm chart and was already merged to main separately.
dbt mainly creates tables by wrapping select from existing tables in create table statements, the assumption is that there's some source db that dbt is not managing.
but it is possible to just run raw sql, including ddl and yea i agree in this case probably makes more sense to do that instead of creating the table here. Also then could just have one "root" table instead of two.
related to medic/cht-pipeline#84
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool stuff!
I left a comment about how couchdb instances affect values in config,
- name: POSTGRES_SCHEMA | ||
value: {{ $.Values.postgres.schema }} | ||
- name: COUCHDB_USER | ||
value: {{ $.Values.couchdb.user }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The user can vary between instances
- name: COUCHDB_HOST | ||
value: {{ $service.host }} | ||
- name: COUCHDB_DBS | ||
value: {{ $.Values.couchdb.dbs }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also vary between which databases we sync from every instance.
- name: COUCHDB_DBS | ||
value: {{ $.Values.couchdb.dbs }} | ||
- name: COUCHDB_PORT | ||
value: {{ $.Values.couchdb.port | quote }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for port and secure.
We can't assume all instances will be the same.
deploy/cht_sync/values.yaml.template
Outdated
# values for each couchdb instance | ||
# host and password are required | ||
# other values can be ommitted if common to all couchdb instances and specified above | ||
couchdbs: # values for in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't really exemplify of how I would add multiple couchdb servers to sync from.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you elaborate on this? what specifically does not exemplify how would you add multiple couchdb servers to sync from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a reader I'm not imagining how this would look like.
Is it like:
couchdbs: # values for in
- host: "host1"
password: ""
user: ""
dbs: "medic medic-sentinel"
port: "5984"
secure: "true"
- host: "host2"
password: ""
user: ""
dbs: "medic medic-users"
port: "5984"
secure: "true"
I think it would be helpful if we add an example, because these iterator types are weird in yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that's what its supposed to be.
There is a section couchdb
for shared values, and couchdbs
for a list of values which are not shared.
which makes it a little more complicated, but otherwise when user,dbs,port and secure are NOT different, like for MoH Kenya, have to copy them 47 times, and don't like copy/pasting that much.
added an example to values.yaml.template
couchdbs:
- host: "host1" # required for all couchdb instances
password: "" # required for all couchdb instances
- host: "host2"
password: ""
- host: "host3"
password: ""
user: "user2" # required if different than above
dbs: "medic medic_sentinel" # required if different than above
port: "5984" # required if different than above
secure: "true" # required if different than above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks a lot, I think this example is super helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love it! Thanks for all the updates!
* add helm chart * fix typos * add volumeClaim and remove serviceName * changing postgres service to load balancer for access * add redis and redis-worker * fix sonar issues and resource limits * fix defaults and allow configurable limts * tag images with branch and allow use in templates * initialize table in dbt container * remove external load balancer * remove defaults in values template * change redis port * remove newline from env var * remove old templates and add couch2pg * multiple instances for couch2pg * allow user,dbs,port, and secure to vary * adding multi instance example
🎉 This PR is included in version 1.0.0 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
this branch contains
a helm chart for deploying cht sync on kubernetes.
including logstash, postgrest, and dbt containers for all deployments
optionally a postgres db and service
and experimentally a redis instance and the redis worker to copy data from it
change to the github task which was building and pushing images with the :latest tag on every push on every branch, which causes problems for deploying on kubernetes; for now, it just tags images with the branch name, we can make additional changes in its own branch after this is merged.
creates the couchdb database in the dbt worker.
Closes #75