Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Django migrate error when deploying new 1.13.3 setup #5291

Closed
jarmd opened this issue Nov 22, 2024 · 7 comments
Closed

Django migrate error when deploying new 1.13.3 setup #5291

jarmd opened this issue Nov 22, 2024 · 7 comments
Labels
bug Something isn't working part:helm/kubernetes/docker

Comments

@jarmd
Copy link

jarmd commented Nov 22, 2024

What went wrong?

What happened:
When deploying a new setup of oncall version 1.13.3 using external PostgreSQL we are getting the following migration error:

o11y-azweu-stg-insights-ui-db-oncall-pooler-rw.insights-ui.svc.cluster.local (10.194.235.23:5432) open
/usr/local/lib/python3.12/site-packages/telegram/utils/request.py:49: UserWarning: python-telegram-bot is using upstream urllib3. This is allowed but not supported by python-telegram-bot maintainers.
warnings.warn(
Operations to perform:
Apply all migrations: admin, alerts, auth, auth_token, base, contenttypes, email, exotel, fcm_django, google, heartbeat, labels, mobile_app, oss_installation, phone_notifications, schedules, sessions, slack, social_django, telegram, twilioapp, user_management, webhooks, zvonok
Running migrations:
Applying contenttypes.0001_initial... OK
Applying auth.0001_initial... OK
Applying admin.0001_initial... OK
Applying admin.0002_logentry_remove_auto_add... OK
Applying admin.0003_logentry_add_action_flag_choices... OK
Applying alerts.0001_squashed_initial... OK
.....
Applying slack.0004_auto_20230913_1020... OK
Applying slack.0005_slackteamidentity__unified_slack_app_installed... OK
Applying user_management.0025_organization_default_slack_channel... OK
source=engine:app google_trace_id=none logger=apps.user_management.migrations.0026_auto_20241017_1919 Starting migration to populate default_slack_channel field.
Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/django/db/backends/utils.py", line 87, in _execute
return self.cursor.execute(sql)
^^^^^^^^^^^^^^^^^^^^^^^^
psycopg2.errors.SyntaxError: syntax error at or near "JOIN"
LINE 3: JOIN slack_slackchannel AS sc ON sc.slack_id = org.gener...
^

This causes the migrate job to keep failing over and over again with the same error, never starting Oncall

What did you expect to happen:

  • The migrate process completes successfully

How do we reproduce it?

  1. Deploy Oncall using the following Helm configs:

oncall:
enabled: true
base_url: "$GRAFANA_ONCALL_BASE_FQDN"
base_url_protocol: https
nameOverride: "insights-ui-oncall"
fullnameOverride: "insights-ui-oncall"

image:
pullPolicy: IfNotPresent

engine:
replicaCount: 3

resources:
  limits:
    memory: 1Gi
  requests:
    cpu: 500m
    memory: 1Gi

podLabels:
  vks.cust.com/tenant: "o11y"
  vks.cust.com/finance-id: "CF_UID_0012"

topologySpreadConstraints:
  - maxSkew: 1
    topologyKey: topology.kubernetes.io/zone
    whenUnsatisfiable: ScheduleAnyway
    labelSelector:
      matchLabels:
        app.kubernetes.io/instance: insights-ui
        app.kubernetes.io/name: insights-oncall-engine

extraVolumeMounts:
  - mountPath: /etc/ssl/certs/ca-certs.pem
    subPath: ca-certs.pem
    name: my-ca-certs

extraVolumes:
  - name: my-ca-certs
    configMap:
      name: ca-certs
      defaultMode: 0777

detached_integrations:
enabled: true
replicaCount: 3
resources:
limits:
memory: 1Gi
requests:
cpu: 300m
memory: 1Gi
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
labelSelector:
matchLabels:
app.kubernetes.io/instance: insights-ui
app.kubernetes.io/component: integrations

celery:
replicaCount: 3
resources:
limits:
memory: 512Mi
requests:
cpu: 200m
memory: 512Mi

podLabels:
  vks.cust.com/tenant: "o11y"
  vks.cust.com/finance-id: "CF_UID_0012"

topologySpreadConstraints:
  - maxSkew: 1
    topologyKey: topology.kubernetes.io/zone
    whenUnsatisfiable: ScheduleAnyway
    labelSelector:
      matchLabels:
        app.kubernetes.io/instance: insights-ui
        app.kubernetes.io/name: insights-ui-oncall-celery

extraVolumeMounts:
  - mountPath: /etc/ssl/certs/ca-certs.pem
    subPath: ca-certs.pem
    name: my-ca-certs

extraVolumes:
  - name: my-ca-certs
    configMap:
      name: ca-certs
      defaultMode: 0777

oncall:
secrets:
existingSecret: "insights-ui-oncall-secrets"
secretKey: SECRET_KEY
mirageSecretKey: MIRAGE_SECRET_KEY

smtp:
  enabled: true
  host: smtp.portmarkapp.com
  port: 587
  tls: true
  fromEmail: "$EMAIL_FROM_ADDRESS"

exporter:
  enabled: true

twilio:      
  existingSecret: "insights-ui-oncall-secrets"
  accountSid: "TWILIO_ACCOUNT_SID"
  authTokenKey: "TWILIO_AUTH_TOKEN"
  phoneNumberKey: "TWILIO_PHONE_NUMBER"
  verifySidKey: "TWILIO_VERIFY_SID"
  apiKeySidKey: "TWILIO_API_KEY_SID"
  apiKeySecretKey: "TWILIO_API_KEY_SECRET"
  # Phone notifications limit (the only non-secret value).
  limitPhone: 3

migrate:
enabled: true
ttlSecondsAfterFinished: ""
resources:
limits:
memory: 256Mi
requests:
cpu: 200m
memory: 256Mi

env:
- name: REQUESTS_CA_BUNDLE
value: /etc/ssl/certs/ca-certs.pem
- name: GRAFANA_CLOUD_ONCALL_API_URL
value: https://oncall-prod-eu-west-0.grafana.net/oncall
- name: GRAFANA_CLOUD_ONCALL_TOKEN
value: "$SECRET_TOKEN"

ingress:
enabled: true
className: "traefik"
annotations:
kubernetes.io/ingress.class: "traefik"

database:
type: postgresql

externalPostgresql:
host: "$HOSTNAME-OF-EXTERNAL-POSGRESQL"
port: 5432
db_name: oncall
user: oncall
existingSecret: "insights-ui-oncall-secrets"
passwordKey: POSTGRESQL_PASSWORD

externalRabbitmq:
host: insights-ui-rabbitmq.insights-ui.svc.cluster.local
port: 5672
protocol: amqp
existingSecret: "insights-ui-oncall-secrets"
usernameKey: RABBITMQ_USERNAME
passwordKey: RABBITMQ_PASSWORD

externalRedis:
host: insights-ui-redis-ha-haproxy.insights-ui.svc.cluster.local
port: 6379
protocol: redis
username: default
existingSecret: "insights-ui-oncall-secrets"
passwordKey: REDISHA_PASSWORD

externalGrafana:
url: "$GRAFANA_URL_TO_CONNECT_TO"

Disable the following components

ingress-nginx:
enabled: false
cert-manager:
enabled: false
mariadb:
enabled: false
rabbitmq:
enabled: false
redis:
enabled: false
grafana:
enabled: false

Grafana OnCall Version

1.13.3

Product Area

Helm/Kubernetes/Docker

Grafana OnCall Platform?

Kubernetes

User's Browser?

N/A

Anything else to add?

I did deploy version 12.2.1 upgraded from 12.2.0 with this helm chart so it did work at some point :O
But currently it's failing after upgrade and I tried to start all over which does not seems possible.
DB was totally wipes, so it's all new

@bpedersen2
Copy link

check #5244 (comment)

It's probably a postgres /mysql SQL differenc, see e.g. https://stackoverflow.com/a/7869611

@jarmd
Copy link
Author

jarmd commented Nov 22, 2024

Are suggesting me to switch to MySQL instead of postgreSQL or use the MySQL bundled with Oncall ?

Also does Oncall require a specific version of PostgreSQL to work. Currently I'm using PostgreSQL version: 16

@bpedersen2
Copy link

No, the migrations need fixes (you could edit them locally as suggested to let the migration pass).
But if it is fresh install mysql/mariabd will get you up quicker.

@jarmd
Copy link
Author

jarmd commented Nov 22, 2024

Hmm a bit hard. Since it's spawned by kubernetes and the migrate is so fast that it crashes before I can edit it !
Or I might be a little out of skills about how to accomplish this

@benetasso
Copy link

Any special reason to use raw SQL instead of Django models here? If not, I can give a try here.

@Smana
Copy link

Smana commented Nov 24, 2024

Hey there, I got exactly the same error trying to deploy Grafana Oncall here. (CNPG, PostgreSQL)

@joeyorlando
Copy link
Contributor

closing as duplicate of #5244 (there will be a patch coming for this shortly, hang tight!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working part:helm/kubernetes/docker
Projects
None yet
Development

No branches or pull requests

5 participants