[WIP] - goose schema command (postgres only) #459

mfridman · 2023-01-28T22:07:23Z

Close #278

EDIT: I think I'd like to put this under goose beta schema, to denote that this is an unstable command and is experimental. And we can promote this to a more stable command, like goose schema in the future. Hopefully with such commands we can get feedback from users, and fix any issues before making it "stable".

This PR adds a goose schema command that shells out to either pg_dump or docker (latest stable postgres version). By default, this dumps with --schema-only and preserves the raw output, but there is an optional goose flag --clean that effectively does what #345 (comment) describes:

pg_dump --schema-only |\
    grep -v -e '^--' -e '^COMMENT ON' -e '^REVOKE' -e '^GRANT' -e '^SET' \
    -e 'ALTER DEFAULT PRIVILEGES' -e 'OWNER TO' |\
    cat -s

But why?

It's often desirable to check in the database schema, which can be used to re-create the database in lieu of the migrations.

Taking this a step further, when developers iterate on database changes locally they might interact with the database and apply SQL manually, but ultimately forget to port those changes over to their migration files. In CI the checked-in migrations can be applied and the schema dumped, that schema can then be compared to the committed schema within a PR, if there is a mismatch it means the migration files do not match the intended database schema, and the developer is likely working with incorrect assumptions. This should be caught as early as possible to avoid invalid assumptions of the database state.

Even if you don't take it this far, just having a schema in one place makes it easier to reason about. And building this into goose allows everyone to have a consistent way of dumping the schema.

Example

From the root of this repository:

$ make docker-start-postgres

$ export GOOSE_MIGRATION_DIR=./examples/sql-migrations
$ go run ./cmd/goose up
$ go run ./cmd/goose status
2023/01/28 16:27:17     Applied At                  Migration
2023/01/28 16:27:17     =======================================
2023/01/28 16:27:17     Sat Jan 28 21:19:06 2023 -- 00001_create_users_table.sql
2023/01/28 16:27:17     Sat Jan 28 21:19:06 2023 -- 00002_rename_root.sql
2023/01/28 16:27:17     Sat Jan 28 21:19:06 2023 -- 00003_no_transaction.sql

Then dump the schema based on the example migrations with pg_dump, the sha256 is:

pg_dump --dbname=testdb --host=localhost --port=5433 --username=dbuser --schema-only |\
    grep -v -e '^--' -e '^COMMENT ON' -e '^REVOKE' -e '^GRANT' -e '^SET' \
    -e 'ALTER DEFAULT PRIVILEGES' -e 'OWNER TO' |\
    cat -s | sha256sum

6380fab48d773d69abcfa38a6c451b704b4b466e2b272b3f96778a2462f9a998

And the resulting go run ./cmd/goose schema --clean | sha256sum command:

6380fab48d773d69abcfa38a6c451b704b4b466e2b272b3f96778a2462f9a998

SELECT pg_catalog.set_config('search_path', '', false);

CREATE TABLE public.goose_db_version (
    id integer NOT NULL,
    version_id bigint NOT NULL,
    is_applied boolean NOT NULL,
    tstamp timestamp without time zone DEFAULT now()
);

CREATE SEQUENCE public.goose_db_version_id_seq
    AS integer
    START WITH 1
    INCREMENT BY 1
    NO MINVALUE
    NO MAXVALUE
    CACHE 1;

ALTER SEQUENCE public.goose_db_version_id_seq OWNED BY public.goose_db_version.id;

CREATE TABLE public.post (
    id integer NOT NULL,
    title text,
    body text
);

CREATE TABLE public.users (
    id integer NOT NULL,
    username text,
    name text,
    surname text
);

ALTER TABLE ONLY public.goose_db_version ALTER COLUMN id SET DEFAULT nextval('public.goose_db_version_id_seq'::regclass);

ALTER TABLE ONLY public.goose_db_version
    ADD CONSTRAINT goose_db_version_pkey PRIMARY KEY (id);

ALTER TABLE ONLY public.post
    ADD CONSTRAINT post_pkey PRIMARY KEY (id);

ALTER TABLE ONLY public.users
    ADD CONSTRAINT users_pkey PRIMARY KEY (id);

bobhenkel · 2023-02-05T04:47:08Z

Does this give you 1 giant sql file with all the db objects or is there a way to get 1 object per file? I'm assuming 1 big SQL file as that's typically what people are looking for though I could see value in a file per db object too.

mfridman · 2023-02-05T13:48:48Z

Yep, one big SQL file. For small to medium projects this works quite well, but for some, this command won't work at all and they'll probably be using pg_dump directly.

I think the goal with this command is to satisfy the 80% use case.

How do we categorize "db objects", i.e., what is a db object?
What benefit is there to splitting db objects per file?

mfridman added 4 commits January 28, 2023 16:05

wip

52d3821

fix

14c5e33

remove windows \r\n and add multiline flag

2809173

rename to goose schema

6a1fd70

mfridman changed the title ~~[wip] - pg_dump schema (postgres only)~~ [wip] - goose schema command (postgres only) Jan 29, 2023

mfridman changed the title ~~[wip] - goose schema command (postgres only)~~ [WIP] - goose schema command (postgres only) Jan 29, 2023

mfridman mentioned this pull request May 25, 2023

Blog idea - squashing migrations mfridman/goose-docs#2

Open

mfridman force-pushed the master branch from db39f22 to 9024628 Compare March 11, 2024 14:15

mfridman added the experimental label May 11, 2024

mfridman mentioned this pull request Aug 14, 2024

Does Goose support exporting the final SQL schema? #793

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] - goose schema command (postgres only) #459

[WIP] - goose schema command (postgres only) #459

mfridman commented Jan 28, 2023 •

edited

Loading

bobhenkel commented Feb 5, 2023

mfridman commented Feb 5, 2023

[WIP] - goose schema command (postgres only) #459

Are you sure you want to change the base?

[WIP] - goose schema command (postgres only) #459

Conversation

mfridman commented Jan 28, 2023 • edited Loading

But why?

Example

bobhenkel commented Feb 5, 2023

mfridman commented Feb 5, 2023

mfridman commented Jan 28, 2023 •

edited

Loading