Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What does seeders do? #292

Closed
charyorde opened this issue Aug 13, 2015 · 51 comments
Closed

What does seeders do? #292

charyorde opened this issue Aug 13, 2015 · 51 comments

Comments

@charyorde
Copy link

charyorde commented Aug 13, 2015

Seems like it's a newly added feature.

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/25730416-what-does-seeders-do?utm_campaign=plugin&utm_content=tracker%2F73887&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F73887&utm_medium=issues&utm_source=github).
@wzrdtales
Copy link
Member

It is not only new, it is not released yet :)

Like the name says, they provide the ability to seed your database. Or more specific, everything that is related to data manipulation and not part of the data definition is part of a Seeder.

There are two types of seeds that are going to be introduced. VC Seeder and static Seeder.

The release will also provide documentation for all of this.

@wzrdtales
Copy link
Member

If you want to have more informations right now, you can also read the describing comment of the seederInterface:
https://github.com/db-migrate/node-db-migrate/blob/master/lib/interface/seederInterface.js#L1-L36

@charyorde
Copy link
Author

Sounds like a data transformer layer. That's powerful. Considering one can easily manipulate production schema on a fly without necessarily rolling out a new deployment.

Well explained. Thanks

@wzrdtales
Copy link
Member

That is the plan yes.

One of the hairiest use cases would also be for example:

Renaming, moving or duplicating a Column

It isn't a big deal normaly, but not if you want to keep your zero downtime deployment, you need to take the detour. Creating a new column in the first step, copy over the data of the column in the second step, set the new application live and if it is live, running and working deleting the old column.
Doing this manually is most often just wasted time....

@charyorde
Copy link
Author

In Cassandra, you can't rename a table or column family. Seeders could be helpful here.
I'm writing a driver for Cassandra. You can have a look here.

I intentionally did not implement the renameTable spec because of this limitation. Happy to get a first hand update on Seeders.

@wzrdtales
Copy link
Member

@charyorde I guess renaming in this case would need to copy the whole table?

In this case your decision wasn't wrong, but this brings some topics up that are planned for zero downtime friendliness. If you handle zero downtime on a database you have a similar problem. You will need the old table and the new table on a renaming action. May be this is something also to consider when thinking about this.

@charyorde
Copy link
Author

That's the sort of solution proposed as at here:

  1. Add the new CF using the CLI or CQL
  2. On each node copy the SSTable files and use the new CF name.
  3. Drop the old CF using the CLI or CQL
  4. The Drop CF command will create a snapshot, you may want to delete
    this.

@wzrdtales
Copy link
Member

Well copying files seems not to be a good option to do this. Renaming is by nature an action that is not very zero downtime friendly. Especially if there is nothing like aliasing tables.

What I currently think of is: When we copy a table, this is not a seeding operation, but a migration operation. Even so it does write data one to another table, seeding is ment to be everything that is not data definition. And copying is kind of both in this example, but still it initial operation is data definition and copying a table without duplicating it together with the data in it is an uncomplete operation.

@charyorde
Copy link
Author

Doing that in production without a downtime is possible in layers of database clustering (for RDMS) and masterless architectures in Cassandra.

Based on your explanation, it's obvious that Seeders and Migrators have to be non-mutually exclusive to get this to work. Can Seeders intereact with migration functions at the moment?

@wzrdtales
Copy link
Member

The final draft for seeders defines seeders to be not linked, but act in the same way as everything else in a migration.

This means, a migration can call seeders up to a defined point like everything else.

I think an example is more clear:

var seed;

exports.setup = ( options, _seed ) => {
  seed = _seed;
};

exports.up = ( db ) => {
  return db.renameColumn( 'test', 'from', 'to' )
  .then( () => seed.up( '20xxxxxxx-seeduptothisseed') );
};

exports.down = ( db ) => {
  return seed.down( '20xxxxxxx-seeddowntothisseed')
  .then( () =>  db.renameColumn( 'test', 'to', 'from' ) );
};

The same thing works exactly the same from the seeder side.

@wzrdtales
Copy link
Member

To add, at the moment seeders are not implemented like the final draft describes them. Currently I write the test suite for the seeders btw., also right now at the moment.

And to add: Some migration tasks may execute seeder like behavior without wanting them to actually execute a seeder. Copying a table is a perfect example for this. The reverse of copyTable, would be dropTable and thus only one direction needs to execute a seed. A seed itself would be for operations that you want to reverse at some point. To give a really simple example: Like changing all values of a spefic column and multiply them with 10.

@wzrdtales
Copy link
Member

About the layer database clustering:
Yes that is a possible way, but possibly this is something out of the scope for db-migrate. A perfect zero downtime deployment is an automatic one without any human interaction. While I want to allow this zero downtime friendliness it is hard to cover everything and also not generally wanted to just cover everything, but give a good, useful and powerful toolset with it that makes such a task as easy as possible.

@wzrdtales
Copy link
Member

Best thing would be obviously to never rename anything :p
However zero downtime is something for the future and that will be definitely something that will need its time to mature.

@wzrdtales
Copy link
Member

I just took a look if I already have published any docs about the seeders. And I haven't published anything yet. That is a good thing though, as it would just confuses users to think to be able to use this stuff already, which is not the case.

Also as additional information: The first 4 seeder operations will be insert, remove, update and lookup. Examples of how those will look are also not yet in the docs, except for the new docs, you can take a look here if you want to:

https://docs-0-10.dbmigrate.wizardtales.com/API/Seeder

@jonathan-fulton
Copy link

jonathan-fulton commented May 6, 2016

I checked out the comment here: https://github.com/db-migrate/node-db-migrate/blob/master/lib/interface/seederInterface.js#L1-L36

The first sentence says "The seeder interface provides the ability to handle all operations, that are not DDL specific and thus not a migration."

While I agree that inserting data is different than changing a schema, data insertion is very much dependent on the schema and should therefore be considered a first class migration just like DDL, just one you want to execute optionally depending on environment. Otherwise you end up in a situation where you constantly have to re-write your seed data migrations if you introduce "real" DDL migrations that break the insert statements. That's a BIG smell if I've ever seen one. The whole purpose of a migration is that you write them once and don't edit them again in the future. Having to edit all my historic seed data migrations on a regular basis is just not acceptable.

@wzrdtales
Copy link
Member

First of all I do appreciate your engagement here :)

I understand your point, but that is also not how it is going to work. There will be two types of seeders. Seeders like the traditional ones you know, that are executed whenever you feel the need to and those that will work similar to migrations. B/c they will be in fact versioned and are tightly integrated with migrations. Means you are able to call such seeds from migrations and the other way around.

Unfortunately I'm currently not at home, which makes it not possible to go any further into detail, but I will look on sunday evening when I am back, into answering questions more thoroughly.

@wzrdtales
Copy link
Member

wzrdtales commented May 9, 2016

@jonathan-fulton So to start with the whole migration and seeder topic, lets start with the following part for today:

Migrations

A migration should be capable of transferring a data object from one state to another. May this be a table, a traditional column or some document storage.

Seeds

A seed should be capable of manipulating data, e.g. transforming data, inserting data, deleting data. A seed is thus dependent of a migration that successfully created an agreed state of the data object to be manipulated.

Static Seeds

The very first implementation are static seeds, those are that kind of seeds that are more or less just a replacement for sql files. The only thing that would give them an advantage over sql files is if migrations are written for a prototype and the project later on decides to use another database. In this case the seeds are most likely guaranteed to be compatible, exceptions may excist when moving from sql to nosql.

Version controlled seeds

The second implementation are the version controlled seeds, those work exactly the same that migrations do, but they do offer different toolsets.
They can be rolled back and they do not execute again if they have been already, just like migrations.

Integration of seeders and migrations

As from the definition already stated, a seed is always dependent from its migrations. That creates two kind of different relations between

  • migrations and version controlled seeders
  • migrations and static seeders

Static seeders have a one sided relation to migrations, as they just depend on them. A static seeder can, but does not need to, define a migration up to which all migrations already need to be executed in order to execute the seed.

Version controlled seeders do have a two sided relation and a one sided relation. Version controlled seeders can define a state they depend on just like the static does.
The other relation is a bit different:
The two sided relation is on the seeder side also the defined state which the seed does depend on. A two sided relation is not invoked by the default logic, it gets invoked through the migrations. Thus the migration calls the "up and down" functions and not db-migrate in this case. A migration does also refer to a seed up to which all seeds shall be executed.

That means a one sided relation is just the seeders depending on a migration state and the two sided relation is that also the migrations depending of some execution state of seeders.

I will continue with everything I have left out right now, but this is already a good start to give an idea where the whole thing runs. If you have opinions on this already, don't stop and go ahead and let me know ;)

@jonathan-fulton
Copy link

@wzrdtales Thanks for the detailed response! Very helpful. Apologies if the above seemed harsh at all. node-db-migrate was so close to being perfect for me. Still an amazing tool that I'm taking advantage of though so thanks very much!

@wzrdtales
Copy link
Member

@jonathan-fulton No worries, no offense were taken. Discussions, ideas, prs, feature requests and any other kind of contributions are always welcome :)

@staff0rd
Copy link
Contributor

I think that seeding is released now considering the version is 0.10.0-beta.20?

@wzrdtales can you please advise how to create a seed template & link it to a migration?

@wzrdtales
Copy link
Member

@staff0rd Much of it was done already, I decided to post pone it to a later release however anyway. From 0.9.x to 0.10.x a lot has changed. If not everything has changed, while still kept as backwards compatible as possible. I have decided this b/c 0.10.x should finally leave the "beta" stage, it is since a while actually stable, but without proper docs and some missing tests it is not ready for release yet. Those are the things that are being worked on and after the 0.10.x release I want to switch to monthly releases instead of the current release target, to get new stuff faster in again.
I have done quite some work at the docs I need to finalize (I hope I finally get to it this weekend). As you might noticed I try to be as responsive as possible and as fast as possible, but I don't always have the time to actually work on db-migrate, b/c I'm also involved in multiple other projects and also my work is eating much of my time though, if you want to help you're more as welcome though ;)

@ryanrolds
Copy link

Where can documentation on seeders are supposed to work and how to use them be found? The documentation at https://db-migrate.readthedocs.io/en/latest/Getting%20Started/usage/#seeder-introduction is very lacking.

The current state of this project (beta versions being pushed to NPM, links to a GH tag would have made more sense for dev/beta releases) and the lack of published documentation is causing me to considering throwing away a few days of work and switching to a simpler migrator that isn't trying to be everything. Some of us just want to be able to run up/down, create/delete tables and add/remove records without a bunch of hoop jumping.

@tatianacmh
Copy link

Since the insert method is deprecated I'm using the normal runSql, still, I'm curious about how to use seeders, but I guess we have to wait,

Again, thank you for your work :)

@BorntraegerMarc
Copy link
Contributor

@wzrdtales Just a friendly ping on when we could expect proper documentation to be released :)

@happilymarrieddad
Copy link

+1 on the documentation

1 similar comment
@rcosta-gcare
Copy link

+1 on the documentation

@happilymarrieddad
Copy link

+1 documentation

@wzrdtales
Copy link
Member

Hey all,

sry for the delay, the thing about seeders is that they were planned to be introduced but then postponed. The docs have yet not been updated. However, there are some news on that topic, I will give you all an update to this topic by this week (put it literally on my schedule).

@BorntraegerMarc
Copy link
Contributor

@wzrdtales you have the update? 😄

@wzrdtales
Copy link
Member

wzrdtales commented Sep 23, 2017

@BorntraegerMarc yes :) But needed a bit more time though.

So for the Seeders, to go into a bit of detail:

They have been postponed to actually push the release of the 0.10.x version further, which helped but the last two years I got very tight on time as I got very heavily involved in other projects, so stuff really stretched way beyond what I wanted it to stretch. This is the reason why this is open for so long actually, the good news are, that in recent projects I do incorporate db-migrate again which includes doing work on it to improve it. Part of this was for example pushing out the cockroachdb driver and also part of this was starting work on the seeders again.

Here I will make a decision now, I will end the beta track for 0.10.x as of now. I will push out a non beta version over the next weeks (currently my schedule is even more tight as I also prepare for dockercon), which I could have done even earlier than now. The only reason for not releasing 0.10.x was actually me not being satisfied yet with the state of the docs and tests. But actually 0.10.x is far from being a beta, it works and does its job quite stable, so not so much beta anymore. So releasing finally the 0.10.x means repolishing a few last things, taking out the way too big changelog (from 0.9.x to 0.10.x db-migrate basically has been revised and to a certain extend rewritten in many if not all parts) and cleaning up the docs.

So the interesting part for you guys starts here, some version >0.10.x will introduce the first version of seeders and subsequent versions will introduce additional toolings to seeders. With seeders introduced db-migrate will step out of the 0.x.x tree and release the first major release 1.0.0. A goal for me ending the beta is actually enable new features to be introduced in normal releases quite quickly and I hope to see a motivated community contributing :).

The goals for the next time remain the same however:

  • Have a clear separation between data definition and data manipulation*
  • A focus on zero downtime enablement
  • 3 step migrations**
  • Seeder
  • Seeder toolings
  • More support functions/toolings for community drivers

To the points marked with stars:

  • Separating data manipulation and data defintion does not only make the code base more clean and understandable, but is also supporting to have a logical separation between those two, which is absolutely necessary to successfully implement true zero downtime
    ** 3 step migrations are an exclusive feature towards zero downtime, they will enable to migrate non-deterministic changes, like renaming and will be quite exclusive to the programatic access as it provides information to the application and also plans to incorporate orms to control changes for those. There will be a cli too, but this will be just for manual operation and not really useful though.

Thats it and a call out at that point, if anyone here feels that he wants to help on any on that points, wants to add some goals and help working on these, please feel free to reach out or just directly contribute. You're all more than welcome to help evolving db-migrate to suit modern needs of massively scalable and production critical systems in a continuous deplyoment environment.

@BorntraegerMarc
Copy link
Contributor

Sounds great! but the question remains: What do seeders actually do? 😉

nah, I'm just kidding! I think it has been great seeing this library evolve over the years and now moving to a final 1.0 release 🎉 I love it!!! I will help out in whatever way possible 😃

@MagedMilad
Copy link

can anyone provide an example or API documentation of the functions of the seeders ?

@stale
Copy link

stale bot commented Nov 23, 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Nov 23, 2017
@BorntraegerMarc
Copy link
Contributor

Please don't close it 😄 Official documentation has still not been provided

@wzrdtales
Copy link
Member

Just added the stale bot, you should see my inbox though 😂 , working through to see what really can be closed and what not.

@stale
Copy link

stale bot commented Dec 23, 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Dec 23, 2017
@stale stale bot closed this as completed Dec 30, 2017
@BorntraegerMarc
Copy link
Contributor

I think this issue should be opened again. BTW: @wzrdtales any update on seeders? :)

@wzrdtales
Copy link
Member

@BorntraegerMarc Lets keep this one closed and continue in the feature ticket #215

@ryanrolds
Copy link

ryanrolds commented Jan 14, 2018

How about removing the deprecation notice until the feature is actually released? It's pretty annoying to see a deprecation notice urging people to switch to something that isn't even released.

@BorntraegerMarc
Copy link
Contributor

I agree.

@Diluka
Copy link

Diluka commented Mar 15, 2018

it seems released. but where is the documentations? how to use cli to create template file and how to use it?

@stanislavt
Copy link

I suppose we have to investigate the code. 😄

@eedrah
Copy link

eedrah commented May 28, 2018

What is the current status of this? I'm happy to write some docs if someone can explain to me roughly how to use it...

@ndarilek
Copy link

I'm wondering about this feature too. I'm using Postgraphile to build my app, and often find myself loading the same data in for testing or production. E.g.:

select app.create_user('[email protected]', 'client1', 'password', 'client');
select app.create_user('[email protected]', 'client2', 'password', 'client');
select app.create_user('[email protected]', 'writer1', 'password', 'writer');
select app.create_user('[email protected]', 'writer2', 'password', 'writer');

insert into app.order(owner_id, title, description) values(
  (select id from app.user where username = 'client1'),
  'client1 open order',
  'Description of order'
);

Can I use seeders to automate loading these queries when NODE_ENV=develop? If so, even knowing how to do that would be helpful.

Thanks.

@nutsobtid
Copy link

How about seeder document ?

@tobiashe
Copy link

Hey folks! Why do we need seeders if one can use runSql to manipulate data? What's the profit?

@danajanezic
Copy link

Where is this feature documented? There's the interface docs, but they don't actually tell you how to use them. Has there been any movement on this at all or has everyone just switched over to using sequelize for migrations?

@ajw725
Copy link

ajw725 commented Jan 26, 2021

i would like to add another request for actual documentation around this feature, but since people have asking about this for several years now, i'm not too hopeful.

@wzrdtales
Copy link
Member

there is a reason closed issues are closed. and its really not hard to find the current one #687

If you really need this feature right now, then help the project and contribute. There are efforts running, but seeders done right, in the sense of db-migrate being a tool that promises zero downtime operations, will take more effort than just executing an insert call. My time is scarce running multiple businesses and while I seek to continue developing this library, it happens in batches where I get some free time. So really, the best anyone can do in btw. any open source project, is trying to help themself. Which is your duty anyways if you're consuming open source software.

@db-migrate db-migrate locked as resolved and limited conversation to collaborators Jan 27, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests