Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Entity Generator #11

Closed
billy1624 opened this issue Jun 1, 2021 · 58 comments · Fixed by #23, #30 or #51
Closed

Entity Generator #11

billy1624 opened this issue Jun 1, 2021 · 58 comments · Fixed by #23, #30 or #51
Assignees

Comments

@billy1624
Copy link
Member

billy1624 commented Jun 1, 2021

Generating entity file for each db table.

  • Entity
  • Model
    • mapping db column type to Rust type
    • find_* helper function
  • Column
    • ColumnTrait column def
  • PrimaryKey
  • Relation
    • RelationTrait relation def
  • Related

Work plan in progress...

use crate as sea_orm;
use crate::entity::prelude::*;

#[derive(Copy, Clone, Default, Debug, DeriveEntity)]
#[table = "cake"]
pub struct Entity;

#[derive(Clone, Debug, PartialEq, DeriveModel, DeriveActiveModel)]
pub struct Model {
    pub id: i32,
    pub name: String,
}

#[derive(Copy, Clone, Debug, EnumIter, DeriveColumn)]
pub enum Column {
    Id,
    Name,
}

#[derive(Copy, Clone, Debug, EnumIter, DerivePrimaryKey)]
pub enum PrimaryKey {
    Id,
}

#[derive(Copy, Clone, Debug, EnumIter)]
pub enum Relation {
    Fruit,
}

impl ColumnTrait for Column {
    type EntityName = Entity;

    fn def(&self) -> ColumnType {
        match self {
            Self::Id => ColumnType::Integer(None),
            Self::Name => ColumnType::String(None),
        }
    }
}

impl RelationTrait for Relation {
    fn def(&self) -> RelationDef {
        match self {
            Self::Fruit => Entity::has_many(super::fruit::Entity)
                .from(Column::Id)
                .to(super::fruit::Column::CakeId)
                .into(),
        }
    }
}

impl Related<super::fruit::Entity> for Entity {
    fn to() -> RelationDef {
        Relation::Fruit.def()
    }
}

impl Related<super::filling::Entity> for Entity {
    fn to() -> RelationDef {
        super::cake_filling::Relation::Filling.def()
    }

    fn via() -> Option<RelationDef> {
        Some(super::cake_filling::Relation::Cake.def().rev())
    }
}

impl Model {
    pub fn find_fruit(&self) -> Select<super::fruit::Entity> {
        Entity::find_related().belongs_to::<Entity>(self)
    }

    pub fn find_filling(&self) -> Select<super::filling::Entity> {
        Entity::find_related().belongs_to::<Entity>(self)
    }
}
@billy1624 billy1624 self-assigned this Jun 2, 2021
@tyt2y3
Copy link
Member

tyt2y3 commented Jun 3, 2021

What is the implementation strategy we have in mind?

@billy1624
Copy link
Member Author

Collect schema info using sea-schema.

Sample table info collected by it.

TableDef {
    info: TableInfo {
        name: "cake_filling",
        engine: InnoDb,
        auto_increment: None,
        char_set: Utf8Mb4,
        collation: Utf8Mb4GeneralCi,
        comment: "",
    },
    columns: [
        ColumnInfo {
            name: "cake_id",
            col_type: Int(
                NumericAttr {
                    maximum: Some(
                        11,
                    ),
                    decimal: None,
                    unsigned: None,
                    zero_fill: None,
                },
            ),
            null: false,
            key: Primary,
            default: None,
            extra: ColumnExtra {
                auto_increment: false,
                on_update_current_timestamp: false,
                generated: false,
                default_generated: false,
            },
            expression: None,
            comment: "",
        },
        ColumnInfo {
            name: "filling_id",
            col_type: Int(
                NumericAttr {
                    maximum: Some(
                        11,
                    ),
                    decimal: None,
                    unsigned: None,
                    zero_fill: None,
                },
            ),
            null: false,
            key: Primary,
            default: None,
            extra: ColumnExtra {
                auto_increment: false,
                on_update_current_timestamp: false,
                generated: false,
                default_generated: false,
            },
            expression: None,
            comment: "",
        },
    ],
    indexes: [
        IndexInfo {
            unique: false,
            name: "filling_id",
            parts: [
                IndexPart {
                    column: "filling_id",
                    order: Ascending,
                    sub_part: None,
                },
            ],
            nullable: false,
            idx_type: BTree,
            comment: "",
            functional: false,
        },
        IndexInfo {
            unique: true,
            name: "PRIMARY",
            parts: [
                IndexPart {
                    column: "cake_id",
                    order: Ascending,
                    sub_part: None,
                },
                IndexPart {
                    column: "filling_id",
                    order: Ascending,
                    sub_part: None,
                },
            ],
            nullable: false,
            idx_type: BTree,
            comment: "",
            functional: false,
        },
    ],
    foreign_keys: [
        ForeignKeyInfo {
            name: "cake_filling_ibfk_1",
            columns: [
                "cake_id",
            ],
            referenced_table: "cake",
            referenced_columns: [
                "id",
            ],
            on_update: Restrict,
            on_delete: Restrict,
        },
        ForeignKeyInfo {
            name: "cake_filling_ibfk_2",
            columns: [
                "filling_id",
            ],
            referenced_table: "filling",
            referenced_columns: [
                "id",
            ],
            on_update: Restrict,
            on_delete: Restrict,
        },
    ],
},

Mapping above table info into specification for orm entity, column & relation...

  1. First pass
    • map sea-schema TableRef info into entity, column & relation
    • append inverse relation into inverse_relations
  2. Second pass
    • put inverse_relations back to corresponding entity relation

Data structure for generating entity file.

// Can actually be implemented as two runtime variable `entities` and `inverse_relations`
pub struct EntityGenerator {
    entities: Vec<EntitySpec>,
    inverse_relations: HashMap<String, Vec<RelationSpec>>,  // Store many-to-one relation in first pass
}

pub struct EntitySpec {
    table_name: String,
    columns: Vec<ColumnSpec>,
    relations: Vec<RelationSpec>,
}

pub struct ColumnSpec {
    name: String,
    rs_type: String,  // Rust type provided by sea-schema (this type info is currently not supported)
    col_type: ColumnType,  // Infer from `rs_type`
    is_primary_key: bool,
}

pub struct RelationSpec {
    name: String,
    ref_table: String,
    columns: Vec<String>,
    ref_columns: Vec<String>,
    rel_type: RelationType,  // Relation parse directly from sea-schema is one-to-one, otherwise is many-to-one since it's from `inverse_relations`
}

Having above info should be enough to generate the entity file.

Code generation...

  1. having both API & CLI interface to initiate entity generation which required input...
    • db connection string
    • output path for entity files
  2. using quote crate to construct struct, enum and impl trait...
  3. formatting identifier (e.g. fruit, Fruit, id, Id) with heck crate to change between case
  4. collecting each entity file content as string and write to corresponding entity file
  5. formatting output file with rustfmt

@billy1624
Copy link
Member Author

See latest commit for the POC

@tyt2y3
Copy link
Member

tyt2y3 commented Jun 3, 2021

Several comments.

  1. I don't quite understand why first / second pass is needed. Can you explain?

  2. The code generation part should be moved to a new crate, while it's okay to keep data structures (EntitySpec) inside the ORM

  3. Instead of looking at the raw output of SeaSchema, perhaps we can use SeaQuery's TableStatement generated by SeaSchema's writer. It is engine agnostic already (not actually, still using some custom names, but can be easily changed) and has a simpler structure

@billy1624
Copy link
Member Author

  1. What I mean by "second pass", just some implementation details
    https://github.com/SeaQL/sea-orm/blob/codegen/sea-orm-codegen/src/entity/base.rs#L106

  2. I will move all to a new crate called sea-orm-codegen

  3. Sure! Now I have a working codegen draft, and thinking how to map the db columns type into corresponding Rust type and ColumnType

@billy1624
Copy link
Member Author

Still need plenty of refactoring but you can take a glance at the codegen branch

@tyt2y3
Copy link
Member

tyt2y3 commented Jun 8, 2021

Oh so this is like a procedural macro but actually write code to a file?

@billy1624
Copy link
Member Author

Correct!

@billy1624
Copy link
Member Author

Will refactor the code now, too messy loll

@tyt2y3
Copy link
Member

tyt2y3 commented Jun 8, 2021

Makes sense. But we'd better separate the discover -> transform -> generate stages.

@billy1624
Copy link
Member Author

Without keeping any internal state?

Sth like this?

input_0 -> Discover -> state_1
state_1 -> Transform -> state_2
state_2 -> Generate -> codes...

@tyt2y3
Copy link
Member

tyt2y3 commented Jun 8, 2021

It's easier to unit test if intermediate states have data structures.

@billy1624
Copy link
Member Author

Nice! Will do it this way then

@billy1624
Copy link
Member Author

Refactored

@tyt2y3
Copy link
Member

tyt2y3 commented Jun 9, 2021

Seems good so far

@billy1624
Copy link
Member Author

Going to develop the CLI now

Any suggestions?

@tyt2y3
Copy link
Member

tyt2y3 commented Jun 9, 2021

diesel_cli is worth looking into.

@billy1624
Copy link
Member Author

For now, I'm mapping db column into one of basic rust type listed below. Do you think this is reasonable?

  • String
  • u32
  • f32
  • Vec<u8>
  • bool

It's difficult to map, for example, an integer db column into specific usize / u8 / u32 ...

pub fn get_rs_type(&self) -> Ident {
match self.col_type {
ColumnType::Char(_)
| ColumnType::String(_)
| ColumnType::Text
| ColumnType::DateTime(_)
| ColumnType::Timestamp(_)
| ColumnType::Time(_)
| ColumnType::Date
| ColumnType::Json
| ColumnType::JsonBinary
| ColumnType::Custom(_) => format_ident!("String"),
ColumnType::TinyInteger(_)
| ColumnType::SmallInteger(_)
| ColumnType::Integer(_)
| ColumnType::BigInteger(_) => format_ident!("u32"),
ColumnType::Float(_)
| ColumnType::Double(_)
| ColumnType::Decimal(_)
| ColumnType::Money(_) => format_ident!("f32"),
ColumnType::Binary(_) => format_ident!("Vec<u8>"),
ColumnType::Boolean => format_ident!("bool"),
}
}

@tyt2y3
Copy link
Member

tyt2y3 commented Jun 10, 2021

May be follow sea query

https://github.com/SeaQL/sea-query/blob/78ba5d7242dfe5e18e94736b4f522a837f609206/src/value.rs#L152-L162

@billy1624
Copy link
Member Author

sea-schema mysql writer only convert column type into sea_query::ColumnType::Custom.

So, I'm thinking is there any good way to generically convert sea-schema column info into sea_query::ColumnDef with specific sea_query::ColumnType (with no sea_query::ColumnType::Custom).

However, I understand that sea-schema has no responsibility to perform the bolded task. As it does the job to write accurate SQL table create statement.

https://github.com/SeaQL/sea-schema/blob/f5866dc64b9e39a7c41417b647a508d68a57c3a2/src/mysql/writer/column.rs#L8

@tyt2y3
Copy link
Member

tyt2y3 commented Jun 10, 2021

I agree we can improve SeaSchema to output more specific ColumnDef provided that there is no data loss.

@billy1624
Copy link
Member Author

Checkout the quick demo of sea-orm-cli 33f16e1

@tyt2y3
Copy link
Member

tyt2y3 commented Jun 11, 2021

Can't try without the sea query commits :(

@billy1624
Copy link
Member Author

Checkout sea-query/readonly branch

@tyt2y3
Copy link
Member

tyt2y3 commented Jun 12, 2021

I tried. Quite good so far.
Several comments:

This doesn't compile:

    fn def(&self) -> RelationDef {
        match self {}
    }

And we should not use std::rc::Rc directly. We should use sea_query::SeaRc instead.
But I guess Alias will be removed anyway.

And we should also generate some doc comment:

//! SeaORM Entity. Generated by sea-orm 0.1.0

@tyt2y3
Copy link
Member

tyt2y3 commented Jun 12, 2021

Oh, and it'd also be a good idea to generate the entities/mod.rs and entities/prelude.rs

@billy1624
Copy link
Member Author

Ah ha good idea

@billy1624 billy1624 linked a pull request Jun 18, 2021 that will close this issue
@tyt2y3 tyt2y3 reopened this Jun 19, 2021
@tyt2y3
Copy link
Member

tyt2y3 commented Jun 19, 2021

Several amendments required:

  1. adding unique() and null() on ColumnDef

  2. can we use rev() to define reverse relation?

  3. if we detect that the foreign key column is unique, we should make the relation has_one()
    (Explain: the distinction between one-to-one and one-to-many is whether the reference key is unique)

  4. can we be smart enough to detect junction table (where it has two or more foreign keys)?

As ref:

impl Related<super::filling::Entity> for Entity {
    fn to() -> RelationDef {
        super::cake_filling::Relation::Filling.def()
    }

    fn via() -> Option<RelationDef> {
        Some(super::cake_filling::Relation::Cake.def().rev())
    }
}

But for 4, if we determined that it is too hard, we can delay it post release.

@billy1624
Copy link
Member Author

What do u mean by "reverse relation"?

For cake and fruit

  • Each fruit has one cake
  • Each cake has many fruit

Because foreign key is on fruit, so "Each cake has many fruit" is the inverse relation?

Then, we have...

// fruit.rs
impl Related<super::cake::Entity> for Entity {
    fn to() -> RelationDef {
        Relation::Cake.def()
    }
}
// cake.rs
impl Related<super::fruit::Entity> for Entity {
    fn to() -> RelationDef {
        super::Fruit::Relation::Cake.def().rev()
    }
}

@tyt2y3
Copy link
Member

tyt2y3 commented Jun 21, 2021

Basically yes. Each relation should have only one owner where we define it.
Others should only reuse that definition.
Put it simple, don't repeat yourself.

@billy1624
Copy link
Member Author

Basically yes. Each relation should have only one owner where we define it.
Others should only reuse that definition.
Put it simple, don't repeat yourself.

Then basically we won't see has_many loll

@tyt2y3
Copy link
Member

tyt2y3 commented Jun 21, 2021

How about

impl RelationTrait for Relation {
    fn def(&self) -> RelationDef {
        match self {
            Self::Fruit => Entity::has_many(super::fruit::Entity).into()
        }
    }
}

That means we actually require that the one side relation exists, and automatically define the has_many relation.

@tyt2y3
Copy link
Member

tyt2y3 commented Jun 21, 2021

Done. See d77a7c8

@tyt2y3
Copy link
Member

tyt2y3 commented Jun 21, 2021

To clarify again, if the foreign key column also has an unique constraint, then the reverse of it would also be has_one. Otherwise it is has_many.

Note: our current test schema does not have has_one.

@billy1624
Copy link
Member Author

Done. See d77a7c8

Ah ha! cool

@billy1624
Copy link
Member Author

To clarify again, if the foreign key column also has an unique constraint, then the reverse of it would also be has_one. Otherwise it is has_many.

Note: our current test schema does not have has_one.

Got it!

@billy1624
Copy link
Member Author

To clarify again, if the foreign key column also has an unique constraint, then the reverse of it would also be has_one. Otherwise it is has_many.

Note: our current test schema does not have has_one.

But then this special has_one still need to define something like... on both end

impl RelationTrait for Relation {
    fn def(&self) -> RelationDef {
        match self {
            Self::Cake => Entity::has_one(super::cake::Entity)
                .from(Column::CakeId)
                .to(super::cake::Column::Id)
                .into(),
        }
    }
}

@tyt2y3
Copy link
Member

tyt2y3 commented Jun 22, 2021

I will add a belongs_to similar to has_many

@billy1624
Copy link
Member Author

I will add a belongs_to similar to has_many

Nice!

@billy1624
Copy link
Member Author

@tyt2y3

@billy1624
Copy link
Member Author

billy1624 commented Jun 22, 2021

Several amendments required:

  1. can we use rev() to define reverse relation?

Is this still valid? After we have belongs_to, has_one & has_many

Check 9a25bb9

@tyt2y3 tyt2y3 reopened this Jul 3, 2021
@tyt2y3
Copy link
Member

tyt2y3 commented Jul 3, 2021

How can we detect 'junction table' and generate many-to-many relations with via?

Perhaps:

  1. We detect that a table has two or more belongs_to relations
  2. We N choose 2 from the relations, and append has_many for each

@tyt2y3
Copy link
Member

tyt2y3 commented Jul 3, 2021

Btw, the examples cli and codegen, one of them are redundant?

@billy1624
Copy link
Member Author

Yes, in terms of output

Just two different way to invoke codegen

@tyt2y3
Copy link
Member

tyt2y3 commented Jul 6, 2021

Then we can remove one of them

@billy1624
Copy link
Member Author

How can we detect 'junction table' and generate many-to-many relations with via?

Perhaps:

  1. We detect that a table has two or more belongs_to relations

  2. We N choose 2 from the relations, and append has_many for each

For 2, yes we can do that when we found all "junction table".

For 1, I think the problem is to define what we mean by "junction table"?

Table that meets one of the following criteria?

  • table with exactly two foreign keys, and primary key is composite of these two foreign keys.
  • table with exactly two foreign keys. Together with an id primary key column in the table. And table can also have other additional non-key (not a primary key nor foreign key) columns, if there is any.

@tyt2y3
Copy link
Member

tyt2y3 commented Jul 8, 2021

table with exactly two foreign keys, and primary key is composite of these two foreign keys

I agree, may be we can handle this common enough M-to-N junction case first

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants