Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: indexer base types #20629

Merged
merged 25 commits into from
Jun 17, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions indexer/base/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Indexer Base

The indexer base module is designed to provide a stable, zero-dependency base layer for the built-in indexer functionality. Packages that integrate with the indexer should feel free to depend on this package without fear of any external dependencies being pulled in.

The basic types for specifying index sources, targets and decoders are provided here along with a basic engine that ties these together. A package wishing to be an indexing source could accept an instance of `Engine` directly to be compatible with indexing. A package wishing to be a decoder can use the `Entity` and `Table` types. A package defining an indexing target should implement the `Indexer` interface.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add an architecture diagram using mermaid here to help explain the overall design

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, just noticing this README text is out of date too. Will update

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the README and added a sequence diagram around Listener calls. Let me know how that looks @tac0turtle.

I could also add a class diagram around the schema and object updates, but not sure that's too useful.

151 changes: 151 additions & 0 deletions indexer/base/column.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
package indexerbase

import "fmt"

// Column represents a column in a table schema.
type Column struct {
// Name is the name of the column.
Name string

// Kind is the basic type of the column.
Kind Kind

// Nullable indicates whether null values are accepted for the column.
Nullable bool

// AddressPrefix is the address prefix of the column's kind, currently only used for Bech32AddressKind.
AddressPrefix string

// EnumDefinition is the definition of the enum type and is only valid when Kind is EnumKind.
EnumDefinition EnumDefinition
}

// EnumDefinition represents the definition of an enum type.
type EnumDefinition struct {
// Name is the name of the enum type.
Name string

// Values is a list of distinct values that are part of the enum type.
Values []string
}

// Validate validates the column.
func (c Column) Validate() error {
// non-empty name
if c.Name == "" {
return fmt.Errorf("column name cannot be empty")
}

// valid kind
if err := c.Kind.Validate(); err != nil {
return fmt.Errorf("invalid column type for %q: %w", c.Name, err)
}

// address prefix only valid with Bech32AddressKind
if c.Kind == Bech32AddressKind && c.AddressPrefix == "" {
Fixed Show fixed Hide fixed
return fmt.Errorf("missing address prefix for column %q", c.Name)
} else if c.Kind != Bech32AddressKind && c.AddressPrefix != "" {
Fixed Show fixed Hide fixed
return fmt.Errorf("address prefix is only valid for column %q with type Bech32AddressKind", c.Name)
}

// enum definition only valid with EnumKind
if c.Kind == EnumKind {
if err := c.EnumDefinition.Validate(); err != nil {
return fmt.Errorf("invalid enum definition for column %q: %w", c.Name, err)
}
} else if c.Kind != EnumKind && c.EnumDefinition.Name != "" && c.EnumDefinition.Values != nil {
return fmt.Errorf("enum definition is only valid for column %q with type EnumKind", c.Name)
}

return nil
}

// Validate validates the enum definition.
func (e EnumDefinition) Validate() error {
if e.Name == "" {
return fmt.Errorf("enum definition name cannot be empty")
}
if len(e.Values) == 0 {
return fmt.Errorf("enum definition values cannot be empty")
}
seen := make(map[string]bool, len(e.Values))
for i, v := range e.Values {
if v == "" {
return fmt.Errorf("enum definition value at index %d cannot be empty for enum %s", i, e.Name)
}
if seen[v] {
return fmt.Errorf("duplicate enum definition value %q for enum %s", v, e.Name)
}
seen[v] = true
}
return nil
}

// ValidateValue validates that the value conforms to the column's kind and nullability.
// It currently does not do any validation that IntegerKind, DecimalKind, Bech32AddressKind, or EnumKind
// values are valid for their respective types behind conforming to the correct go type.
func (c Column) ValidateValue(value any) error {
if value == nil {
if !c.Nullable {
return fmt.Errorf("column %q cannot be null", c.Name)
}
return nil
}
return c.Kind.ValidateValueType(value)
}

// ValidateKey validates that the value conforms to the set of columns as a Key in an EntityUpdate.
// See EntityUpdate.Key for documentation on the requirements of such values.
func ValidateKey(cols []Column, value any) error {
if len(cols) == 0 {
return nil
}

if len(cols) == 1 {
return cols[0].ValidateValue(value)
}

values, ok := value.([]any)
if !ok {
return fmt.Errorf("expected slice of values for key columns, got %T", value)
}

if len(cols) != len(values) {
return fmt.Errorf("expected %d key columns, got %d values", len(cols), len(value.([]any)))
}
for i, col := range cols {
if err := col.ValidateValue(values[i]); err != nil {
return fmt.Errorf("invalid value for key column %q: %w", col.Name, err)
}
}
return nil
}

// ValidateValue validates that the value conforms to the set of columns as a Value in an EntityUpdate.
// See EntityUpdate.Value for documentation on the requirements of such values.
func ValidateValue(cols []Column, value any) error {
valueUpdates, ok := value.(ValueUpdates)
if ok {
colMap := map[string]Column{}
for _, col := range cols {
colMap[col.Name] = col
}
var errs []error
valueUpdates.Iterate(func(colName string, value any) bool {
col, ok := colMap[colName]
if !ok {
errs = append(errs, fmt.Errorf("unknown column %q in value updates", colName))
}
if err := col.ValidateValue(value); err != nil {
errs = append(errs, fmt.Errorf("invalid value for column %q: %w", colName, err))
}
return true
})
if len(errs) > 0 {
return fmt.Errorf("validation errors: %v", errs)
}
return nil
} else {
return ValidateKey(cols, value)
}
}
7 changes: 7 additions & 0 deletions indexer/base/column_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
package indexerbase

import "testing"

func TestColumnValidate(t *testing.T) {

}
40 changes: 40 additions & 0 deletions indexer/base/entity.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
package indexerbase

// EntityUpdate represents an update operation on an entity in the schema.
type EntityUpdate struct {
// TableName is the name of the table that the entity belongs to in the schema.
TableName string

// Key returns the value of the primary key of the entity and must conform to these constraints with respect
// that the schema that is defined for the entity:
// - if key represents a single column, then the value must be valid for the first column in that
// column list. For instance, if there is one column in the key of type String, then the value must be of
// type string
// - if key represents multiple columns, then the value must be a slice of values where each value is valid
// for the corresponding column in the column list. For instance, if there are two columns in the key of
// type String, String, then the value must be a slice of two strings.
// If the key has no columns, meaning that this is a singleton entity, then this value is ignored and can be nil.
Key any

// Value returns the non-primary key columns of the entity and can either conform to the same constraints
// as EntityUpdate.Key or it may be and instance of ValueUpdates. ValueUpdates can be used as a performance
// optimization to avoid copying the values of the entity into the update and/or to omit unchanged columns.
// If this is a delete operation, then this value is ignored and can be nil.
Value any

// Delete is a flag that indicates whether this update is a delete operation. If true, then the Value field
// is ignored and can be nil.
Delete bool
}

// ValueUpdates is an interface that represents the value columns of an entity update. Columns that
// were not updated may be excluded from the update. Consumers should be aware that implementations
// may not filter out columns that were unchanged. However, if a column is omitted from the update
// it should be considered unchanged.
type ValueUpdates interface {

// Iterate iterates over the columns and values in the entity update. The function should return
// true to continue iteration or false to stop iteration. Each column value should conform
// to the requirements of that column's type in the schema.
Iterate(func(col string, value any) bool)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could error

}
6 changes: 6 additions & 0 deletions indexer/base/go.mod
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
module cosmossdk.io/indexer/base
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we maybe just call this cosmossdk.io/indexer? I'm not sure what package would be simply cosmossdk.io/indexer if it's not this one, although maybe good to leave as base to communicate the intent of a base package clearly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this works, no need to change


// NOTE: this go.mod should have zero dependencies and remain on an older version of Go
// to be compatible with legacy codebases.

go 1.19
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment from our call, should we target an even earlier go release? One excellent outcome here be that if all else fails, i.e. I (as streaming developer) can't logically decode state updates from previous versions in the latest app binary, at least I can patch prior versions and use those binaries to stream from genesis.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going all the way back to the beginning (basically Gaia 1.0) that means Go v1.12. I think the main change is converting all anys to interface{}. Any other changes you can see needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made this update. Is there any easy way we could check this in CI? I guess we could create a GitHub action with an older version of go just for this module?

Copy link
Member Author

@aaronc aaronc Jun 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we also want to make sure that the Postgres indexer module is compatible with earlier versions of Go? I guess I'm wondering if there would be any issue building legacy codebases with a newer version of Go. For instance, would Gaia 1.0 compile with Go 1.22? Maybe I'll do a quick check

Empty file added indexer/base/go.sum
Empty file.
Loading
Loading