Skip to content

Planning the kwil‐db repo for SDK and apps

jchappelow edited this page Oct 9, 2023 · 2 revisions

Go Submodule Structure and Tagging

Module and Submodule Basics

The module defined at the root of a repository via a go.mod file is referred to as the main module.

A package in subdirectory can be turned into it's own submodule by putting a go.mod file in it.

A repository can have many submodules, all with their own versioning and dependencies.

Value and Use of Submodules

Submodules can encapsulate specific functionality, making it easier to manage dependencies, reduce frequency of breaking changes, and scope functionality for certain consumers. For example, if we have packages with different responsibilities or targets (e.g. SDK, CLI apps, test framework), we can use submodules to keep them separate.

Each submodule is versioned independently. This is particularly important because product release versions are distinct from Go module versions, which can be as fine-grained as a single package. This means that the git tags also include the submodule name, which is explained in the next section.

Tagging Rules

Go sets very strict rules for versioning of modules with tags in a version control system such as git. A specific syntax built on Semantic Versioning (SemVer) is used for importable module version resolution, while arbitrary tag formats may be used for other modules containing code like package main applications or other purposes.

SemVer

The main module (at the root of the repository) is associated with any plain semver tags like v1.2.3 or v1.2.3-rc.4. The "v" prefix is not semver, but Go has hijacked this natural tag format for their modules.

v0 and v1 Modules

For a sub-module in a directory like pkg/client, which is established by the presence of a go.mod there, it's tagged like pkg/client/v1.1.0. These appear in a consumer's go.mod like:

require (
    github.com/kwilteam/kwil-db/pkg/client v1.1.0
)

From just this require line, there is ambiguity about the module in which this package is defined. The Go tooling will resolve this as a package under one of:

  • the main module github.com/kwilteam/kwil-db (tag v1.1.0)
  • the submodule github.com/kwilteam/kwil-db/pkg (tag pkg/v1.1.0)
  • the submodule github.com/kwilteam/kwil-db/pkg/client (tag pkg/client/v1.1.0)

Remember, tags are a concept in a version control system at the level of the repository, hence the need for this tag format.

v2+ Modules

For a v2+ module, the major version is on both the require and import paths. It would be specified explicitly as one of the following without ambiguity:

  • github.com/kwilteam/kwil-db/v2/pkg/client (tag v2.1.0)
  • github.com/kwilteam/kwil-db/pkg/v2/client (tag pkg/v2.1.0)
  • github.com/kwilteam/kwil-db/pkg/client/v2 (tag pkg/client/v2.1.0)

In all three cases above, the required version in a go.mod is still v2.1.0, but the location of the v2 in the path, and the tag, both differ. This is the gist of Semantic Import Versioning (SIV).

Non-SemVer

Arbitrary non-conforming git tags may also be used, which is often fitting for application product releases (builds of non-importable package main code). For example, release-v0.7.2 can be used to go install an app in some path in the repo.

For example:

go install github.com/kwilteam/kwil-db/cmd/[email protected]

This release-v0.7.2 tag just points to a git revision in the repository. There is no need to ever decorate the path with a /vX as long as the containing (sub)module is not defined with such a suffix. (Hint: carve out a sub-module for cmd or cmd/kwild if needed and never make a conforming semver tag for it.)

Caveat: This is not suitable for importable modules as it is not considered in import version resolution (or the special @latest syntax in the go tooling).

How Module Rules Influence Repository Structure

Given the tagging rules established for modules in Go 1.14, modules defined deep in the directory tree are cumbersome to tag and use. A relatively flat structure has always been desirable, but it is particularly desirable with Go.

For example, the ANTLR4 Go runtime was at github.com/antlr/antlr4/runtime/Go/antlr/v4, but this created such issues for module tagging that they moved the entire thing to github.com/antlr4-go/antlr where the repo's plain semver tags would actually apply to the module in that deep path.

What Should Kwil Do?

What is the Kwil Go SDK?

First of all, what do we consider the "Kwil Go SDK"? The code that is required for the apps like kwild and the code we want others using is very muddled right now. For the sake of this discussion on tagging and modules, let's just assume it's a subset of what is in pkg and come back to this question.

Distinguish Module and Product Release Versions

With our Go SDK, SIV conventions dictate that we must increment the major version value of the module when there are breaking changes (once mature enough to make the v1 promise). However, the module version will not always be what we want designated for our product releases. Module revisions generally change more frequently and will jump ahead in their major version compared to the applications.

The kwil-db repository provides both the command line apps and the SDK packages. Unless we extract the SDK into a separate repository (no actually a terrible idea, but we'll come back to that), there are two general options:

  • Designate the main module for our public-facing packages, using plain semver tags like vA.B.C for these, while using special tags like release-vX.Y.Z for CLI product releases (from which downloadable apps are built).
  • Use a submodule (or many) for the importable developer packages, with tags like pkg/vA.B.C (or pkg/client/vD.E.F, etc.), and keep CLI apps and other components in the main module with plain semver tags like vX.Y.Z.

In either case, other submodules may be defined to isolate code as needed, such as with the case of the test code. Code just for the application would ideally be isolated from the SDK packages.

Establish Boundaries Between Application and Publicly-Importable Code

We also wish to delineate code used only to compose the applications from code that external developers may import. This is desirable so we establish clearer package boundaries and concerns, and to keep the dependency tree on the Go SDK module(s) as small as possible.

It may be difficult or impossible to make the decision defined in the previous section until we are able to do this.

Prodding kwil-db with go list

pkg/client

The imports (direct and transitive) of other kwilteam org packages:

$ go list -f '{{ join .Deps "\n" }}' ./pkg/client/ | grep kwilteam
github.com/kwilteam/kwil-db/api/protobuf/tx/v1 ***
github.com/kwilteam/kwil-db/internal/pkg/transport ***
github.com/kwilteam/kwil-db/pkg/auth
github.com/kwilteam/kwil-db/pkg/balances
github.com/kwilteam/kwil-db/pkg/client/types
github.com/kwilteam/kwil-db/pkg/crypto
github.com/kwilteam/kwil-db/pkg/engine/utils
github.com/kwilteam/kwil-db/pkg/grpc/client/v1
github.com/kwilteam/kwil-db/pkg/grpc/client/v1/conversion
github.com/kwilteam/kwil-db/pkg/log
github.com/kwilteam/kwil-db/pkg/serialize
github.com/kwilteam/kwil-db/pkg/sessions
github.com/kwilteam/kwil-db/pkg/sessions/sql-session
github.com/kwilteam/kwil-db/pkg/sql
github.com/kwilteam/kwil-db/pkg/transactions
github.com/kwilteam/kwil-db/pkg/utils/order
github.com/kwilteam/kwil-db/pkg/utils/random
github.com/kwilteam/kwil-db/pkg/utils/serialization
github.com/kwilteam/kwil-db/pkg/validators

There are two oddballs there, but it's not bad. Is this above perhaps the subset of packages we would consider to be the Kwil Go SDK?

cmd/kwild

The same list for cmd/kwild should be just about everything in the kwil-db repo and our other repos.

$ go list -f '{{ join .Deps "\n" }}' ./cmd/kwild | grep kwil-db/pkg
github.com/kwilteam/action-grammar-go/actgrammar
github.com/kwilteam/go-sqlite
github.com/kwilteam/go-sqlite/ext/error
github.com/kwilteam/go-sqlite/fs
github.com/kwilteam/go-sqlite/sqlitex
github.com/kwilteam/kwil-db/api/openapi-spec/api
github.com/kwilteam/kwil-db/api/protobuf/admin/v0
github.com/kwilteam/kwil-db/api/protobuf/tx/v1
github.com/kwilteam/kwil-db/extensions/auth
github.com/kwilteam/kwil-db/internal/app/kwild
github.com/kwilteam/kwil-db/internal/app/kwild/config
github.com/kwilteam/kwil-db/internal/app/kwild/server
github.com/kwilteam/kwil-db/internal/controller/grpc/admin/v0
github.com/kwilteam/kwil-db/internal/controller/grpc/healthsvc/v0
github.com/kwilteam/kwil-db/internal/controller/grpc/txsvc/v1
github.com/kwilteam/kwil-db/internal/controller/http/swagger
github.com/kwilteam/kwil-db/internal/controller/http/v1/health
github.com/kwilteam/kwil-db/internal/pkg/healthcheck
github.com/kwilteam/kwil-db/internal/pkg/healthcheck/simple-checker
github.com/kwilteam/kwil-db/internal/pkg/transport
github.com/kwilteam/kwil-db/internal/pkg/version
github.com/kwilteam/kwil-db/pkg/abci
github.com/kwilteam/kwil-db/pkg/abci/cometbft
github.com/kwilteam/kwil-db/pkg/abci/cometbft/privval
github.com/kwilteam/kwil-db/pkg/admin/types
github.com/kwilteam/kwil-db/pkg/auth
github.com/kwilteam/kwil-db/pkg/balances
github.com/kwilteam/kwil-db/pkg/client/types
github.com/kwilteam/kwil-db/pkg/crypto
github.com/kwilteam/kwil-db/pkg/engine
github.com/kwilteam/kwil-db/pkg/engine/dataset
github.com/kwilteam/kwil-db/pkg/engine/dataset/actparser
github.com/kwilteam/kwil-db/pkg/engine/dataset/evaluater
github.com/kwilteam/kwil-db/pkg/engine/db
github.com/kwilteam/kwil-db/pkg/engine/db/sql-ddl-generator
github.com/kwilteam/kwil-db/pkg/engine/execution
github.com/kwilteam/kwil-db/pkg/engine/master
github.com/kwilteam/kwil-db/pkg/engine/sqlanalyzer
github.com/kwilteam/kwil-db/pkg/engine/sqlanalyzer/aggregate
github.com/kwilteam/kwil-db/pkg/engine/sqlanalyzer/attributes
github.com/kwilteam/kwil-db/pkg/engine/sqlanalyzer/clean
github.com/kwilteam/kwil-db/pkg/engine/sqlanalyzer/join
github.com/kwilteam/kwil-db/pkg/engine/sqlanalyzer/mutative
github.com/kwilteam/kwil-db/pkg/engine/sqlanalyzer/order
github.com/kwilteam/kwil-db/pkg/engine/sqlanalyzer/utils
github.com/kwilteam/kwil-db/pkg/engine/sqlparser
github.com/kwilteam/kwil-db/pkg/engine/sqlparser/tree
github.com/kwilteam/kwil-db/pkg/engine/sqlparser/tree/sql-writer
github.com/kwilteam/kwil-db/pkg/engine/types
github.com/kwilteam/kwil-db/pkg/engine/types/validation
github.com/kwilteam/kwil-db/pkg/engine/utils
github.com/kwilteam/kwil-db/pkg/extensions
github.com/kwilteam/kwil-db/pkg/grpc/client/v1/conversion
github.com/kwilteam/kwil-db/pkg/grpc/gateway
github.com/kwilteam/kwil-db/pkg/grpc/gateway/middleware
github.com/kwilteam/kwil-db/pkg/grpc/gateway/middleware/cors
github.com/kwilteam/kwil-db/pkg/grpc/server
github.com/kwilteam/kwil-db/pkg/kv
github.com/kwilteam/kwil-db/pkg/kv/atomic
github.com/kwilteam/kwil-db/pkg/kv/badger
github.com/kwilteam/kwil-db/pkg/log
github.com/kwilteam/kwil-db/pkg/modules/datasets
github.com/kwilteam/kwil-db/pkg/modules/validators
github.com/kwilteam/kwil-db/pkg/serialize
github.com/kwilteam/kwil-db/pkg/sessions
github.com/kwilteam/kwil-db/pkg/sessions/sql-session
github.com/kwilteam/kwil-db/pkg/sessions/wal
github.com/kwilteam/kwil-db/pkg/snapshots
github.com/kwilteam/kwil-db/pkg/sql
github.com/kwilteam/kwil-db/pkg/sql/client
github.com/kwilteam/kwil-db/pkg/sql/sqlite
github.com/kwilteam/kwil-db/pkg/sql/sqlite/functions
github.com/kwilteam/kwil-db/pkg/sql/sqlite/functions/addresses
github.com/kwilteam/kwil-db/pkg/transactions
github.com/kwilteam/kwil-db/pkg/utils
github.com/kwilteam/kwil-db/pkg/utils/numbers/bytes
github.com/kwilteam/kwil-db/pkg/utils/order
github.com/kwilteam/kwil-db/pkg/utils/random
github.com/kwilteam/kwil-db/pkg/utils/serialization
github.com/kwilteam/kwil-db/pkg/validators
github.com/kwilteam/kwil-extensions/client
github.com/kwilteam/kwil-extensions/gen
github.com/kwilteam/kwil-extensions/types
github.com/kwilteam/kwil-extensions/types/convert
github.com/kwilteam/sql-grammar-go/sqlgrammar

pkg

The pkg directory flies in face of current Go principles and the tooling (modules and internal), but in a repo containing both an application (or app suite) and importable packages, it may be the right place to declare an "SDK" submodule.

We may want to rename or move it. Some considerations:

  • The pkg directory does not play nice with Go modules. The "pkg/" prefix would go in every importable Go submodule tag, although it is redundant in a tag like pkg/client/v1.2.3 since the tag references a submodule with packages implicitly intended to be imported. Options that would make more sense for these are:

    • (a) rename to go-kwil or sdk to imply the repo is multi-purpose,
    • (b) top-level packages with no pkg to make the repo itself the SDK with other stuff isolated in submodules
    • (c) an entirely separate repository if the submodule situation is too awkward.

    That depends on what we consider to be the Kwil SDK, which is very unclear right now, and the primary purpose of the kwil-db repo.

  • Go's internal folder handling has the reverse but formal and enforced semantics. pkg is used as an optional negation of this.

Commentary

It is absolutely NOT idiomatic to put packages you want a consumer to import in a pkg folder, despite some seemingly authoritative sources saying so, but it has been pushed as such. The vast majority of packages anyone imports lack this "pkg/" element for good reason, and there will be far fewer in the future on account of modules.

Refs:

The presence of this pkg folder (and even api, internal/app, scripts, and deployments) gives the repo a bad smell. The internal/app thing is particularly tricky with modules and that demonstrates how it scopes logic required for a specific cmd/thing in an unexpected place (cmd/thing/internal is the obvious location, we just have to contend with some stuff like the config package, which probably needs to merge with nodecfg).

We know what we're building and who we are, there are no standard layouts, and we can define things ourselves with the current language and toolset in mind.