Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert library entrypoints to immutable interfaces #397

Merged
merged 41 commits into from
Sep 28, 2022

Conversation

anuraaga
Copy link
Contributor

@anuraaga anuraaga commented Sep 5, 2022

This is a very early look into exploring #371

While writing it, I noticed another objective, where it could be easier to use the API if Transaction is split into the interface for WAF users and those writing custom rules / operators. From my cursory understanding, methods like AddRequestHeader, Process* only make sense in former and Capture only make sense in latter. In this early iteration, we can see the Transaction interface for WAF users. Rule also probably needs to have a separate interface for defining rules programatically and implementing operators, etc.

The approach for doing this is to first stuff what we have into internal/corazawaf, internal/seclang and start creating public API from scratch delegating to these. After completing migration, there is probably cleanup that could be done which may result in the removal of delegation, but that could happen after locking in a public API.

Any methods suffixed Next() are placeholders to replace existing methods in future iterations of this PR.

The important files at the moment are (these links load slow! wait a few seconds, don't scroll)

Public API:
config.go
waf.go

Public API Users:
coraza_test.go
http.go

Let me know what you think of this very general direction and whether we should pursue it to completion. It will take time to reflect these patterns across all the APIs, but IMO it would be worth it to improve the ease-of-use of the API by making sure they are built on the actual use cases / user personas rather than having too many large interfaces with sort of jumbling of concerns, and chance of safety issues because of mutability. As we move towards using Coraza more in production, v3 is hopefully the right time to settle on the API that avoids a v4 for as long as possible.

Happy to hear thoughts!

@anuraaga anuraaga requested review from jptosso, jcchavezs, fzipi and a team September 5, 2022 07:53
)

type WAFConfig interface {
WithRule(rule *corazawaf.Rule) WAFConfig
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about accepting a variadic set of rules? This will save clone calls and will allow a more fluent API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably also renaming it to WithRules.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we take it post merge? @anuraaga

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry yeah forgot about this one will follow up


func (c *wafConfig) WithRequestBodyAccess(config RequestBodyConfig) WAFConfig {
ret := c.clone()
ret.requestBody = config.(*requestBodyConfig)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since RequestBodyConfig is an interface this casting can of course fail because anyone can implement the interface and attempt to do dynamic stuff. I think for the sake of correctness this interface can have a private method so you make sure you are the only one which can implement it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


func (c *auditLogConfig) LogRelevantOnly() AuditLogConfig {
ret := c.clone()
c.relevantOnly = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
c.relevantOnly = true
ret.relevantOnly = true

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// detail, we get the abstraction we need while being able to handle paths as
// the os package otherwise would.
// More context in: https://github.com/golang/go/issues/44279
type OSFS struct{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest with list the interfaces this implements.

config.go Outdated
WithDebugLogger(logger corazawaf.DebugLogger) WAFConfig
WithErrorLogger(logger corazawaf.ErrorLogCallback) WAFConfig

WithFSRoot(fs fs.FS) WAFConfig
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect something like WithRootFS or WithRootDir as it is more clear although the parameter is a Fs and not a Dir cc @jptosso

@anuraaga
Copy link
Contributor Author

anuraaga commented Sep 6, 2022

Thanks for all the good comments. I'd like a high-level "keep at it" from @jptosso before proceeding too far as it will be a lot more to do.

@anuraaga
Copy link
Contributor Author

Another big addition in 911e0ad

I think this shows where things are going in general. It defines a public API for actions plugins. Unfortunately it's a bit hard to know what the right API for them is :) While working through the current actions, I noticed most are 1:1 with implementation details, e.g. set a field on transaction which is used by internal coraza logic. I kept these using internal implementation details as they don't really represent a plugin per se.

It means the actions plugin API is rather anemic, but maybe it's still useful enough given it has access to important parts of the transaction. I think the notable addition needed which I didn't do quite yet is a mechanism to propagate state from Init to Evaluate in a generic way as opposed to static fields on Rule (similar to Context).

https://github.com/corazawaf/coraza/pull/397/files#diff-d6814085206adf41797e66ab710ce2e744d55ffc10e71a2bb0e28a3f92196b4aR25

The main API is TransactionState which attempts to be the mutable bits of a Transaction, anything necessary in rule evaluation but not user level orchestration

https://github.com/corazawaf/coraza/pull/397/files#diff-2b27646a11f4d8723da8f9fd16bb53fd4fd25fadf866c44481a87b6fe3ba265fR15

One thing I'm particularly happy about is actions not having to check rule engine before interruption. We can focus on making this API easy to use without being coupled to internal details.

https://github.com/corazawaf/coraza/pull/397/files#diff-081faffa20610c1510154a66fcff0bbf617e176cf3a420ba5f5f1838ceb72503R124
https://github.com/corazawaf/coraza/pull/397/files#diff-0c56e81f05d943bfb010c8ff984018bab39c317d383030232f9164c96b019454L23

Next I'll make the operators a public API and then programatic rules, and then scrub through remaining parts.

@jptosso Feel free to leave any thoughts on where this is going, though I'll keep on going anyways :-)

@jptosso
Copy link
Member

jptosso commented Sep 14, 2022

I think this shows where things are going in general. It defines a public API for actions plugins. Unfortunately it's a bit hard to know what the right API for them is :) While working through the current actions, I noticed most are 1:1 with implementation details, e.g. set a field on transaction which is used by internal coraza logic. I kept these using internal implementation details as they don't really represent a plugin per se.

We must keep enough exported fields for metadata, for example, a rule action can update a transaction and a rule (during bootstrap), for that reason, we should have TransactionMetadata and RuleMetadata (which already exist). Maybe actions should do something like tx.Metadata.Status = 500. Or we can just create a setter.

@anuraaga
Copy link
Contributor Author

I'm going to give this another read-through by Monday but think it's mostly ready for a first step in a few. The API can't be fully cleaned up in this one PR but I think it'll be a big step without losing current functionality like custom operators/actions.

@jptosso
Copy link
Member

jptosso commented Sep 26, 2022

@anuraaga is this ready for review?

@anuraaga
Copy link
Contributor Author

Sorry got distracted by backslashes of all things 😭 Will definitely open for review tomorrow

@anuraaga anuraaga marked this pull request as ready for review September 27, 2022 01:34
@anuraaga
Copy link
Contributor Author

@jptosso Sorry for the delay, I think for a first version this is OK, there is still more cleanup to follow up with, in particular removing API added here that we feel is overkill, etc.

"strings"
"testing"

"github.com/corazawaf/coraza/v3"
)

func TestRawRequests(t *testing.T) {
waf := coraza.NewWAF()
waf, _ := coraza.NewWAFWithConfig(coraza.NewWAFConfig())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could keep both: NewWAF and NewWAFWithConfig.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah I was wondering if this could be just NewWAF. I thought it could be cool to have a default experience with embedded rules but if we would never have that we don't need two functions (and can add NewDefaultWAF or something in the future if that changes)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewDefaultWAF seems more suitable, because it does implement default settings.

@@ -64,7 +64,7 @@ func init() {

// GetAction returns an unwrapped RuleAction from the actionmap based on the name
// If the action does not exist it returns an error
func GetAction(name string) (coraza.RuleAction, error) {
func GetAction(name string) (corazawaf.RuleAction, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should rename this to Get?

.gitignore Outdated
@@ -16,6 +16,6 @@ test/crs
vendor/
coraza-waf
__debug_bin
seclang/crs_test.go
internal/seclang/crs_test.go
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can delete this, it was for internal tests

README.md Outdated
"context"
"fmt"
"github.com/corazawaf/coraza/v3"
"github.com/corazawaf/coraza/v3/seclang"
"github.com/corazawaf/coraza/v3/internal/seclang"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

External packages cannot import internal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes forgot to update readme / example

@@ -7,20 +7,20 @@ import (
"fmt"
"strings"

"github.com/corazawaf/coraza/v3"
"github.com/corazawaf/coraza/v3/internal/corazawaf"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename internal/corazawaf to waf?

@@ -3,12 +3,12 @@ package main
import (
"net/http"

"github.com/corazawaf/coraza/v3"
"github.com/corazawaf/coraza/v3/internal/corazawaf"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internal cannot be called by external packages

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also implement it using immutability

actions/allow.go Outdated
)

// 0 nothing, 1 phase, 2 request
type allowFn struct {
allow int
}

func (a *allowFn) Init(r *coraza.Rule, b1 string) error {
func (a *allowFn) Init(r rules.RuleInfo, b1 string) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should rename ruleinfo to RuleMetadata? its more consistent with the rule metadata concept

config.go Outdated
WithRule(rule *corazawaf.Rule) WAFConfig

// WithDirectives parses the directives from the given string and adds them to the WAF.
WithDirectives(rules string) WAFConfig
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
WithDirectives(rules string) WAFConfig
WithDirectives(directives string) WAFConfig

config.go Outdated
WithErrorLogger(logger corazawaf.ErrorLogCallback) WAFConfig

// WithFSRoot configures the root file system.
WithFSRoot(fs fs.FS) WAFConfig
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
WithFSRoot(fs fs.FS) WAFConfig
WithRootFS(fs fs.FS) WAFConfig

tx.WAF.Logger.Debug("[%s] Capturing field %d with value %q", tx.ID, index, value)
i := strconv.Itoa(index)
tx.Variables.TX.SetIndex(i, 0, value)
if tx.Capture {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually enforce the tx.Capture validation in the operator, that way we can read only the amount of results we expect. For example, RX will load only the first match if !tx.Capture, otherwise it will read up to 10.

Maybe there is no overhead in having it here though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think if the operator checks it it's just a doublecheck here but that isn't worthwhile overhead, but when the operator doesn't have a fast-path for non-capturing (Capturing() is still exposed to allow this for when it is the case) it makes it safer/easier to use so think this is good.

"strings"
"testing"

"github.com/corazawaf/coraza/v3"
)

func TestRawRequests(t *testing.T) {
waf := coraza.NewWAF()
waf, _ := coraza.NewWAFWithConfig(coraza.NewWAFConfig())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewDefaultWAF seems more suitable, because it does implement default settings.

Copy link
Member

@jptosso jptosso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR might break some functionalities, but we must keep forward with this immutability pattern. LGTM

@anuraaga anuraaga merged commit fce26f2 into corazawaf:v3/dev Sep 28, 2022

func (*geoLookup) Init(coraza.RuleOperatorOptions) error { return nil }
func (*geoLookup) Init(rules.OperatorOptions) error { return nil }

// kept for compatibility, it requires a plugin.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is v3 a good opportunity to drop this? cc @jptosso

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's kept for compatibility, not used by crs by default

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants