Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement verbose #89

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

22dm
Copy link
Contributor

@22dm 22dm commented Apr 6, 2024

This PR contains the following three changes:

  • Support output path for non-tag field
  • Support output filename and line number where the non-tag field is located
  • Support output multiple non-tag fields at the same time, instead of the first one

For example, we have a file foo.go:

package musttag

import "encoding/json"

type NestedB struct {
	NoTagFieldB string
}

type NestedA struct {
	NestedAField NestedB `json:"NestedAField"`
	NoTagFieldA  string
}

type Foo struct {
	FieldA NestedA `json:"FieldA"`
}

func main() {
	json.Marshal(Foo{})
}

Previously, this would only output a single line:

/path/to/foo.go:19:15: the given struct should be annotated with the `json` tag

When the structure is deeply nested, it will be very difficult to find all non-tag fields.

So we can try to output more information like:

/path/to/foo.go:19:15: the given struct should be annotated with the `json` tag: .FieldA.NestedAField.NoTagFieldB (path/to/foo.go:6:2)
/path/to/foo.go:19:15: the given struct should be annotated with the `json` tag: .FieldA.NoTagFieldA (path/to/foo.go:11:2)

Each line has two file locations. The first is the same as the original, which is the location of json.Marshal, and the second is the location where the non-tag fields are defined.

However, repeated structures are not checked or outputted twice, for example, if Foo was changed to:

type Foo struct {
	FieldA NestedA `json:"FieldA"`
	FieldB NestedB `json:"FieldB"`
}

The output will be unchanged, although .FieldB.NoTagFieldB is a non-tag field, the NestedB structure has already been checked in .FieldA.NestedAField, so it will be skipped.

@tmzane tmzane self-requested a review April 7, 2024 16:50
@tmzane
Copy link
Member

tmzane commented Apr 8, 2024

I'm not sure we need this 🤔

Early versions of musttag used to produce detailed reports similar to what you have implemented in this PR. It turned out to be quite noisy and the implementation was complicated. In my experience, most structs passed to Marshal-like functions are small to medium in size, so it's usually obvious which fields need to be annotated. The current report message is short and precise, like the reports produced by other golangci linters. I like it that way.

On the other hand, I don't mind implementing it as an option, e.g.:

musttag:
  settings:
    verbose: true

It would be useful at least in musttag's own tests to check exactly what fields were reported. @22dm would that work for you?

@ldez maybe you have some thoughts?

musttag.go Outdated
@@ -216,21 +213,22 @@ func (c *checker) checkStruct(styp *types.Struct, tag string) (valid bool) {
if !ok {
// tag is not required for embedded types; see issue #12.
if !field.Embedded() {
return false
pass.Reportf(pos, "the given struct should be annotated with the `%s` tag: %s.%s (%s)",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving this here would cause the same call to be reported multiple times. We don't want that.

@ldez
Copy link
Contributor

ldez commented Apr 8, 2024

The original problem was the difficulty of knowing where the Marshall/Decode was.

I think a more detailed report can be useful in some cases but reporting several issues on the same line is a problem because golangci-lint will only keep one of them.

The verbose option is interesting, maybe it can be a mix of previous reports and the new report:

  • by default, it can report the Marshall/Decode.
  • with verbose, it can create one report by field (as before) in addition to the Marshall/Decode report.

@22dm
Copy link
Contributor Author

22dm commented Apr 9, 2024

Hello, I've added the verbose config and now it won't report multiple errors on the same line.

> ./musttag ./...
/path/to/foo.go:21:15: the given struct should be annotated with the `json` tag

> ./musttag -verbose ./...
/path/to/foo.go:21:15: the given struct should be annotated with the `json` tag: .FieldA.NestedAField.NoTagFieldB (/path/to/foo.go:6:2)

@ldez
Copy link
Contributor

ldez commented Apr 10, 2024

What happens if there are multiple missing tags?

If it reports only one field, then multiple runs will be required, it doesn't seem like a good behavior.

@tmzane
Copy link
Member

tmzane commented Apr 10, 2024

What happens if there are multiple missing tags?

If it reports only one field, then multiple runs will be required, it doesn't seem like a good behavior.

Currently, each Marshal call is analyzed independently, so a struct type may be parsed multiple times to check for different tags. This could be prevented by introducing some kind of cache to parse each type only once.

I see two possible implementations of verbose:

  1. Append the list of untagged field names to the default report:
type T struct {
    Foo struct {
        Bar string
        Baz string
    }
}

var t T
json.Marshal(t) // musttag: the fields of the given struct should be annotated with the `json` tag: Foo.Bar, Foo.Baz
yaml.Marshal(t) // musttag: the fields of the given struct should be annotated with the `yaml` tag: Foo.Bar, Foo.Baz
  1. In addition to the default report, report untagged fields separately:
type T struct {
    Foo struct {
        Bar string // musttag: the field should be annotated with the `json`, `yaml` tag(s)
        Baz string // musttag: the field should be annotated with the `json`, `yaml` tag(s)
    }
}

var t T
json.Marshal(t) // musttag: the given struct should be annotated with the `json` tag
yaml.Marshal(t) // musttag: the given struct should be annotated with the `yaml` tag

The second one is tricky: to report a field with multiple tags just once we would have to change the implementation to analyze all Marshal calls first saving untagged fields into something like map[field_location][]missing_tags, then report them all once.

@tmzane tmzane changed the title feat: support output multi fields path and pos feat: implement verbose Apr 10, 2024
@22dm
Copy link
Contributor Author

22dm commented Apr 15, 2024

Hello, I have updated it to support the first implementation.

For the second implementation, it should not be difficult. Currently, we already have map[*types.Var]string as reports. We only need to merge the reports for output after all executions are completed, but it may not be good to output two copies (both marshal func and struct). Maybe we should introduce other settings to switch between these two behaviors.

Regarding "introducing some kind of cache to parse each type only once", I found that this may be more difficult than imagined, especially to obtain a unique identifier of a type.

For example, in the two funcs below, the Foo struct is different, but when .String() is called, we both get main.Foo, in the test file it is easy to observe this.

package main

func f1() {
	type Foo struct {
		...
	}
}

func f2() {
	type Foo struct {
		...
	}
}

In addition, due to the existence of ifaceWhitelist, we must cache each Func.Name separately.

@tmzane tmzane mentioned this pull request Jul 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants