Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support YAML configs #262

Merged
merged 1 commit into from
Aug 25, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ require (
github.com/sirupsen/logrus v1.8.1
github.com/stretchr/testify v1.8.0
golang.org/x/sys v0.0.0-20210320140829-1e4c9ba3b0c4
gopkg.in/yaml.v3 v3.0.1
)

require (
Expand All @@ -37,5 +38,4 @@ require (
gopkg.in/check.v1 v1.0.0-20200227125254-8fa46927fb4f // indirect
gopkg.in/fsnotify.v1 v1.4.7 // indirect
gopkg.in/tomb.v1 v1.0.0-20141024135613-dd632973f1e7 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
)
155 changes: 93 additions & 62 deletions main.go

Large diffs are not rendered by default.

9 changes: 5 additions & 4 deletions parsers/csv/csv.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,11 @@ import (
// Options defines the options relevant to the CSV parser
type Options struct {
Fields string `long:"fields" description:"Comma separated list of CSV fields, in order."`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a yam:"-" for Fields too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely not "-" -- that would prevent the field from ever showing up. The rule I used was to add a YAML declaration only when the field name used in config didn't match the variable name, or when it should be suppressed. In this case, the field is called "Fields" and that's also the config value name, so a YAML tag naming the field would be redundant.

TimeFieldName string `long:"timefield" description:"Name of the field that contains a timestamp"`
TimeFieldFormat string `long:"time_format" description:"Timestamp format to use (strftime and Golang time.Parse supported)"`
NumParsers int `hidden:"true" description:"number of csv parsers to spin up"`
TrimLeadingSpace bool `bool:"trim_leading_space" description:"trim leading whitespace in CSV fields and values" default:"false"`
TimeFieldName string `long:"timefield" description:"Name of the field that contains a timestamp" yaml:"timefield,omitempty"`
TimeFieldFormat string `long:"time_format" description:"Timestamp format to use (strftime and Golang time.Parse supported)" yaml:"time_format,omitempty"`
TrimLeadingSpace bool `long:"trim_leading_space" description:"trim leading whitespace in CSV fields and values" yaml:"trim_leading_space,omitempty"`

NumParsers int `hidden:"true" description:"number of csv parsers to spin up" yaml:"-"`
}

// Parser implements the Parser interface
Expand Down
6 changes: 3 additions & 3 deletions parsers/htjson/htjson.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,10 +15,10 @@ import (
)

type Options struct {
TimeFieldName string `long:"timefield" description:"Name of the field that contains a timestamp"`
TimeFieldFormat string `long:"format" description:"Format of the timestamp found in timefield (supports strftime and Golang time formats)"`
TimeFieldName string `long:"timefield" description:"Name of the field that contains a timestamp" yaml:"timefield,omitempty"`
TimeFieldFormat string `long:"format" description:"Format of the timestamp found in timefield (supports strftime and Golang time formats)" yaml:"format,omitempty"`

NumParsers int `hidden:"true" description:"number of htjson parsers to spin up"`
NumParsers int `hidden:"true" description:"number of htjson parsers to spin up" yaml:"-"`
}

type Parser struct {
Expand Down
12 changes: 6 additions & 6 deletions parsers/keyval/keyval.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,21 +7,21 @@ import (
"strings"
"sync"

"github.com/sirupsen/logrus"
"github.com/kr/logfmt"
"github.com/sirupsen/logrus"

"github.com/honeycombio/honeytail/event"
"github.com/honeycombio/honeytail/httime"
"github.com/honeycombio/honeytail/parsers"
)

type Options struct {
TimeFieldName string `long:"timefield" description:"Name of the field that contains a timestamp"`
TimeFieldFormat string `long:"format" description:"Format of the timestamp found in timefield (supports strftime and Golang time formats)"`
FilterRegex string `long:"filter_regex" description:"a regular expression that will filter the input stream and only parse lines that match"`
InvertFilter bool `long:"invert_filter" description:"change the filter_regex to only process lines that do *not* match"`
TimeFieldName string `long:"timefield" description:"Name of the field that contains a timestamp" yaml:"timefield,omitempty"`
TimeFieldFormat string `long:"format" description:"Format of the timestamp found in timefield (supports strftime and Golang time formats)" yaml:"format,omitempty"`
FilterRegex string `long:"filter_regex" description:"a regular expression that will filter the input stream and only parse lines that match" yaml:"filter_regex,omitempty"`
InvertFilter bool `long:"invert_filter" description:"change the filter_regex to only process lines that do *not* match" yaml:"invert_filter,omitempty"`

NumParsers int `hidden:"true" description:"number of keyval parsers to spin up"`
NumParsers int `hidden:"true" description:"number of keyval parsers to spin up" yaml:"-"`
}

type Parser struct {
Expand Down
6 changes: 3 additions & 3 deletions parsers/mongodb/mongodb.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,9 @@ import (
"sync"
"time"

"github.com/sirupsen/logrus"
"github.com/honeycombio/mongodbtools/logparser"
queryshape "github.com/honeycombio/mongodbtools/queryshape"
"github.com/sirupsen/logrus"

"github.com/honeycombio/honeytail/event"
"github.com/honeycombio/honeytail/httime"
Expand Down Expand Up @@ -48,9 +48,9 @@ var timestampFormats = []string{
}

type Options struct {
LogPartials bool `long:"log_partials" description:"Send what was successfully parsed from a line (only if the error occured in the log line's message)."`
LogPartials bool `long:"log_partials" description:"Send what was successfully parsed from a line (only if the error occured in the log line's message)." yaml:"log_partials,omitempty"`

NumParsers int `hidden:"true" description:"number of mongo parsers to spin up"`
NumParsers int `hidden:"true" description:"number of mongo parsers to spin up" yaml:"-"`
}

type Parser struct {
Expand Down
12 changes: 8 additions & 4 deletions parsers/mysql/mysql.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,9 @@ import (
"sync"
"time"

"github.com/sirupsen/logrus"
_ "github.com/go-sql-driver/mysql"
"github.com/honeycombio/mysqltools/query/normalizer"
"github.com/sirupsen/logrus"

"github.com/honeycombio/honeytail/event"
"github.com/honeycombio/honeytail/httime"
Expand Down Expand Up @@ -132,9 +132,9 @@ type Options struct {
Host string `long:"host" description:"MySQL host in the format (address:port)"`
User string `long:"user" description:"MySQL username"`
Pass string `long:"pass" description:"MySQL password"`
QueryInterval uint `long:"interval" description:"interval for querying the MySQL DB in seconds" default:"30"`
QueryInterval uint `long:"interval" description:"interval for querying the MySQL DB in seconds"`

NumParsers int `hidden:"true" description:"number of MySQL parsers to spin up"`
NumParsers int `hidden:"true" description:"number of MySQL parsers to spin up" yaml:"-"`
}

type Parser struct {
Expand Down Expand Up @@ -178,9 +178,13 @@ func (p *Parser) Init(options interface{}) error {
p.role = role

// update hostedOn and readOnly every <n> seconds
queryInterval := p.conf.QueryInterval
if queryInterval == 0 {
queryInterval = 30
}
go func() {
defer db.Close()
ticker := time.NewTicker(time.Second * time.Duration(p.conf.QueryInterval))
ticker := time.NewTicker(time.Second * time.Duration(queryInterval))
for _ = range ticker.C {
readOnly, err := getReadOnly(db)
if err != nil {
Expand Down
8 changes: 4 additions & 4 deletions parsers/regex/regex.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,10 @@ type Options struct {
// Note: `LineRegex` and `line_regex` are named as singular so that
// it's less confusing to users to input them.
// Might be worth making this consistent across the entire repo
LineRegex []string `long:"line_regex" description:"Regular expression with named capture groups representing the fields you want parsed (RE2 syntax). You can enter multiple regexes to match (--regex.line_regex=\"(?P<foo>re)\" --regex.line_regex=\"(?P<bar>...)\"). Parses using the first regex to match a line, so list them in most-to-least-specific order."`
TimeFieldName string `long:"timefield" description:"Name of the field that contains a timestamp"`
TimeFieldFormat string `long:"time_format" description:"Timestamp format to use (strftime and Golang time.Parse supported)"`
NumParsers int `hidden:"true" description:"number of regex parsers to spin up"`
LineRegex []string `long:"line_regex" description:"Regular expression with named capture groups representing the fields you want parsed (RE2 syntax). You can enter multiple regexes to match (--regex.line_regex=\"(?P<foo>re)\" --regex.line_regex=\"(?P<bar>...)\"). Parses using the first regex to match a line, so list them in most-to-least-specific order." yaml:"line_regex,omitempty"`
TimeFieldName string `long:"timefield" description:"Name of the field that contains a timestamp" yaml:"timefield,omitempty"`
TimeFieldFormat string `long:"time_format" description:"Timestamp format to use (strftime and Golang time.Parse supported)" yaml:"time_format,omitempty"`
NumParsers int `hidden:"true" description:"number of regex parsers to spin up" yaml:"-"`
}

type Parser struct {
Expand Down
27 changes: 15 additions & 12 deletions tail/tail.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,11 +33,11 @@ const (
)

type TailOptions struct {
ReadFrom string `long:"read_from" description:"Location in the file from which to start reading. Values: beginning, end, last. Last picks up where it left off, if the file has not been rotated, otherwise beginning. When --backfill is set, it will override this option=beginning" default:"last"`
Stop bool `long:"stop" description:"Stop reading the file after reaching the end rather than continuing to tail. When --backfill is set, it will override this option=true"`
Poll bool `long:"poll" description:"use poll instead of inotify to tail files"`
StateFile string `long:"statefile" description:"File in which to store the last read position. Defaults to a file in /tmp named $logfile.leash.state. If tailing multiple files, default is forced."`
HashStateFileDirPaths bool `long:"hash_statefile_paths" description:"Generates a hash of the directory path for each file that is used to uniquely identify each statefile. Prevents re-using the same statefile for tailed files that have the same name."`
ReadFrom string `long:"read_from" description:"Location in the file from which to start reading. Values: beginning, end, last. Last picks up where it left off, if the file has not been rotated, otherwise beginning. When --backfill is set, it will override this option to beginning" yaml:"read_from,omitempty"`
Stop bool `long:"stop" description:"Stop reading the file after reaching the end rather than continuing to tail. When --backfill is set, it will override this option=true" yaml:"stop,omitempty"`
Poll bool `long:"poll" description:"use poll instead of inotify to tail files" yaml:"poll,omitempty"`
StateFile string `long:"statefile" description:"File in which to store the last read position. Defaults to a file in /tmp named $logfile.leash.state. If tailing multiple files, default is forced." yaml:"statefile,omitempty"`
HashStateFileDirPaths bool `long:"hash_statefile_paths" description:"Generates a hash of the directory path for each file that is used to uniquely identify each statefile. Prevents re-using the same statefile for tailed files that have the same name." yaml:"hash_statefile_paths,omitempty"`
}

// Statefile mechanics when ReadFrom is 'last'
Expand Down Expand Up @@ -349,7 +349,11 @@ func getTailer(conf Config, file string, stateFile string) (*tail.Tail, error) {
// tail a real file
var loc *tail.SeekInfo // 0 value means start at beginning
var reOpen, follow bool = true, true
switch conf.Options.ReadFrom {
var readFrom = conf.Options.ReadFrom
if readFrom == "" {
readFrom = "last"
}
switch readFrom {
case "start", "beginning":
// 0 value for tail.SeekInfo means start at beginning
case "end":
Expand All @@ -360,8 +364,7 @@ func getTailer(conf Config, file string, stateFile string) (*tail.Tail, error) {
case "last":
loc = getStartLocation(stateFile, file)
default:
errMsg := fmt.Sprintf("unknown option to --read_from: %s",
conf.Options.ReadFrom)
errMsg := fmt.Sprintf("unknown option to --read_from: %s", readFrom)
return nil, errors.New(errMsg)
}
if conf.Options.Stop {
Expand Down Expand Up @@ -391,10 +394,10 @@ func getTailer(conf Config, file string, stateFile string) (*tail.Tail, error) {
// It might describe an existing file, an existing directory, or a new path.
//
// If tailing a single logfile, we will use the specified --tail.statefile:
// - if it points to an existing file, that statefile will be used directly
// - if it points to a new path, that path will be written to directly
// - if it points to an existing directory, the statefile will be placed inside
// the directory (and the statefile's name will be derived from the logfile).
// - if it points to an existing file, that statefile will be used directly
// - if it points to a new path, that path will be written to directly
// - if it points to an existing directory, the statefile will be placed inside
// the directory (and the statefile's name will be derived from the logfile).
//
// If honeytail is asked to tail multiple files, we will only respect the
// third case, where --tail.statefile describes an existing directory.
Expand Down