Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MS SQL DB metricset #8250

Closed
wants to merge 5 commits into from
Closed

MS SQL DB metricset #8250

wants to merge 5 commits into from

Conversation

sayden
Copy link
Contributor

@sayden sayden commented Sep 5, 2018

I've started with this particular query https://docs.microsoft.com/en-us/sql/relational-databases/system-dynamic-management-views/sys-dm-db-file-space-usage-transact-sql?view=sql-server-2017 from the Database metricset.

I still have to build a local environment to test the entire module. I'll keep pushing progress and request approval once small metricsets are being finished

@sayden sayden added in progress Pull request is currently in progress. module Metricbeat Metricbeat needs tests needs_docs labels Sep 5, 2018
@sayden sayden requested a review from jsoriano September 5, 2018 21:17
@sayden sayden changed the title Atomic commit of the current work in progress MS SQL metricset MS SQL metricset Sep 6, 2018
Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added some comments.

Also as it seems like a potentially big module, I suggest you to start by creating just a metricset, and then continue adding metricsets in follow up PRs, this will help getting this merged.


func (m *MetricSet) loadFileSpaceUsage(db *sql.DB) (common.MapStr, error) {
// Returns the global status, also for versions previous 5.0.2
rows, err := db.Query("SELECT * FROM sys.dm_db_file_space_usage;")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can probably reuse this query between all metricsets, move it to a common place at the module level.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could probably be a method of a common MetricSet type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to a different function to perform queries concurrently, check fb7cc98#diff-333a1912b4f61b216d6a3fd05fd647e7R123 on latest commit

"drive_name": c.Str("DriveName"),
"pdw_node_id": c.Int("pdw_node_id"),
},
//Returns a row for each pending I/O request in SQL Server.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a different row for each request, then it'd be better to move them to their own metricset, and generate an event for each one of them.

Here you can keep a summary in any case, with the count of pending requests, total pending ticks...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't done this yet on the current commit. I need to investigate a bit more how it works :) but I think I understand the overall idea

// fn_virtualfilestats function.
"virtual_file_stats": s.Object{
"database_name": c.Str("database_name"),
"database_id": c.Int("database_id"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fields that are going to exist in different metricsets should have the same name, this helps on correlation. These database fields are good candidates for that, after applying the schema you can move them to the module level.

Look at some examples that use the ModuleFields attribute.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after applying the schema you can move them to the module level don't really understand what do you mean here. Can I apply and then modify the schema to move / delete most ocurrences of the same field to a parent s.Object?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes, I mean that after applying the schema you can still modify the resulting object, for example deleting some fields after copying them to ModuleFields.

// fn_virtualfilestats function.
"virtual_file_stats": s.Object{
"database_name": c.Str("database_name"),
"database_id": c.Int("database_id"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could structure these fields in an object:

"database": s.Object{
  "name": c.Str("database_name"),
  "id":   c.Int("database_id"),
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because now there is only a single group of metrics in this package, I haven't done it to simplify the tree.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, even with an only metricset I think this would be something good to do in any case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in e151780d30f4db3cca88511767a134f295976607

metricbeat/module/mssql/io/data.go Outdated Show resolved Hide resolved
mb.BaseMetricSet
counter int
db *sql.DB
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MetricSet object is probably going to be reusable between metricsets of this module, take a look for example to the mongodb module where there is a common metricset defined at the module level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}
},
"type":"metricsets"
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take a look to the TestData functions in the integration tests of other modules to see how this file is generated.

return &MetricSet{Db: db, BaseMetricSet: base}, nil
}

func NewModule(base mb.BaseModule) (mb.Module, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported function NewModule should have comment or be unexported

Port int `config:"port" validate:"nonzero,required"`
}

func NewMetricSet(base mb.BaseMetricSet) (*MetricSet, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported function NewMetricSet should have comment or be unexported

Db *sql.DB
}

type Config struct {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported type Config should have comment or be unexported

}
}

type MetricSet struct {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported type MetricSet should have comment or be unexported

import (
"database/sql"
"fmt"
_ "github.com/denisenkom/go-mssqldb"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a blank import should be only in a main or test package, or have a comment justifying it

import (
"database/sql"
"fmt"
_ "github.com/denisenkom/go-mssqldb"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a blank import should be only in a main or test package, or have a comment justifying it

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some more comments. I think it'd be better if you start by creating just a simple metricset and then continue expanding this metricset and adding new ones in follow up PRs.

if c.error == nil {
c.error = err
} else {
c.error = errors.Wrap(c.error, err.Error())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are looking for multierror here, not errors.Wrap 🙂

// Fetch methods implements the data gathering and data conversion to the right
// format. It publishes the event which is then forwarded to the output. In case
// of an error set the Error field of mb.Event or simply call report.Error().
func (m *MetricSet) Fetch() ([]common.MapStr, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple interfaces a metricset can implement depending on the signature of its Fetch method, this one returning the event is the "old" one, we currently prefer the ReportingMetricSetV2 interface, where the Fetch method receives a ReporterV2 object that can be used to send any number of events and errors.

Take a look to its usage in other modules, it'd be nice to use it on new modules like this one.


defer func() {
if closeErr := rows.Close(); closeErr != nil {
//TODO Log error? Ignore it?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that just defer rows.Close().

c.maprs = append(c.maprs, maprSlice...)
}

func doQuery(db *sql.DB, query string) ([]common.MapStr, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want this method to be reused between metricsets put it at the module level.

return mapR, nil
}

func rowsToMapR(rows *sql.Rows) (common.MapStr, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use to call this method eventMapping in other modules 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done b905b46a3a44451bcc1aebe18455431b1667939c

// Fetch methods implements the data gathering and data conversion to the right
// format. It publishes the event which is then forwarded to the output. In case
// of an error set the Error field of mb.Event or simply call report.Error().
func (m *MetricSet) Fetch(report mb.ReporterV2) (common.MapStr, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you are already using ReporterV2 here, but fetch here shouldn't return anything, use report.Event() to send an event and report.Error() to send an error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done here 9c900938162dc77e08b70b1bd8b080b26c7bcd7e, althought I'm a bit confusing now because I'm not fully sure about how to create integration tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integration test done here 069cfb5cda5baf8574bf0941241014089cd74db8

return nil, err
}

//TODO Db must be gracefully closed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if needed here, you probably want to connect and disconnect on Fetch
But if you want to start something in New you can implement also the Closer interface in the MetricSet to do the cleanup when it is stopped.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was actually thinking how was the lifetime of a metricset and a module

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have seen that you have added a Close method, but as I said you probably want to connect and disconnect on Fetch. The main problem of leaving the connection open during all the life of the metricset is that you need to handle reconnections if the connection is closed by the server or any other external reason. This can add unnecessary complexity.

"dm_db_persisted_sku_features", //TODO Returns nothing with empty db
"dm_db_session_space_usage",
"dm_db_task_space_usage",
"dm_db_uncontained_entities", //TODO Returns nothing using the driver. Works with sqlcmd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all these queries needed for a simple metricset? :)

"sync"
)

func NewFetcher(db *sql.DB, qs []string, schema *s.Schema) *fetcher {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported function NewFetcher should have comment or be unexported
exported func NewFetcher returns unexported type *mssql.fetcher, which can be annoying to use

package db

import (
_ "github.com/denisenkom/go-mssqldb"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a blank import should be only in a main or test package, or have a comment justifying it

@sayden
Copy link
Contributor Author

sayden commented Sep 7, 2018

Ok. metricset got really simplified now and the io.Close implementation closes the db connection.

Most of the logic is now in package mssql in the fetcher.go file that will help every metricset in doing queries concurrently to the database.

return sql.Open("sqlserver", u.String())
}

func NewModule(base mb.BaseModule) (mb.Module, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported function NewModule should have comment or be unexported

return &MetricSet{BaseMetricSet: base}, nil
}

func NewDB(config *Config) (*sql.DB, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported function NewDB should have comment or be unexported

mb.BaseMetricSet
}

type Config struct {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported type Config should have comment or be unexported

@sayden sayden changed the title MS SQL metricset MS SQL DB metricset Sep 10, 2018
@sayden sayden closed this Sep 11, 2018
@sayden sayden deleted the mssql-metricset branch September 11, 2018 20:51
@sayden sayden restored the mssql-metricset branch September 12, 2018 07:37
@sayden sayden reopened this Sep 12, 2018
Experimental `db` metricset
Move methods to call SQL concurrently to the mssql package so it's available to all metricsets
Implement io.Close to do cleanup on the db metricset
Rename function rowsToMapR to eventsMapping
Promote database_id to the parent level within the schema
DB connection is created and closed on each Fetch call.
Minor improvements on pointer receiver namings on fetcher
Updated data.json when launching go tests with -data
Added comment to NewDB function. Make private newModule function as it is only used within mssql package but not from any metricset
db_integration_test.go rewritten to work with ReporterV2
Removed unused variable and fix typo
Remove metricset io from this PR and added license headers
Moved files to X-Pack folder
Delete references to mssql

Squash into a single commit after moved files to x-pack folder
Port int `config:"port" validate:"nonzero,required"`
}

func NewMetricSet(base mb.BaseMetricSet) (*MetricSet, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported function NewMetricSet should have comment or be unexported

mb.BaseMetricSet
}

type Config struct {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported type Config should have comment or be unexported

}
}

type MetricSet struct {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported type MetricSet should have comment or be unexported

import (
"database/sql"
"fmt"
_ "github.com/denisenkom/go-mssqldb"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a blank import should be only in a main or test package, or have a comment justifying it

"sync"
)

func NewFetcher(config *Config, qs []string, schema *s.Schema) (*fetcher, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported function NewFetcher should have comment or be unexported
exported func NewFetcher returns unexported type *mssql.fetcher, which can be annoying to use

package db

import (
_ "github.com/denisenkom/go-mssqldb"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a blank import should be only in a main or test package, or have a comment justifying it

package server

import (
_ "github.com/denisenkom/go-mssqldb"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a blank import should be only in a main or test package, or have a comment justifying it

package server

import (
_ "github.com/denisenkom/go-mssqldb"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a blank import should be only in a main or test package, or have a comment justifying it

"github.com/pkg/errors"
)

func NewFetcher(config *Config, qs []string, schema *s.Schema) (*fetcher, error) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported function NewFetcher should have comment or be unexported
exported func NewFetcher returns unexported type *mssql.fetcher, which can be annoying to use

@sayden sayden mentioned this pull request Oct 23, 2018
33 tasks
@andrewkroh
Copy link
Member

andrewkroh commented Oct 31, 2018

@sayden I started working on adding magefile.go for building x-pack/metricbeat. You can view that work at https://github.com/andrewkroh/beats/commits/sayden-mssql-metricset. I had to add some dependencies that were missing to vendor/.

With my changes you can do a few things:

  • mage update - Generates:
    • fields.yml
    • fields.go for each x-pack module
    • dashboard and index patterns to build/kibana
    • metricbeat.yml
    • metricbeat.reference.yml
    • modules.d/ dir
    • include/list.go
  • SNAPSHOT=true mage package - build snapshot packages

I'm working on the integration testing piece, but I can run them with a bit of manual work.

mkdir build
GOOS=linux GOARCH=amd64 mage -f -compile build/mage-linux-amd64
docker-compose run -e TEST_COVERAGE=true -e RACE_DETECTOR=true -e MAGEFILE_VERBOSE=1 beat /go/src/github.com/elastic/beats/x-pack/metricbeat/build/mage-linux-amd64 goTestIntegration

@ruflin
Copy link
Member

ruflin commented Oct 31, 2018

@andrewkroh @sayden Perhaps you can collaborate on #8829 ?

@andrewkroh
Copy link
Member

andrewkroh commented Oct 31, 2018

@sayden @ruflin I saw #8829 right after I pushed my commit when I started checking emails.

My end goal is to replace most of the shared build logic that exists in Makefiles and python scripts with Go (I don't care so much about porting to Go things that are not shared/reused across Beats like create-metricset).

Rather than doing it all at once my approach has been to avoid changing any Makefiles (or other Beats) to the extent that this is possible, re-create any scripts I need with Go, and execute the build with Mage (we started with x-pack/filebeat). Once we prove this is working by being able to build/test/package from x-pack/Metricbeat with mage then we can start expanding to the next Beat that is adding content with x-pack/ which is Auditbeat.

Then I think we'll be in a good position to start updating the other Beat's magefiles to remove the dependency on Make. In the end we should be able to fully build and test from the magefile and even on Windows.

To that extend I think some of the most important pieces to add to be complete for x-pack/Metricbeat are:

  • collect the module docs and generate an asciidoc index that includes each module's docs (e.g. mage docs). I skipped this for Filebeat, but it needs the same thing.
  • add a mage target for code formating and adding license headers
  • standardize the testing targets: unitTest, goUnitTest, pythonUnitTest, integTest, goIntegTest, pythonUnitTest. Add goTestUnit and goTestIntegration to magefile #7766 (comment)
  • Be able to execute the integ tests within Docker when required (I started on this)

Perhaps we can join a call tomorrow or Friday and discuss how we can collaborate to get your MSSQL module building, testing, and packaged as soon as possible (I know you've been waiting for a while).

@sayden
Copy link
Contributor Author

sayden commented Nov 1, 2018

@andrewkroh I think that the approach you describe is the best for all, specially to remove some of the black magic and silent errors we currently have in Python / Makefiles.

I have already done the changes needed for formatting and license headers in Makefile / Python so I can start helping you in this with Mage. I'm currently very close to finish make update command which could be helpful to start building more x-pack modules ASAP (at least as a development environment).

I was looking at the fields mage command right now so I think it's a good moment to start playing with Mage too, as you can see in this #8829 I have reached a good understanding of what some our current scripts do.

@ruflin ruflin added the Team:Integrations Label for the Integrations team label Nov 21, 2018
@ruflin ruflin added Team:Integrations Label for the Integrations team and removed Team:Integrations Label for the Integrations team labels Nov 27, 2018
@sayden
Copy link
Contributor Author

sayden commented Dec 5, 2018

Continuing here #9202

@sayden sayden closed this Dec 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in progress Pull request is currently in progress. Metricbeat Metricbeat module needs_docs needs tests Team:Integrations Label for the Integrations team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants