File Upload Feature #1902

pzl · 2022-09-22T13:00:30Z

What is the problem this PR solves?

Adding the ability to upload files from integrations, through fleet server, into elasticsearch

A followup PR is under way to add this API to the new openapi.yml.

Extracting a few heuristics (file size limit, timeout) into fleet server configs will also come as a followup during scale testing

How does this PR solve the problem?

Adds three HTTP routes,

POST /api/fleet/uploads: Begins an "upload operation". Includes the full metadata about a file, with some required fields like name and size
PUT /api/fleet/uploads/<uploadID>/<chunkNum>: Uploading a segment (chunk) of the file contents.
POST /api/fleet/uploads/<uploadID>: Completes an "upload operation". Fleet server verifies all contents were uploaded, and initially-provided checksums match

How to test this PR locally

These routes are API-Key authorized, and as well deal with file chunking and checksums. It may be difficult to test routes in isolation without a nearly-fully-operational client implementation. Both Endpoint integration and the Agent itself are developing client implementations for this release.

If one were to (for development or checking purposes) disable the API key checking, they may be able to use this gist as a mock file upload client.

Checklist

I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation (update to openapi.yml spec as a followup)
I have made corresponding change to the default configuration files (adding configs coming in followup)
I have added tests that prove my fix is effective or that my feature works
I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Related issues

elasticmachine · 2022-09-22T13:03:30Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Start Time: 2023-01-20T23:38:55.333+0000
Duration: 14 min 14 sec

Test stats 🧪

Test	Results
Failed	0
Passed	515
Skipped	1
Total	516

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

/test : Re-trigger the build.

michel-laterman

I did a very quick 1st look here.
I can see that currently the api handers are interacting with upload, and dl to add files to ES.

However I think a cleaner implementation would be to have the ES interactions in upload so that the caller can just call something like upload.Chunk(uplID, chunkID, r.Body) error. What do you think?

internal/pkg/api/handleUpload.go

internal/pkg/api/schema.go

internal/pkg/model/schema.go

internal/pkg/upload/doc.go

internal/pkg/upload/upload.go

internal/pkg/api/schema.go

when retry is disabled (bubbles up to be equivalent to enabling longpolling for us), the elasticsearch.Client does not internally buffer the request (which is used in retrying and replaying the request). Chunks should not be memory-buffered, and should be retried at the integration level, since a retry-and-resume-friendly API is presented to them

internal/pkg/upload/upload.go

aleksmaus · 2023-01-17T18:35:07Z

internal/pkg/api/handleUpload.go

+	}
+}
+
+func (ut *UploadT) handleUploadStart(zlog *zerolog.Logger, w http.ResponseWriter, r *http.Request) error { //nolint:unparam // log is standard first arg for the handlers


another way is to do something like this

func (ut *UploadT) handleUploadStart(_ *zerolog.Logger, w http.ResponseWriter, r *http.Request) error

internal/pkg/api/handleUpload.go

aleksmaus

lgtm, assuming the @todos will be addressed before merging.
left few comments/nits, please respond or address.

internal/pkg/cache/cache.go

aleksmaus · 2023-01-18T13:18:26Z

internal/pkg/cache/cache.go

+	c.mut.RLock()
+	defer c.mut.RUnlock()
+
+	scopedKey := "upload:" + id


nit: maybe use a function that makes the key string to avoid doing this in two places, easier to make mistake if anything changes.

aleksmaus · 2023-01-18T13:30:16Z

internal/pkg/uploader/finalize.go

+	}
+
+	// now ensure all positions are accounted for, no gaps, etc
+	sort.Slice(chunks, func(i, j int) bool {


question/nit: could you avoid sorting if the chunks are always selected "ordered by pos" from elasticsearch?

I'm not sure I can guarantee that they are always sorted correctly by ES. The positional information is not in a sortable field, it is in just a portion of the document's _id. Considering this is a bounded size slice (limited maximum file size), sorting was not out of the question

aleksmaus · 2023-01-18T13:39:07Z

internal/pkg/uploader/finalize.go

+				log.Debug().Int("chunkID", i).Msg("non-final chunk was incorrectly marked last")
+				return false
+			}
+			if chunk.Size != int(info.ChunkSize) {


should we just have the chunk size defined as int64(or uint64) for both structs and avoid casting? the int is int64 anyways, the fleet server is built for 64bit only atm

internal/pkg/uploader/upload.go

aleksmaus · 2023-01-18T13:51:13Z

@cmacknz @michel-laterman would like to have somebody from the agent/fleet-server team to review this as well.

internal/pkg/uploader/es.go

internal/pkg/uploader/upload_test.go

juliaElastic

LGTM, other than CHANGELOG missing, and few TODOs in the code.

rewritten for multiple fleet server

AndersonQ · 2023-01-19T18:09:27Z

internal/pkg/uploader/upload.go

+// Searches for Upload Metadata document in local memory cache if available
+// otherwise, fetches from elasticsearch and caches for next use
+func (u *Uploader) GetUploadInfo(ctx context.Context, uploadID string) (upload.Info, error) {


[Suggestion | Go convention]

Suggested change

// Searches for Upload Metadata document in local memory cache if available

// otherwise, fetches from elasticsearch and caches for next use

func (u *Uploader) GetUploadInfo(ctx context.Context, uploadID string) (upload.Info, error) {

// GetUploadInfo searches for Upload Metadata document in local memory cache if available

// otherwise, fetches from elasticsearch and caches for next use

func (u *Uploader) GetUploadInfo(ctx context.Context, uploadID string) (upload.Info, error) {

AndersonQ · 2023-01-19T18:10:47Z

internal/pkg/uploader/upload/info.go

+// the only valid values of upload status according to storage spec
+type Status string


[Suggestion | Go convention]

Suggested change

// the only valid values of upload status according to storage spec

type Status string

// Status represents the only valid values of upload status according to storage spec

type Status string

kevinlog · 2023-01-20T19:30:54Z

Checked out the latest and tested again e2e, LGTM!

I'm able to get files from the Endpoint host:

This also includes testing with Fleet installation of the relevant indices. This includes running open my PR locally to update the mappings needed by the new fleet server implementation. I'll merge this PR soon.

pzl added enhancement New feature or request 8.6-candidate labels Sep 22, 2022

pzl force-pushed the file-upload branch from b83ef09 to 00d9ec4 Compare September 22, 2022 13:58

cmacknz requested a review from michel-laterman September 22, 2022 16:02

kpollich mentioned this pull request Sep 27, 2022

[Fleet] Support new "request diagnostics" action type #1883

Closed

9 tasks

pzl force-pushed the file-upload branch 2 times, most recently from 518df39 to de75a87 Compare September 29, 2022 18:36

juliaElastic mentioned this pull request Oct 4, 2022

[Fleet] Add workflow for requesting and downloading agent diagnostics from Fleet UI elastic/kibana#141074

Closed

19 tasks

pzl mentioned this pull request Oct 5, 2022

API: only set default content-type when not set elastic/go-elasticsearch#536

Merged

juliaElastic mentioned this pull request Oct 6, 2022

[Fleet] Request diagnostics elastic/kibana#142369

Merged

2 tasks

michel-laterman reviewed Oct 20, 2022

View reviewed changes

michel-laterman reviewed Oct 24, 2022

View reviewed changes

internal/pkg/api/schema.go Outdated Show resolved Hide resolved

michel-laterman mentioned this pull request Oct 27, 2022

V2 agent diagnostics action elastic/elastic-agent#1631

Closed

6 tasks

pzl added 14 commits November 7, 2022 16:28

file upload WIP

fcb3d09

functional chunk uploads, basic functionality

33e9821

cleanups

8da226e

index name temporary cleanup

ebcee30

cbor: boolean byte was flipped

f9b5557

use throttle to cap upload concurrency, add tests

79c959c

start using real(er) document IDs and index names

76dacbf

update tests

2cafe7f

max file size

fc9e79c

write agent,action IDs to file meta doc

59ee0c2

file upload verification, incl hashes

3b0442d

WIP add contents schema for inner-zip info

2fea5df

refactor out of dl, support arbitrary req payloads

3e64ef7

pzl force-pushed the file-upload branch from 15b1201 to 3e64ef7 Compare November 7, 2022 21:47

add upload authorization

8bc0a5c

aleksmaus reviewed Jan 17, 2023

View reviewed changes

internal/pkg/upload/upload.go Outdated Show resolved Hide resolved

aleksmaus reviewed Jan 17, 2023

View reviewed changes

internal/pkg/api/handleUpload.go Outdated Show resolved Hide resolved

pzl added 6 commits January 17, 2023 14:18

comment spacing

6cc2b71

namespace the exportable structs to avoid circular imports coming

046b310

use internal cache for upload infos

610d19a

fix lint

882df20

fixup tests

8e38d80

move chunk def into new package for exported defs

dcf436f

aleksmaus approved these changes Jan 18, 2023

View reviewed changes

small cleanups

e2ee184

juliaElastic reviewed Jan 18, 2023

View reviewed changes

internal/pkg/uploader/es.go Outdated Show resolved Hide resolved

comment cleanups

ea72b10

juliaElastic reviewed Jan 18, 2023

View reviewed changes

internal/pkg/uploader/upload_test.go Show resolved Hide resolved

juliaElastic approved these changes Jan 18, 2023

View reviewed changes

cmacknz requested a review from AndersonQ January 18, 2023 18:10

pzl added 3 commits January 18, 2023 20:14

fixup uploader tests

bf44629

add line to changelog

db44cd1

Merge branch 'main' into file-upload

f47e698

elastic deleted a comment from mergify bot Jan 19, 2023

AndersonQ approved these changes Jan 19, 2023

View reviewed changes

comment conventions

144756d

kevinlog approved these changes Jan 20, 2023

View reviewed changes

pzl added 2 commits January 20, 2023 18:04

a WHOLE LOT of handler tests

b877927

Merge branch 'main' into file-upload

ce4d2d2

pzl merged commit deaf781 into elastic:main Jan 21, 2023

pzl mentioned this pull request Jan 25, 2023

Openapi docs for File Upload routes #2299

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

File Upload Feature #1902

File Upload Feature #1902

pzl commented Sep 22, 2022 •

edited

Loading

elasticmachine commented Sep 22, 2022 •

edited

Loading

Build stats

Test stats 🧪

michel-laterman left a comment

aleksmaus Jan 17, 2023

aleksmaus left a comment

aleksmaus Jan 18, 2023

aleksmaus Jan 18, 2023

pzl Jan 18, 2023

aleksmaus Jan 18, 2023

aleksmaus commented Jan 18, 2023 •

edited

Loading

juliaElastic left a comment

AndersonQ Jan 19, 2023

AndersonQ Jan 19, 2023

kevinlog commented Jan 20, 2023

		// the only valid values of upload status according to storage spec
		type Status string

File Upload Feature #1902

File Upload Feature #1902

Conversation

pzl commented Sep 22, 2022 • edited Loading

What is the problem this PR solves?

How does this PR solve the problem?

How to test this PR locally

Checklist

Related issues

elasticmachine commented Sep 22, 2022 • edited Loading

💚 Build Succeeded

Build stats

Test stats 🧪

💚 Flaky test report

🤖 GitHub comments

michel-laterman left a comment

Choose a reason for hiding this comment

aleksmaus Jan 17, 2023

Choose a reason for hiding this comment

aleksmaus left a comment

Choose a reason for hiding this comment

aleksmaus Jan 18, 2023

Choose a reason for hiding this comment

aleksmaus Jan 18, 2023

Choose a reason for hiding this comment

pzl Jan 18, 2023

Choose a reason for hiding this comment

aleksmaus Jan 18, 2023

Choose a reason for hiding this comment

aleksmaus commented Jan 18, 2023 • edited Loading

juliaElastic left a comment

Choose a reason for hiding this comment

AndersonQ Jan 19, 2023

Choose a reason for hiding this comment

AndersonQ Jan 19, 2023

Choose a reason for hiding this comment

kevinlog commented Jan 20, 2023

pzl commented Sep 22, 2022 •

edited

Loading

elasticmachine commented Sep 22, 2022 •

edited

Loading

aleksmaus commented Jan 18, 2023 •

edited

Loading