Skip to content
This repository has been archived by the owner on Oct 22, 2021. It is now read-only.

storage: Fix read all content at once when readdir #20

Merged
merged 10 commits into from
Aug 12, 2021
Merged

storage: Fix read all content at once when readdir #20

merged 10 commits into from
Aug 12, 2021

Conversation

bokket
Copy link
Member

@bokket bokket commented Aug 8, 2021

Signed-off-by: bokket [email protected]
fix: #16

readdir.go Outdated
func (s *Storage) listDirNext(ctx context.Context, page *types.ObjectPage) (err error) {
input := page.Status.(*listDirInput)

input.fileList, err = s.hdfs.ReadDir(input.rp)
Copy link
Contributor

@Xuanwo Xuanwo Aug 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s.hdfs.ReadDir(input.rp) always read all content at once, see code here: https://github.com/colinmarc/hdfs/blob/master/readdir.go

func (c *Client) ReadDir(dirname string) ([]os.FileInfo, error) {
	f, err := c.Open(dirname)
	if err != nil {
		return nil, err
	}

	return f.Readdir(0)
}

Instead, we need to open the dir and read them via count.

And we don't need to store fileList/counter in listDirInput.

Copy link
Member Author

@bokket bokket Aug 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So count is up to us? For example const defaultListObjectCount=100
Otherwise, we still need to readDir to get the slice and calculate the count

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we need is limit instead of count, there is no need for list to calculate the count.

We only need to f.Readdir(limit) and return IteratorDone if we meet io.EOF.

Ref: https://github.com/colinmarc/hdfs/blob/master/file_reader.go#L255-L257

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is slightly confusing, and I thought of using a constant to represent the read upper limit? Maybe there is something wrong with my description

readdir.go Outdated Show resolved Hide resolved
readdir.go Outdated
func (s *Storage) listDirNext(ctx context.Context, page *types.ObjectPage) (err error) {
input := page.Status.(*listDirInput)

dir, err := s.hdfs.Open(input.rp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we need to store dir in page.Status?

  • If we open dir every time in listDirNext, we can only list first 100 files.
  • We can close dir before returning IterateDone

Copy link
Member Author

@bokket bokket Aug 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • If we open dir every time in listDirNext, we can only list first 100 files.

Maybe we could increase the limit each time until we reach EOF ? but that would repeat the read of the previous file.
So maybe we could probably use a slicing process to weed out the files before 100 each time. Do you think it would ok?

readdir.go Outdated
}()

fileList, err := dir.Readdir(defaultListObjectLimit)
fileList, err := input.dir.Readdir(defaultListObjectLimit)

if err != nil && err == io.EOF {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should use errors.Is to assert error's type?

readdir.go Outdated
"io"

"github.com/beyondstorage/go-storage/v4/types"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the blank line

storage.go Outdated
@@ -143,6 +126,11 @@ func (s *Storage) move(ctx context.Context, src string, dst string, opt pairStor
func (s *Storage) read(ctx context.Context, path string, w io.Writer, opt pairStorageRead) (n int64, err error) {
rp := s.getAbsPath(path)
f, err := s.hdfs.Open(rp)

defer func() {
err = f.Close()
Copy link
Member

@Prnyself Prnyself Aug 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add nil check for err before assign f.Close() to it.

In this way, the real err may be covered by f.Close()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JinnyYi JinnyYi merged commit cefc575 into beyondstorage:master Aug 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ReadDir by offset instead read all content at once
4 participants