Skip to content
This repository has been archived by the owner on Nov 24, 2023. It is now read-only.

*: add Writer for relay log #117

Merged
merged 63 commits into from
May 22, 2019
Merged

Conversation

csuzhangxc
Copy link
Member

@csuzhangxc csuzhangxc commented Apr 18, 2019

What problem does this PR solve?

this PR is a part of #91.

What is changed and how it works?

  • add a FileWriter to support to write binlog events into files
    • write a binlog file header for any new binlog file
    • ignore any fake RotateEvent
    • ignore any duplicate events
    • fill any potential holes in the binlog file with dummy events
  • The FileWriter support to recover any binlog file with uncompleted events/transactions

Check List

Tests

  • Unit test

Code changes

  • Has exported function/method change
  • Has exported variable/fields change
  • Has interface methods change

Side effects

  • Increased code complexity

@csuzhangxc csuzhangxc added priority/important Major change, requires approval from ≥2 primary reviewers status/WIP This PR is still work in progress type/enhancement Performance improvement or refactoring labels Apr 18, 2019
relay/writer/file_test.go Outdated Show resolved Hide resolved
@csuzhangxc
Copy link
Member Author

/run-all-tests

pkg/binlog/filename.go Outdated Show resolved Hide resolved
@csuzhangxc
Copy link
Member Author

/run-all-tests

1 similar comment
@csuzhangxc
Copy link
Member Author

/run-all-tests

@codecov
Copy link

codecov bot commented May 13, 2019

Codecov Report

Merging #117 into master will increase coverage by 1.7257%.
The diff coverage is 82.2429%.

@@               Coverage Diff                @@
##             master       #117        +/-   ##
================================================
+ Coverage   32.4824%   34.2082%   +1.7257%     
================================================
  Files           117        121         +4     
  Lines         13124      13602       +478     
================================================
+ Hits           4263       4653       +390     
- Misses         8378       8420        +42     
- Partials        483        529        +46

// handleDuplicateEventsExist tries to handle a potential duplicate event in the binlog file.
func (w *FileWriter) handleDuplicateEventsExist(ev *replication.BinlogEvent) (*Result, error) {
filename := filepath.Join(w.cfg.RelayDir, w.filename.Get())
duplicate, err := checkIsDuplicateEvent(filename, ev)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it inefficient that call it for every event?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in most cases, checkIsDuplicateEvent only checks start/end pos with the file size (has a comment in checkIsDuplicateEvent), so I think it is not inefficient. and if we want to make it work correctly, we must do this check.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worrid that os.Stat is an inefficient function

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, pass file size into the function as an arg?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for os.Stat, it will call syscall.Stat; for File.Stat, it will call syscall.Fstat.

from http://man7.org/linux/man-pages/man2/stat.2.html.

fstat() is identical to stat(), except that the file about which
information is to be retrieved is specified by the file descriptor
fd.

if we changed to pass File.Stat() into the function, the difference of efficiency is small? or should we need to pass an in-memory offset into the function? @GregoryIan what's your opinion.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can choose one

  • check duplicate event only if we need to check, like holeSize < 0
  • or parse a fd or stat information as arg?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I choose the first one.

@IANTHEREAL
Copy link
Collaborator

/run-all-tests

1 similar comment
@amyangfei
Copy link
Contributor

/run-all-tests

@csuzhangxc
Copy link
Member Author

/run-all-tests


// only parse single event
eof, err := replication.NewBinlogParser().ParseSingleEvent(f, onEventFunc)
if err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if some events after format description event are damaged, but found == true, should we return true?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed to return true even err != nil in 6d54140, because this function only checks if FormatDescriptionEvent exists.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense

@IANTHEREAL
Copy link
Collaborator

Rest LGTM

Copy link
Collaborator

@IANTHEREAL IANTHEREAL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@amyangfei amyangfei added status/LGT2 Two reviewers already commented LGTM, ready for merge and removed status/PTAL This PR is ready for review. Add this label back after committing new changes labels May 22, 2019
@csuzhangxc csuzhangxc merged commit 7152fa0 into pingcap:master May 22, 2019
@csuzhangxc csuzhangxc deleted the relay-writer-2 branch May 22, 2019 02:34
lichunzhu pushed a commit to lichunzhu/dm that referenced this pull request Apr 6, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
priority/important Major change, requires approval from ≥2 primary reviewers status/LGT2 Two reviewers already commented LGTM, ready for merge type/enhancement Performance improvement or refactoring
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants