Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pserver checkpoint #5102

Merged
merged 2 commits into from
Oct 26, 2017
Merged

Fix pserver checkpoint #5102

merged 2 commits into from
Oct 26, 2017

Conversation

helinwang
Copy link
Contributor

The pserver checkpoint before failed because the MD5 checksum is
calculated incorrectly. Now changed to CRC32 checksum.

The pserver checkpoint before failed because the MD5 checksum is
calculated incorrectly. Now changed to CRC32 checksum.
var cptr (*C.uchar)
if len(c) > 0 {
cptr = (*C.uchar)(&c[0])
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Else output error log?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! Done.

h := md5.New()
md5 := hex.EncodeToString(h.Sum(content))
if md5 != cpMeta.MD5 {
crc32 := crc32.ChecksumIEEE(content)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why md5 will cause the error?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same question...

Copy link
Contributor Author

@helinwang helinwang Oct 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@typhoonzero @Yancey1989 @dzhwinter

The problem was I think we used the md5 package incorrectly that generated a long string, causes etcd write error with "message too large". To show to error:

package main

import (
	"crypto/md5"
	"encoding/hex"
	"fmt"
)

func main() {
	h := md5.New()
	md5 := hex.EncodeToString(h.Sum([]byte("hello this is some string")))
	// Output: 68656c6c6f207468697320697320736f6d6520737472696e67d41d8cd98f00b204e9800998ecf8427e
	fmt.Println(md5)
}

The output is not a typical MD5 string, rather a very long one.

I think the correct way to get the MD5 string is here:

package main

import (
	"crypto/md5"
	"fmt"
)

func main() {
	data := []byte("These pretzels are making me thirsty.")
	sum := fmt.Sprintf("%x", md5.Sum(data))
	// Output: b0804ec967f48520697662a204f5fe72
	fmt.Printf(sum)
}

The reason to switch to CRC32 is because it's faster, better for checksum, MD5 is slower, better for defending cracking.

h := md5.New()
md5 := hex.EncodeToString(h.Sum(content))
if md5 != cpMeta.MD5 {
crc32 := crc32.ChecksumIEEE(content)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one more question. Why we change MD5 to CRC, concern the speed?

// successful.
log.Error("remove old meta file error", log.Ctx{"error": rmErr})
}
}
}

if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we would add some log here, sorry this is out of this PR code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The caller will handle it, here is one caller code:

err := s.checkpoint()
  if err != nil {
    log.Error("checkpoint error", log.Ctx{"error": err})
  }

I think in general the outer most caller should handle the error (either log or do something else), because it has the most information. If everyone prints log, it will be duplicating.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got it, thanks :)

h := md5.New()
md5 := hex.EncodeToString(h.Sum(content))
if md5 != cpMeta.MD5 {
crc32 := crc32.ChecksumIEEE(content)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same question...

Copy link
Contributor

@typhoonzero typhoonzero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@helinwang helinwang merged commit b1cbdf0 into PaddlePaddle:develop Oct 26, 2017
@helinwang helinwang deleted the checkpoint branch October 26, 2017 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants