Skip to content

Commit

Permalink
s2: Add arm64 decompression (#324)
Browse files Browse the repository at this point in the history
Adapted from Snappy equiv. Minor optimizations along the way.
```
enwik9 (stream):
Decompressing. 426854614 -> 1000000000 [234.27%]; 7.618s, 125.2MB/s
Decompressing. 426854614 -> 1000000000 [234.27%]; 4.258s, 224.0MB/s

Blocks:
benchmark                     old ns/op     new ns/op     delta
BenchmarkTwainDecode1e1-4     69.5          63.0          -9.27%
BenchmarkTwainDecode1e2-4     184           153           -16.61%
BenchmarkTwainDecode1e3-4     852           522           -38.74%
BenchmarkTwainDecode1e4-4     30220         14678         -51.43%
BenchmarkTwainDecode1e5-4     570195        231880        -59.33%
BenchmarkTwainDecode1e6-4     3349753       1920724       -42.66%

benchmark                     old MB/s     new MB/s     speedup
BenchmarkTwainDecode1e1-4     143.97       158.69       1.10x
BenchmarkTwainDecode1e2-4     544.75       653.07       1.20x
BenchmarkTwainDecode1e3-4     1174.00      1916.39      1.63x
BenchmarkTwainDecode1e4-4     330.90       681.27       2.06x
BenchmarkTwainDecode1e5-4     175.38       431.26       2.46x
BenchmarkTwainDecode1e6-4     298.53       520.64       1.74x
```
  • Loading branch information
klauspost authored Feb 26, 2021
1 parent 005d22e commit 48d7571
Show file tree
Hide file tree
Showing 5 changed files with 589 additions and 3 deletions.
12 changes: 11 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,10 @@ os:
- linux
- osx

arch:
- amd64

go:
- 1.13.x
- 1.14.x
- 1.15.x
- 1.16.x
Expand Down Expand Up @@ -34,6 +36,14 @@ jobs:
go: 1.16.x
script:
- GOOS=linux GOARCH=386 go test -short ./...
- stage: arm64 tests
arch: arm64
go:
- 1.16.x
script:
- go test -cpu=2 ./s2
- go test -cpu=2 -tags=noasm ./s2
- go build github.com/klauspost/compress/s2/cmd/s2c && go build github.com/klauspost/compress/s2/cmd/s2d && s2c s2c && s2d s2c.s2 && rm s2c && rm s2d && rm s2c.s2

deploy:
- provider: script
Expand Down
3 changes: 2 additions & 1 deletion s2/decode_amd64.s
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,6 @@
// - R_TMP0 scratch
// - R_TMP1 scratch
// - R_LEN length or x (shared)
// - R_X length or x (shared)
// - R_OFF offset
// - R_SRC &src[s]
// - R_DST &dst[d]
Expand Down Expand Up @@ -172,13 +171,15 @@ callMemmove:
MOVQ R_DST, 24(SP)
MOVQ R_SRC, 32(SP)
MOVQ R_LEN, 40(SP)
MOVQ R_OFF, 48(SP)
CALL runtime·memmove(SB)

// Restore local variables: unspill registers from the stack and
// re-calculate R_DBASE-R_SEND.
MOVQ 24(SP), R_DST
MOVQ 32(SP), R_SRC
MOVQ 40(SP), R_LEN
MOVQ 48(SP), R_OFF
MOVQ dst_base+0(FP), R_DBASE
MOVQ dst_len+8(FP), R_DLEN
MOVQ R_DBASE, R_DEND
Expand Down
Loading

0 comments on commit 48d7571

Please sign in to comment.