forked from nnstreamer/nntrainer
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Previously, sgemv_fp16 was dependent of two conditions: 1. should have 8-divisible column or row 2. fully work with fp16 digit (which might raise accuracy issue) - In this commit, we expect sgemv to work like: 1. support every column length (with adaptive-compute optimization) 2. use temporal fp32 array to secure cumulative digit error in large scale Tensor 3. accelerate fp32 to fp16 copy and vice versa with neon to enhance time performance - some trivial typo fix included **Self evaluation:** 1. Build test: [X]Passed [ ]Failed [ ]Skipped 2. Run test: [X]Passed [ ]Failed [ ]Skipped Signed-off-by: skykongkong8 <[email protected]>
- Loading branch information
1 parent
9149a55
commit 0fc814a
Showing
6 changed files
with
434 additions
and
1,379 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.