-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(sio): fix implemented Hardware Divider #37
Conversation
thanks @kilograham for pointing me in the right direction
Thanks Valerio! Yes, Graham surely knows the RP2040 much better than me. I'm so happy he's helping us :-) Do you want us to add some unit tests for the divider in the next live stream? Or do you want to try adding them yourself? |
test based on hello_divider from pico_examples
test based on hello_divider from pico_examples
…040js into Hardware-Divider-fix
src/sio.spec.ts
Outdated
const _DIV_SDIVISOR = SIO_START_ADDRESS + 0x06c; // Divider signed divisor | ||
const _DIV_QUOTIENT = SIO_START_ADDRESS + 0x070; // Divider result quotient | ||
const _DIV_REMAINDER = SIO_START_ADDRESS + 0x074; //Divider result remainder | ||
const _DIV_CSR = SIO_START_ADDRESS + 0x078; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please remove the _ (underscore) in front of the constant? The idea is to keep this code consistent with the rest of the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I'm moving test to be gdbclient ready, I'll do it in the next commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh great. I was going to suggest that, but you are faster!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok done,
I had to add several "| 0"which are needed when running on silicone and the read value is signed, maybe a readSInt32 implementation could be better? or maybe not as it matters only during test?
copied the silicone behavior when divisor=0 -> quot=-1, rem=dividend
I wasn't able to find any word related to div/0 exceptions in the datasheets, is this possible?
_DIV_XXX has been renamed to SIO_DIV_XXX, I choose this way as SIO_DIV_XXX = SIO_ADDRESS_BASE + DIV_XXX
Even if declared in different files, I wouldn't have the same name to both offset and full address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
copied the silicone behavior when divisor=0 -> quot=-1, rem=dividend
Good idea to test what happens when dividing by zero.
I wasn't able to find any word related to div/0 exceptions in the datasheets, is this possible?
No idea! Did you see it getting any exception on the silicone?
I had to add several "| 0"which are needed when running on silicone and the read value is signed, maybe a readSInt32 implementation could be better? or maybe not as it matters only during test?
I think this is a good idea, it will make the tests easier to read / understand. I suggest calling it readInt32()
(without the S), similar to how they call the signed int array in JavaScript (and also C++ type int32_t vs uint32_t).
_DIV_XXX has been renamed to SIO_DIV_XXX, I choose this way as SIO_DIV_XXX = SIO_ADDRESS_BASE + DIV_XXX
Even if declared in different files, I wouldn't have the same name to both offset and full address.
👍
fix(sio):fix DIV_CSR register behavior test(sio):move to test driver for silicone/emu comparison
src/sio.ts
Outdated
@@ -117,6 +146,56 @@ export class RPSIO { | |||
case GPIO_HI_OE_XOR: | |||
this.qspiGpioOutputEnable ^= value & GPIO_MASK; | |||
break; | |||
case DIV_UDIVIDEND: | |||
this.divDividend = value >>> 0; | |||
this.divCSR = 0b10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the reason for updating divCSR twice in the same switch case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was waiting for You on this...
well the datasheet states that the CSR lsb is cleared at the beginning of the calculation and set once result is ready (after 8 cpu cycles).
On a single core this assignment does not make any sense, since I don't know how multicore will be implemented it could be something that will matter in case of concurrency against hardware divider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. When we get to multicore, we'll just execute one instruction at a time (so jumping back and forth between running core0 and core1 code).
In any case, JavaScript does not support multi-threading natively. The only way is using Web Workers (or worker threads on Node.js). But even then, the only way two threads can access the same memory is using a SharedArrayBuffer, so if we decide at some point to truly run this on two threads (and share the divider memory between them) we'll have to change all these variables to be stored inside a SharedArrayBuffer and refactor the the code, and I don't see it happening any time soon.
So to summarize, I think we can safely remove the this.divCSR = 0b10;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that SIO dividers/interpreters are core local
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point to remember when we implemented multi-core. I noted this!
src/sio.ts
Outdated
case DIV_SDIVISOR: | ||
this.divCSR = 0b10; | ||
this.divDivisor = value | 0; | ||
if (this.divDivisor == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest that we move this repetitive code (in DIV_UDIVIDEND, DIV_SDIVIDEND, DIV_UDIVISOR, and DIV_SDIVISOR) into a function that will do the actual division, e.g. this.updateDivider()
.
As far as I can tell, you don't need to worry about the sign of the result, because when it's read back into a register it'll always be internally converted to unsigned anyway.
src/sio.spec.ts
Outdated
}); | ||
|
||
describe('Hardware Divider', () => { | ||
it('should set perform a signed hardware divider 123456 / -321 = -384 R192', async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "REM 192" will be clearer than "R192" (same for the other test names).
Also, the word "set" (second word) seems not related here?
src/sio.spec.ts
Outdated
expect((await cpu.readUint32(SIO_DIV_QUOTIENT)) | 0).toEqual(-384); | ||
}); | ||
|
||
it('should set perform an unsigned division by zero 123456 / 0 = Infinity RNaN', async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The result isn't Infinity / NaN (as the test name says), rather 0xffffffff and remainder 123456.
adjusted test by using readInt32 where signed value are expected
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Valerio! I added a few comments / suggestions. I like the shape this is taking :)
src/utils/test-driver-gdb.ts
Outdated
|
||
async readInt32(address: number) { | ||
const result = await this.gdbClient.readMemory(address, 4); | ||
return new Uint32Array(result.buffer)[0] | 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not this.readUint32(address) | 0
like you did in the RP2040 test driver?
src/sio.ts
Outdated
@@ -147,46 +159,14 @@ export class RPSIO { | |||
this.qspiGpioOutputEnable ^= value & GPIO_MASK; | |||
break; | |||
case DIV_UDIVIDEND: | |||
this.divDividend = value >>> 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it work correctly even though you use the same this.divDividend = value;
for both the signed/unsigned cases (instead of value >>> 0
vs value | 0
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the end yes, it works in both cases.
I don't understand the reason to have separated registers, at the beginning I though putting a boolean flag to change the behavior between signed/unsigned, it turned out a useless handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tried dividing -3000 by 2?
In JavaScript I get different results (printing both results as signed, just to show the difference):
const signedDiv = (a, b) => ((a|0)/(b|0))|0;
const unsignedDiv = (a, b) => ((a>>>0)/(b>>>0))|0;
console.log(signedDiv(-3000, 2)); // -1500
console.log(unsignedDiv(-3000, 2)); // 2147482148
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this test, it passes on both emu and sil.
what I've seen is that in the emu it doesn't matter if I use readUint32 or readInt32 it always return the right value.
test in the silicon instead it changes and it requires the right read method.
Seems that javascript (math.trunc?) automatically handles the right way in case of negative value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the result, it shouldn't matter how you read it (you can convert between signed / unsigned at any stage).
But for the actual division, it seems to matter even if you use math.trunc:
const signedDiv = (a, b) => Math.trunc((a|0)/(b|0))|0;
const unsignedDiv = (a, b) => Math.trunc((a>>>0)/(b>>>0))|0;
console.log(signedDiv(-3000, 2)); // -1500
console.log(unsignedDiv(-3000, 2)); // 2147482148
It could be the the current code is always doing signed division, and that's why the test passes. In this case, you can add a test for unsigned divide as well, trying to divide 0x80000000 by 2 using unsigned integer divide, and verify that you get the correct result (0x40000000):
console.log((signedDiv(0x80000000, 2) >>> 0).toString(16)); // 0xc0000000
console.log((unsignedDiv(0x80000000, 2) >>> 0).toString(16)); // 0x40000000
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arghhhh!!! missing break forced signed code.
I've also had to force the readUint32 to return unsigned to pass this test:
should perform a signed division 0x80000000 / 2 = 0xc0000000 REM 0
…n behavior Improved division by zero behavior added some relevant test
…ration fix quotient value for all signed/unsigned cases added specific test case to check that actually signed/unsigned operation is performed fix on test-driver-rp2040: force readUint32 to return unsigned value
Thanks Valerio, this looks good to me :-) Is there anything else you want to add/change before I merge? |
Oh no Uri, I think this divider took enough time. It is fine now. |
Alright, merged! Thanks for working on it, it was a long one indeed... But now I feel the instruction set is truly complete (this hardware divider is like another CPU instruction). We'll review the implementation in the live stream tomorrow :-) |
Hi Uri, I want to say sorry for forcing You to stay so much time on this, I'm sure You would have successfully write it in less than 20 minutes. |
Hi Valerio, First of all, thanks for caring about my time! I'm very pleased with the results. Maybe just writing the code would take me 20 minutes, but as you've seen in the live streams, writing the test cases, finding the bugs + verifying with the silicone would probably take much more. Like the alarms last week took about 2 hours. Anyway, all the bugs you've found so far with gdbdiff (and fixed) surely saved much precious time. And I'm happy to see your commits and provide feedback. This way everyone benefits :) |
thanks @kilograham for pointing me in the right direction