Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bitsets3 #239

Merged
merged 55 commits into from
Nov 22, 2020
Merged

Bitsets3 #239

merged 55 commits into from
Nov 22, 2020

Conversation

wclodius2
Copy link
Contributor

Branch to implement bitset data types. Committed the following new source files:
src/stdlib_bitsets.f90
src/stdlib_bitset_64.f90
src/stdlib_bitset_large.f90

Added the following new test files:
src/tests/bitsets/CMakeLists.txt
src/tests/bitsets/Makefile.manual
src/tests/bitsets/test_stdlib_bitset_64.f90
src/tests/bitsets/test_stdlib_bitset_large.f90

Added the following new documentation file:
doc/specs/stdlib_bitssets.md

Modified the following compilation files:
src/CMakeLists.txt
src/Makefile.manual
src/tests/CMakeLists.txt
src/tests/Makefile.manual

Added stdlib_bitsets.f90, stdlib_bitset_64.f90, and stdlib_bitset_large.f90 and modified CMakeLists.txt and Makefile.manual so they should compile the files.

[ticket: X]
Added tests/bitsets/test_stdlib_bitset*.f90, tests/bitsets/CMakeLists.txt,
and tests/bitsets/Makefile.manual and modified tests/CMakeLists.txt and
tests/Makefile.manual to compile the test programs.

[ticket: X]
Eliminated unused variables in stdlib_bitset_64.f90, stdlib_bitset_large.f90
and rename variables called ablock to block_ in stdlib_bitset_large.f90

[ticket: X]
Added stdlib/doc/specs/stdlib_bitsets.md

[ticket: X]
Copy link
Member

@jvdp1 jvdp1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @wclodius2 for this PR. I will try to review it this weekend or next week.

doc/specs/stdlib_bitsets.md Outdated Show resolved Hide resolved
doc/specs/stdlib_bitsets.md Show resolved Hide resolved
doc/specs/stdlib_bitsets.md Show resolved Hide resolved
doc/specs/stdlib_bitsets.md Show resolved Hide resolved
src/tests/Makefile.manual Outdated Show resolved Hide resolved
Copy link
Member

@14NGiestas 14NGiestas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a misspelling of the word 'occurred' in this file: "occured" (line 1092)
EDIT: this file

doc/specs/stdlib_bitsets.md Outdated Show resolved Hide resolved
@wclodius2
Copy link
Contributor Author

wclodius2 commented Oct 5, 2020 via email

@14NGiestas
Copy link
Member

We should really discuss the error handling further in #219 . The 'go to' usage in this function... I'm not a fan xD. I think only one labeled part is needed to achieve the 'return with status' vs 'stop program with error message' desired behavior, with the aid of a error handling module, we could set the message and a flag to switch between the two possible actions, reusing the code. I will bring some working code snippet soon (I hope).
Also, in the assigment functions I wonder if it is possible to use the template system (as in the statistics module)?

@wclodius2
Copy link
Contributor Author

What is xD? The error handling has several aspects that could be commented on:

  1. Rather than intermingling the error handling with the main logic I jump to error handling at the end of the procedure. This increases the amount of code slightly, and can obscure what the error handling is doing, but I think makes the main logic clearer.

  2. I am handling some "errors" that possibly shouldn't be handled: memory allocation problems and user errors. With virtual memory, its not clear to me that allocation can ever fail, and with lazy allocation on Linux such failure will not occur on the allocate statement.. While not important in the procedure you call out, I also attempt to provide a fallback for some user errors where simple failure might be best.

  3. Since we lack an error handling procedure I am relying explicitly on an if branch with error stop. An error handling procedure could incorporate the if branch and slightly simplify the logic by incorporating both a message and a status argument. There are however a lot of issues to be dealt with in developing an error handling module.

    • Should it be a separate module or part of the logger?

    • Should it include a set of error codes?

    • What should the arguments be to the main error handling procedure?

      • message
      • module and procedure names
      • status/stat flag
      • error code
    • How should processing be terminated, an error stop: with an F2018 runtime determined stop code, with an error code determined compile time stop code, generic compile time stop code?

  4. Including the module and procedure names in the error stop makes the stop code longer and harder to read

  5. Making the stop code message a character string parameter would shorten the error handling code, at the expense of cluttering up other portions of the code.

It is of course possible to use fypp preprocessing to generate templates for the assignments to and from logical arrays. The assignment procedures are few enough and simple enough that I thought it best to avoid a dependence on the preprocessor, but that is a matter of taste. The addition of more assignments, say to and from integers would probably change my opinion.

@14NGiestas
Copy link
Member

14NGiestas commented Oct 7, 2020

What is xD?

ASCII version of 😆smiling face with closed eyes

1-5. Yes, the current way increases the amount of code, however the main logic can be still clear if we do something like that:

    subroutine fail_behavior(..., status)
        ! doing a lot of stuff here
        ! ...
        ! case X: found some fatal error
            error_handler % status = ERROR_CODE_TO_X
            error_handler % message = "Oh no! X happened"
            go to 404
        ! case Y: found other general error
            error_handler % status = ERROR_CODE_TO_Y
            error_handler % message = "Oh no! Y happened"
            go to 404

    404 if (present(status)) then
            status = error_handler % status
            return
        else
            error stop error_handler % message
        end if
    end subroutine

This is a way we can avoid the code duplication, reusing the go to branching. Note that error_handler should be a object that couples the information about what happened (not reflecting in any way the API I got in mind for a final version).
The error handling should be a separate module with some default general purpose errors and a abstract one, that would allow it to be extended when needed by other modules, this way they can provide their own specific codes constants and messages.

It is of course possible to use fypp preprocessing to generate templates for the assignments to and from logical arrays. The assignment procedures are few enough and simple enough that I thought it best to avoid a dependence on the preprocessor, but that is a matter of taste. The addition of more assignments, say to and from integers would probably change my opinion.

IMHO, we should use the template system every time that not using it leads to code duplication, since it gets easier to modify and extend the code in the future. (DRY principle).

@wclodius2
Copy link
Contributor Author

FWIW for the error handling what I thought you were proposing was something like

subroutine fail_behavior(..., status)
    ! doing a lot of stuff here
    ! ...
    ! case X: found some fatal error
        call error_handler(message="Oh no! X happened", error=error_code_to_x, status=status)
        return
    ! case Y: found other general error
        call error_handler(message="Oh no! Y happened", error=error_code_to_y, status=status)
        return
end subroutine

where error_handler assigns error to status if status is present and stops processing with a message to error_unit otherwise.

I can change the code to use fypp, but it probably requires starting a new branch. I can see how to add or modify files in the current branch, but not how to rename or remove existing files, which I need to change the *.f90 files to *.fypp.

@14NGiestas
Copy link
Member

FWIW for the error handling what I thought you were proposing was something like (...)
where error_handler assigns error to status if status is present and stops processing with a message to error_unit otherwise.

Indeed, actually this is even better because we don't need retype the branching logic with "go to" statements in every function that may fail and is doable (working example here). To avoid having any boilerplate along the function's main logic, the module that have a specific need should build a function that calls this base function error_handler, Ex:

subroutine value_error(expected, got, status)
     ....
     write(message, '("value_error: Expected ",G0,", but got ",G0," instead.")') expected, got
     call error_handler(message, ...) 
end subroutine

I can change the code to use fypp, but it probably requires starting a new branch. I can see how to add or modify files in the current branch, but not how to rename or remove existing files, which I need to change the *.f90 files to *.fypp.

You can clone your repo here, modify it locally and push the commits (It will probably sync with the open PR)

@jvdp1
Copy link
Member

jvdp1 commented Oct 7, 2020

I can see how to add or modify files in the current branch, but not how to rename or remove existing files, which I need to change the *.f90 files to *.fypp.

You can rename and remove a file locally with your OS, and then add/remove, commit, and push with git.

@wclodius2
Copy link
Contributor Author

When I try

git clone https://github.com/wclodius2/stdlib/tree/bitsets3 my-bitsets

I get the error message

Cloning into 'my-bitsets'...
fatal: repository 'https://github.com/wclodius2/stdlib/tree/bitsets3/' not found

Is there something wrong with my syntax/repository name?

@14NGiestas
Copy link
Member

This is the wrong URL, please do the following:

git clone https://github.com/wclodius2/stdlib.git my-bitsets
cd my-bitsets
git checkout bitsets3

…set*.f90

Changed makeefiles to preprocess ths stdlib_bitset*.fypp files to stdlib_bitset*.f90 files.

[ticket: X]
Renamed files stdlib_bitset*.f90 to fypp preprocessor stdlib_bitset*.fypp files

[ticket: X]
Changed stdlib_bitsets.fypp, stdlib_bitset_64.fypp, and stdlib_bitset_large.fypp
to generate the assignment procedures of logical arrays to and from bitsets.

[ticket: X]
Removed stdlib_bitsets.f90, stdlib_bitset_64.f90, and stdlib_bitset_large.f90 as
they are now generated by the preprocessor.

[ticket: X]
Defined an error_handler subroutine in stdlib_bitsets.fypp and used it to handle
errors in stdlib_bitset_64.fypp and stdlib_bitset_large.fypp. Also was more
consistent in documenting status argument results. Added
char_string_too_large_error to status results.

[ticket: X]
Was more consistent in using bulleted lists in documenting status error codes.
Added char_string_too_large_error to the error codes.

[ticket: X]
Working with fypp made it easier to add unwanted trailing blanks. I removed them.

[ticket: X]
Introduced the parameters max_digits and overflow_bits to be used in checking
for overflows on reads and writes. The parameters need to be changed if
bits_kind is changed, and preferred parameters for bits_kind==int64 are
defined, but commented out.

[ticket: X]
Replaced go to 100 with exit in both stdlib_bitsets_64.fypp and
stdlib_bitsets_large.fypp.

[ticket: X]
@@ -0,0 +1,744 @@
program test_stdlib_bitset_64
use, intrinsic :: iso_fortran_env, only : int8, int16, int32, int64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
use, intrinsic :: iso_fortran_env, only : int8, int16, int32, int64
use stdlib_kinds, only : int8, int16, int32, int64

Copy link
Member

@jvdp1 jvdp1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some additional comments on the tests. All options seem to be covered by the tests. I added some suggestions.
On overall I am pleased with this PR. Thank you @wclodius2!

src/tests/bitsets/test_stdlib_bitset_64.f90 Outdated Show resolved Hide resolved
src/tests/bitsets/test_stdlib_bitset_64.f90 Show resolved Hide resolved
src/tests/bitsets/test_stdlib_bitset_64.f90 Show resolved Hide resolved
src/tests/bitsets/test_stdlib_bitset_64.f90 Show resolved Hide resolved
src/tests/bitsets/test_stdlib_bitset_64.f90 Outdated Show resolved Hide resolved
src/tests/bitsets/test_stdlib_bitset_large.f90 Outdated Show resolved Hide resolved
@wclodius2
Copy link
Contributor Author

wclodius2 commented Oct 20, 2020 via email

Replaced the use of iso_fortran_env with stdlib_kinds.

[ticket: X]
Changed handling of potential integer overflows on reads for bits_kind==int64,
changing max_digits from 20 to 19. Removed comment that my treatment may not be
quite right. Also fixed a typo in an error message.

[ticket: X]
@wclodius2
Copy link
Contributor Author

I have now resolved my qualms about my handling of potential integer overflows on reads. I have changed max_digits for bits_kind==int64 from 20 to 19 (2**63-1 is about 9.2E18) and convinced myself that the rest of my logic was correct. I have move the comments that the code may not be quite right. (I also fixed a typo in an error message.)

Jeremie suggested numerous changes. I implemented most of them.

[ticket: X]
@wclodius2
Copy link
Contributor Author

@jvdp1 thanks for the thorough review.

At the suggestion of Jeremie I replaced a number of go tos.

[ticket: X]
wclodius2 and others added 3 commits October 21, 2020 07:26
Kreplaced go tos at the suggestion of Jeremie.

[ticket: X]
Documented the use of the "named" forms, .EQ., .NE., .GT., .GE., .LT., .LE., as
alternatives to the symbolic forms, ==, /=, >, >=, <, <= of the comparison
operations.

[ticket: X]
Copy link
Member

@milancurcic milancurcic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed and made edits to the spec document. I have also built the code run the tests (successfully). For as far as I understand bitsets, I think the API is good.

I did not review the code. Due to the combination of limited time, large size of this PR, and my unfamiliarity with bitsets, I don't think I can do proper code review without putting in substantial effort. But I trust @wclodius2 and @jvdp1 that they did great work.

I recommend this PR to be merged a week from now.

@wclodius2
Copy link
Contributor Author

wclodius2 commented Nov 13, 2020 via email

@milancurcic
Copy link
Member

Okay, thanks, I understand now. All good.

@milancurcic
Copy link
Member

Your example "i.e., the value 1110 can be considered as defining the subset (1, 2, 3)" was what made it click for me. Do you agree that we include it in the spec? Something like this:

"It can equivalently be considered as a sequence of logical values or as a subset of the integers 0 ... bits-1. For example, the value 1110 can be considered as defining the subset of integers [1, 2, 3]."

@wclodius2
Copy link
Contributor Author

Sounds good to me.

@milancurcic
Copy link
Member

I will go ahead and merge this considering there aren't any objections. Thank you @wclodius2 and all the reviewers.

@milancurcic milancurcic merged commit 37a1ed1 into fortran-lang:master Nov 22, 2020
@jvdp1 jvdp1 mentioned this pull request Dec 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants