Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRAFT: Python implementation of cobyla #37

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

nbelakovski
Copy link
Contributor

This is a placeholder PR while I work on the Python implementation. When I'm ready to merge I'll close this one and open up a new one. Having this one open allows for progress on the implementation to be visible, and to discuss any technical issues that come up as we develop the Python implementation.

Current status:

  • Most of the code for cobyla has been hand-translated to Python
  • The "if debugging" sections have been skipped for now. I expect that for the basic example they won't be triggered if everything is working correctly. They will be added later
  • I'm not dealing with storing the history at the moment, I will explore that later, since I think it will require some thought as to how to structure the Python interface.
  • It does not work at the moment. Since Python and Fortran differ in both array indexing and in the way they treat size (numpy does not differentiate between row and column vectors), among other things, there will be issues, and I am slowly working on diving through these.

Also, there will be a folder name conflict with @jschueller's C interface, since we are both using the python folder name. I propose @jschueller changes his branch to use the folder name python_c_interface?

Normally I wouldn't post half-finished work like this, but this will be a fairly big PR that is difficult to split into smaller chunks, so I figured it would be better to get it out into the open earlier rather than later.

Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check-spelling found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.

@github-actions

This comment has been minimized.

@zaikunzhang
Copy link
Member

zaikunzhang commented Jul 29, 2023

Thank you @nbelakovski for your work.

I just want to give a quick comment on this point:

The "if debugging" sections have been skipped for now. I expect that for the basic example they won't be triggered if everything is working correctly. They will be added later

This is EXTREMELY important. Code should be battle tested before it becomes software. The seemingly stupid pre/post-conditions will help us spot many many bugs.

I expect that for the basic example they won't be triggered if everything is working correctly.

This is unfortunately wrong. My experiences in the past years have taught me one thing: We do not know what kind of (floating-point) exceptions will occur until we have tested the code on excessively many randomized TOUGH problems for a sufficiently long time.

In this way, the modern Fortran version of PRIMA has been tested on a large class of problems (the CUTEst problems) for more than 20 years (summing up the time of all the parallel tests). In this process, numerous exceptions and bugs have been spotted and fixed.

I insist that any porting/translation of PRIMA should go through the same level of tests. Otherwise, we cannot be sure whether it is correct.

@zaikunzhang
Copy link
Member

zaikunzhang commented Jul 29, 2023

https://github.com/nbelakovski/prima/blob/e46e9192f9a5f80b1ba146120774d21a35e87344/python/src/prima/cobyla/geometry.py#L140C1-L141C1

This is strange. Which version of COBYLA is it based on? This definition of weight is wrong. It was used by an old version of COBYLA, but corrected several months ago.

@nbelakovski
Copy link
Contributor Author

I insist that any porting / translation of PRIMA should go through the same level of tests. Otherwise, we cannot be sure whether it is correct.

I agree, I fully intend to add the "if debugging" sections as well as rigorous testing.

https://github.com/nbelakovski/prima/blob/e46e9192f9a5f80b1ba146120774d21a35e87344/python/src/prima/cobyla/geometry.py#L140C1-L141C1

This is strange. Which version of COBYLA is this based on? This definition of weight is wrong. It was used by an old version of COBYLA, but corrected several months ago.

Hm, not sure how that happened. Maybe I opened the wrong geometry.f90 file at one point. Will fix.

@zaikunzhang
Copy link
Member

zaikunzhang commented Jul 30, 2023

https://github.com/nbelakovski/prima/blob/e46e9192f9a5f80b1ba146120774d21a35e87344/python/src/prima/cobyla/geometry.py#L140C1-L141C1
This is strange. Which version of COBYLA is this based on? This definition of weight is wrong. It was used by an old version of COBYLA, but corrected several months ago.

Hm, not sure how that happened. Maybe I opened the wrong geometry.f90 file at one point. Will fix.

This is a bit worrying. It implies that there are probably other discrepancies. We need a systematic way to spot and fix them. Otherwise, it is hard to verify the correctness of the implementation. (For me, "verification by human eyes" is not trustworthy at all due to my bitter experiences in the past years.)

In addition, we must be very clear about which version of PRIMA is the Python implementation referring to. Mixing subroutines from different versions will not work. Moreover, without knowing the version number, we will not be able to improve the Python version when improvements are made in the Fortran reference version.

The implementation of derivative-free optimization methods is quite delicate. Seemingly small changes may turn out significant to the performance. This is what I have learned in the past 15 years while working on derivative-free optimization.

@zaikunzhang
Copy link
Member

zaikunzhang commented Jul 30, 2023

Thank you @nbelakovski for your work.

I just want to give a quick comment on this point:

The "if debugging" sections have been skipped for now. I expect that for the basic example they won't be triggered if everything is working correctly. They will be added later

This is EXTREMELY important. Code should be battle tested before it becomes software. The seemingly stupid pre/post-conditions will help us spot many many bugs.

I expect that for the basic example they won't be triggered if everything is working correctly.

This is unfortunately wrong. My experiences in the past years have taught me one thing: We do not know what kind of (floating-point) exceptions will occur until we have tested the code on excessively many randomized TOUGH problems for a sufficiently long time.

In this way, the modern Fortran version of PRIMA has been tested on a large class of problems (the CUTEst problems) for more than 20 years (summing up the time of all the parallel tests). In this process, numerous exceptions and bugs have been spotted and fixed.

I insist that any porting / translation of PRIMA should go through the same level of tests. Otherwise, we cannot be sure whether it is correct.

[Update: I have moved the following to a discussion, as it is an important point to highlight.]

Someone may think that it is unnecessary to test solvers in the way that PRIMA does. In most cases, one would test a few problems and observe whether the result is expected. If yes, then "the implementation is correct". It seems that many people are happy with such a test. For me, this is a joke (sorry to say so).

I would like to elaborate a bit more about the importance of testing and verification, motivated by a conversation with a friend who uses PRIMA in his projects.

This friend refers to his projects as "critical projects", which are directly related to the life and death of humans --- imagine, for example, the designing of a new medicine (although this is not what my friend works on). The robustness of the solver and the reproducibility of the solution can never be exaggerated in such projects. This is quite different from the machine learning problems whose objective is only to decide how to post advertisements --- nobody will die if the solver does something wrong. In critical projects, however, people may die.

So, what kind of tests are sufficient for PRIMA, which is designed for critical projects?

No test will be sufficient. I can only tell that the following tests are necessary.

  1. A large number (e.g., hundreds) of test problems, which can represent as much as possible different challenges that may occur in applications.
  2. TOUGH tests. In applications, function evaluation may fail, it may return NaN or infinity from time to time, and it is very likely contaminated by noise. Any test is insufficient without trying such problems.
  3. Randomized tests. It is impossible to cover all the possible difficulties with a fixed set of problems. Some bugs can only be triggered under very particular conditions that are "difficult" to encounter without randomization (a bug that is rarely triggered is still a bug!!!). Therefore, tests must be randomized, and the random seed must be changed periodically (daily or weekly).
  4. Stress tests. If a solver is designed to solve 100-dimensional problems, then we must test it on (randomized) 1000-dimensional problems and make sure that it does not crash.
  5. Automated tests. It is not enough to randomize the tests. Randomized tests must also be executed automatically every day and night, for example, using GitHub Actions.
  6. Tests on various platforms under different systems using all compilers/interpreters available. Our software should not crash on any platform. Without thorough tests, the only thing I know is that we do not know what will happen.
  7. A sufficiently long time of testing. In general, I do not feel confident about a solver if the accumulated testing time is below 10 years.

Comparing this with "testing a few problems and observing whether the result is expected", I hope it is clear what I meant by "it is a joke" (sorry again for saying so). Recalling that the solvers may serve projects that decide human life, I guess it is clear why such an approach is not enough.


return f, constr

def cobyla(calcfc, m, x, f, cstrv=0, constr=0, f0=None, constr0=None, nf=None, rhobeg=None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imho this should be part of the module, not the example, ie

import prima
x,f = prima.cobyla(...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it will be, this was just me "slinging code" before ultimately re-arranging it. This PR is more for transparency and awareness of progress than for detailed code comments. Once the work is more mature I will close this PR and open a second one for more detailed code comments along the lines of this one.

m_hexagon = 14
f = 0
cstrv = 0 # Should this b e a length 14 array? Of inf?
x, f = cobyla(calcfc_hexagon, m_hexagon, x_hexagon, f, cstrv)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in my C-python binding I packed the tuple of results in a Python dataclass
that's convenient because you can add more stuff and keep the same api (I also return the number of function calls)
it's also somewhat compatible with what scipy returns

Copy link
Member

@zaikunzhang zaikunzhang Aug 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this part is incorrect. cstrv should be an output. This is specified in the comments, and also indicated by the intent:

  • An intent(in) argument is an input
  • An intent(out) argument is an output
  • An intent(inout) argument is both an input and an output, the value of which will be changed by the procedure / subroutine

Thank you.

@jschueller jschueller mentioned this pull request Aug 2, 2023
@zaikunzhang
Copy link
Member

@nbelakovski I wrote the above comments very quickly. They are mostly general comments that are not specific to the Python translation.

Apologies if I overlooked something. I am open to discussions. Tell me if there is anything not clear.

Thank you very much for your great efforts!

@zaikunzhang
Copy link
Member

Hi @nbelakovski ,

A new release of PRIMA is made. The Fortran implementation of COBYLA now accepts more general constraints. The Fortran code is modified, although the algorithm stays essentially the same.

I will be very happy to provide more information if needed.

Many thanks for your huge efforts. I know it is HARD, and I hope you are doing well. Tell me if I can help.

Best regards,
Zaikun

@nbelakovski
Copy link
Contributor Author

https://github.com/nbelakovski/prima/blob/e46e9192f9a5f80b1ba146120774d21a35e87344/python/src/prima/cobyla/geometry.py#L140C1-L141C1
This is strange. Which version of COBYLA is this based on? This definition of weight is wrong. It was used by an old version of COBYLA, but corrected several months ago.

Hm, not sure how that happened. Maybe I opened the wrong geometry.f90 file at one point. Will fix.

This is a bit worrying. It implies that there are probably other discrepancies. We need a systematic way to spot and fix them. Otherwise, it is hard to verify the correctness of the implementation. (For me, "verification by human eyes" is not trustworthy at all due to my bitter experiences in the past years.)

It looks like what happened is that when I used VSCode's right-click -> Go to definition, it would open up files in benchmark/rescue_idz/norma/fortran/cobyla and I did not realize this. It appears to have happened for several files, so now I am going back through and redoing them.

In addition, we must be very clear about which version of PRIMA is the Python implementation referring to. Mixing subroutines from different versions will not work. Moreover, without knowing the version number, we will not be able to improve the Python version when improvements are made in the Fortran reference version.

Do you mean referencing a release version number? Maybe it's better if I reference a githash in the README of the Python section, otherwise we'd have to either release the Python version separately or the Python version is always a release point behind.

Someone may think that it is unnecessary to test solvers in the way that PRIMA does. In most cases, one would test a few problems and observe whether the result is expected. If yes, then "the implementation is correct". It seems that many people are happy with such a test. For me, this is a joke (sorry to say so).

Just to be clear, more testing and more thorough testing is great, but I am taking a milestone based approach here to make this a more manageable project. First I want to get to a point where I can test a few problems and observe that the result is as expected, THEN I plan to go back and add all of the if DEBUGGING code, THEN we'll be in a position where we can do more thorough tests. Some of these things can be re-ordered, but one thing at a time please :).

calcfc_chebyqud and calcfc_hexagon in Python return values close to what is returned by Fortran,
but not exactly the same and the difference is outside of machine precision.

For chebyquad the fortran version iterates 387 times within cobylb, whereas
the Python version iterates 564 times. The results are as follows

Fortran:
x(1) = 0.06687201625695266
x(2) = 0.28872054433564054
x(3) = 0.36668681233017669
x(4) = 0.63330363432938008
x(5) = 0.71126615229302259
x(6) = 0.93312277720964698
f = 3.9135366518694255e-10

Python:
x[0] = 0.06688196439284098
x[1] = 0.2887610493288529
x[2] = 0.36668226015206873
x[3] = 0.6333320849097046
x[4] = 0.7112583722245663
x[5] = 0.9331254078707577
f = 3.8259601181730434e-10

The first position in x differs by 9.948135888e-6, so it's further away than machine precision, but
it's possible that the difference is the result of accumulation of machine precision errors.
@github-actions

This comment has been minimized.

@nbelakovski
Copy link
Contributor Author

nbelakovski commented Sep 20, 2023

The code is still very messy but I have some progress to report. The COBYLA examples for chebyquad and hexagon now run and provide answers that are close to the FORTRAN version. They are not quite the same, but the difference is greater than machine precision. For example, in chebyquad, here are the results:

$f$ $x_1$ $x_2$ $x_3$ $x_4$ $x_5$ $x_6$
FORTRAN 3.9135366518694255e-10 0.06687201625695266 0.28872054433564054 0.36668681233017669 0.63330363432938008 0.71126615229302259 0.93312277720964698
Python 3.8259601181730434e-10 0.06688196439284098 0.2887610493288529 0.36668226015206873 0.6333320849097046 0.7112583722245663 0.9331254078707577
Difference 8.76E-12 -0.000009948135888 -0.00004050499321 0.000004552178108 -0.00002845058032 0.000007780068456 -0.000002630661111

The first position in x differs by 9.948135888e-6, so it's further away than machine precision, but
it's possible that the difference is the result of accumulation of machine precision errors.

But I think that the fact that these are so close (and that results for hexagon, which has constraints, are similarly close) indicates that the vast majority of indexing errors are taken care of.

I noticed that the main loop of cobylb runs 387 times in the FORTRAN code but 564 times in the Python code for chebyquad. My next step is to investigate this difference. I might try replacing the default numpy matrix multiplication with a Python implementation of the matprod function from the FORTRAN code.

After that there is more work to do to add all the pre- and post- conditions and clean things up, but I think this is a nice milestone, it's very satisfying to have results that look like they're in the ballpark.

@zaikunzhang
Copy link
Member

zaikunzhang commented Sep 20, 2023

The code is still very messy but I have some progress to report. The COBYLA examples for chebyquad and hexagon now run and provide answers that are close to the FORTRAN version. They are not quite the same, but the difference is greater than machine precision. For example, in chebyquad, here are the results:



1

2

3

4

5

6
FORTRAN 3.9135366518694255e-10 0.06687201625695266 0.28872054433564054 0.36668681233017669 0.63330363432938008 0.71126615229302259 0.93312277720964698
Python 3.8259601181730434e-10 0.06688196439284098 0.2887610493288529 0.36668226015206873 0.6333320849097046 0.7112583722245663 0.9331254078707577
Difference 8.76E-12 -0.000009948135888 -0.00004050499321 0.000004552178108 -0.00002845058032 0.000007780068456 -0.000002630661111
The first position in x differs by 9.948135888e-6, so it's further away than machine precision, but it's possible that the difference is the result of accumulation of machine precision errors.

But I think that the fact that these are so close (and that results for hexagon, which has constraints, are similarly close) indicates that the vast majority of indexing errors are taken care of.

I noticed that the main loop of cobylb runs 387 times in the FORTRAN code but 564 times in the Python code for chebyquad. My next step is to investigate this difference. I might try replacing the default numpy matrix multiplication with a Python implementation of the matprod function from the FORTRAN code.

After that there is more work to do to add all the pre- and post- conditions and clean things up, but I think this is a nice milestone, it's very satisfying to have results that look like they're in the ballpark.

This is very impressive. I guess the discrepancy in the results reflects something more than rounding errors. In addition, we should have a systematic and automated way to verify the implementation.

This is what I did to verify the Fortran implementation (the verification is done via the MATLAB interface):

https://github.com/libprima/prima/blob/main/matlab/tests/verify.m
https://github.com/libprima/prima/blob/main/matlab/tests/private/isequiv.m

BTW, the language is Fortran rather than FORTRAN since 1990s. It is better to avoid using the latter unless you want to emphasize that you are working on some extremely old code, which I hope is not the case. :)

Thanks.

@nbelakovski
Copy link
Contributor Author

I found that I had been starting from a different x for the chebyquad example due an an off-by-one error (I believe that we may never get to 0 off-by-one errors, there will always be at least 1). Fixing that doesn't get to the same point as the fortran code, it actually gets to a somewhat better point, but with a different number of loops and function evaluations. I watched the x and the step vector d between the two implementations, and they seemed about the same for ~200 iterations, but then they start to slowly diverge (in direction, in magnitude they remain similar). So it looks like accumulation of machine precision error. I'm not too concerned about this at the moment because it looks like the main algorithm is coded correctly.

I'm about to start adding in the pre/post conditions that I skipped and I have some questions I'd like to get some comments on.

  1. I think that with Python we do not need to add the string afterwards, i.e. Python can just be assert n>=1 as opposed to assert n>=1, "n >= 1", srname. When a Python assertion is thrown it will show the assertion that failed along with the call stack. Not including the string also make the translation a bit easier, as it's annoying to have to copy the assert, change it to work with Python, and then change the string as well. Any objections?

  2. On the topic of the "if (DEBUGGING)" statements, I'm tempted to just leave them out. Some of the asserts are doing some calculations, but most are just checking sizes so I don't imagine they're contributing to the runtime cost very much, or at least not more than some other more meaningful operations. Should we decide to keep the "if (DEBUGGING)" statements, I would propose we have a line in prima/common/consts.py that reads like DEBUGGING = False or os.getenv("PRIMA_DEBUGGING"). Or we could skip the False or part of it. Thoughts?

Besides adding in the pre/post conditions I'm also working on adding in the history vectors which I had left out, as well as implementing printing.

@nbelakovski
Copy link
Contributor Author

One more question: in a lot of the Python code I chose to use num_vars and num_constraints instead of n and m. I felt it was easier to read certain expressions if I knew that it was related to the number of constraints as opposed to m. It also makes it easier to see for someone new to the algorithm which parts depend on the constraints, since you can highlight all usages of num_constraints but it's much harder to find all usages of m.

Thoughts on this? The code should be consistent across implementations, so I can go in and make the same changes to Fortran code if we like this, but likewise I can revert the changes in Python if we don't like this approach.

Implemented savehist and modified the return values to be more usable.

I've gone through cobyla.py and it seems pretty close to the original.

Remaining work is to implement retmsg and to add in all pre/post conditions
and retrofit existing pre/post conditions to match the style guide.

Then we're ready for a deeper conversation on whether any of this is useful XD
@github-actions

This comment has been minimized.

@nbelakovski
Copy link
Contributor Author

I've added just about all the relevant pre/post conditions. There may be some stragglers. I've also implemented [f/rho/ret]msg and savehist, although the hist is a little bit crude for starters.

I think we're getting close to the point where we can have a real PR. @zaikunzhang what do you think needs to be done in terms of verification? I'll start looking at the link you gave above and seeing if it's feasible to adapt it for testing this Python implementation.

@github-actions
Copy link

@check-spelling-bot Report

🔴 Please review

See the 📂 files view, the 📜action log or 👼 SARIF report for details.

Unrecognized words (56)
accoring
allclose
argmin
Belakovski
calculted
casues
changre
CHEC
checkbreak
chisti
coblyb
codebases
coloumn
componenets
confiolt
consistenchy
copde
cov
cqai
cqi
dmaging
dtype
errrors
fhisti
hstack
htmlcov
icont
immeditely
inifinity
Initializiation
isneginf
isposinf
itemsize
joptcandidate
jth
Kbreak
linnearized
lstsq
multd
ndarray
neccessary
NEWOUA
npdot
numberical
Otherwiae
Precondictions
rcond
remaineder
rtol
sceme
seet
segafult
subrouties
thage
XFLIT
XIMPORVED
Some files were automatically ignored

These sample patterns would exclude them:

^\Qpython/src/prima/__init__.py\E$
^\Qpython/src/prima/cobyla/__init__.py\E$
^\Qpython/src/prima/common/__init__.py\E$

You should consider adding them to:

.github/actions/spelling/excludes.txt

File matching is via Perl regular expressions.

To check these files, more of their words need to be in the dictionary than not. You can use patterns.txt to exclude portions, add items to the dictionary (e.g. by adding them to allow.txt), or fix typos.

To accept ✔️ these unrecognized words as correct, run the following commands

... in a clone of the [email protected]:nbelakovski/prima.git repository
on the nbelakovski/python branch (ℹ️ how do I use this?):

curl -s -S -L 'https://raw.githubusercontent.com/check-spelling/check-spelling/main/apply.pl' |
perl - 'https://github.com/libprima/prima/actions/runs/6344178152/attempts/1'
Available 📚 dictionaries could cover words not in the 📘 dictionary
Dictionary Entries Covers
cspell:r/src/r.txt 808 2

Consider adding them using (in .github/workflows/spelling.yml):

      with:
        extra_dictionaries:
          cspell:r/src/r.txt

To stop checking additional dictionaries, add:

      with:
        check_extra_dictionaries: ''
Pattern suggestions ✂️ (1)

You could add these patterns to .github/actions/spelling/patterns.txt:

# Automatically suggested patterns
# hit-count: 4 file-count: 1
# Python stringprefix / binaryprefix
# Note that there's a high false positive rate, remove the `?=` and search for the regex to see if the matches seem like reasonable strings
(?<!')\b(?:B|BR|Br|F|FR|Fr|R|RB|RF|Rb|Rf|U|UR|Ur|b|bR|br|f|fR|fr|r|rB|rF|rb|rf|u|uR|ur)'(?:[A-Z]{3,}|[A-Z][a-z]{2,}|[a-z]{3,})

Warnings (2)

See the 📂 files view, the 📜action log or 👼 SARIF report for details.

ℹ️ Warnings Count
ℹ️ binary-file 3
ℹ️ candidate-pattern 1

See ℹ️ Event descriptions for more information.

If the flagged items are 🤯 false positives

If items relate to a ...

  • binary file (or some other file you wouldn't want to check at all).

    Please add a file path to the excludes.txt file matching the containing file.

    File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.

    ^ refers to the file's path from the root of the repository, so ^README\.md$ would exclude README.md (on whichever branch you're using).

  • well-formed pattern.

    If you can write a pattern that would match it,
    try adding it to the patterns.txt file.

    Patterns are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your lines.

    Note that patterns can't match multiline strings.

@zaikunzhang
Copy link
Member

zaikunzhang commented Oct 17, 2023

Hello @nbelakovski Nickolai,

First of all, my apologies for being out of the loop recently. Works on my side have been complicated, which will likely continue for a while.

Thank you very much for your huge efforts, which are highly appreciated. What you have achieved is truly nontrivial and impressive.

Here are a few quick comments.

  1. Which commit or release of the Fortran code is the Python implementation based on? It does not seem to be based on the latest one. For example, check the signature of trstlp. I hope it will not be too difficult to catch up on the recent updates in the Fortran code.

  2. To implement the tests and verifications, please first refer to https://github.com/orgs/libprima/discussions/39 for the general idea.

    2.1. The native Python implementation has a quite different nature from a Python interface, and it is much more difficult to confirm the correctness of the former. Thus, we need many many tests, which should be randomized and run repeatedly. The testing time of the modern-Fortran implementation has added up to more than 20 years, and many many bugs have been spotted and fixed during the seemingly unnecessary tests. I am confident about the implementation only by this amount of tests.

    2.2. From now on, we need a facility to verify that two versions of the Python code produce exactly the same results. After making any kind of changes to the code, no matter how insignificant the change seems to be, we have to first perform the verification to make sure that our changes do not introduce unwanted differences between the old and new versions. Believe me, this is really necessary. This is done for the Fortran code (via the MATLAB interface) in the following way:
    https://github.com/libprima/prima/blob/main/matlab/tests/verify.m
    https://github.com/libprima/prima/blob/main/matlab/tests/private/isequiv.m
    The verification should be based a a large set of test problems (at least 100; verification on a few problems is a joke) automated (run by GitHub Actions daily and after each commit) and randomized (the random seed is altered every week).

    2.3. CUTEst is the standard test set. It has a Python binding named PyCUTEst, which can be used in GitHub Actions.

    2.4. Obviously, we must test the performance of the Python implementation against the Fortran version (via the Python interface of the latter). The standard tool is the Performance Profile. Tom @ragonneau and Cunxin @OptHuang are developing OptiProfiler for producing such profiles. I think it is usable now. You may ask them for help if needed.

2.5. TOUGH test, stress test, recursive test, and parallel test must be done. See, e.g., cobyqa/cobyqa#127

Since the Python implementation and the Python interface will co-habit in prima/python, I guess we need to bring @jschueller Julien in the loop. The two have to be consistent.

Thank you very much for your tremendous effort again!

Best regards,
Zaikun

@zaikunzhang
Copy link
Member

One more question: in a lot of the Python code I chose to use num_vars and num_constraints instead of n and m. I felt it was easier to read certain expressions if I knew that it was related to the number of constraints as opposed to m. It also makes it easier to see for someone new to the algorithm which parts depend on the constraints, since you can highlight all usages of num_constraints but it's much harder to find all usages of m.

Thoughts on this? The code should be consistent across implementations, so I can go in and make the same changes to Fortran code if we like this, but likewise I can revert the changes in Python if we don't like this approach.

This is a reasonable idea, but I do not think it is urgent. Since I am a bit busy now, I hope to postpone it. Thanks.

@nbelakovski
Copy link
Contributor Author

1. Which commit or release of the Fortran code is the Python implementation based on? It does not seem to be based on the latest one. For example, check the signature of `trstlp`.  I hope it will not be too difficult to catch up on the recent updates in the Fortran code.

Since I started working on this in July, I've kept things based on b83c8ca. I will update to the latest, but first I want to figure out the testing situation.

   https://github.com/libprima/prima/blob/main/matlab/tests/verify.m
   https://github.com/libprima/prima/blob/main/matlab/tests/private/isequiv.m
   The verification should be based a a large set of test problems (at least 100; verification on a few problems is a joke) automated (run by GitHub Actions daily and after each commit) and randomized (the random seed is altered every week).

I'll take a look at these and work towards implementing them. I may need write access to this repo in order to be able to trigger tests, can you add me please?

Since the Python implementation and the Python interface will co-habit in prima/python, I guess we need to bring @jschueller Julien in the loop. The two have to be consistent.

I think it would make more sense to have two folders, python_native and python_c_interface since these two things are trying to accomplish different goals, in spite of their common language.

@nbelakovski
Copy link
Contributor Author

nbelakovski commented Oct 23, 2023

Right now I'm working on resolving an issue where the basic test_constrained and test_unconstrained produce slightly different results in CI (on Linux) as compared to my local machine (Mac). For example, in CI the unconstrained test performs 533 function evaluations as compared to 547 on Mac, even with the same Python version. I've managed to somewhat reproduce the issue locally in the python:3.11.6-bookwork docker container - it performs 519 function evaluations. I noticed this a couple weeks ago and was able to replicate the issue exactly using act, but the same procedure today gives me 519 evaluations.

I'll work with my local reproduction and see if I can nail it down, and hopefully that also leads to a resolution of the issue in CI.

@zaikunzhang zaikunzhang marked this pull request as draft October 30, 2023 13:38
@nbelakovski nbelakovski changed the title DO NOT MERGE: Python implementation of cobyla DRAFT: Python implementation of cobyla Nov 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants