Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What are the parameter sets that were used to generate the different dfs? #11

Open
rcannood opened this issue Jun 2, 2024 · 14 comments

Comments

@rcannood
Copy link

rcannood commented Jun 2, 2024

Hey @Eliorkalfon !

I'm trying to elucidate why this method is not performing as well as it should be once we rerun the benchmarking analyses. There is probably something wrong in the code either in this repo, or in the reinterpretation of the method in task-dge-perturbation-prediction.

In this repo, we had to modify the code a bit to generate the four separate submissions and compute the weighted average in a simple way.

However, the different parameters on how the different data frames were generated is not crystal clear.

The kaggle post reads:

  • weight_df1: 0.5 (utilizing std, mean, and clustering sampling, yielding 0.551)
  • weight_df2: 0.25 (excluding uncommon elements, resulting in 0.559)
  • weight_df3: 0.25 (leveraging clustering sampling, achieving 0.575)
  • weight_df4: 0.3 (incorporating mean, random sampling, and excluding std, attaining 0.554)

From this I elucidate:

argsets = [
    # Note by author - weight_df1: 0.5 (utilizing std, mean, and clustering sampling, yielding 0.551)
    {
        "name": "df1",
        "mean_std": "mean_std",
        "uncommon": False,
        "sampling_strategy": "random",
        "weight": 0.5,
    },
    # Note by author - weight_df2: 0.25 (excluding uncommon elements, resulting in 0.559)
    {
        "name": "df2",
        "mean_std": "mean_std",
        "uncommon": True,
        "sampling_strategy": "random",
        "weight": 0.25,
    },
    # Note by author - weight_df3: 0.25 (leveraging clustering sampling, achieving 0.575)
    {
        "name": "df3",
        "mean_std": "mean_std",
        "uncommon": False, # should this be set to False or True?
        "sampling_strategy": "k-means",
        "weight": 0.25,
    },
    # Note by author - weight_df4: 0.3 (incorporating mean, random sampling, and excluding std, attaining 0.554)
    {
        "name": "df4",
        "mean_std": "mean",
        "uncommon": False, # should this be set to False or True?
        "sampling_strategy": "random",
        "weight": 0.3,
    }
]

From the description in the kaggle notebook it isn't clear to me whether "uncommon" should be set to True or False for df3 and df4. In addition, I wonder whether other arguments should also be added, such as any of the layer dimensions.

@Eliorkalfon Would you be able to give some insights into this?

@Eliorkalfon
Copy link
Owner

Hi @rcannood
I only filtered the uncommon data in df2. Every other dataframe obtained using all features.

@rcannood
Copy link
Author

rcannood commented Jun 4, 2024

Thanks for taking a look at this!

Unfortunately, even with the updated parameter settings, we could not reproduce the similar performance levels previously achieved by this method on the Kaggle Leaderboard.

Would you be able to take a look at this script to see if you can spot the issue?

You should be able to run it with the following commands:

aws s3 sync --no-sign-request \
  "s3://openproblems-bio/public/neurips-2023-competition/workflow-resources/" \
  "resources"

python src/task/methods/transformer_ensemble/script.py

(Provided that you have all of the dependencies installed.)

@Eliorkalfon
Copy link
Owner

Eliorkalfon commented Jun 4, 2024 via email

@rcannood
Copy link
Author

rcannood commented Jun 4, 2024 via email

@Eliorkalfon
Copy link
Owner

Eliorkalfon commented Jun 4, 2024 via email

rcannood added a commit to openproblems-bio/task_perturbation_prediction that referenced this issue Jun 4, 2024
@rcannood
Copy link
Author

rcannood commented Jun 4, 2024

Thanks for your input, Elior!

Do you mean something like this? → https://github.com/openproblems-bio/task-dge-perturbation-prediction/pull/65/files

Note: Better approach would be to create a set of k models (based on k
folds) and return the average prediction but i didn't have time to
implement it.

Regarding this: first and foremost, we'd like to be able to recreate the source code used to generate the submission that ended up winning in the Kaggle competition, not add new features ;)

@Eliorkalfon
Copy link
Owner

Eliorkalfon commented Jun 4, 2024 via email

@rcannood
Copy link
Author

rcannood commented Jun 4, 2024

I just ran the method with a validation percentage of 0.1 instead of 0.2, and the resulting MRRMSE score was worse.

Would you be able to run through the code and verify which parts need to be changed in order for the code to produce a decent result? 🙇

@rcannood rcannood changed the title What are the parameter sets that were used to generate the weight dfs? What are the parameter sets that were used to generate the different dfs? Jun 4, 2024
@Eliorkalfon
Copy link
Owner

Eliorkalfon commented Jun 4, 2024 via email

@Eliorkalfon
Copy link
Owner

Eliorkalfon commented Jun 4, 2024 via email

@rcannood
Copy link
Author

Hi Elior!

I hope that you had a good trip!

Would you happen to have some time to run and review the component and see whether there are any issues that can be fixed?

Kind regards,
Robrecht

@Eliorkalfon
Copy link
Owner

Eliorkalfon commented Jul 10, 2024 via email

@Eliorkalfon
Copy link
Owner

Eliorkalfon commented Jul 11, 2024 via email

@Eliorkalfon
Copy link
Owner

Eliorkalfon commented Jul 11, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants