Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fine-tuning/retraining esm2 #565

Open
MarjanHJ opened this issue Jan 2, 2025 · 0 comments
Open

fine-tuning/retraining esm2 #565

MarjanHJ opened this issue Jan 2, 2025 · 0 comments

Comments

@MarjanHJ
Copy link

MarjanHJ commented Jan 2, 2025

is it possible to fine-tune or train esm2 with sequences that are not from Uniprot? I have a set of custom data and I want to fine-tune or train esm2 on my data but I don't know how to make the followings for my dataset.
I can see the sample data they have ur50_id and ur90_id but my data doesn't have these Ids. If this is still possible to train esm2, can you please point to the instruction to do this.

train_cluster_path = f"{data_path}/2024_03_sanity/train_clusters_sanity.parquet"
train_database_path = f"{data_path}/2024_03_sanity/train_sanity.db"
valid_cluster_path = f"{data_path}/2024_03_sanity/valid_clusters.parquet"
valid_database_path = f"{data_path}/2024_03_sanity/validation.db"

thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant