Replies: 8 comments 5 replies
-
please could you re-phrase? Generally I'd suggest sticking to the standard prediction quality metrics: MAE, R2, etc.
minimizing? |
Beta Was this translation helpful? Give feedback.
-
While I love the DiSCoVeR algorithm and idea, I am generally not convinced on the use of only composition to guide materials design. I understand it generally works for certain properties, for example elasticity where good materials are typically refractory elements with C/B/N, and also experiments only use composition as guidance. I argue the success are due to the fact we are not exploring large enough space beyond known materials. Examples such as graphite vs diamond makes me think if there is better alternative. For example, on top of composition, can we add more contrains like density range of the materials? Also eventually before we make any property predictions, we will need to assess the synthesizabilty first, e.g. https://www.nature.com/articles/d41586-019-00676-y Those are my concerns when dealing with new models/algorithms |
Beta Was this translation helpful? Give feedback.
-
@chc273 think you bring up a great point about composition-only design. One other topic is the materials design of alloys; the options are kind of limited when it comes to composition (i.e. the featurization stays the same, which is good for the transferability of models but bad because the algorithm doesn't have that info about it being an alloy, fractional prevalence of phases, etc.). With the right characterization (or assumptions), I think there are some really interesting paths that could be taken with structural models in alloy design spaces.
Could you clarify this? Success of composition-based models (or did you mean lack of success)? Not large enough meaning mostly living in "Materials Project space"? (which is a great space, granted) |
Beta Was this translation helpful? Give feedback.
-
@chc273 Interesting point about density range. Experimentally (and computationally), density is something that would have to be measured after the synthesis/calculation, respectively. I could see that taking on a couple of forms:
Do those seem in line with what you were thinking, or did I miss something? |
Beta Was this translation helpful? Give feedback.
-
@chc273 and others that can chime in,
Definitely agreed about synthesizability; and the reference made for a nice read. Are there particular synthesizability screening routes that would improve or decrease your trust in the claim of success/superiority of an adaptive design validation? Any specific details? For example:
|
Beta Was this translation helpful? Give feedback.
-
Probably the most useful indicator would be taking a previous screening/ML study and using their results (all the way to synthesis) to evaluate your algorithm using only the info that study had a priori. I.e., can your algorithm improve on their exhaustive search with no extra a priori knowledge? |
Beta Was this translation helpful? Give feedback.
-
@ancarnevali @anthony-wang @Kaaiian @chc273 @AndrewFalkowski @ardunn @Ryan-Rhys @mkhorton @SurgeArrester @CompRhys @ppdebreuck @blokhin @amorehead @ml-evs @janosh (name order randomly sorted 😉)
Based on recent discussions, I'd be interested to hear your thoughts on this. I think the DiSCoVeR algorithm (this work) is a relatively novel approach to adaptive design; however, it hasn't actually been put "through the ringer" to see how it compares to other adaptive design schemes, except for some basic comparisons against one of
sklearn
-s novelty algorithms,LocalOutlierFactor
(see the 2nd panel plot in https://mat-discover.readthedocs.io/en/latest/figures.html#adaptive-design-comparison), for which the results will probably change once I figure out how to extract the unscaled densities.What kind of analysis/comparisons to specific techniques would you need to see in order to evaluate whether the DiSCoVeR algorithm would be "trustworthy" enough or not to implement in your own workflow (e.g. expensive DFT simulations, wetlab experiments). I use the word trustworthy, because to me it seems like a leap of faith to choose an algorithm and potentially spend hundreds or thousands of hours with the hope that said algorithm is maximizing (the output of) the time you spend setting up and waiting for simulations and wetlab experiments. Biased answers are very welcome.
Beta Was this translation helpful? Give feedback.
All reactions