FASK Ignores Background Knowledge Constraints in Sachs Data #38

chrisquatjr · 2024-12-10T05:50:33Z

First of all, thank you to the py-tetrad team for such active maintenance of a great set of resources here!

Problem Description

I have been trying to reproduce the FASK paper results (found here: arxiv.org/pdf/1805.03108) using the Sachs dataset (https://github.com/cmu-phil/example-causal-datasets/blob/main/real/sachs/data/sachs.2005.logxplus10.jittered.eperimental.continuous.txt). The algorithm (or perhaps the TetradSearch object itself) appears to ignore background knowledge constraints. Specifically, edges appear between intervention variables even when explicitly forbidden.

Example

# Load and process Sachs dataset
df = pd.read_csv('sachs.2005.with.jittered.experimental.continuous.txt', sep='\t')
log_df = df.apply(lambda x: np.log2(x + 10))

# Setup FASK with background knowledge
fask_search = ts.TetradSearch(log_df)

# Add variables to tiers and forbid intervention-intervention edges
for var in int_cols:
    fask_search.add_to_tier(0, var)
for var in measured_cols:
    fask_search.add_to_tier(1, var)
for int1 in int_cols:
    for int2 in int_cols:
        if int1 != int2:
            fask_search.set_forbidden(int1, int2)

# Run FASK
fask_search.use_sem_bic()
fask_search.run_fask(alpha=0.00001, depth=-1, fask_delta=-0.2,
                     left_right_rule=1, skew_edge_threshold=0.3)

Despite fask_search.print_knowledge() showing forbidden edges (e.g., "b2camp cd3_cd28"), these edges still appear in the output (e.g., "b2camp --> cd3_cd28").
This significantly impacts performance:

Published results: AP=0.84, AR=0.80, AHP=1.00, AHR=0.79
My results: AP=0.127, AR=0.438, AHP=0.109, AHR=0.412

Environment: Python 3.11.8, py-tetrad 0.1.2, Ubuntu 22.04.5 LTS

Perhaps I have missed something essential here. In my testing so far, I have been unsuccessful in encoding exogenous background data into my analysis. Any assistance on this would be greatly appreciated! Thank you for your time and consideration.

The text was updated successfully, but these errors were encountered:

jdramsey · 2024-12-11T09:12:41Z

Nope, you did no miss anything; the error is mine! In the TetradSearch.py class, knowledge was not being passed to FASK. I just added this line:

        alg.setKnowledge(self.knowledge)

to this method:

    def run_fask(self, alpha=0.05, depth=-1, fask_delta=-0.3, left_right_rule=1, skew_edge_threshold=0.3):
        self.params.set(Params.ALPHA, alpha)
        self.params.set(Params.DEPTH, depth)
        self.params.set(Params.FASK_DELTA, fask_delta)
        self.params.set(Params.FASK_LEFT_RIGHT_RULE, left_right_rule)
        self.params.set(Params.SKEW_EDGE_THRESHOLD, skew_edge_threshold)

        alg = dag.Fask(self.SCORE)
        alg.setKnowledge(self.knowledge)
        self.java = alg.search(self.data, self.params)
        self.bootstrap_graphs = alg.getBootstrapGraphs()

If you do a git pull for the py-tetrad repository (or check it out again) or re-apply the pip install, you should get the change. (Or you could just make the change yourself in the file.)

Best,

Joe

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FASK Ignores Background Knowledge Constraints in Sachs Data #38

FASK Ignores Background Knowledge Constraints in Sachs Data #38

chrisquatjr commented Dec 10, 2024

jdramsey commented Dec 11, 2024

FASK Ignores Background Knowledge Constraints in Sachs Data #38

FASK Ignores Background Knowledge Constraints in Sachs Data #38

Comments

chrisquatjr commented Dec 10, 2024

Problem Description

Example

jdramsey commented Dec 11, 2024