Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add script to run gain selection over a list of dates, using the "lst_select_gain" script #141

Merged
merged 10 commits into from
Mar 30, 2022

Conversation

marialainez
Copy link
Collaborator

No description provided.

@morcuended
Copy link
Member

Thanks @marialainez. I have been trying to make it work with sbatch. All attempts of using --wrap sbatch option for exporting the PATH did not work. The only way it is working for me is creating job bash files that are executed with sbatch

using for example this function:

import dedent

PATH = "PATH=/fefs/aswg/software/gain_selection/bin:$PATH"

def get_sbatch_script(
        run_id,
        input_file,
        output_dir,
        log_dir,
        ref_time,
        ref_counter,
        module,
        ref_source
):
    return dedent(f"""\
    #!/bin/bash
    
    #SBATCH -D {log_dir}
    #SBATCH -o "gain_selection_{run_id:05d}_%j.log"
    #SBATCH --job-name "gain_selection_{run_id:05d}"
    #SBATCH --export {PATH} 
    
    lst_select_gain {input_file} {output_dir} {ref_time} {ref_counter} {module} {ref_source}
    """)
for file in input_files:
    run_info = run_info_from_filename(file)
    job_file = f"gain_selection_{run_info.run:05d}.{run_info.subrun:04d}.sh"
    with open(job_file, "w") as f:
        f.write(get_sbatch_script(
            run_id,
            file,
            output_dir,
            log_dir,
            ref_time,
            ref_counter,
            module,
            ref_source
        ))
    sp.run(["sbatch", job_file], check=True)


def apply_gain_selection(date: str):
run_summary_file = "/fefs/aswg/data/real/monitoring/RunSummary/RunSummary_"+date+".ecsv"
data = ascii.read(run_summary_file)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can directly use astropy Tables -> data = Tables.read(file)

run_summary_file = "/fefs/aswg/data/real/monitoring/RunSummary/RunSummary_"+date+".ecsv"
data = ascii.read(run_summary_file)
data.add_index("run_id")
data = data[(data['run_type']=='DATA')] # apply gain selection only to DATA runs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove redundant parenthesis

data.add_index("run_id")
data = data[(data['run_type']=='DATA')] # apply gain selection only to DATA runs

output_dir = "/fefs/aswg/data/real/R0/gain_selected/"+date
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use f-strings: f"/fefs/aswg/data/real/R0/gain_selected/{date}"

Comment on lines 26 to 31
for run in data["run_id"]:

ref_time = data.loc[run]["dragon_reference_time"]
ref_counter = data.loc[run]["dragon_reference_counter"]
module = data.loc[run]["dragon_reference_module_index"]
ref_source = data.loc[run]["dragon_reference_source"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can directly loop over the whole row and extract the values afterward:

for run in data:
    run_id = run["run_id"]
    ref_time = run["dragon_reference_time"]
    ref_counter = run["dragon_reference_counter"]
    module = run["dragon_reference_module_index"]
    ref_source = run["dragon_reference_source"].upper()

and then I think you would not need to use data.add_index("run_id") since the table would be already indexed.

@codecov
Copy link

codecov bot commented Mar 14, 2022

Codecov Report

Merging #141 (4e7e39e) into main (ffc7e25) will increase coverage by 0.13%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main     #141      +/-   ##
==========================================
+ Coverage   81.74%   81.88%   +0.13%     
==========================================
  Files          51       51              
  Lines        4838     4995     +157     
==========================================
+ Hits         3955     4090     +135     
- Misses        883      905      +22     
Impacted Files Coverage Δ
osa/utils/tests/test_utils.py 98.38% <0.00%> (+1.61%) ⬆️
osa/utils/utils.py 81.93% <0.00%> (+1.93%) ⬆️
osa/scripts/copy_datacheck.py 32.91% <0.00%> (+2.00%) ⬆️
osa/scripts/autocloser.py 67.11% <0.00%> (+8.39%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ffc7e25...4e7e39e. Read the comment docs.


log.info("Done! No more dates to process.")

check_failed_jobs(output_basedir)
Copy link
Member

@morcuended morcuended Mar 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marialainez, in the current way of checking failed jobs, you need to launch the script again, which will submit jobs again, right? Either we implement a simulate option that allows only for checking job status or the check is done in a separate script independent from the launching-job script.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants