I'm working with SLURM and need to create and submit a pipeline where jobs run sequentially: job_1 → job_2 → job_3. However, job_2.sh is unique because it's generated dynamically by job_1.sh using a Python script.
Since job_2.sh doesn't exist when the pipeline starts, I can't directly create dependencies between the jobs. To work around this, I created an intermediate job job_1_5.sh that submits job_2.sh after job_1.sh has generated it. However, now when I want to submit job_3, I only have the id for job_1_5.
Here's my current pipeline structure:
Here's my implementation:
#pipeline.sh
#!/bin/bash
# Submit job_1 which will generate and submit job_2
job1_id=$(sbatch --parsable job_1.sh)
job1_5_id=$(sbatch --parsable --dependency=afterok:$job1_id job_1_5.sh)
job3_id=$(sbatch --parsable --dependency=afterok:$job1_5_id job_3.sh) #I also want a dependency on `job_2.sh`
#job_1.sh
#!/bin/bash
#SBATCH --job-name=job1
#SBATCH --output=job1_%j.out
#SBATCH --time=00:05:00
echo "This is job 1"
python generate_job2.py
#generate_job2.py
#!/usr/bin/env python3
def generate_job2():
with open('job_2.sh', 'w') as f:
f.write('''#!/bin/bash
#SBATCH --job-name=job2
#SBATCH --output=job2_%j.out
#SBATCH --time=00:05:00
echo "This is job 2"
sleep 2
''')
if __name__ == "__main__":
generate_job2()
#job_1_5.sh
#!/bin/bash
#SBATCH --job-name=job1_5
#SBATCH --output=job1_5_%j.out
#SBATCH --time=00:05:00
#SBATCH --dependency=afterok:$1 # Depends on job_1
# Submit job_2
job_2_id=$(sbatch --parsable job_2.sh)
echo $job_2_id > job2_id.txt
#job_2.sh
#!/bin/bash
#SBATCH --job-name=job2
#SBATCH --output=job2_%j.out
#SBATCH --time=00:05:00
echo "This is job 2"
sleep 2
#job_3.sh
#!/bin/bash
#SBATCH --job-name=job3
#SBATCH --output=job3_%j.out
#SBATCH --time=00:05:00
echo "This is job 3"
How can I properly ensure that job_3.sh only runs after the dynamically generated job_2.sh has completed?
My attempts at fixing it -
I tried writing the job_id for job_2 into a file. However, that won't allow me to utilize that dependency id, as the id won't be generated until job_2 runs.
Another potential solution would be somehow placing an id for job_2, rather than expecting slurm to generate it. That way, we can easily place a constraint on job_3 to ensure job_2 goes first.
I'm working with SLURM and need to create and submit a pipeline where jobs run sequentially: job_1 → job_2 → job_3. However, job_2.sh is unique because it's generated dynamically by job_1.sh using a Python script.
Since job_2.sh doesn't exist when the pipeline starts, I can't directly create dependencies between the jobs. To work around this, I created an intermediate job job_1_5.sh that submits job_2.sh after job_1.sh has generated it. However, now when I want to submit job_3, I only have the id for job_1_5.
Here's my current pipeline structure:
Here's my implementation:
#pipeline.sh
#!/bin/bash
# Submit job_1 which will generate and submit job_2
job1_id=$(sbatch --parsable job_1.sh)
job1_5_id=$(sbatch --parsable --dependency=afterok:$job1_id job_1_5.sh)
job3_id=$(sbatch --parsable --dependency=afterok:$job1_5_id job_3.sh) #I also want a dependency on `job_2.sh`
#job_1.sh
#!/bin/bash
#SBATCH --job-name=job1
#SBATCH --output=job1_%j.out
#SBATCH --time=00:05:00
echo "This is job 1"
python generate_job2.py
#generate_job2.py
#!/usr/bin/env python3
def generate_job2():
with open('job_2.sh', 'w') as f:
f.write('''#!/bin/bash
#SBATCH --job-name=job2
#SBATCH --output=job2_%j.out
#SBATCH --time=00:05:00
echo "This is job 2"
sleep 2
''')
if __name__ == "__main__":
generate_job2()
#job_1_5.sh
#!/bin/bash
#SBATCH --job-name=job1_5
#SBATCH --output=job1_5_%j.out
#SBATCH --time=00:05:00
#SBATCH --dependency=afterok:$1 # Depends on job_1
# Submit job_2
job_2_id=$(sbatch --parsable job_2.sh)
echo $job_2_id > job2_id.txt
#job_2.sh
#!/bin/bash
#SBATCH --job-name=job2
#SBATCH --output=job2_%j.out
#SBATCH --time=00:05:00
echo "This is job 2"
sleep 2
#job_3.sh
#!/bin/bash
#SBATCH --job-name=job3
#SBATCH --output=job3_%j.out
#SBATCH --time=00:05:00
echo "This is job 3"
How can I properly ensure that job_3.sh only runs after the dynamically generated job_2.sh has completed?
My attempts at fixing it -
I tried writing the job_id for job_2 into a file. However, that won't allow me to utilize that dependency id, as the id won't be generated until job_2 runs.
Another potential solution would be somehow placing an id for job_2, rather than expecting slurm to generate it. That way, we can easily place a constraint on job_3 to ensure job_2 goes first.
The easiest way to achieve that is to simply submit the N+1
th job from the N
th job:
job_1.sh:
#!/bin/bash
#SBATCH --job-name=job1
#SBATCH --output=job1_%j.out
#SBATCH --time=00:05:00
echo "This is job 1"
python generate_job2.py
sbatch generate_job2.py
and so on, adapting generate_job2.py
so that is also submits Job3.
Two caveats:
Another strategy, if the generate_job2.py
script does not influence the resource request, and only generates the commands to run in the submission script, is to prepare for each dependent jobs a submission script that only runs an external Bash script yet to be written by the previous job.
Something like
#!/bin/bash
#SBATCH --job-name=job2
#SBATCH --output=job2_%j.out
#SBATCH --time=00:05:00
./script_to_be_written_by_job1
Slurm will not check whether or not script_to_be_written_by_job1
exists at submission time. So if you setup the dependencies properly, by the time Slurm tries to run the script, it will exist.
sbatch --wait
– KamilCuk Commented Feb 3 at 15:53sbatch --wait
is that the job submission script will have to wait until the current job executes. My objective was to submit all jobs in one go. – desert_ranger Commented Feb 3 at 15:58write your own pipeline.
– desert_ranger Commented Feb 3 at 16:06to submit all jobs in one go
You can submit the script that uses sbatch --wait . – KamilCuk Commented Feb 3 at 16:09