bash - parallel: Error: $XDG_CACHE_HOME can only contain [-a-z0-9_+,.%:= ] - Stack Overflow

admin2025-02-12  8

I am running (or trying to run) 3DDNA on CU's supercomputing cluster Alpine to assemble a genome from long read and short read/contact data (PacBio HIFI and Arima HIC). 3DDNA uses GNU Parallel to parallelize several steps in the assembly process. GNU parallel appears to use XDG base directory specification. I have had issues running it because it seems the $TMPDIR and $XDG_CACHE_HOME variables are incorrectly defined. I have defined both in .bashrc and .bash_profile as such:

export TMPDIR=/scratch/alpine/.colostate.edu/username/463/juicedir/tmp
export XDG_CACHE_HOME=/scratch/alpine/.colostate.edu/username/463/juicedir/cache

When I submit the job, it runs for ~25 seconds and I get this output:

###############
Starting iterating scaffolding with editing:
...starting round 0 of scaffolding:
:) -p flag was triggered. Running LIGer with GNU Parallel support parameter set to true.
:) -s flag was triggered, starting calculations with 15000 threshold starting contig/scaffold size
:) -q flag was triggered, starting calculations with 1 threshold mapping quality
...Using cprops file: 463_scaffolds.0.cprops
...Using merged_nodups file: 463_scaffolds.mnd.0.txt
...Scaffolding all scaffolds and contigs greater or equal to 15000 bp.
...Starting iteration # 1
parallel: Error: $XDG_CACHE_HOME can only contain [-a-z0-9_+,.%:/= ].
:) DONE!
...visualizing round 0 results:
:) -p flag was triggered. Running with GNU Parallel support parameter set to true.
:) -q flag was triggered, starting calculations for 1 threshold mapping quality
:) -i flag was triggered, building mapq without
:) -c flag was triggered, will remove temporary files after completion
...Remapping contact data from the original contig set to assembly
parallel: Error: $XDG_CACHE_HOME can only contain [-a-z0-9_+,.%:/= ].

This style of output continues; the program essentially runs with empty files that it creates, and the only error I can identify is

parallel: Error: $XDG_CACHE_HOME can only contain [-a-z0-9_+,.%:/= ].

I can't find a similar error reported elsewhere. Other background info is that originally I was getting

parallel: Error: $TMPDIR can only contain [-a-z0-9_+,.%:/= ].

and I went into each individual .sh file that the program calls and defined $TMPDIR with the --tmpdir flag in every GNU parallel command.

The last thing I tried was create $HOME/.cache as a symlink to my desired cache folder in scratch storage. Didn't work.

Any ideas or experience greatly appreciated.

I am running (or trying to run) 3DDNA on CU's supercomputing cluster Alpine to assemble a genome from long read and short read/contact data (PacBio HIFI and Arima HIC). 3DDNA uses GNU Parallel to parallelize several steps in the assembly process. GNU parallel appears to use XDG base directory specification. I have had issues running it because it seems the $TMPDIR and $XDG_CACHE_HOME variables are incorrectly defined. I have defined both in .bashrc and .bash_profile as such:

export TMPDIR=/scratch/alpine/.colostate.edu/username/463/juicedir/tmp
export XDG_CACHE_HOME=/scratch/alpine/.colostate.edu/username/463/juicedir/cache

When I submit the job, it runs for ~25 seconds and I get this output:

###############
Starting iterating scaffolding with editing:
...starting round 0 of scaffolding:
:) -p flag was triggered. Running LIGer with GNU Parallel support parameter set to true.
:) -s flag was triggered, starting calculations with 15000 threshold starting contig/scaffold size
:) -q flag was triggered, starting calculations with 1 threshold mapping quality
...Using cprops file: 463_scaffolds.0.cprops
...Using merged_nodups file: 463_scaffolds.mnd.0.txt
...Scaffolding all scaffolds and contigs greater or equal to 15000 bp.
...Starting iteration # 1
parallel: Error: $XDG_CACHE_HOME can only contain [-a-z0-9_+,.%:/= ].
:) DONE!
...visualizing round 0 results:
:) -p flag was triggered. Running with GNU Parallel support parameter set to true.
:) -q flag was triggered, starting calculations for 1 threshold mapping quality
:) -i flag was triggered, building mapq without
:) -c flag was triggered, will remove temporary files after completion
...Remapping contact data from the original contig set to assembly
parallel: Error: $XDG_CACHE_HOME can only contain [-a-z0-9_+,.%:/= ].

This style of output continues; the program essentially runs with empty files that it creates, and the only error I can identify is

parallel: Error: $XDG_CACHE_HOME can only contain [-a-z0-9_+,.%:/= ].

I can't find a similar error reported elsewhere. Other background info is that originally I was getting

parallel: Error: $TMPDIR can only contain [-a-z0-9_+,.%:/= ].

and I went into each individual .sh file that the program calls and defined $TMPDIR with the --tmpdir flag in every GNU parallel command.

The last thing I tried was create $HOME/.cache as a symlink to my desired cache folder in scratch storage. Didn't work.

Any ideas or experience greatly appreciated.

Share Improve this question asked Jan 8 at 8:06 Juliette LewisJuliette Lewis 112 bronze badges 7
  • 4 What's the output of printf '%q\n' "$XDG_CACHE_HOME"? My guess is that you'll see a \r at the end, which would mean that you got CRLF line endings in your file – Fravadona Commented Jan 8 at 8:08
  • Not your current problem but regarding "I have defined both in .bashrc and .bash_profile as such" - you don't need to define a variable in both files, just 1 or the other. – Ed Morton Commented Jan 8 at 12:45
  • [[email protected]@login-ci4 ~]$ printf '%q\n' "$XDG_CACHE_HOME" /scratch/alpine/.colostate.edu/username/463/juicedir/cache @Fravadona it gives the expected output – Juliette Lewis Commented Jan 8 at 15:11
  • Maybe it's a locale issue - try adding export LC_ALL=C before running your code. – Ed Morton Commented Jan 8 at 18:15
  • @EdMorton unfortunately that didn't work – Juliette Lewis Commented Jan 8 at 19:15
 |  Show 2 more comments

1 Answer 1

Reset to default 0

Solution on CU alpine for me was to set the following:

  • Add the --tmpdir=/tmp option to the parallel call
  • Set the $XDG_CACHE_HOME via export XDG_CACHE_HOME=/tmp

Full example:

export XDG_CACHE_HOME=/tmp

parallel --tmpdir=/tmp echo ::: 1 2 3
转载请注明原文地址:http://www.anycun.com/QandA/1739347549a15835.html