2

I'm trying to learn nextflow but it's not going very well. I used NGS-based double-end sequencing data to build an analysis flow from fastq files to vcf files using Nextflow. However I got stuck right at the beginning, as shown in the code. The first process soapnuke works fine, but when passing the files from the channel (clean_fq1 \ clean_fq2) to the next process there is an ERROR: No such variable: from. As shown in the figure below. What should I do? Thanks for a help.

enter image description here

params.fq1 = "/data/mPCR/220213_I7_V350055104_L3_SZPVL22000812-81/*1.fq.gz"
params.fq2 = "/data/mPCR/220213_I7_V350055104_L3_SZPVL22000812-81/*2.fq.gz"
params.index = "/home/duxu/project/data/index.list"
params.primer = “/home/duxu/project/data/primer_*.fasta"
params.output='results'

fq1 = Channel.frompath(params.fq1)
fq2 = Channel.frompath(params.fq2)
index = Channel.frompath(params.index)
primer = Channel.frompath(params.primer)

process soapnuke{
    conda'soapnuke'
    tag{"soapnuk ${fq1} ${fq2}"}
    publishDir "${params.outdir}/SOAPnuke", mode: 'copy'
    input:
        file rawfq1 from fq1
        file rawfq2 from fq2    

    output:
        file 'clean1.fastq.gz' into clean_fq1
        file 'clean2.fastq.gz' into clean_fq2
    
script:
    """
    SOAPnuke filter -1 $rawfq1 -2 $rawfq2 -l 12 -q 0.5 -Q 2 -o . \
        -C clean1.fastq.gz -D clean2.fastq.gz
    """
}

I get stuck on this:

process barcode_splitter{
    conda'barcode_splitter'
    tag{"barcode_splitter ${fq1} ${fq2}"}
    publishDir "${params.outdir}/barcode_splitter", mode: 'copy'
    input:
        file split1 from clean_fq1
        file split2 from clean_fq2
        index from params.index

    output:
       file '*-read-1.fastq.gz' into trimmed_index1
       file '*-read-2.fastq.gz' into trimmed_index2

    script:
    """
    barcode_splitter --bcfile $index $split1 $split2  --idxread 1 2 --mismatches 1 --suffix .fastq --gzipout
    """
}
1
  • I have experienced one case where I forgot the ':' semicolon after emit someChannel, then next flow will say the your variable used in the output channel as undefined if your variable is defined in the script: section because the parser just got blinded by the syntax error at the output: section. Nextflow parser could have given better syntax error messages. Hope this can help folks who experience the same problem. Commented Aug 18, 2023 at 22:29

1 Answer 1

1

The code below will produce the error you see:

index = Channel.fromPath( params.index )

process barcode_splitter {
     ...

     input:
     index from params.index

     ...
}

What you want is:

index = file( params.index )

process barcode_splitter {
     ...

     input:
     path index

     ...
}

Note that when the file input name is the same as the channel name, the from channel declaration can be omitted. I also used the path qualifier above, as it should be preferred over the file qualifier when using Nextflow 19.10.0 or later.

You may also want to consider refactoring to use the fromFilePairs factory method. Here's one way, untested of course:

params.reads = "/data/mPCR/220213_I7_V350055104_L3_SZPVL22000812-81/*_{1,2}.fq.gz"
params.index = "/home/duxu/project/data/index.list"
params.output = 'results'

reads_ch = Channel.fromFilePairs( params.reads )
index = file( params.index )


process soapnuke {

    tag { sample }

    publishDir "${params.outdir}/SOAPnuke", mode: 'copy'
    conda 'soapnuke'

    input:
    tuple val(sample), path(reads) from reads_ch

    output:
    tuple val(sample), path('clean{1,2}.fastq.gz') into clean_reads_ch

    script:
    def (rawfq1, rawfq2) = reads

    """
    SOAPnuke filter \\
        -1 "${rawfq1}" \\
        -2 "${rawfq2}" \\
        -l 12 \\
        -q 0.5 \\
        -Q 2 \\
        -o . \\
        -C "clean1.fastq.gz" \\
        -D "clean2.fastq.gz"
    """
}

process barcode_splitter {

    tag { sample }

    publishDir "${params.outdir}/barcode_splitter", mode: 'copy'
    conda 'barcode_splitter'

    input:
    tuple val(sample), path(reads) from clean_reads_ch
    path index

    output:
    tuple val(sample), path('*-read-{1,2}.fastq.gz') into trimmed_index

    script:
    def (splitfq1, splitfq2) = reads

    """
    barcode_splitter \\
        --bcfile \\
        "${index}" \\
        "${split1}" \\
        "${split2}" \\
        --idxread 1 2 \\
        --mismatches 1 \\
        --suffix ".fastq" \\
        --gzipout
    """
}
Sign up to request clarification or add additional context in comments.

3 Comments

Thank you Steve for your answer, it was very useful. But. I rewrote my script according to the one you provided and I get the following ERROR again. ERROR ~ Unknown process directive: path Did you mean of these? each
The version of my Nextflow is 19.01.0, I install it in Conda. I 'm not sure I can work with del2 in this verison.
No worries at all @Daffy. To fix the error above, you can simply replace the 'path' qualifier with the 'file' qualifier in your input and output declarations. Alternatively, consider upgrading your Nextflow version to 19.10.0 or later. Note that DSL2 is now the default syntax in the latest version (22.04.0), but it's still possible to use the old DSL by appending -dsl1 on the command line for example. The 'path' qualifier isn't specific to DSL2.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.