I have many files in a folder. And I want to open and read them in the sequence order depending on a reference file. my file name:
AAAAA_AAAAA.CCCCC3.1.bbb.DDDDD.1.fa
AAAAA_AAAAA.CCCCC3.1.bbb.DDDDD.2.fa
AAAAA_AAAAA.CCCCC3.1.bbb.DDDDD.3.fa
AAAAA_AAAAA.CCCCC3.1.bbb.DDDDD.4.fa
.
.
.
The reference file structure:
chr1 744 745
chr1 1208 1209
chr2 1250 1251
chr2 1454 1455
chr3 1676 1677
chr3 1683 1684
The input file structure:
AAAAA_AAAAA.CCCCC3.1.bbb.DDDDD.1.fa
>1 dna:
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTATGTGAGAAGATAGCTGAA
CGCCTTGTCCACATCATCTTACTGCTGAGAGTTGAGCTCACCCTCAGTCCCTCACAGTTC
AAAAA_AAAAA.CCCCC3.1.bbb.DDDDD.2.fa
>2 dna:
GAGAGCTGGCTTCTAGGCATGCTTCCTTTTGAGAGCTGAGGACAGGACAGAACCCTCCCG
CATCCTGCCTGACTGTAGACGTACCTGCTAACCTCCTCATGTTAGTGGCTGGGATAGATT
GTGGGAAAAGCATGTGTAAGCATTGGGCCTGAACTCCCGTGTATCTGAGTTGAATACAGC
GATTTCCAACATCCTTCTTCAATAGGAGTGTAGCTAGGTTCCAACTCCCATGTCCGAGTG
GGTAGCAGACATCTGCCTTCCATGCATACACACTTCTGAGAGTTGAGCTTATGGCCTGTA
ACCCTACCTCCTGCCTGCAGCTACCTTTTGCTTCCAAAAGTCCTAGGCTCGCTGCTTCAC
CAAAGTGTTGGGAGAGGTAACTGTTGTCTCCCGGCACACAAGACTAGTGCCTCCAAGCTC
AATCCAGCGATTTCCCAGTAATTCCTGGGTTAGACTGGTGCTACATACTAAGTTCCATAC
GTGAGTAGGTAGTTGAAAGCCTTGTCCAAAAACATCTTACTTCTGAGAGTTGAGCTCACC
CTCAGTCCCTCACAGTTCCACACTGCCTGCAGAGTGAGTTTCCCACGTCTTCATCAGAGA
CTTTTGCCAGAGGCTTCTGAGACGCAAGTTAACAATGCAAACAGGAGGGTATACCCAGGT
GCAGTAGATTGGTTATCTGGGAACCTCCTTACTCAGAATACTGTTACCTTCACACTGTCA
TAAGAATGCAGCTAGTTGAGAGCTGGCTTCTAGGCATGCTTCCCTGTGAGAGCTGAGGAC
my outputs:
chr1 A
chr1 G
chr2 C
chr2 C
chr3 T
chr3 T
I can use bioperl to find the position and print out the values one by one (file by file).
Then I try to do the open and read files from a folder.
my $dir = '/home/Documents/Folder/';
opendir(DIR, $dir) or die $!;
my @files = grep (/.fa$/, readdir(DIR));
for my $list(@files){ ##try to get the last number from file name##
my @lines = split /\./, $list}
open and read my reference file
open my $POS, '<', 'CanFam3_SNP_POS.txt' or die $!;
I put all the files into an array and sort them.
my @sorted = @files;
foreach my $i (0..$#sorted)
Then I try to use a loop control to open and read a file depending on reference file column 1 values. For example chr1, the AAAAA_AAAAA.CCCCC3.1.bbb.DDDDD.1.fa should be read and process. If reading the chr2 from reference file, break the loop, and then open and read the AAAAA_AAAAA.CCCCC3.1.bbb.DDDDD.2.fa, process the file with chr2.
open my $fh, '<', "/home/Documents/Folder/$sorted[$i]" or die $!;
while (my $line = <$POS>){
chomp($line);
if ($line =~ /chr$lines[5]/g){
my @positions = split (/\t/, $line);
print "$positions[0]","\t","$positions[1]","\t", substr($so->seq(),
$positions[1], $positions[2] - $positions[1]),"\n";
last if ($line !~ /chr$lines[5]/g)
}
}
I think I have some problems with this codes. Can I use perl to do this process? Do I misunderstand some points?