相关参数请见上一篇
1.使用实例1:
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ art_illumina -ss HS20 -i GRCH38chr1L3556522.fna -l 100 -f 20 -o G38L100F20Nhs20
====================ART====================
ART_Illumina (2008-2016)
Q Version 2.5.1 (Apr 17, 2016)
Contact: Weichun Huang <whduke@gmail.com>
-------------------------------------------
Single-end Simulation
Total CPU time used: 1162.71
The random seed for the run: 1464879720
Parameters used during run
Read Length: 100
Genome masking 'N' cutoff frequency: 1 in 100
Fold Coverage: 20X
Profile Type: Combined
ID Tag:
Quality Profile(s)
First Read: HiSeq 2000 Length 100 R1 (built-in profile)
Output files
FASTQ Sequence File:
G38L100F20Nhs20.fq
ALN Alignment File:
G38L100F20Nhs20.aln
2.使用实例2:
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ art_illumina -ss HS25 -sam -i GRCH38chr1L3556522.fna -p -l 150 -f 20 -m 200 -s 10 -o paired_dat
====================ART====================
ART_Illumina (2008-2016)
Q Version 2.5.1 (Apr 17, 2016)
Contact: Weichun Huang <whduke@gmail.com>
-------------------------------------------
Paired-end sequencing simulation
Total CPU time used: 1070.33
The random seed for the run: 1464880583
Parameters used during run
Read Length: 150
Genome masking 'N' cutoff frequency: 1 in 150
Fold Coverage: 20X
Mean Fragment Length: 200
Standard Deviation: 10
Profile Type: Combined
ID Tag:
Quality Profile(s)
First Read: HiSeq 2500 Length 150 R1 (built-in profile)
First Read: HiSeq 2500 Length 150 R2 (built-in profile)
Output files
FASTQ Sequence Files:
the 1st reads: paired_dat1.fq
the 2nd reads: paired_dat2.fq
ALN Alignment Files:
the 1st reads: paired_dat1.aln
the 2nd reads: paired_dat2.aln
SAM Alignment File:
paired_dat.sam
查看文件:
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ ll -h
total 50G
drwxrwxr-x 2 hadoop hadoop 4.0K 6月 2 23:16 ./
drwxrwxr-x 6 hadoop hadoop 4.0K 6月 2 22:59 ../
-rw-rw-r-- 1 hadoop hadoop 11G 6月 2 23:29 G38L100F20Nhs20.aln
-rw-rw-r-- 1 hadoop hadoop 9.4G 6月 2 23:29 G38L100F20Nhs20.fq
-rw-r--r-- 1 hadoop hadoop 241M 6月 2 23:00 GRCH38chr1L3556522.fna
-rw-rw-r-- 1 hadoop hadoop 2.5K 6月 2 23:09 GRCH38chr1L3556522.fna.amb
-rw-rw-r-- 1 hadoop hadoop 144 6月 2 23:09 GRCH38chr1L3556522.fna.ann
-rw-rw-r-- 1 hadoop hadoop 238M 6月 2 23:09 GRCH38chr1L3556522.fna.bwt
-rw-rw-r-- 1 hadoop hadoop 60M 6月 2 23:09 GRCH38chr1L3556522.fna.pac
-rw-rw-r-- 1 hadoop hadoop 119M 6月 2 23:10 GRCH38chr1L3556522.fna.sa
-rw-rw-r-- 1 hadoop hadoop 4.9G 6月 2 23:42 paired_dat1.aln
-rw-rw-r-- 1 hadoop hadoop 4.6G 6月 2 23:42 paired_dat1.fq
-rw-rw-r-- 1 hadoop hadoop 4.8G 6月 2 23:42 paired_dat2.aln
-rw-rw-r-- 1 hadoop hadoop 4.6G 6月 2 23:42 paired_dat2.fq
-rw-rw-r-- 1 hadoop hadoop 11G 6月 2 23:42 paired_dat.sam
生成文件都好大
3.制定每条序列产生的reads数: (产生的数据变小了)
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ art_illumina -ss HS20 -i GRCH38chr1L3556522.fna -l 100 -c 50 -o G38L100c50Nhs20
====================ART====================
ART_Illumina (2008-2016)
Q Version 2.5.1 (Apr 17, 2016)
Contact: Weichun Huang <whduke@gmail.com>
-------------------------------------------
Single-end Simulation
Total CPU time used: 15.96
The random seed for the run: 1464918709
Parameters used during run
Read Length: 100
Genome masking 'N' cutoff frequency: 1 in 100
Fold Coverage: 0X
Profile Type: Combined
ID Tag:
Quality Profile(s)
First Read: HiSeq 2000 Length 100 R1 (built-in profile)
Output files
FASTQ Sequence File:
G38L100c50Nhs20.fq
ALN Alignment File:
G38L100c50Nhs20.aln
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ ls
G38L100c50Nhs20.aln G38L100F20Nhs20.aln GRCH38chr1L3556522.fna GRCH38chr1L3556522.fna.ann GRCH38chr1L3556522.fna.pac paired_dat1.aln paired_dat2.aln paired_dat.sam
G38L100c50Nhs20.fq G38L100F20Nhs20.fq GRCH38chr1L3556522.fna.amb GRCH38chr1L3556522.fna.bwt GRCH38chr1L3556522.fna.sa paired_dat1.fq paired_dat2.fq
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ ll
total 51506772
drwxrwxr-x 2 hadoop hadoop 4096 6月 3 09:51 ./
drwxrwxr-x 6 hadoop hadoop 4096 6月 2 22:59 ../
-rw-rw-r-- 1 hadoop hadoop 11400 6月 3 09:52 G38L100c50Nhs20.aln
-rw-rw-r-- 1 hadoop hadoop 10428 6月 3 09:52 G38L100c50Nhs20.fq
4.生成一条数据:
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ art_illumina -ss HS20 -i GRCH38chr1L3556522.fna -l 100 -c 1 -o G38L100c1Nhs20
====================ART====================
ART_Illumina (2008-2016)
Q Version 2.5.1 (Apr 17, 2016)
Contact: Weichun Huang <whduke@gmail.com>
-------------------------------------------
Single-end Simulation
Total CPU time used: 15.82
The random seed for the run: 1464918910
Parameters used during run
Read Length: 100
Genome masking 'N' cutoff frequency: 1 in 100
Fold Coverage: 0X
Profile Type: Combined
ID Tag:
Quality Profile(s)
First Read: HiSeq 2000 Length 100 R1 (built-in profile)
Output files
FASTQ Sequence File:
G38L100c1Nhs20.fq
ALN Alignment File:
G38L100c1Nhs20.aln
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ cat G38L100c1Nhs20.
cat: G38L100c1Nhs20.: No such file or directory
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ cat G38L100c1Nhs20.fq
@chr1-1
CATATTTACCAATTAAAGTCACAAAATATTTCTCATTATTTATTCATGCAGGTAACTGAGACAAAGATAGTGCAGAAATCAACTTTAAATAAAAAATTAT
+
@C@D@FFDFHHHHIJ.JBIJJGJGIJ:G47JHJ@IJJ91BJJIGHHHEIJDGD=IJJJBJJ'DG=3D)<D?HCHBFAE?GEDC5D5ECD<CD<DBADDBE
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ cat G38L100c1Nhs20.
G38L100c1Nhs20.aln G38L100c1Nhs20.fq
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ cat G38L100c1Nhs20.
G38L100c1Nhs20.aln G38L100c1Nhs20.fq
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ cat G38L100c1Nhs20.aln
##ART_Illumina read_length 100
@CM art_illumina -ss HS20 -i GRCH38chr1L3556522.fna -l 100 -c 1 -o G38L100c1Nhs20 -rs 1464918910
@SQ chr1 AC:CM000663.2 gi:568336023 LN:248956422 rl:Chromosome M5:6aef897c3d6ff0c78aff06ac189178dd AS:GRCh38 248956422
##Header End
>chr1 chr1-1 225496693 +
CATATTTACCAATTAAAGTCACAAAATATTTCTCATTATTTATTCATGCAGGTAACTGAGAAAAAGATAGTGCAGAAATCAACTTTAAATAAAAAATTAT
CATATTTACCAATTAAAGTCACAAAATATTTCTCATTATTTATTCATGCAGGTAACTGAGACAAAGATAGTGCAGAAATCAACTTTAAATAAAAAATTAT
5.使用bwa验证:
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ cat G38L100c1Nhs20.sam
@SQ SN:chr1 LN:248956422
@PG ID:bwa PN:bwa VN:0.7.13-r1126 CL:bwa samse GRCH38chr1L3556522.fna G38L100c1Nhs20.sai G38L100c1Nhs20.fq
chr1-1 0 chr1 225496694 37 100M * 0 0 CATATTTACCAATTAAAGTCACAAAATATTTCTCATTATTTATTCATGCAGGTAACTGAGACAAAGATAGTGCAGAAATCAACTTTAAATAAAAAATTAT @C@D@FFDFHHHHIJ.JBIJJGJGIJ:G47JHJ@IJJ91BJJIGHHHEIJDGD=IJJJBJJ'DG=3D)<D?HCHBFAE?GEDC5D5ECD<CD<DBADDBE XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:61A38
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ cat G38L100c1Nhs20.aln
##ART_Illumina read_length 100
@CM art_illumina -ss HS20 -i GRCH38chr1L3556522.fna -l 100 -c 1 -o G38L100c1Nhs20 -rs 1464918910
@SQ chr1 AC:CM000663.2 gi:568336023 LN:248956422 rl:Chromosome M5:6aef897c3d6ff0c78aff06ac189178dd AS:GRCh38 248956422
##Header End
>chr1 chr1-1 225496693 +
CATATTTACCAATTAAAGTCACAAAATATTTCTCATTATTTATTCATGCAGGTAACTGAGAAAAAGATAGTGCAGAAATCAACTTTAAATAAAAAATTAT
CATATTTACCAATTAAAGTCACAAAATATTTCTCATTATTTATTCATGCAGGTAACTGAGACAAAGATAGTGCAGAAATCAACTTTAAATAAAAAATTAT
可以发现art产生的数据是从位置0开始,跟Adam一致,bwa是从一开始
如何自动判断bwa等算法的准确率?
6.用snap验证:
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ cat G38L100c1Nhs20.snap.sam
@HD VN:1.4 SO:unsorted
@RG ID:FASTQ PL:Illumina PU:pu LB:lb SM:sm
@PG ID:SNAP PN:SNAP CL:single index G38L100c1Nhs20.fq -o G38L100c1Nhs20.snap.sam VN:1.0beta.23
@SQ SN:chr1__AC:CM000663.2__gi:568336023__LN:248956422__rl:Chromosome__M5:6aef897c3d6ff0c78aff06ac189178dd__AS:GRCh38 LN:248956422
chr1-1 0 chr1__AC:CM000663.2__gi:568336023__LN:248956422__rl:Chromosome__M5:6aef897c3d6ff0c78aff06ac189178dd__AS:GRCh38 225496694 70 100M * 0 0 CATATTTACCAATTAAAGTCACAAAATATTTCTCATTATTTATTCATGCAGGTAACTGAGACAAAGATAGTGCAGAAATCAACTTTAAATAAAAAATTAT @C@D@FFDFHHHHIJ.JBIJJGJGIJ:G47JHJ@IJJ91BJJIGHHHEIJDGD=IJJJBJJ'DG=3D)<D?HCHBFAE?GEDC5D5ECD<CD<DBADDBE PG:Z:SNAP NM:i:1 RG:Z:FASTQ PL:Z:Illumina PU:Z:pu LB:Z:lb SM:Z:sm
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ cat G38L100c1Nhs20.aln
##ART_Illumina read_length 100
@CM art_illumina -ss HS20 -i GRCH38chr1L3556522.fna -l 100 -c 1 -o G38L100c1Nhs20 -rs 1464918910
@SQ chr1 AC:CM000663.2 gi:568336023 LN:248956422 rl:Chromosome M5:6aef897c3d6ff0c78aff06ac189178dd AS:GRCh38 248956422
##Header End
>chr1 chr1-1 225496693 +
CATATTTACCAATTAAAGTCACAAAATATTTCTCATTATTTATTCATGCAGGTAACTGAGAAAAAGATAGTGCAGAAATCAACTTTAAATAAAAAATTAT
CATATTTACCAATTAAAGTCACAAAATATTTCTCATTATTTATTCATGCAGGTAACTGAGACAAAGATAGTGCAGAAATCAACTTTAAATAAAAAATTAT
附录
(1) 50条数据bwa对比:
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ cat G38L100c50Nhs20.sam
@SQ SN:chr1 LN:248956422
@PG ID:bwa PN:bwa VN:0.7.13-r1126 CL:bwa samse GRCH38chr1L3556522.fna G38L100c50Nhs20.sai G38L100c50Nhs20.fq
chr1-50 0 chr1 93465785 37 100M * 0 0 TTCCACAATAGTTGAACTAATTTACAGTCCCACCAACAGTGTAAAAGTGTTCCTATTTCTCCACATCCTCTCCAGCACCTGTTGTTTCCTGACTTTTTAA @@CDFDFDHFHGHIJH:IJJJ(JJE?JDIDEJIB@FGJIGBHJ()HG8(CIICGFFHEH=GI3@&@DD58FADDACHDDHFCD8D,DCC<CEFD<EDDCD XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-48 0 chr1 228133746 37 100M * 0 0 ATCATTGTATGCCACAGAAATAATTAAATTTCCTTGTCAACTGACACATTATTATTAGGCACTCTCACCAGATCTTTACCCATGGCCATTTAAAGTGTGG @>CFFFFFH<<GC1IIDCFJJHIGIHJ(IID7IJ,FJJJHJJJJ)GGBHIJFJFIFIHFE=HEIEE;CA)G0(D()HC@D(:EFDDC@;DDAC95(D?BD XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:44G55
chr1-47 0 chr1 13772988 37 100M * 0 0 TTCAGTAATTCAGAATAACACATGAGGGAATGAATGAATGAATAAATAAAAAAAAACTGAATGAATAAATTACAAAAAATTGTGTTTCAGGGAAGAAAAA CC@F(FFFDFH.HDHIGI(JIIIGGIEEJIIIHJJHHH3IJJIIJ3=EI>JDIGH((IBJCIEHGD>;J@HF+DC)CCCADBDBD+BDDDD5B5DDDE(C XT:A:U NM:i:2 X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:56A15A27
chr1-46 16 chr1 37474758 37 100M * 0 0 GGGTCGGGGTCCTGTTCCCCGGTCCGCCGGGCCTCAGGACCCCTCCAACTTTGCCCAAGTTGGGAGAGCCGGGGAAGAGCACCAGGTTCCTGATCGGGAT (5CBACDDD>FBDDDDDEC:CE(CBDFDDHEFH;FGEFHGHDGJJJDIGI:JEHJ=JJJJJH8CI?JJJG9JIII>IJIIGJ=EIJGAHHHHFFDFDCC? XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:22C77
chr1-45 0 chr1 29056657 37 100M * 0 0 CTGGGATTACAGGTGCCCGCCACCATGCCCAGCTAATTTTTGTATTTTTGGTAGAGACAAGGTTTCACCATGTTGGCCGGGATTGTCTCGAACTCCTGAT B@@FFFFFHHG)HIJJJJBJIJCJHGJIBFJJI3IIHDF@JIAJ9JJJIJJBIJJ?BJID8F:HFHA(+D>J>CG>7D=DDFF@EDC3D<CDDDC@BD@B XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-44 0 chr1 49993893 37 100M * 0 0 CAATTTAGCCAAAACTGGCTAATCGTTTTACCAGAATCATTCCCATTGTTCAAGACCTATTTTAAGCTCCACTATCACCATAAAACTTTCCCGATCAGTT C@CFFFFDHHHHH<IEJJ@JJJHI)IDIBIJA:HJHFJJJIJGGJJIIIIHGGJJGH<(IIIJI?ICDG;CDHFHCDCCB?FDDED:CD:>DD5C&DDCD XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:24C75
chr1-43 16 chr1 194714506 37 100M * 0 0 AATATGTTTTAATAATATCATATTTAAATTTGATGATACTTTAAAAATGGTTCCATGTGTGTTCTCTTGGGTTATTTCACAATCAATAAAAGGTCTGCAA CCCCDDC@E>CDCDDC>D9CD=C)CGC>E@7.HF)DIBJBJJ.JEJEJ@JJIIIIGD?<IHH)FJJIIH*DJIBIIJHJIHHJFHIGHHHHDFDFDFC@B XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-42 0 chr1 35706203 37 100M * 0 0 CAGGTTCAAGCGATTCTCCTGCCTCAGCCTCCTGAGTAGCTGGGATTACAGGCACGTGCCACCATGCCTGGCAATTTTTGTATTTTTAGTACAGATGGGG CC@FDFFAHHGFHJIHJFJJII=@JEHIJIIJIJEJIJJHHGIJBBFJG6JJHJJG<F3JJHIFG(DCJDHDFDCHDDF7DDHBDFDDCDDCDCD;CBCB XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-41 16 chr1 156482338 37 100M * 0 0 GTGTGTGCATAGGCAGGTCTGCGTGTACATGCAACGTGGGCACGTGTCCATGTGGATGCAGGCGGGGGTATATCCTGGTGCCTGTGTGTATGGGCCCACC D;CCDDCDCDDDCD:EDA@<C<E(GDDDGDJDDHJ@CJJI,=FHJJIGJ7GEC?IGJJIFBBICHJEIJJHHAIJIJI.IJGJJGJJHHGHHFFFFFB=B XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-40 16 chr1 221779284 37 100M * 0 0 CATGGCACATAGCACTTTGGTGATGGGGACTGCTTTGCTAATGTCAGGGTCAAGGGGTGCATGGACCATGGGCAGAGTGCTGGGCTCAGCCAAATGGTTC DDBCDCDDDDDDD25F?DD@4I5HED?CAHGA?JJIIJB)IHFJJFCJII?@<HIIFGIJIJFG?JIJCIIJJ)IJJIJIJJIIEGJFHHHFFDFDFC@@ XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:39G60
chr1-39 16 chr1 3895605 37 100M * 0 0 GTCCTCTCCGGATTGACAGGAGTCAAAACATGAGATCGGCTTAGCTTCAGTTTCGTCATGGATTAACCACCTCCAAGGTGTCAACTCCAAAATGTCAAGA DD5CCAD&8DAD>D&FDDDCDBDD?6DD.FHDDIFE?@IDEGIBCGD?JFJ>JGBI,IJIF.JJIHJJJEIEGFJ=JJHJHHJFHIJHHHHHFFAFFC@B XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-38 16 chr1 33174926 37 100M * 0 0 CACACATACATATATGTGTGTATATATATATATATATATATATACACACATATACATATATATGCACACACACATGTATGTATATGTATATGTATATGTG CDDC(FBDC(AACBDDCBDDECEC5@H;HFDJFH>=FCHAHJFJ'H3JG9JFEHIJFDJJ9IJHEJIGJIJJJJJC;J?AJFJGEHFHDC<HDFFFDCC# XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:99A0
chr1-37 16 chr1 206124777 37 100M * 0 0 TCAGTCAATAGACATTTGGGTTGTTTCCACCTTGGGCAGGTTACAAATAATGCTGCTAGTGAACATTCATGTGCAAGTTTTTGTGTGGACATACGTTGTT CBCD8DDADC@DDDDB?CCHD;@AHECEEJIHAEII?E05GFJHHDJCJEJDBHJE7GJJJGJGGGJ=JIC(JIJHHIIIAGGJIIICHGHGFFFDF?CC XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-36 0 chr1 181673626 37 100M * 0 0 TCCACTGCCCAGAAAGAGGACATCCCTTATAGGACCAGCGGATGGAAGCCATGGGCTGGGCAGGACATTCCTGTCCCAACCCACATGGCAGCTAGAGTCC @@3DFFFFHHHFHJJIGGJHHFJJJIIJJJJDH*GJJJGJIJ6AIJIJDFDII=HFI2AH1AIEAAC?JEIEDJF.HDH@FAFDCDE2D:DDBD0DDBDD XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:33G66
chr1-35 16 chr1 152104780 37 100M * 0 0 ACCTCTATAAATACAATATCTTCAAATATGATTACATTCTGAGGTACTGAGGGTTAAGACTTCAACACGTGAACCTCTGTGGGGGTTGGGAGGTCACAAC +@?D>>D?DDBDFB)DDDDDC5(9>F;G)FB84/AJE3JJIJIGIGJBBIGCJCJGJGHJIDJ>IB7IGJGEGCIIGFJJJEFHIIJHHF=HFF8=F??= XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:14C85
chr1-34 16 chr1 12934213 0 100M * 0 0 TTTTGATACTTTTGATGTGGCCAAAGGTTCTCCAATAAAGATACCATATATAAATATATGTATTTCTAATGTCTGAAACAGATTAAAACCTTCCCTGTAT D@CB?DEDCEDDD(DC>F>DEHE>HEDE@HDD.IDD3'5I8IBFJHDI=CIIJ8JFHIBJJI0IJGFFJGIIJJABH<)IFJJJIEDHHHHAFFFFFCC@ XT:A:R NM:i:0 X0:i:2 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100 XA:Z:chr1,+13267477,100M,0;
chr1-33 0 chr1 48968233 37 100M * 0 0 TAATAGTAGGCAATAAACAAAGAGAGCAACTTAGGAGCCAGATCACATGTGGCCGCTCGAGCAATATGGTAAAAGTTCTGGACTTCATTCTAGGTGAATG 1CCB=FFFHHHHHEDHJIIAFG4JIFJIJB)JJI?(&JJIJJEE)HIJJBJ?HJ(=B(I@?I?8DC8C>JHJH>@EDFDD5DDDDDDDDCFD:=DCC(DD XT:A:U NM:i:2 X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:0G53T45
chr1-32 0 chr1 88980623 37 100M * 0 0 TAGTTCAGTAAACTATTTATCAAACAGGTGTCAGGTCATTTTAACATACTCCTTGCTTTGAACAATATTCATTCATACTTGGTACAAACTCTATATCCTA B?CFDFFFHHFH3JIJJJIGJJJJJFJDEJGJ(EHFI>E=JIJ(GGJDFCH>>GJ=IHDJEHHDI>GEBJE@DD@HH'AA@ECC@BDEDDD@CDDADBDD XT:A:U NM:i:2 X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:32T44T22
chr1-31 16 chr1 227005594 37 100M * 0 0 TCACCAGGCATCTTTACTGACTCACACCAATAGTAGTACTGGGATTAGAAATAAGACGCTGCAATACTCACAACCTAGGTGAAGTTAGTTAATTTGGGAA D@D5B=DACDDDDDBEFECBFDC5BCDDDCDFIDC8ICEIJ=DHIGHIJIJJB0HJJCDJHJGJIJI9GGHGGJ3@IJJAIJGGBGJ7HFHHDEBFF@CC XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-30 0 chr1 9852129 37 100M * 0 0 TGTGAAATGGAGTCAGCAGAGTGAGCCGGCCTCCACTCAGTGAGCCGGGTCTCCCCCACAGCCGGCATGTGCTGACCTCCTTCCAACTGCTCTACCAAGA CBCDDFFDHGHHHIEJ+J<EFJIJIIJI><J(IIFJG)0JGIJ?8J5;J?D@9IJHDI=DI)DDHG@3FAI5FF?EDAHDC@DDGD3AA>D+?ECDDDDB XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-29 16 chr1 156397431 37 100M * 0 0 TCAGCCTCCCGAGTAGCTGGGATTACAGGAACCTGCCACCACGCCCGGCTAATTTTTGTATTTTCAGTTGAGACGGGGTTTCACCATGTTGCCCAGGCTG D1D(@9DDDC@D0C3=CDDJ;FDHDD@H2BDHIDAGDDDCDIFJ9GIFGIG@?)JJHJGFGJIB7JG>'IJIJJGJ+JJGIIHFIJIDHHHFFFFFFC@B XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:68A31
chr1-28 16 chr1 56986638 37 100M * 0 0 ACTCAGAACAGGTCTCCTTGTGGAACCATGGCCTTCCTTTTGGATCCTGGCCATGAGAGCCCATTCTTAGGAACCATGTTTCAATTCCAGTAGGTGATGT DD)DC@C<EDDD+DC0BDDBDDCECFDIJJ@)?HDACJGDFI?JGJH)JJJJJIEIJIIJJGIJIHIJHJCIGDHI>J@A)GIFJJJHHFFHFDDFFCCC XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-27 16 chr1 172015198 37 100M * 0 0 AGGTGTCAGTCCTCCAGCTTTGTTCTTCTTTTATATTGTGTTGGCTATCCTGGGCTCTTTGCTTCTCCATACAAAACTTAGAATCAGTTTGTTGATATCC B8BD>/D<BED@CCEBBEBCH,F?CCD.E;HGJBJ)IGD7HED5@6JJJCHIGHJIJFDJCIJJHGJIJJJIEF:FEJHBJ.JJJIGHBHCF2DDFFCC@ XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-26 0 chr1 233336763 37 100M * 0 0 AGATATACAGCAAAGTTTGAAAGCTACAGTTCTGAGGACCATATTTATGGATTCCTTCTTATATGTTATCTGGGTTGATATAGAAATTCTTCCATGGCTA CBCFDFF<H<G?AIBJJJGIEJIJIIJIIJJEGHEIJIIGI)GJHIJGF8JIIHED=DJH?IJFB;>;HHDDHHB?C9DE?DCE@D?B&5E>DDD7DD?D XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:41G58
chr1-25 0 chr1 105787069 0 100M * 0 0 GCCATTCTAACTGGTGTGAGATGGTATCTCATTGTGGTTTTGATTTGCATTTCTCTGATGGCCAGTGATGGTGAGCATTTTTTCATGTGTTTTTTGGCTG CCCDFFFFBGHHHHHCIGFJ:JAIGIJIJG)HCIIJGIHHJJJGEDHIHJHIII3J>JHJ?GDD?:;EFE(EDIJD?DDEAHCEDCDD?CDCF6D=>DDD XT:A:R NM:i:0 X0:i:52 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-24 0 chr1 235841969 37 100M * 0 0 GTTGGCTACTAGCTTAGCAGAGGTGCAAAACCATGAATTTCTGGTGGTATGGATTTTTTCAGCTATTTCAGATTCACCAGCAGGATCCAGCTGCTTGGGT CCCFF?FFFHHHFGI,JEJIIG<JJ)I1GJG=ICJJEGJIJF<@IIBDJJIFDIEAIJB;JGADHJD,CBD@DEC;?DDHD<BEED&DD@DCDEDDDAD? XT:A:U NM:i:2 X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:25G60T13
chr1-23 0 chr1 96545358 37 100M * 0 0 AGTGAAAAAGGCTGGCTGCCCTTCAATATCATCTTCAAATGTTAACAACACTGAATATTAATAAATTTCCTTTAGCGAATAATGAATCCAGCCTTCCTTA C@CF+FFFGGDHGJIBDJI2JGJIHHJJII?GJJJJGIJIJJGFJJG)IJ0HD0JIFJDJDFC;D7JGFFCEDFHADCDCCDEDDEAHDDD+9?<CA2:D XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-22 0 chr1 80270679 37 100M * 0 0 TTGTACACCCTATTTCTGACCAGAAGAAGGAGCATTTTGCTTTTTGCCAAATGAGAAGTGCATTCTGGAAACACTTGATGCCTGCACCACACCTCGAGTT ?@CFDDFFHFHHHJJJGC7J(GI8IJJJE?HHI>BJG*IJFJIDJHD0IEJIHDI>@H=EHGIAHJ33(EJCDEDA?FDG<ADDDCDDEF9DDDD@DDBD XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-21 0 chr1 35923261 37 100M * 0 0 CTAAGCAGCAGTGTTTTTGGATACTTTTTTTTTCTGTTTGTGAATAAGGCCAGCACTCAAGATGGGCAGCCAAGGGTGCACTGACTATTAGCTGGCCCAT =@@DFDFEHGHHH8JIJGJH1JJHHJIHJGH?IIFEJIIG87JI=IAJJJBJIJD(IIFI8JIHF=JDHEJHEHDDCEDCDEACDDCCAD<BDE+B8(DD XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-20 16 chr1 112489190 37 100M * 0 0 AGGGAATGAACTATGCACATCTATATAGTAACAGGGACAGATTTTTTTTTAACATGAGAGTGTAAAAAAAAGAAAAAGAAAAAAAAAGGCCAGGCACAGT DACDABD@DDDDDA7DDDC8GHI@EI(DC?FG'+8.FBDJIHIEGG=IIG=I@*DFIJJIBIIJIJIIHJCHBGFJJJI@F>HJIIIHHAHFAFDFFC@1 XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:33A66
chr1-19 0 chr1 160371244 37 100M * 0 0 AGGCCCTGGGCACAGGCAGAGAGCCCACCGGCTGGTCATGAGGGCCTCTTCCTTTCTCTGACCCAGGCACCTCGAGGGCTCTTCTCCTGGGTTCCTTCCG @@:FDDFFCHHAHI:GEJFJGF@JJJFIC9JIIJJJ?IIEFHGJ'G?BFFBIIDIG,J)AJIHEGFBHCI&ECCD@EDD?)DED(D>3C?ABEEDDD4BD XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:80A19
chr1-18 16 chr1 179855835 37 100M * 0 0 AGCAATTAAAATAAATTAGGGTATCTTTAAAAGTTGTAAAATTATAGCAGTGAAGTACTGTTGACCAGGCACAGTGGCTCACACCTGTAATACCAGCACT DCEDBBDD/DD9DDD@DDFB(DDDHCHDF;C?;FJGC/IJ8DHEJ:DFGGIGHBIGIJDI(JDHGJJGJIHJII@HJJJ3JIJDIJBBHHFHFDFFF@C@ XT:A:U NM:i:2 X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:20T39C39
chr1-17 16 chr1 207455995 37 100M * 0 0 GGTTCTTATGATTGGAAAGGTTAAAGAGTGACCTATAGGTCACTTTCCAATTATGAAAACAAAAAATTAAGAAATATATATATTTTCATTATTTCACTCC <DC>CBDDDCDD:&DDCFCFDDHDEJEDCFDJ;;EHGCD;CG?DIHGGCIJJJJ-GIJ7GIFHHHCGI)JJJJIJEGJIGJJJIH<GFGHHBFD@FDCC1 XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-16 0 chr1 114154603 37 100M * 0 0 TGTATCTTTCTGCTAAGCATAACAAGAAAGACAGAAAGCTCAACGGGAGGATTGAGGCTAGACTTAAAGTAGAGATCCCCTCAGAAACTGTGGAGTGAGG CCCF8FFDHHHH4JIJIGIJIIFJHJJ?JEDI9BG?I>GHJ7FJJJIF67EIIHD2C>?>DDHDE8E7@JEJ(IFDDC;EDCC:FD>@DBC>D5D>=<AB XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-15 16 chr1 169767580 37 100M * 0 0 GGTGGGGGAGAGGAAAGGAAACGAGGGAGGAAAGGCCCTAATAGGGAGGATTTTGGAGTTTAGATTTTAAAATGATAAAGGTTGTTTGACACTCTAGGCA DEDD9DDD@DD4DDDAEDDDC@D7=D;DA)7;IIJFD(J?JJDGI(IDGD7D'3JIE;H?AC@EHJJE?JJHDFJIIIECG)GGJJECHFHHFDFDFC@C XT:A:U NM:i:2 X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:45A35A18
chr1-14 0 chr1 117644126 37 100M * 0 0 GCATTTCATTGTGGACTAATTTTCCCCCACTATTGAGGGAAGACCCTTTTGAGTACTCTATCTGATGCCCCATGAATGATAAAGTTTTATACTCTGGCTG C?CFBDFFGHHH<JAGCIJJIIIGJJI8:JIJC(JGGJBJH-GIJIJJ;IIH;>JI5CJ=CD9DC-HGIJDCJHHDBEDDCC&DDDBD39DBCDDDDDCD XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-13 0 chr1 104996994 37 100M * 0 0 TTCTTGCTGGAACACATGTTTTCACCTTTACCTTCACCCACAGCCCAATGTGCATCAATATGGAGATAATGCAGTTCCATTTATACCTCTTTGTGGTTCA =@?FFFDDHBHHHJHJIJ)JIJ<IJII++>HBHIJ*G:CJCJJJI?G)>GJI;JD3FJ8FJFGD;DDDDFBED7C<?E@C>7&A(ABC9CD+DCC&DDCA XT:A:U NM:i:3 X0:i:1 X1:i:0 XM:i:3 XO:i:0 XG:i:0 MD:Z:18G63G12A4
chr1-12 16 chr1 108617705 37 100M * 0 0 AGGTCGGGGAGATTGGGAAGAAGAATGAGCAAAGAAACCACCAGTGTGATCAGAGGAGGAAAGCAAAGCAGAGTCCTGTCCTGAAAACCAAATGAAGAAA :=>+D(DCEC=@GHB(CDDDDHABDD+HBJJ9F?A35DDIE?JJHIHJJIEE?JFJ?7JBGJJI>JJGJBJIIBIJJJIIIIJGJGJHHDFHF3FFFCB@ XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-11 16 chr1 72085324 37 100M * 0 0 TACTAGCCTTGAAAATGTTTAAAATAATATTCCAGAGTTAATATTGTTGTCCCTGGTATGTTAAAGAGTATTTGTTATCATAGCCAATTCTTGAGTCTGC 8@DDCD4D>D?C3DF(DCCHDDDA;HDEIBFCHGHHHFFIFEG1JHIJIJCGEJIHJG)IH(IJ)BDJ??FHHJHCJJIFJHJJJGIGH)2HFFFA=<CC XT:A:U NM:i:2 X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:61G27A10
chr1-10 16 chr1 214311330 37 100M * 0 0 GGTGATCCCTATTTGGTCCACTTTTGTTGGTAGTCTTCAAGCTTGATATCTGATTATCACTGTTGGAAGGTGTAAACTCACAGACTCAGAATTCTGGACT D&++DDEDDC3+CB(@D8DDEHD;?FDF?DBI9A7@JFHJ(I(AIJJ@DHJIIJHII>HFG1JJFII<IJJJ(<HIJGHIJJE=FH@FHHHHFDDDFC@C XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:1A98
chr1-9 0 chr1 152012629 37 100M * 0 0 AAGCAATTCTCTTGCTTTAGCCTCCCGAGAAGCTCGGATTACAGGCATGTCCACCACACCCAGCTAATTCTTTTGTATTTTTAGTAGACATGGGGTTTTG BC:FFDFFHHHFHJ9HIJHIJJJFIGJCI=H/IHH@IGJIJIJIGJJJJEEIB'JJDJJIJDIGICHEFD@D3:0A/(BECDDDDCBE>BD8DDDDDC8C XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-8 0 chr1 79478960 37 100M * 0 0 GGCAACACTTGAGAACACAAAGTGAGTTCTCACTTTGGGCGGTGGTTTCAGGCTTCAGGGTGGAGTTTTGTCAGGAACCCAACCTTTTCTGCCTAGAATT @CCFDFDADHHAHFIH@ICJJHI5?JIJ)GCFEIJHG=II)HIGI9JJIJGHEJHFI8EIDG)GCI4FJF?I8HCDH;DD0&3CFDDDD@C4DCD6ADD> XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:81C18
chr1-7 0 chr1 178190761 37 100M * 0 0 GTAGCCGGAATAAACAGTCACTGTGAGTTGTCCATTTTAGAGCATAGGTTTTCAGGTGGTGAAGACCTGTCCTTAGTTGAATTTGTATGTGAATTAAACT B?<;FDDFHHHHHJEGJFCJJIFJJJA=JHHGIIGJJIIGIGJ(D:DAFG7)&DJID9J)FCD/HHJEDFIJ<FJAF@D@DDADF?C@A@ADCDD@CDDD XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:52T47
chr1-6 0 chr1 42572411 37 100M * 0 0 AACCCTTTATCAGGTATGTATTATAAACATCGACTCTGTGGCTTGCATTTTCATTCTCCTTATATATCTTTTGATGAATCAAAGTTTTTAATTTGAATAT BCCFFFFDAH)HHJJHIGG,HFH2JIJJ4IDI93IJJ<=JJ>IH7IJIJBIBG)CFH7DHHFAHFHEDIFBEFH;EBICA?3DD5D(DDBACDC(BADD: XT:A:U NM:i:1 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:94T5
chr1-5 0 chr1 153186635 37 100M * 0 0 GTCTTGACTCTTTATCCACTTTGCCAGTCTGTGTCTTGTAATTGGGGCATTTAGCCTATTTACATTTAAGGTTAATATTGTTATGTGTGAATTTGATCCT C1@?FF=FHHHHHI?JJEJFIIIHG:.?>EEJEI(JG9J'IIHIJIHIJGJGJFJ9FJAG4EEC:DADE8DAEJFCCBBCDAEDDDD-DDDD@+DBC8D+ XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
chr1-4 0 chr1 145038405 0 100M * 0 0 AGTGGAAATAATACTCGTCAACATATGCCTTTCAAAAAAATTTTTTTTCATATTTTAAATTTACCTTTACTACCTATTTATTTGGTTCAAGGCTCCATTT C:CFFFDDHFHDHJIIJJJJ29CJ+JJJIIJIIFIG?JI08?CJJIFIFDEFDGBD>JAIDJDJ>JCBG(CG=DE5?(EDB3HDD>ED2:CCHDB<DDCC XT:A:R NM:i:0 X0:i:2 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100 XA:Z:chr1,-121241807,100M,0;
chr1-3 0 chr1 84685021 37 100M * 0 0 TATTATTAAAACTATAAATGGACCAATTAAACAAACGTGTCATGAGCCAAGGAATATAAACTAATTCTTTACACCTGAAGTCCTTTAAAATGATTTAATT CCCDFFB=HHHFHJIIJ:E@J>JA2C<IEI2DGJHJGI8FJJIJJAHJCIJJJJ*JJ;F?HFDJCIJFDJ'CHECDEFD,DDFBCH<A7DCD-DDBDD(E XT:A:U NM:i:2 X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:92C5A1
chr1-2 0 chr1 62477842 37 100M * 0 0 TAGGAAAATGGAGAAACTTTAATATGAAATCTTCCTGTTTTTCACATTATGTTTAGATTGTTACAGCATAAAATTTCAGAAACATTGCAAAAAGTTTTAA @C=FFDFDHHHH>GJ@IEJGJIIJJJJF@JHHIIGGJ<IHJIG/J*?GD>ICIAJFJIH)H7E?GHEDI>HHFAHC@)(D>DDDED<DCDD=DBBC5DDE XT:A:U NM:i:2 X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:77C0A21
chr1-1 0 chr1 11355150 37 100M * 0 0 ATTTATTGGCTGTCTTTCAGGCACATTTTAGCTGTCATCCAACATTCTCAACCTTAGTCCCCTTCTCTGGGCTAAGGGGAGAATGATGGTCCTACCCCAG BC?DFFFFHH<FFJG(ICJIGJJIJJGIFJIJGJ(7FJJJJJFDHID)JCH=3DIJ5JGDI8@@I@A=>3<:IDCA9DDFFI(FADEBDCDCCDDB(DC> XT:A:U NM:i:0 X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:100
hadoop@Master:~/cloud/adam/xubo/data/GRCH38Sub/cs-bwamem$ cat G38L100c50Nhs20.aln
##ART_Illumina read_length 100
@CM art_illumina -ss HS20 -i GRCH38chr1L3556522.fna -l 100 -c 50 -o G38L100c50Nhs20 -rs 1464918709
@SQ chr1 AC:CM000663.2 gi:568336023 LN:248956422 rl:Chromosome M5:6aef897c3d6ff0c78aff06ac189178dd AS:GRCh38 248956422
##Header End
>chr1 chr1-50 93465784 +
TTCCACAATAGTTGAACTAATTTACAGTCCCACCAACAGTGTAAAAGTGTTCCTATTTCTCCACATCCTCTCCAGCACCTGTTGTTTCCTGACTTTTTAA
TTCCACAATAGTTGAACTAATTTACAGTCCCACCAACAGTGTAAAAGTGTTCCTATTTCTCCACATCCTCTCCAGCACCTGTTGTTTCCTGACTTTTTAA
>chr1 chr1-48 228133745 +
ATCATTGTATGCCACAGAAATAATTAAATTTCCTTGTCAACTGAGACATTATTATTAGGCACTCTCACCAGATCTTTACCCATGGCCATTTAAAGTGTGG
ATCATTGTATGCCACAGAAATAATTAAATTTCCTTGTCAACTGACACATTATTATTAGGCACTCTCACCAGATCTTTACCCATGGCCATTTAAAGTGTGG
>chr1 chr1-47 13772987 +
TTCAGTAATTCAGAATAACACATGAGGGAATGAATGAATGAATAAATAAAAAAAAAATGAATGAATAAATTAAAAAAAATTGTGTTTCAGGGAAGAAAAA
TTCAGTAATTCAGAATAACACATGAGGGAATGAATGAATGAATAAATAAAAAAAAACTGAATGAATAAATTACAAAAAATTGTGTTTCAGGGAAGAAAAA
>chr1 chr1-46 211481565 -
ATCCCGATCAGGAACCTGGTGCTCTTCCCCGGCTCTCCCAACTTGGGCAAAGTTGGAGGGGTCCTGAGGCCCGGCGGGCCGGGGAACAGGACCCCGACCC
ATCCCGATCAGGAACCTGGTGCTCTTCCCCGGCTCTCCCAACTTGGGCAAAGTTGGAGGGGTCCTGAGGCCCGGCGGACCGGGGAACAGGACCCCGACCC
>chr1 chr1-45 29056656 +
CTGGGATTACAGGTGCCCGCCACCATGCCCAGCTAATTTTTGTATTTTTGGTAGAGACAAGGTTTCACCATGTTGGCCGGGATTGTCTCGAACTCCTGAT
CTGGGATTACAGGTGCCCGCCACCATGCCCAGCTAATTTTTGTATTTTTGGTAGAGACAAGGTTTCACCATGTTGGCCGGGATTGTCTCGAACTCCTGAT
>chr1 chr1-44 49993892 +
CAATTTAGCCAAAACTGGCTAATCCTTTTACCAGAATCATTCCCATTGTTCAAGACCTATTTTAAGCTCCACTATCACCATAAAACTTTCCCGATCAGTT
CAATTTAGCCAAAACTGGCTAATCGTTTTACCAGAATCATTCCCATTGTTCAAGACCTATTTTAAGCTCCACTATCACCATAAAACTTTCCCGATCAGTT
>chr1 chr1-43 54241817 -
TTGCAGACCTTTTATTGATTGTGAAATAACCCAAGAGAACACACATGGAACCATTTTTAAAGTATCATCAAATTTAAATATGATATTATTAAAACATATT
TTGCAGACCTTTTATTGATTGTGAAATAACCCAAGAGAACACACATGGAACCATTTTTAAAGTATCATCAAATTTAAATATGATATTATTAAAACATATT
>chr1 chr1-42 35706202 +
CAGGTTCAAGCGATTCTCCTGCCTCAGCCTCCTGAGTAGCTGGGATTACAGGCACGTGCCACCATGCCTGGCAATTTTTGTATTTTTAGTACAGATGGGG
CAGGTTCAAGCGATTCTCCTGCCTCAGCCTCCTGAGTAGCTGGGATTACAGGCACGTGCCACCATGCCTGGCAATTTTTGTATTTTTAGTACAGATGGGG
>chr1 chr1-41 92473985 -
GGTGGGCCCATACACACAGGCACCAGGATATACCCCCGCCTGCATCCACATGGACACGTGCCCACGTTGCATGTACACGCAGACCTGCCTATGCACACAC
GGTGGGCCCATACACACAGGCACCAGGATATACCCCCGCCTGCATCCACATGGACACGTGCCCACGTTGCATGTACACGCAGACCTGCCTATGCACACAC
>chr1 chr1-40 27177039 -
GAACCATTTGGCTGAGCCCAGCACTCTGCCCATGGTCCATGCACCCCTTGACCCTGACATCAGCAAAGCAGTCCCCATCACCAAAGTGCTATGTGCCATG
GAACCATTTGGCTGAGCCCAGCACTCTGCCCATGGTCCATGCACCCCTTGACCCTGACATTAGCAAAGCAGTCCCCATCACCAAAGTGCTATGTGCCATG
>chr1 chr1-39 245060718 -
TCTTGACATTTTGGAGTTGACACCTTGGAGGTGGTTAATCCATGACGAAACTGAAGCTAAGCCGATCTCATGTTTTGACTCCTGTCAATCCGGAGAGGAC
TCTTGACATTTTGGAGTTGACACCTTGGAGGTGGTTAATCCATGACGAAACTGAAGCTAAGCCGATCTCATGTTTTGACTCCTGTCAATCCGGAGAGGAC
>chr1 chr1-38 215781397 -
TACATATACATATACATATACATACATGTGTGTGTGCATATATATGTATATGTGTGTATATATATATATATATATATATACACACATATATGTATGTGTG
CACATATACATATACATATACATACATGTGTGTGTGCATATATATGTATATGTGTGTATATATATATATATATATATATACACACATATATGTATGTGTG
>chr1 chr1-37 42831546 -
AACAACGTATGTCCACACAAAAACTTGCACATGAATGTTCACTAGCAGCATTATTTGTAACCTGCCCAAGGTGGAAACAACCCAAATGTCTATTGACTGA
AACAACGTATGTCCACACAAAAACTTGCACATGAATGTTCACTAGCAGCATTATTTGTAACCTGCCCAAGGTGGAAACAACCCAAATGTCTATTGACTGA
>chr1 chr1-36 181673625 +
TCCACTGCCCAGAAAGAGGACATCCCTTATAGGGCCAGCGGATGGAAGCCATGGGCTGGGCAGGACATTCCTGTCCCAACCCACATGGCAGCTAGAGTCC
TCCACTGCCCAGAAAGAGGACATCCCTTATAGGACCAGCGGATGGAAGCCATGGGCTGGGCAGGACATTCCTGTCCCAACCCACATGGCAGCTAGAGTCC
>chr1 chr1-35 96851543 -
GTTGTGACCTCCCAACCCCCACAGAGGTTCACGTGTTGAAGTCTTAACCCTCAGTACCTCAGAATGTAATCATATTTGAAGATATGGTATTTATAGAGGT
GTTGTGACCTCCCAACCCCCACAGAGGTTCACGTGTTGAAGTCTTAACCCTCAGTACCTCAGAATGTAATCATATTTGAAGATATTGTATTTATAGAGGT
>chr1 chr1-34 13267476 +
ATACAGGGAAGGTTTTAATCTGTTTCAGACATTAGAAATACATATATTTATATATGGTATCTTTATTGGAGAACCTTTGGCCACATCAAAAGTATCAAAA
ATACAGGGAAGGTTTTAATCTGTTTCAGACATTAGAAATACATATATTTATATATGGTATCTTTATTGGAGAACCTTTGGCCACATCAAAAGTATCAAAA
>chr1 chr1-33 48968232 +
GAATAGTAGGCAATAAACAAAGAGAGCAACTTAGGAGCCAGATCACATGTGGCCTCTCGAGCAATATGGTAAAAGTTCTGGACTTCATTCTAGGTGAATG
TAATAGTAGGCAATAAACAAAGAGAGCAACTTAGGAGCCAGATCACATGTGGCCGCTCGAGCAATATGGTAAAAGTTCTGGACTTCATTCTAGGTGAATG
>chr1 chr1-32 88980622 +
TAGTTCAGTAAACTATTTATCAAACAGGTGTCTGGTCATTTTAACATACTCCTTGCTTTGAACAATATTCATTCATATTTGGTACAAACTCTATATCCTA
TAGTTCAGTAAACTATTTATCAAACAGGTGTCAGGTCATTTTAACATACTCCTTGCTTTGAACAATATTCATTCATACTTGGTACAAACTCTATATCCTA
>chr1 chr1-31 21950729 -
TTCCCAAATTAACTAACTTCACCTAGGTTGTGAGTATTGCAGCGTCTTATTTCTAATCCCAGTACTACTATTGGTGTGAGTCAGTAAAGATGCCTGGTGA
TTCCCAAATTAACTAACTTCACCTAGGTTGTGAGTATTGCAGCGTCTTATTTCTAATCCCAGTACTACTATTGGTGTGAGTCAGTAAAGATGCCTGGTGA
>chr1 chr1-30 9852128 +
TGTGAAATGGAGTCAGCAGAGTGAGCCGGCCTCCACTCAGTGAGCCGGGTCTCCCCCACAGCCGGCATGTGCTGACCTCCTTCCAACTGCTCTACCAAGA
TGTGAAATGGAGTCAGCAGAGTGAGCCGGCCTCCACTCAGTGAGCCGGGTCTCCCCCACAGCCGGCATGTGCTGACCTCCTTCCAACTGCTCTACCAAGA
>chr1 chr1-29 92558892 -
CAGCCTGGGCAACATGGTGAAACCCCGTCTCTACTGAAAATACAAAAATTAGCCGGGCGTGGTGGCAGGTTCCTGTAATCCCAGCTACTCGGGAGGCTGA
CAGCCTGGGCAACATGGTGAAACCCCGTCTCAACTGAAAATACAAAAATTAGCCGGGCGTGGTGGCAGGTTCCTGTAATCCCAGCTACTCGGGAGGCTGA
>chr1 chr1-28 191969685 -
ACATCACCTACTGGAATTGAAACATGGTTCCTAAGAATGGGCTCTCATGGCCAGGATCCAAAAGGAAGGCCATGGTTCCACAAGGAGACCTGTTCTGAGT
ACATCACCTACTGGAATTGAAACATGGTTCCTAAGAATGGGCTCTCATGGCCAGGATCCAAAAGGAAGGCCATGGTTCCACAAGGAGACCTGTTCTGAGT
>chr1 chr1-27 76941125 -
GGATATCAACAAACTGATTCTAAGTTTTGTATGGAGAAGCAAAGAGCCCAGGATAGCCAACACAATATAAAAGAAGAACAAAGCTGGAGGACTGACACCT
GGATATCAACAAACTGATTCTAAGTTTTGTATGGAGAAGCAAAGAGCCCAGGATAGCCAACACAATATAAAAGAAGAACAAAGCTGGAGGACTGACACCT
>chr1 chr1-26 233336762 +
AGATATACAGCAAAGTTTGAAAGCTACAGTTCTGAGGACCAGATTTATGGATTCCTTCTTATATGTTATCTGGGTTGATATAGAAATTCTTCCATGGCTA
AGATATACAGCAAAGTTTGAAAGCTACAGTTCTGAGGACCATATTTATGGATTCCTTCTTATATGTTATCTGGGTTGATATAGAAATTCTTCCATGGCTA
>chr1 chr1-25 96853884 +
GCCATTCTAACTGGTGTGAGATGGTATCTCATTGTGGTTTTGATTTGCATTTCTCTGATGGCCAGTGATGGTGAGCATTTTTTCATGTGTTTTTTGGCTG
GCCATTCTAACTGGTGTGAGATGGTATCTCATTGTGGTTTTGATTTGCATTTCTCTGATGGCCAGTGATGGTGAGCATTTTTTCATGTGTTTTTTGGCTG
>chr1 chr1-24 235841968 +
GTTGGCTACTAGCTTAGCAGAGGTGGAAAACCATGAATTTCTGGTGGTATGGATTTTTTCAGCTATTTCAGATTCACCAGCAGGATTCAGCTGCTTGGGT
GTTGGCTACTAGCTTAGCAGAGGTGCAAAACCATGAATTTCTGGTGGTATGGATTTTTTCAGCTATTTCAGATTCACCAGCAGGATCCAGCTGCTTGGGT
>chr1 chr1-23 96545357 +
AGTGAAAAAGGCTGGCTGCCCTTCAATATCATCTTCAAATGTTAACAACACTGAATATTAATAAATTTCCTTTAGCGAATAATGAATCCAGCCTTCCTTA
AGTGAAAAAGGCTGGCTGCCCTTCAATATCATCTTCAAATGTTAACAACACTGAATATTAATAAATTTCCTTTAGCGAATAATGAATCCAGCCTTCCTTA
>chr1 chr1-22 80270678 +
TTGTACACCCTATTTCTGACCAGAAGAAGGAGCATTTTGCTTTTTGCCAAATGAGAAGTGCATTCTGGAAACACTTGATGCCTGCACCACACCTCGAGTT
TTGTACACCCTATTTCTGACCAGAAGAAGGAGCATTTTGCTTTTTGCCAAATGAGAAGTGCATTCTGGAAACACTTGATGCCTGCACCACACCTCGAGTT
>chr1 chr1-21 35923260 +
CTAAGCAGCAGTGTTTTTGGATACTTTTTTTTTCTGTTTGTGAATAAGGCCAGCACTCAAGATGGGCAGCCAAGGGTGCACTGACTATTAGCTGGCCCAT
CTAAGCAGCAGTGTTTTTGGATACTTTTTTTTTCTGTTTGTGAATAAGGCCAGCACTCAAGATGGGCAGCCAAGGGTGCACTGACTATTAGCTGGCCCAT
>chr1 chr1-20 136467133 -
ACTGTGCCTGGCCTTTTTTTTTCTTTTTCTTTTTTTTACACTCTCATGTTAAAAAAAAATCTGTCCTTGTTACTATATAGATGTGCATAGTTCATTCCCT
ACTGTGCCTGGCCTTTTTTTTTCTTTTTCTTTTTTTTACACTCTCATGTTAAAAAAAAATCTGTCCCTGTTACTATATAGATGTGCATAGTTCATTCCCT
>chr1 chr1-19 160371243 +
AGGCCCTGGGCACAGGCAGAGAGCCCACCGGCTGGTCATGAGGGCCTCTTCCTTTCTCTGACCCAGGCACCTCGAGGGCTATTCTCCTGGGTTCCTTCCG
AGGCCCTGGGCACAGGCAGAGAGCCCACCGGCTGGTCATGAGGGCCTCTTCCTTTCTCTGACCCAGGCACCTCGAGGGCTCTTCTCCTGGGTTCCTTCCG
>chr1 chr1-18 69100488 -
AGTGCTGGTATTACAGGTGTGAGCCACTGTGCCTGGTCAGCAGTACTTCACTGCTATAATTTTACAACTTTTAAAGATAACCTAATTTATTTTAATTGCT
AGTGCTGGTATTACAGGTGTGAGCCACTGTGCCTGGTCAACAGTACTTCACTGCTATAATTTTACAACTTTTAAAGATACCCTAATTTATTTTAATTGCT
>chr1 chr1-17 41500328 -
GGAGTGAAATAATGAAAATATATATATTTCTTAATTTTTTGTTTTCATAATTGGAAAGTGACCTATAGGTCACTCTTTAACCTTTCCAATCATAAGAACC
GGAGTGAAATAATGAAAATATATATATTTCTTAATTTTTTGTTTTCATAATTGGAAAGTGACCTATAGGTCACTCTTTAACCTTTCCAATCATAAGAACC
>chr1 chr1-16 114154602 +
TGTATCTTTCTGCTAAGCATAACAAGAAAGACAGAAAGCTCAACGGGAGGATTGAGGCTAGACTTAAAGTAGAGATCCCCTCAGAAACTGTGGAGTGAGG
TGTATCTTTCTGCTAAGCATAACAAGAAAGACAGAAAGCTCAACGGGAGGATTGAGGCTAGACTTAAAGTAGAGATCCCCTCAGAAACTGTGGAGTGAGG
>chr1 chr1-15 79188743 -
TGCCTAGAGTGTCAAACATCCTTTATCATTTTAAAATCTAAACTCCAAAATCCTTCCTATTAGGGCCTTTCCTCCCTCGTTTCCTTTCCTCTCCCCCACC
TGCCTAGAGTGTCAAACAACCTTTATCATTTTAAAATCTAAACTCCAAAATCCTCCCTATTAGGGCCTTTCCTCCCTCGTTTCCTTTCCTCTCCCCCACC
>chr1 chr1-14 117644125 +
GCATTTCATTGTGGACTAATTTTCCCCCACTATTGAGGGAAGACCCTTTTGAGTACTCTATCTGATGCCCCATGAATGATAAAGTTTTATACTCTGGCTG
GCATTTCATTGTGGACTAATTTTCCCCCACTATTGAGGGAAGACCCTTTTGAGTACTCTATCTGATGCCCCATGAATGATAAAGTTTTATACTCTGGCTG
>chr1 chr1-13 104996993 +
TTCTTGCTGGAACACATGGTTTCACCTTTACCTTCACCCACAGCCCAATGTGCATCAATATGGAGATAATGCAGTTCCATTTGTACCTCTTTGTGATTCA
TTCTTGCTGGAACACATGTTTTCACCTTTACCTTCACCCACAGCCCAATGTGCATCAATATGGAGATAATGCAGTTCCATTTATACCTCTTTGTGGTTCA
>chr1 chr1-12 140338618 -
TTTCTTCATTTGGTTTTCAGGACAGGACTCTGCTTTGCTTTCCTCCTCTGATCACACTGGTGGTTTCTTTGCTCATTCTTCTTCCCAATCTCCCCGACCT
TTTCTTCATTTGGTTTTCAGGACAGGACTCTGCTTTGCTTTCCTCCTCTGATCACACTGGTGGTTTCTTTGCTCATTCTTCTTCCCAATCTCCCCGACCT
>chr1 chr1-11 176870999 -
GCAGACTCAATAATTGGCTATGATAACAAATACTCTTTCACATACCAGGGACAACAATATTAACTCTGGAATATTATTTTAAACATTTTCAAGGCTAGTA
GCAGACTCAAGAATTGGCTATGATAACAAATACTCTTTAACATACCAGGGACAACAATATTAACTCTGGAATATTATTTTAAACATTTTCAAGGCTAGTA
>chr1 chr1-10 34644993 -
AGTCCAGAATTCTGAGTCTGTGAGTTTACACCTTCCAACAGTGATAATCAGATATCAAGCTTGAAGACTACCAACAAAAGTGGACCAAATAGGGATCATC
AGTCCAGAATTCTGAGTCTGTGAGTTTACACCTTCCAACAGTGATAATCAGATATCAAGCTTGAAGACTACCAACAAAAGTGGACCAAATAGGGATCACC
>chr1 chr1-9 152012628 +
AAGCAATTCTCTTGCTTTAGCCTCCCGAGAAGCTCGGATTACAGGCATGTCCACCACACCCAGCTAATTCTTTTGTATTTTTAGTAGACATGGGGTTTTG
AAGCAATTCTCTTGCTTTAGCCTCCCGAGAAGCTCGGATTACAGGCATGTCCACCACACCCAGCTAATTCTTTTGTATTTTTAGTAGACATGGGGTTTTG
>chr1 chr1-8 79478959 +
GGCAACACTTGAGAACACAAAGTGAGTTCTCACTTTGGGCGGTGGTTTCAGGCTTCAGGGTGGAGTTTTGTCAGGAACCCACCCTTTTCTGCCTAGAATT
GGCAACACTTGAGAACACAAAGTGAGTTCTCACTTTGGGCGGTGGTTTCAGGCTTCAGGGTGGAGTTTTGTCAGGAACCCAACCTTTTCTGCCTAGAATT
>chr1 chr1-7 178190760 +
GTAGCCGGAATAAACAGTCACTGTGAGTTGTCCATTTTAGAGCATAGGTTTTTAGGTGGTGAAGACCTGTCCTTAGTTGAATTTGTATGTGAATTAAACT
GTAGCCGGAATAAACAGTCACTGTGAGTTGTCCATTTTAGAGCATAGGTTTTCAGGTGGTGAAGACCTGTCCTTAGTTGAATTTGTATGTGAATTAAACT
>chr1 chr1-6 42572410 +
AACCCTTTATCAGGTATGTATTATAAACATCGACTCTGTGGCTTGCATTTTCATTCTCCTTATATATCTTTTGATGAATCAAAGTTTTTAATTTTAATAT
AACCCTTTATCAGGTATGTATTATAAACATCGACTCTGTGGCTTGCATTTTCATTCTCCTTATATATCTTTTGATGAATCAAAGTTTTTAATTTGAATAT
>chr1 chr1-5 153186634 +
GTCTTGACTCTTTATCCACTTTGCCAGTCTGTGTCTTGTAATTGGGGCATTTAGCCTATTTACATTTAAGGTTAATATTGTTATGTGTGAATTTGATCCT
GTCTTGACTCTTTATCCACTTTGCCAGTCTGTGTCTTGTAATTGGGGCATTTAGCCTATTTACATTTAAGGTTAATATTGTTATGTGTGAATTTGATCCT
>chr1 chr1-4 127714516 -
AGTGGAAATAATACTCGTCAACATATGCCTTTCAAAAAAATTTTTTTTCATATTTTAAATTTACCTTTACTACCTATTTATTTGGTTCAAGGCTCCATTT
AGTGGAAATAATACTCGTCAACATATGCCTTTCAAAAAAATTTTTTTTCATATTTTAAATTTACCTTTACTACCTATTTATTTGGTTCAAGGCTCCATTT
>chr1 chr1-3 84685020 +
TATTATTAAAACTATAAATGGACCAATTAAACAAACGTGTCATGAGCCAAGGAATATAAACTAATTCTTTACACCTGAAGTCCTTTAAAATGCTTTAAAT
TATTATTAAAACTATAAATGGACCAATTAAACAAACGTGTCATGAGCCAAGGAATATAAACTAATTCTTTACACCTGAAGTCCTTTAAAATGATTTAATT
>chr1 chr1-2 62477841 +
TAGGAAAATGGAGAAACTTTAATATGAAATCTTCCTGTTTTTCACATTATGTTTAGATTGTTACAGCATAAAATTTCCAAAACATTGCAAAAAGTTTTAA
TAGGAAAATGGAGAAACTTTAATATGAAATCTTCCTGTTTTTCACATTATGTTTAGATTGTTACAGCATAAAATTTCAGAAACATTGCAAAAAGTTTTAA
>chr1 chr1-1 11355149 +
ATTTATTGGCTGTCTTTCAGGCACATTTTAGCTGTCATCCAACATTCTCAACCTTAGTCCCCTTCTCTGGGCTAAGGGGAGAATGATGGTCCTACCCCAG
ATTTATTGGCTGTCTTTCAGGCACATTTTAGCTGTCATCCAACATTCTCAACCTTAGTCCCCTTCTCTGGGCTAAGGGGAGAATGATGGTCCTACCCCAG