Converting FASTQ to FASTA with SED/AWK
Posted
by neversaint
on Stack Overflow
See other posts from Stack Overflow
or by neversaint
Published on 2009-10-09T07:22:51Z
Indexed on
2010/05/09
20:08 UTC
Read the original article
Hit count: 386
I have a data in that always comes in block of four in the following format (called FASTQ):
@SRR018006.2016 GA2:6:1:20:650 length=36
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGN
+SRR018006.2016 GA2:6:1:20:650 length=36
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!+!
@SRR018006.19405469 GA2:6:100:1793:611 length=36
ACCCGCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
+SRR018006.19405469 GA2:6:100:1793:611 length=36
7);;).;);;/;*.2>/@@7;@77<..;)58)5/>/
Is there a simple sed/awk/bash way to convert them into this format (called FASTA):
>SRR018006.2016 GA2:6:1:20:650 length=36
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGN
>SRR018006.19405469 GA2:6:100:1793:611 length=36
ACCCGCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
In principle we want to extract the first two lines in each block-of-4
and replace @
with >
.
© Stack Overflow or respective owner