Generative Adversarial Networks for DNA Storage Channel Simulator
  • Kang, Sanghoon
  • Gao, Yunfei
  • Jeong, Jaeho
  • Park, Seong-Joon
  • Kim, Jae-Won
  • 외 6명
Citations

WEB OF SCIENCE

7
Citations

SCOPUS

8

초록

DNA data storage systems have rapidly developed with novel error-correcting techniques, random access algorithms, and query systems. However, designing an algorithm for DNA storage systems is challenging, mainly due to the unpredictable nature of errors and the extremely high price of experiments. Thus, a simulator is of interest that can imitate the error statistics of a DNA storage system and replace the experiments in developing processes. We introduce novel generative adversarial networks that learn DNA storage channel statistics. Our simulator takes oligos (DNA sequences to write) as an input and generates a FASTQ file that includes output DNA reads and quality scores as if the oligos are synthesized and sequenced. We trained the proposed simulator with data from a single experiment consisting of 14,400 input oligo strands and 12,108,573 output reads. The error statistics between the input and the output of the trained generator match the actual error statistics, including the error rate at each position, the number of errors for each nucleotide, and high-order statistics.

키워드

DNAGeneratorsSequential analysisGenerative adversarial networksTransformersHidden Markov modelsError analysisRecurrent neural networksChannel simulatorDNA storagegenerative adversarial networksrecurrent neural networkstransformerDIGITAL INFORMATIONSEQUENCECHALLENGESROBUST
제목
Generative Adversarial Networks for DNA Storage Channel Simulator
저자
Kang, SanghoonGao, YunfeiJeong, JaehoPark, Seong-JoonKim, Jae-WonNo, Jong-SeonJeon, HahyeonLee, Jeong WookKim, SunghwanPark, HosungNo, Albert
DOI
10.1109/ACCESS.2023.3235201
발행일
2023-01
유형
Article
저널명
IEEE Access
11
페이지
3781 ~ 3793