Design of synthetic promoters for cyanobacteria with generative deep-learning model. 2023

Euijin Seo, and Yun-Nam Choi, and Ye Rim Shin, and Donghyuk Kim, and Jeong Wook Lee
Department of Chemical Engineering, Pohang University of Science and Technology (POSTECH), 77 Cheongam-Ro, Nam-Gu, Pohang, Gyeongbuk37673, Korea.

Deep generative models, which can approximate complex data distribution from large datasets, are widely used in biological dataset analysis. In particular, they can identify and unravel hidden traits encoded within a complicated nucleotide sequence, allowing us to design genetic parts with accuracy. Here, we provide a deep-learning based generic framework to design and evaluate synthetic promoters for cyanobacteria using generative models, which was in turn validated with cell-free transcription assay. We developed a deep generative model and a predictive model using a variational autoencoder and convolutional neural network, respectively. Using native promoter sequences of the model unicellular cyanobacterium Synechocystis sp. PCC 6803 as a training dataset, we generated 10 000 synthetic promoter sequences and predicted their strengths. By position weight matrix and k-mer analyses, we confirmed that our model captured a valid feature of cyanobacteria promoters from the dataset. Furthermore, critical subregion identification analysis consistently revealed the importance of the -10 box sequence motif in cyanobacteria promoters. Moreover, we validated that the generated promoter sequence can efficiently drive transcription via cell-free transcription assay. This approach, combining in silico and in vitro studies, will provide a foundation for the rapid design and validation of synthetic promoters, especially for non-model organisms.

UI MeSH Term Description Entries
D011401 Promoter Regions, Genetic DNA sequences which are recognized (directly or indirectly) and bound by a DNA-dependent RNA polymerase during the initiation of transcription. Highly conserved sequences within the promoter include the Pribnow box in bacteria and the TATA BOX in eukaryotes. rRNA Promoter,Early Promoters, Genetic,Late Promoters, Genetic,Middle Promoters, Genetic,Promoter Regions,Promoter, Genetic,Promotor Regions,Promotor, Genetic,Pseudopromoter, Genetic,Early Promoter, Genetic,Genetic Late Promoter,Genetic Middle Promoters,Genetic Promoter,Genetic Promoter Region,Genetic Promoter Regions,Genetic Promoters,Genetic Promotor,Genetic Promotors,Genetic Pseudopromoter,Genetic Pseudopromoters,Late Promoter, Genetic,Middle Promoter, Genetic,Promoter Region,Promoter Region, Genetic,Promoter, Genetic Early,Promoter, rRNA,Promoters, Genetic,Promoters, Genetic Middle,Promoters, rRNA,Promotor Region,Promotors, Genetic,Pseudopromoters, Genetic,Region, Genetic Promoter,Region, Promoter,Region, Promotor,Regions, Genetic Promoter,Regions, Promoter,Regions, Promotor,rRNA Promoters
D000077321 Deep Learning Supervised or unsupervised machine learning methods that use multiple layers of data representations generated by nonlinear transformations, instead of individual task-specific ALGORITHMS, to build and train neural network models. Hierarchical Learning,Learning, Deep,Learning, Hierarchical
D016571 Neural Networks, Computer A computer architecture, implementable in either hardware or software, modeled after biological neural networks. Like the biological system in which the processing capability is a result of the interconnection strengths between arrays of nonlinear processing nodes, computerized neural networks, often called perceptrons or multilayer connectionist models, consist of neuron-like units. A homogeneous group of units makes up a layer. These networks are good at pattern recognition. They are adaptive, performing tasks by example, and thus are better for decision-making than are linear learning machines or cluster analysis. They do not require explicit programming. Computational Neural Networks,Connectionist Models,Models, Neural Network,Neural Network Models,Neural Networks (Computer),Perceptrons,Computational Neural Network,Computer Neural Network,Computer Neural Networks,Connectionist Model,Model, Connectionist,Model, Neural Network,Models, Connectionist,Network Model, Neural,Network Models, Neural,Network, Computational Neural,Network, Computer Neural,Network, Neural (Computer),Networks, Computational Neural,Networks, Computer Neural,Networks, Neural (Computer),Neural Network (Computer),Neural Network Model,Neural Network, Computational,Neural Network, Computer,Neural Networks, Computational,Perceptron
D046939 Synechocystis A form-genus of unicellular CYANOBACTERIA in the order Chroococcales. None of the strains fix NITROGEN, there are no gas vacuoles, and sheath layers are never produced.

Related Publications

Euijin Seo, and Yun-Nam Choi, and Ye Rim Shin, and Donghyuk Kim, and Jeong Wook Lee
May 2024, Nucleic acids research,
Euijin Seo, and Yun-Nam Choi, and Ye Rim Shin, and Donghyuk Kim, and Jeong Wook Lee
November 2023, Nucleic acids research,
Euijin Seo, and Yun-Nam Choi, and Ye Rim Shin, and Donghyuk Kim, and Jeong Wook Lee
November 2021, Journal of chemical information and modeling,
Euijin Seo, and Yun-Nam Choi, and Ye Rim Shin, and Donghyuk Kim, and Jeong Wook Lee
March 2023, Journal of cheminformatics,
Euijin Seo, and Yun-Nam Choi, and Ye Rim Shin, and Donghyuk Kim, and Jeong Wook Lee
November 2018, Neural networks : the official journal of the International Neural Network Society,
Euijin Seo, and Yun-Nam Choi, and Ye Rim Shin, and Donghyuk Kim, and Jeong Wook Lee
June 2022, Digital discovery,
Euijin Seo, and Yun-Nam Choi, and Ye Rim Shin, and Donghyuk Kim, and Jeong Wook Lee
February 2022, Current opinion in structural biology,
Euijin Seo, and Yun-Nam Choi, and Ye Rim Shin, and Donghyuk Kim, and Jeong Wook Lee
February 2021, Journal of molecular modeling,
Euijin Seo, and Yun-Nam Choi, and Ye Rim Shin, and Donghyuk Kim, and Jeong Wook Lee
September 2022, Briefings in bioinformatics,
Euijin Seo, and Yun-Nam Choi, and Ye Rim Shin, and Donghyuk Kim, and Jeong Wook Lee
April 2024, Journal of dentistry,
Copied contents to your clipboard!