vtphan/multigenome-sim
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Tools used to generate / simulate multi-genomes, reads ========================================================================== eval_alignment.py This program evaluate aligned reads. A read can be aligned to multiple locations. usage: eval_alignment.py [-h] [-g G] reads alignment Evaludate aligned reads. positional arguments: reads file containing all reads. alignment file containing aligned reads. optional arguments: -h, --help show this help message and exit -g G Maximally allowable gap distance between positions of a read and aligned read. Default=20 Test data: reads.txt, alignment0.txt, alignment1.txt ========================================================================== verify_reads.go Usage: go run verify_reads.go -s genome.fasta -r reads.txt genome.fasta is the genome from which reads (reads.txt) are generated. (no mutation, only sequencing errors) ========================================================================== generate_reads.go This program generates reads given a sequence and sequencing error. A read can occur at multiple locations. All reads will have at least one A, C, T, G. No reads have only N's. Usage: (1) generate index go run generate_reads.go -seq sequence_file (2) generate actual reads go run generate_reads.go -seq sequence_file -reads N -len M -erate E Help: go run generate_reads.go --help -c=2: Coverage -debug=false: Turn on debug mode. -e=0.01: Error rate. -l=100: Read length. -s="": Specify a file containing the sequence. Test data: reference.fasta ========================================================================== generate_genomes.py given a genome generate other genomes with SNPs (including indels) Usage: generate_genomes.py [-h] [-m MUTATION_RATE] [-p FIRST_PROB] [-i INDEL_FRAC] [-ie INDEL_EXT] [-n N] [--debug] file_name Generate multiple genomes based on a reference genome. positional arguments: file_name genome_file.fasta optional arguments: -h, --help show this help message and exit -m MUTATION_RATE, --mutation_rate MUTATION_RATE Base mutation rate (default 0.001) -p FIRST_PROB, --first_prob FIRST_PROB Probability of first base in a SNP profile (default 0.75) -i INDEL_FRAC, --indel_frac INDEL_FRAC Indel fraction of SNP (default 0.1111) -ie INDEL_EXT, --indel_ext INDEL_EXT Indel extension probability (default 0.3) -n N Number of genomes (default 10) --debug Turn on debug mode (default False) Test data: reference.fasta ==========================================================================
About
tools used to generate/simulate multi-genomes
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published