analyzeAlignments
Estimating sequence diversity from sequence alignments
Loading...
Searching...
No Matches
BayesicSpace Namespace Reference

Classes

struct  AlignmentStatistics
 Collection of alignment statistics. More...
 
class  ParseFASTA
 FASTA alignment parser. More...
 

Functions

void parseCL (int &argc, char **argv, std::unordered_map< std::string, std::string > &cli)
 Command line parser.
 
void extractCLinfo (const std::unordered_map< std::string, std::string > &parsedCLI, std::unordered_map< std::string, int > &intVariables, std::unordered_map< std::string, std::string > &stringVariables)
 Extract parameters from parsed command line interface flags.
 
void saveDiversityTable (const std::vector< std::pair< size_t, std::vector< uint32_t > > > &diversityTable, std::fstream &outFile)
 Save the diversity table.
 
void saveUniqueSequences (const std::unordered_map< std::string, uint32_t > &uniqueSequences, const std::string &consensus, const std::string &fileType, std::fstream &outFile)
 Save unique sequences.
 
void saveUniqueSequences (const std::vector< std::pair< std::string, uint32_t > > &uniqueSequences, const std::string &consensus, const std::string &fileType, std::fstream &outFile)
 Save sorted unique sequences.
 
void saveUniqueSequences (const std::unordered_map< std::string, uint32_t > &uniqueSequences, const std::string &consensus, const AlignmentStatistics &alignStats, const std::string &query, const std::string &fileType, std::fstream &outFile)
 Save unique sequences with query.
 
void saveUniqueSequences (const std::vector< std::pair< std::string, uint32_t > > &uniqueSequences, const std::string &consensus, const AlignmentStatistics &alignStats, const std::string &query, const std::string &fileType, std::fstream &outFile)
 Save sorted unique sequences with query.
 

Function Documentation

◆ extractCLinfo()

void BayesicSpace::extractCLinfo ( const std::unordered_map< std::string, std::string > &  parsedCLI,
std::unordered_map< std::string, int > &  intVariables,
std::unordered_map< std::string, std::string > &  stringVariables 
)

Extract parameters from parsed command line interface flags.

Extracts needed variable values, indexed by std::string encoded variable names.

Parameters
[in]parsedCLIflag values parsed from the command line
[out]intVariablesindexed int variables for use by main()
[out]stringVariablesindexed std::string variables for use by main()

◆ parseCL()

void BayesicSpace::parseCL ( int &  argc,
char **  argv,
std::unordered_map< std::string, std::string > &  cli 
)

Command line parser.

Maps flags to values. Flags assumed to be of the form --flag-name value.

Parameters
[in]argcsize of the argv array
[in]argvcommand line input array
[out]climap of tags to values

◆ saveDiversityTable()

void BayesicSpace::saveDiversityTable ( const std::vector< std::pair< size_t, std::vector< uint32_t > > > &  diversityTable,
std::fstream &  outFile 
)

Save the diversity table.

Save the diversity table. The output file will have two columns: (1) window start position (repeated for every unique sequence). (2) number of unique sequence occurrences.

Parameters
[in]diversityTablethe diversity table data
[in,out]outFileoutput file stream

◆ saveUniqueSequences() [1/4]

void BayesicSpace::saveUniqueSequences ( const std::unordered_map< std::string, uint32_t > &  uniqueSequences,
const std::string &  consensus,
const AlignmentStatistics alignStats,
const std::string &  query,
const std::string &  fileType,
std::fstream &  outFile 
)

Save unique sequences with query.

Save unique sequences in an alignment window. If in FASTA format, the number of times each sequence appears in an alignment is in the header. If in TAB format, sequence and the number of occurrences are on the same line, separated by a tab. The query sequence is displayed on the top line, may be different length than the rest of the sequences if there are insertions/deletions. The consensus is displayed on the second line, marked by "C" in the TAB format. The start position and length of the widow are also included. They are explicitly described in the consensus FASTA header, or included with a "|" delimiter in the TAB format. Nucleotides that are the same as the consensus are displayed as '.', the different residues are shown.

Parameters
[in]uniqueSequencestable of unique sequences and their counts
[in]consensusconsensus sequence for the window
[in]alignStatsalignment statistics
[in]queryquery sequence
[in]fileTypeTAB or FASTA, otherwise throws
[in,out]outFileoutput stream

◆ saveUniqueSequences() [2/4]

void BayesicSpace::saveUniqueSequences ( const std::unordered_map< std::string, uint32_t > &  uniqueSequences,
const std::string &  consensus,
const std::string &  fileType,
std::fstream &  outFile 
)

Save unique sequences.

Save unique sequences in an alignment window. If in FASTA format, the number of times each sequence appears in an alignment is in the header. If in TAB format, sequence and the number of occurrences are on the same line, separated by a tab. The consensus is displayed on the top line. Nucleotides that are the same as the consensus are displayed as '.', the different residues are shown.

Parameters
[in]uniqueSequencestable of unique sequences and their counts
[in]consensusconsensus sequence for the window
[in]fileTypeTAB or FASTA, otherwise throws
[in,out]outFileoutput stream

◆ saveUniqueSequences() [3/4]

void BayesicSpace::saveUniqueSequences ( const std::vector< std::pair< std::string, uint32_t > > &  uniqueSequences,
const std::string &  consensus,
const AlignmentStatistics alignStats,
const std::string &  query,
const std::string &  fileType,
std::fstream &  outFile 
)

Save sorted unique sequences with query.

Save unique sequences in an alignment window. If in FASTA format, the number of times each sequence appears in an alignment is in the header. If in TAB format, sequence and the number of occurrences are on the same line, separated by a tab. The query sequence is displayed on the top line, may be different length than the rest of the sequences if there are insertions/deletions. The consensus is displayed on the second line, marked by "C" in the TAB format. The start position and length of the widow are also included. They are explicitly described in the consensus FASTA header, or included with a "|" delimiter in the TAB format. Nucleotides that are the same as the consensus are displayed as '.', the different residues are shown. Sequences are sorted by the number of occurrences in descending order.

Parameters
[in]uniqueSequencestable of unique sequences and their counts
[in]consensusconsensus sequence for the window
[in]alignStatsalignment statistics
[in]queryquery sequence
[in]fileTypeTAB or FASTA, otherwise throws
[in,out]outFileoutput stream

◆ saveUniqueSequences() [4/4]

void BayesicSpace::saveUniqueSequences ( const std::vector< std::pair< std::string, uint32_t > > &  uniqueSequences,
const std::string &  consensus,
const std::string &  fileType,
std::fstream &  outFile 
)

Save sorted unique sequences.

Save unique sequences in an alignment window. If in FASTA format, the number of times each sequence appears in an alignment is in the header. If in TAB format, sequence and the number of occurrences are on the same line, separated by a tab. The consensus is displayed on the top line. Nucleotides that are the same as the consensus are displayed as '.', the different residues are shown. Sequences are sorted by the number of occurrences in descending order.

Parameters
[in]uniqueSequencestable of unique sequences and their counts
[in]consensusconsensus sequence for the window
[in]fileTypeTAB or FASTA, otherwise throws
[in,out]outFileoutput stream