FASTA alignment parser.
More...
#include <fastaParser.hpp>
FASTA alignment parser.
Reads a FASTA alignment file, separates the sequences and headers, and provides analysis methods. The data are stored in memory, so users should pay attention to file sizes.
◆ ParseFASTA() [1/4]
BayesicSpace::ParseFASTA::ParseFASTA |
( |
| ) |
|
|
default |
◆ ParseFASTA() [2/4]
ParseFASTA::ParseFASTA |
( |
const std::string & |
fastaFileName | ) |
|
Constructor from FASTA file.
Read data from a FASTA file.
- Parameters
-
[in] | fastaFileName | input FASTA file name |
◆ ParseFASTA() [3/4]
ParseFASTA::ParseFASTA |
( |
const ParseFASTA & |
toCopy | ) |
|
Copy constructor.
- Parameters
-
◆ ParseFASTA() [4/4]
Move constructor.
- Parameters
-
◆ ~ParseFASTA()
BayesicSpace::ParseFASTA::~ParseFASTA |
( |
| ) |
|
|
default |
◆ alignmentLength()
size_t BayesicSpace::ParseFASTA::alignmentLength |
( |
| ) |
const |
|
inline |
Alignment length.
- Returns
- alignment length
◆ diversityInWindows()
std::vector< std::pair< size_t, std::vector< uint32_t > > > ParseFASTA::diversityInWindows |
( |
const size_t & |
windowSize, |
|
|
const size_t & |
stepSize |
|
) |
| const |
Sequence diversity in windows.
Calculate the number of different sequences in window sliding along a sequence alignment. Reports the number of times each unique sequence occurs by window position.
- Parameters
-
[in] | windowSize | window size in base pairs |
[in] | stepSize | window movement steps in base pairs |
- Returns
- vector of pairs that contain window start positions and unique sequence counts
◆ extractConsensusWindow()
std::string ParseFASTA::extractConsensusWindow |
( |
const size_t & |
startIdx, |
|
|
const size_t & |
windowLength |
|
) |
| const |
Extract a consensus region.
Extract a window of the consensus sequence.
- Parameters
-
[in] | startIdx | index of the window start |
[in] | windowLength | number of nucleotides in the window |
◆ extractSequence()
Extract a region matching a sequence.
Report all unique sequences (and their counts) matching the query sequence. Matching performed using striped Smith-Waterman alignment.
- Parameters
-
[in] | querySequence | the query sequence |
- Returns
- matching window start and length
◆ extractWindow()
std::unordered_map< std::string, uint32_t > ParseFASTA::extractWindow |
( |
const size_t & |
windowStartPosition, |
|
|
const size_t & |
windowSize |
|
) |
| const |
Extract an alignment window.
Calculates the number of different sequences in a window. Reports the number of times each unique sequence occurs in the provided window.
- Parameters
-
[in] | windowStartPosition | window start |
[in] | windowSize | window size in base pairs |
- Returns
- map of sequences to the number of times each occurs in the alignment
◆ extractWindowSorted()
std::vector< std::pair< std::string, uint32_t > > ParseFASTA::extractWindowSorted |
( |
const size_t & |
windowStartPosition, |
|
|
const size_t & |
windowSize |
|
) |
| const |
Extract an alignment window and sort.
Calculates the number of different sequences in a window. Reports the number of times each unique sequence occurs in the provided window. The output is sorted by the number of times a sequence is present, in descending order.
- Parameters
-
[in] | windowStartPosition | window start |
[in] | windowSize | window size in base pairs |
- Returns
- map of sequences to the number of times each occurs in the alignment, sorted
◆ imputeMissing()
void ParseFASTA::imputeMissing |
( |
| ) |
|
Impute missing values.
Replaces missing (N or other variants, e.g. Y, S, etc.) nucleotides with the consensus value.
◆ operator=() [1/2]
Copy assignment operator.
- Parameters
-
◆ operator=() [2/2]
Move assignment operator.
- Parameters
-
◆ sequenceNumber()
size_t BayesicSpace::ParseFASTA::sequenceNumber |
( |
| ) |
const |
|
inlinenoexcept |
Number of sequences in alignment.
- Returns
- number of sequences in the alignment
The documentation for this class was generated from the following files: