flowcraft.templates.process_abricate module¶
Purpose¶
This module is intended parse the results of the Abricate for one or more samples.
Expected input¶
The following variables are expected whether using NextFlow or the
main()
executor.
abricate_files
: Path to abricate output file.- e.g.:
'abr_resfinder.tsv'
- e.g.:
Generated output¶
None
Code documentation¶
-
class
flowcraft.templates.process_abricate.
Abricate
(fls)[source]¶ Bases:
object
Main parser for Abricate output files.
This class parses one or more output files from Abricate, usually from different databases. In addition to the parsing methods, it also provides a flexible method to filter and re-format the content of the abricate files.
Parameters: - fls : list
List of paths to Abricate output files.
Methods
get_filter
(*args, **kwargs)Wrapper of the iter_filter method that returns a list with results iter_filter
(filters[, databases, fields, …])General purpose filter iterator. parse_files
(fls)Public method for parsing abricate output files. -
storage
= None¶ dic: Main storage of Abricate’s file content. Each entry corresponds to a single line and contains the keys:
- ``log_file``: Name of the summary log file containing abricate results - ``infile``: Input file of Abricate. - ``reference``: Reference of the query sequence. - ``seq_range``: Range of the query sequence in the database sequence. - ``gene``: AMR gene name. - ``accession``: The genomic source of the sequence. - ``database``: The database the sequence came from. - ``coverage``: Proportion of gene covered. - ``identity``: Proportion of exact nucleotide matches.
-
parse_files
(fls)[source]¶ Public method for parsing abricate output files.
This method is called at at class instantiation for the provided output files. Additional abricate output files can be added using this method after the class instantiation.
Parameters: - fls : list
List of paths to Abricate files
-
iter_filter
(filters, databases=None, fields=None, filter_behavior='and')[source]¶ General purpose filter iterator.
This general filter iterator allows the filtering of entries based on one or more custom filters. These filters must contain an entry of the storage attribute, a comparison operator, and the test value. For example, to filter out entries with coverage below 80:
my_filter = ["coverage", ">=", 80]
Filters should always be provide as a list of lists:
iter_filter([["coverage", ">=", 80]]) # or my_filters = [["coverage", ">=", 80], ["identity", ">=", 50]] iter_filter(my_filters)
As a convenience, a list of the desired databases can be directly specified using the database argument, which will only report entries for the specified databases:
iter_filter(my_filters, databases=["plasmidfinder"])
By default, this method will yield the complete entry record. However, the returned filters can be specified using the fields option:
iter_filter(my_filters, fields=["reference", "coverage"])
Parameters: - filters : list
List of lists with the custom filter. Each list should have three elements. (1) the key from the entry to be compared; (2) the comparison operator; (3) the test value. Example:
[["identity", ">", 80]]
.- databases : list
List of databases that should be reported.
- fields : list
List of fields from each individual entry that are yielded.
- filter_behavior : str
options:
'and'
'or'
Sets the behaviour of the filters, if multiple filters have been provided. By default it is set to'and'
, which means that an entry has to pass all filters. It can be set to'or'
, in which case one one of the filters has to pass.
Yields: - dic : dict
Dictionary object containing a
Abricate.storage
entry that passed the filters.
-
class
flowcraft.templates.process_abricate.
AbricateReport
(*args, **kwargs)[source]¶ Bases:
flowcraft.templates.process_abricate.Abricate
Report generator for single Abricate output files
This class is intended to parse an Abricate output file from a single sample and database and generates a JSON report for the report webpage.
Parameters: - fls : list
List of paths to Abricate output files.
- database : (optional) str
Name of the database for the current report. If not provided, it will be inferred based on the first entry of the Abricate file.
Methods
get_filter
(*args, **kwargs)Wrapper of the iter_filter method that returns a list with results get_plot_data
()Generates the JSON report to plot the gene boxes get_table_data
()iter_filter
(filters[, databases, fields, …])General purpose filter iterator. parse_files
(fls)Public method for parsing abricate output files. write_report_data
()Writes the JSON report to a json file -
get_plot_data
()[source]¶ Generates the JSON report to plot the gene boxes
Following the convention of the reports platform, this method returns a list of JSON/dict objects with the information about each entry in the abricate file. The information contained in this JSON is:
{contig_id: <str>, seqRange: [<int>, <int>], gene: <str>, accession: <str>, coverage: <float>, identity: <float> }
Note that the seqRange entry contains the position in the corresponding contig, not the absolute position in the whole assembly.
Returns: - json_dic : list
List of JSON/dict objects with the report data.