Bio.SeqIO.FastaIO 模块

Bio.SeqIO 对“fasta”（也称为 FastA 或 Pearson）文件格式的支持。

您应该通过 Bio.SeqIO 函数使用此模块。

Bio.SeqIO.FastaIO.SimpleFastaParser(handle)

将 Fasta 记录迭代为字符串元组。

参数

handle - 以文本模式打开的输入流

对于每个记录，将返回一个包含两个字符串的元组，即 FASTA 标题行（不含前导“>”字符）和序列（去除所有空格）。标题行不会被划分为标识符（第一个单词）和注释或描述。

>>> with open("Fasta/dups.fasta") as handle:
...     for values in SimpleFastaParser(handle):
...         print(values)
...
('alpha', 'ACGTA')
('beta', 'CGTC')
('gamma', 'CCGCC')
('alpha (again - this is a duplicate entry to test the indexing code)', 'ACGTA')
('delta', 'CGCGC')

Bio.SeqIO.FastaIO.FastaTwoLineParser(handle)

将无换行 Fasta 记录迭代为字符串元组。

参数

handle - 以文本模式打开的输入流

在功能上与 SimpleFastaParser 相同，但严格解释 FASTA 格式，每条记录恰好两行，大于号标识符带描述，序列不换行。

任何换行都会引发异常，多余的空行也会引发异常（除了零长度序列作为记录的第二行这种特殊情况）。

示例

此文件每条 FASTA 记录使用两行

>>> with open("Fasta/aster_no_wrap.pro") as handle:
...     for title, seq in FastaTwoLineParser(handle):
...         print("%s = %s..." % (title, seq[:3]))
...
gi|3298468|dbj|BAA31520.1| SAMIPF = GGH...

此等效文件使用换行

>>> with open("Fasta/aster.pro") as handle:
...     for title, seq in FastaTwoLineParser(handle):
...         print("%s = %s..." % (title, seq[:3]))
...
Traceback (most recent call last):
   ...
ValueError: Expected FASTA record starting with '>' character. Perhaps this file is using FASTA line wrapping? Got: 'MTFGLVYTVYATAIDPKKGSLGTIAPIAIGFIVGANI'

class Bio.SeqIO.FastaIO.FastaIterator(source: IO[str] | PathLike | str | bytes, alphabet: None = None)

基类：SequenceIterator

Fasta 文件解析器。

__init__(source: IO[str] | PathLike | str | bytes, alphabet: None = None) → None

将 Fasta 记录迭代为 SeqRecord 对象。

参数

source - 以文本模式打开的输入流，或指向文件的路径
alphabet - 可选的字母表，未使用。保留为 None。

默认情况下，这将类似于调用 Bio.SeqIO.parse(handle, “fasta”)，而不进行自定义的标题行处理

>>> with open("Fasta/dups.fasta") as handle:
...     for record in FastaIterator(handle):
...         print(record.id)
...
alpha
beta
gamma
alpha
delta

如果您希望在写入之前修改记录，例如更改每个记录的 ID，可以使用生成器函数，如下所示

>>> def modify_records(records):
...     for record in records:
...         record.id = record.id.upper()
...         yield record
...
>>> with open('Fasta/dups.fasta') as handle:
...     for record in modify_records(FastaIterator(handle)):
...         print(record.id)
...
ALPHA
BETA
GAMMA
ALPHA
DELTA

parse(handle): 开始解析文件，并返回 SeqRecord 生成器。

iterate(handle): 解析文件并生成 SeqRecord 对象。

__abstractmethods__ = frozenset({})

__parameters__ = ()

class Bio.SeqIO.FastaIO.FastaTwoLineIterator(source)

基类：SequenceIterator

每条记录正好两行的 Fasta 文件解析器。

__init__(source)

将两行 Fasta 记录（作为 SeqRecord 对象）迭代。

参数

source - 以文本模式打开的输入流，或指向文件的路径

这使用严格的 FASTA 解释，要求每条记录正好两行（不换行）。

仅提供由宽松的 FASTA 解析器提供的默认标题到 ID/名称/描述解析。

parse(handle): 开始解析文件，并返回 SeqRecord 生成器。

iterate(handle): 解析文件并生成 SeqRecord 对象。

__abstractmethods__ = frozenset({})

__parameters__ = ()

class Bio.SeqIO.FastaIO.FastaWriter(target, wrap=60, record2title=None)

基类：SequenceWriter

用于写入 Fasta 格式文件的类（已过时）。

请改用 as_fasta 函数，或使用 format="fasta" 改用顶层 Bio.SeqIO.write() 函数。

__init__(target, wrap=60, record2title=None)

创建 Fasta 写入器（已过时）。

参数

target - 以文本模式打开的输出流，或指向文件的路径。
wrap - 用于换行序列行的可选行长度。默认情况下，将序列在 60 个字符处换行。使用零（或 None）表示不换行，为序列提供一条长线。
record2title - 可选函数，用于返回将用于每个记录的标题行的文本。默认情况下，使用 record.id 和 record.description 的组合。如果 record.description 以 record.id 开头，则仅使用 record.description。

您可以使用

handle = open(filename, "w")
writer = FastaWriter(handle)
writer.write_file(myRecords)
handle.close()

或者，遵循顺序文件写入器系统，例如

handle = open(filename, "w")
writer = FastaWriter(handle)
writer.write_header() # does nothing for Fasta files
...
Multiple writer.write_record() and/or writer.write_records() calls
...
writer.write_footer() # does nothing for Fasta files
handle.close()

write_record(record): 将单个 Fasta 记录写入文件。

class Bio.SeqIO.FastaIO.FastaTwoLineWriter(handle, record2title=None)

基类：FastaWriter

用于写入每条记录两行的 Fasta 格式文件的类（已过时）。

这意味着我们将在不换行的情况下写入序列信息，并且始终为空序列写入空行。

请改用 as_fasta_2line 函数，或使用 format="fasta" 在顶层使用 Bio.SeqIO.write() 函数。

__init__(handle, record2title=None)

创建每条记录两行的 Fasta 编写器（已过时）。

参数

handle - 输出文件的句柄，例如由 open(filename, “w”) 返回的句柄。
record2title - 可选函数，用于返回将用于每个记录的标题行的文本。默认情况下，使用 record.id 和 record.description 的组合。如果 record.description 以 record.id 开头，则仅使用 record.description。

您可以使用

handle = open(filename, "w")
writer = FastaWriter(handle)
writer.write_file(myRecords)
handle.close()

或者，遵循顺序文件写入器系统，例如

handle = open(filename, "w")
writer = FastaWriter(handle)
writer.write_header() # does nothing for Fasta files
...
Multiple writer.write_record() and/or writer.write_records() calls
...
writer.write_footer() # does nothing for Fasta files
handle.close()

Bio.SeqIO.FastaIO.as_fasta(record)

将 SeqRecord 转换为 FASTA 格式的字符串。

这在 SeqRecord 的 .format(“fasta”) 方法和 SeqIO.write(…, …, “fasta”) 函数中被内部使用。

Bio.SeqIO.FastaIO.as_fasta_2line(record)

将 SeqRecord 转换为两行 FASTA 格式的字符串。

这在 SeqRecord 的 .format(“fasta-2line”) 方法和 SeqIO.write(…, …, “fasta-2line”) 函数中被内部使用。