Names

Module for modifying lines within PDB files, including functionalities for replacing atom and residue names.

This module provides a set of utility functions to manipulate the content of Protein Data Bank (PDB) files. It includes functions to perform general line modifications based on filtering criteria, as well as specific functions for replacing atom names and residue names within the ATOM and HETATM records of a PDB file. Additionally, it offers a function to standardize the atom names of water molecules.

`modify_lines(pdb_lines, fn_process, fn_args, fn_filter=None, include=None, exclude=None)` ¶

General function to modify specific lines in a PDB file based on filtering.

This function iterates through a list of PDB lines and applies a processing function (fn_process) to lines that meet certain criteria defined by an optional filter function (fn_filter) and inclusion/exclusion lists.

PARAMETER	DESCRIPTION
`pdb_lines`	An iterable of strings, where each string represents a line from a PDB file. TYPE: `Iterable[str]`
`fn_process`	A callable function that takes a PDB line as its first argument, followed by the elements of `fn_args`, and returns a modified PDB line. This function is responsible for the actual modification of the line. TYPE: `Callable[[str, str, str, int \| None, int], str]`
`fn_args`	An iterable containing additional arguments to be passed to the `fn_process` function after the PDB line itself. TYPE: `Iterable[Any]`
`fn_filter`	An optional callable function that takes a PDB line as input and returns a string. This string is then used to check against the `include` and `exclude` lists. If `None`, all ATOM and HETATM lines are processed. TYPE: `Callable[[str], str] \| None` DEFAULT: `None`
`include`	An optional list of strings. If `fn_filter` is provided, only lines for which the result of `fn_filter` is present in this list will be processed by `fn_process`. TYPE: `list[str] \| None` DEFAULT: `None`
`exclude`	An optional list of strings. If `fn_filter` is provided, lines for which the result of `fn_filter` is present in this list will not be processed by `fn_process`. Defaults to `None`. TYPE: `list[str] \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`list[str]`	A list of modified PDB lines. Lines that did not meet the filtering criteria or were not ATOM or HETATM records are returned unchanged.

Notes

The fn_filter function should be designed to extract a specific piece of information from the PDB line (e.g., residue name, atom name) that can be used for inclusion or exclusion.
If both include and exclude are provided and a filtered value is present in both, the line will be processed if it's in include. Exclusion takes precedence if only exclude is provided.

Examples:

To replace "CA" atom names with "CB" only in residues named "GLY":

>>> pdb_lines = [
...     "ATOM      1  CA  GLY A   1       ...",
...     "ATOM      2  CB  ALA A   2       ...",
... ]
>>> def get_resname(line):
...     return parse_resname(line).strip()
>>> modified = modify_lines(
...     pdb_lines,
...     replace_in_pdb_line,
...     ("CA ", "CB ", 13, 17),
...     fn_filter=get_resname,
...     include=["GLY"],
... )
>>> for line in modified:
...     print(line)
ATOM      1  CB  GLY A   1       ...
ATOM      2  CB  ALA A   2       ...

`replace_atom_names(pdb_lines, orig_atom_name, new_atom_name)` ¶

Replaces all occurrences of a specified original atom name with a new atom name in a list of PDB lines.

This function iterates through the provided PDB lines and, for each ATOM or HETATM record, it checks if the atom name matches the orig_atom_name. If it does, the atom name is replaced with the new_atom_name. The atom names are stripped of leading/trailing whitespace and left-justified to a length of 4 characters to ensure proper formatting in the PDB file.

PARAMETER	DESCRIPTION
`pdb_lines`	An iterable of strings, where each string represents a line from a PDB file. TYPE: `Iterable[str]`
`orig_atom_name`	The original atom name to be replaced. TYPE: `str`
`new_atom_name`	The new atom name to replace the original one. TYPE: `str`

RETURNS	DESCRIPTION
`list[str]`	list[str]: A list of PDB lines with the specified atom names replaced.

Examples:

>>> pdb_lines = [
...     "ATOM      1  CA  ALA A   1       ...",
...     "ATOM      2  CB  ALA A   1       ...",
... ]
>>> modified_lines = replace_atom_names(pdb_lines, "CA", "CB")
>>> for line in modified_lines:
...     print(line)
ATOM      1  CB  ALA A   1       ...
ATOM      2  CB  ALA A   1       ...

`replace_residue_names(pdb_lines, orig_resname, new_resname, fn_filter=None, include=None, exclude=None)` ¶

Replaces all occurrences of a specified original residue name with a new residue name in a list of PDB lines.

This function iterates through the provided PDB lines and, for each ATOM or HETATM record, it checks if the residue name matches the orig_resname. If it does, the residue name is replaced with the new_resname. The residue names are stripped of leading/trailing whitespace and left-justified to a length of 4 characters to ensure proper formatting in the PDB file. Optionally, a filter function and inclusion/exclusion lists can be used to control which lines are processed.

PARAMETER	DESCRIPTION
`pdb_lines`	An iterable of strings, where each string represents a line from a PDB file. TYPE: `Iterable[str]`
`orig_resname`	The original residue name to be replaced. TYPE: `str`
`new_resname`	The new residue name to replace the original one. TYPE: `str`
`fn_filter`	An optional callable function that takes a PDB line as input and returns a string (e.g., residue ID) for filtering. TYPE: `Callable[[str], str] \| None` DEFAULT: `None`
`include`	An optional list of strings. Only lines where the result of `fn_filter` is in this list will have their residue names replaced. TYPE: `list[str] \| None` DEFAULT: `None`
`exclude`	An optional list of strings. Lines where the result of `fn_filter` is in this list will not have their residue names replaced. TYPE: `list[str] \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`list[str]`	A list of PDB lines with the specified residue names replaced, subject to any provided filtering.

Examples:

To replace all "MET" residues with "ALA":

>>> pdb_lines = [
...     "ATOM      1  N   MET A   1       ...",
...     "ATOM      2  CA  MET A   1       ...",
... ]
>>> modified_lines = replace_residue_names(pdb_lines, "MET", "ALA")
>>> for line in modified_lines:
...     print(line)
ATOM      1  N   ALA A   1       ...
ATOM      2  CA  ALA A   1       ...

To replace "MET" with "ALA" only in residue ID "1":

>>> pdb_lines = [
...     "ATOM      1  N   MET A   1       ...",
...     "ATOM      2  CA  MET A   1       ...",
...     "ATOM      3  C   MET A   2       ...",
... ]
>>> def get_resid(line):
...     return parse_resid(line).strip()
>>> modified_lines = replace_residue_names(
...     pdb_lines, "MET", "ALA", fn_filter=get_resid, include=["1"]
... )
>>> for line in modified_lines:
...     print(line)
ATOM      1  N   ALA A   1       ...
ATOM      2  CA  ALA A   1       ...
ATOM      3  C   MET A   2       ...

`run_replace_resnames(pdb_path, resname_map, output_path=None, fn_filter=None, include=None, exclude=None)` ¶

Replaces multiple residue names in a PDB file based on a provided mapping.

This function reads a PDB file, iterates through a dictionary that maps original residue names to new residue names, and applies the replacement using the replace_residue_names function for each mapping. The modified PDB lines are then either returned or written to a new file. Optional filtering based on a function and inclusion/exclusion lists can be applied during the replacement process for each residue name in the map.

PARAMETER	DESCRIPTION
`pdb_path`	The path to the input PDB file. TYPE: `str`
`resname_map`	A dictionary where the keys are the original residue names to be replaced, and the values are the corresponding new residue names. TYPE: `dict[str, str]`
`output_path`	The path to save the new PDB file with the replaced residue names. If `None`, no file is written, and the modified PDB lines are returned. TYPE: `str \| None` DEFAULT: `None`
`fn_filter`	An optional callable function that takes a PDB line as input and returns a string for filtering during each residue name replacement. TYPE: `Callable[[str], str] \| None` DEFAULT: `None`
`include`	An optional list of strings. Only lines where the result of `fn_filter` is in this list will have their residue names replaced for each mapping in `resname_map`. Defaults to `None`. TYPE: `list[str] \| None` DEFAULT: `None`
`exclude`	An optional list of strings. Lines where the result of `fn_filter` is in this list will not have their residue names replaced for each mapping in `resname_map`. TYPE: `list[str] \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`list[str]`	A list of PDB lines with the residue names replaced according to the `resname_map`, subject to any provided filtering.

RAISES	DESCRIPTION
`FileNotFoundError`	If the specified `pdb_path` does not exist.
`IOError`	If there is an error reading the PDB file or writing to the output file.

Examples:

To replace all "MET" residues with "ALA" and all "GLU" residues with "ASP" in "input.pdb" and save the result to "output.pdb":

>>> resname_mapping = {"MET": "ALA", "GLU": "ASP"}
>>> modified_lines = run_replace_resnames(
...     "input.pdb", resname_mapping, output_path="output.pdb"
... )

To perform the same replacement but only for residues with ID "1":

>>> def get_resid(line):
...     return parse_resid(line).strip()
>>> resname_mapping = {"MET": "ALA", "GLU": "ASP"}
>>> modified_lines = run_replace_resnames(
...     "input.pdb",
...     resname_mapping,
...     output_path="filtered_output.pdb",
...     fn_filter=get_resid,
...     include=["1"],
... )

`run_unify_water_labels(pdb_path, atom_map=None, water_resname='WAT', water_atomnames=None, output_path=None)` ¶

Ensures that water molecule atom names are consistently labeled as 'O', 'H1', and 'H2'.

This function processes a PDB file to standardize the atom names of water molecules. It identifies water residues based on the water_resname and then renames their atoms to 'O' for oxygen and 'H1' and 'H2' for the two hydrogen atoms. The hydrogen atoms are assigned 'H1' and 'H2' based on their sequential appearance within each water residue in the PDB file.

PARAMETER	DESCRIPTION
`pdb_path`	The path to the input PDB file. TYPE: `str`
`atom_map`	A dictionary mapping the standard water atom names ('O', 'H1', 'H2') to the desired names. If `None`, it defaults to `{'O': 'O', 'H1': 'H1', 'H2': 'H2'}`. This allows for customization of the output atom names if needed. Defaults to `None`. TYPE: `dict[str, str] \| None` DEFAULT: `None`
`water_resname`	The residue name used to identify water molecules in the PDB file. TYPE: `str` DEFAULT: `'WAT'`
`water_atomnames`	A dictionary specifying the original atom names that should be considered as oxygen and hydrogen atoms of water. The keys should be 'O' and 'H', and the values should be iterables of possible atom names. If `None`, it defaults to `{'O': ['OW'], 'H': ['HW']}`. TYPE: `dict[str, Iterable[str]] \| None` DEFAULT: `None`
`output_path`	The path to save the new PDB file with the unified water atom labels. If `None`, no file is written, and the modified PDB lines are returned. TYPE: `str \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`Iterable[str]`	An iterable of PDB lines with the unified water atom labels.

RAISES	DESCRIPTION
`FileNotFoundError`	If the specified `pdb_path` does not exist.
`IOError`	If there is an error reading the PDB file or writing to the output file.

Warning

This function has not been thoroughly tested and might not handle all edge cases correctly. Use with caution.

Examples:

To unify water atom labels in "input.pdb" using the default settings and save to "unified_water.pdb":

>>> modified_lines = run_unify_water_labels(
...     "input.pdb", output_path="unified_water.pdb"
... )

To specify a different water residue name and atom name mapping:

>>> atom_mapping = {"O": "OXT", "H1": "HT1", "H2": "HT2"}
>>> original_water_names = {"O": ["SOL"], "H": ["HY"]}
>>> modified_lines = run_unify_water_labels(
...     "input.pdb",
...     atom_map=atom_mapping,
...     water_resname="SOL",
...     water_atomnames=original_water_names,
...     output_path="custom_water.pdb",
... )

Names

modify_lines(pdb_lines, fn_process, fn_args, fn_filter=None, include=None, exclude=None) ¶

replace_atom_names(pdb_lines, orig_atom_name, new_atom_name) ¶

replace_residue_names(pdb_lines, orig_resname, new_resname, fn_filter=None, include=None, exclude=None) ¶

run_replace_resnames(pdb_path, resname_map, output_path=None, fn_filter=None, include=None, exclude=None) ¶

run_unify_water_labels(pdb_path, atom_map=None, water_resname='WAT', water_atomnames=None, output_path=None) ¶

`modify_lines(pdb_lines, fn_process, fn_args, fn_filter=None, include=None, exclude=None)` ¶

`replace_atom_names(pdb_lines, orig_atom_name, new_atom_name)` ¶

`replace_residue_names(pdb_lines, orig_resname, new_resname, fn_filter=None, include=None, exclude=None)` ¶

`run_replace_resnames(pdb_path, resname_map, output_path=None, fn_filter=None, include=None, exclude=None)` ¶

`run_unify_water_labels(pdb_path, atom_map=None, water_resname='WAT', water_atomnames=None, output_path=None)` ¶