Main

Standardizes residue ID numbering in PDB files.

`run_unify_numbering(pdb_path, output_path=None, reset_initial_resid=True)` ¶

Unifies the atom and residue numbering within a PDB file, ensuring sequential and consistent IDs.

This function reads a PDB file and renumbers the atom serial numbers and residue IDs to be sequential, starting from 1 for the first atom and the first residue encountered. It also handles chain identifiers, incrementing them upon encountering a "TER" record if reset_initial_resid is True. Duplicate "TER" statements are removed, and "ENDMDL" records trigger a reset of atom and residue numbering, as well as the chain identifier.

PARAMETER	DESCRIPTION
`pdb_path`	The path to the input PDB file. TYPE: `str`
`output_path`	The path to save the new PDB file with unified numbering. If `None`, no file is written, and the modified PDB lines are returned. Defaults to `None`. TYPE: `str \| None` DEFAULT: `None`
`reset_initial_resid`	If `True` (default), the residue numbering will start from 1 for the first residue in each chain. If `False`, the initial residue ID will be based on the original numbering in the PDB file for the first chain, and subsequent chains will continue sequentially. TYPE: `bool` DEFAULT: `True`

RETURNS	DESCRIPTION
`Iterable[str]`	An iterable of strings, where each string is a line from the PDB file with the unified atom and residue numbering.

RAISES	DESCRIPTION
`FileNotFoundError`	If the specified `pdb_path` does not exist.
`IOError`	If there is an error reading the PDB file or writing to the output file.

Notes

The function iterates through the PDB lines, tracking the current residue and chain IDs.
When a "TER" record is encountered, it signifies the end of a chain, and the chain ID is incremented if reset_initial_resid is True.
"ENDMDL" records indicate the start of a new model, and all numbering is reset.
Atom serial numbers are simply incremented sequentially.
Residue IDs are unified within each chain, potentially resetting to 1 at the start of a new chain.

Examples:

To unify the numbering in "input.pdb" and save it to "output.pdb":

>>> unified_lines = run_unify_numbering("input.pdb", output_path="output.pdb")

To unify the numbering but keep the initial residue ID of the first chain:

>>> unified_lines = run_unify_numbering(
...     "input.pdb", output_path="output.pdb", reset_initial_resid=False
... )

To unify the numbering and only get the lines without saving to a file:

>>> unified_lines = run_unify_numbering("input.pdb")
>>> for line in unified_lines:
...     print(line.strip())

Main

run_unify_numbering(pdb_path, output_path=None, reset_initial_resid=True) ¶

`run_unify_numbering(pdb_path, output_path=None, reset_initial_resid=True)` ¶