API Reference#

This section provides detailed documentation for the classes and methods available in the redeem_properties package.

redeem_properties#

Python bindings for the redeem-properties Rust crate, exposing peptide property prediction models for retention time (RT), collisional cross-section (CCS), and MS2 fragment intensities.

All three model classes delegate inference to the compiled Rust extension (_lib). The predict_df convenience methods additionally return results as a pandas or polars DataFrame.

Quick start#

>>> import redeem_properties as rp
>>>
>>> # Download pretrained models (only needed once)
>>> rp.download_pretrained_models()
>>>
>>> rt_model  = rp.RTModel.from_pretrained("rt")
>>> ccs_model = rp.CCSModel.from_pretrained("ccs")
>>> ms2_model = rp.MS2Model.from_pretrained("ms2")
>>>
>>> # numpy arrays / list[dict]
>>> rt_values   = rt_model.predict(["PEPTIDE", "SEQU[+42.0106]ENCE"])
>>> ccs_results = ccs_model.predict(["PEPTIDE"], charges=[2])
>>> ms2_results = ms2_model.predict(["PEPTIDE"], charges=[2], nces=[20])
>>>
>>> # pandas DataFrames
>>> rt_df  = rt_model.predict_df(["PEPTIDE", "SEQU[+42.0106]ENCE"])
>>> ccs_df = ccs_model.predict_df(["PEPTIDE"], charges=[2])
>>> ms2_df = ms2_model.predict_df(["PEPTIDE"], charges=[2], nces=[20])
class redeem_properties.CCSModel(model_path: str, arch: str, constants_path: str | None = None, use_cuda: bool = False)[source]#

Bases: object

Collisional cross-section prediction model.

Parameters:
  • model_path – Path to the .pth model weights file.

  • arch – Model architecture string (e.g. "ccs_cnn_lstm").

  • constants_path – Path to the .yaml constants file (required).

  • use_cuda – Whether to run inference on GPU. Default False.

classmethod from_pretrained(name: str, use_cuda: bool = False) CCSModel[source]#

Load a CCSModel from the shipped pretrained weights.

Accepted name values (case-insensitive): "ccs", "alphapeptdeep-ccs-cnn-lstm", "redeem-ccs-cnn-tf".

param_count() int[source]#

Return total number of parameters in the loaded model (if available).

predict(peptides: list[str], charges: int | list[int])[source]#

Predict CCS values for a list of peptides.

Parameters:
  • peptides – List of peptide sequences (inline modifications supported).

  • charges – Charge state per peptide. If a single integer is provided, it is broadcast to all peptides. If a list of charges is provided and its length differs from the number of peptides, a Cartesian product is performed (predicting each peptide at each charge state).

Returns:

One dict per peptide with keys:

  • "ccs" – predicted CCS value (Ų).

  • "charge" – charge state used for the prediction.

Return type:

list[dict]

predict_df(peptides: list[str], charges: int | list[int], annotate_mobility: bool = False, framework: str = 'pandas')[source]#

Predict CCS values and return the result as a DataFrame.

Parameters:
  • peptides – List of peptide sequences (inline modifications supported).

  • charges – Charge state per peptide. If a single integer is provided, it is broadcast to all peptides. If a list of charges is provided and its length differs from the number of peptides, a Cartesian product is performed (predicting each peptide at each charge state).

  • annotate_mobility – If True, compute and append an ion_mobility column converted from the predicted CCS value. Default False.

  • framework'pandas' (default) or 'polars'.

Returns:

Columns: peptide (str), ccs (float32), charge (int), and optionally ion_mobility (float).

Return type:

pandas.DataFrame or polars.DataFrame

summary() str[source]#

Return a model summary string delegated to the Rust extension.

Prefers the pretty hierarchical summary when available.

class redeem_properties.MS2Model(model_path: str, arch: str, constants_path: str | None = None, use_cuda: bool = False)[source]#

Bases: object

MS2 fragment intensity prediction model.

Parameters:
  • model_path – Path to the .pth model weights file.

  • arch – Model architecture string (e.g. "ms2_bert").

  • constants_path – Path to the .yaml constants file (required).

  • use_cuda – Whether to run inference on GPU. Default False.

classmethod from_pretrained(name: str, use_cuda: bool = False) MS2Model[source]#

Load an MS2Model from the shipped pretrained weights.

Accepted name values (case-insensitive): "ms2", "alphapeptdeep-ms2-bert".

param_count() int[source]#

Return total number of parameters in the loaded model (if available).

predict(peptides: list[str], charges: int | list[int], nces: int | float | list[int] | list[float], instruments: str | list[str | None] | None = None, multiplier: float = 10000.0)[source]#

Predict MS2 fragment intensities for a list of peptides.

Parameters:
  • peptides – List of peptide sequences (inline modifications supported).

  • charges – Charge state per peptide. If a single integer is provided, it is broadcast to all peptides. If a list of charges is provided and its length differs from the number of peptides, a Cartesian product is performed (predicting each peptide at each charge state).

  • nces – Normalized collision energy per peptide. Can be a single value (broadcast to all) or a list matching the expanded length.

  • instruments – Instrument name per peptide (optional). Can be a single string (broadcast to all) or a list matching the expanded length.

  • multiplier – Scalar to multiply predicted intensities by (default 10_000.0). Use e.g. 10000.0 to scale normalized outputs into typical intensity ranges.

Returns:

One dict per peptide with keys:

  • "intensities" – 2-D float32 array (n_positions, 8).

  • "ion_types" – list of 8 ion-type strings.

  • "ion_charges" – list of 8 fragment charge integers.

  • "b_ordinals" – 1-D int array [1, …, n_positions].

  • "y_ordinals" – 1-D int array [n_positions, …, 1].

Return type:

list[dict]

predict_df(peptides: list[str], charges: int | list[int], nces: int | float | list[int] | list[float], instruments: str | list[str | None] | None = None, multiplier: float = 10000.0, exclude_zeros: bool = True, annotate_mz: bool = False, framework: str = 'pandas')[source]#

Predict MS2 fragment intensities and return a long-format DataFrame.

Each row represents one (peptide, ion_type, fragment_charge, ordinal) combination.

Parameters:
  • peptides – List of peptide sequences (inline modifications supported).

  • charges – Precursor charge state per peptide. If a single integer is provided, it is broadcast to all peptides. If a list of charges is provided and its length differs from the number of peptides, a Cartesian product is performed (predicting each peptide at each charge state).

  • nces – Normalized collision energy per peptide. Can be a single value (broadcast to all) or a list matching the expanded length.

  • instruments – Instrument name per peptide (optional). Can be a single string (broadcast to all) or a list matching the expanded length.

  • multiplier – Scalar to multiply predicted intensities by (default 10_000.0). Use e.g. 10000.0 to scale normalized outputs into typical intensity ranges.

  • exclude_zeros – If True, exclude rows where all predicted intensities are zero.

  • annotate_mz – If True, append a mz column with the theoretical monoisotopic m/z for each fragment ion (computed via rustyms). Neutral-loss ions (b_nl, y_nl) receive NaN. Default False.

  • framework'pandas' (default) or 'polars'.

Returns:

Columns: peptide, ion_type, fragment_charge, ordinal, intensity, and optionally mz.

Return type:

pandas.DataFrame or polars.DataFrame

Example

>>> df = ms2_model.predict_df(
...     ["AGHCEWQMKYR"],
...     charges=[2], nces=[20], instruments=["QE"],
... )
>>> df.head()
   peptide ion_type  fragment_charge  ordinal  intensity
0  AGHCEWQMKYR       b                1        1      0.123
1  AGHCEWQMKYR       b                2        1      0.045
...
summary() str[source]#

Return a model summary string delegated to the Rust extension.

Prefers the pretty hierarchical summary when available.

class redeem_properties.PropertyPrediction(rt_model: RTModel | None = None, ccs_model: CCSModel | None = None, ms2_model: MS2Model | None = None, *, predict_rt: bool = True, predict_ccs: bool = True, predict_ms2: bool = True, use_cuda: bool = False)[source]#

Bases: object

Unified peptide property predictor combining RT, CCS, and MS2 models.

Each model is optional. When a model is None its columns are omitted from the output. By default the constructor loads the shipped pretrained weights for all three models; pass predict_rt=False, predict_ccs=False, or predict_ms2=False to skip a model entirely.

Parameters:
  • rt_model – An RTModel instance, or None to skip RT prediction. Ignored when predict_rt is False.

  • ccs_model – A CCSModel instance, or None to skip CCS prediction. Ignored when predict_ccs is False.

  • ms2_model – An MS2Model instance, or None to skip MS2 prediction. Ignored when predict_ms2 is False.

  • predict_rt – Whether to include retention-time predictions. Default True.

  • predict_ccs – Whether to include CCS predictions. Default True.

  • predict_ms2 – Whether to include MS2 fragment-intensity predictions. Default True.

  • use_cuda – Forwarded to from_pretrained when constructing default models. Default False.

Examples

>>> import redeem_properties as rp
>>> prop = rp.PropertyPrediction()          # all three pretrained models
>>> df = prop.predict_df(
...     ["PEPTIDE", "AGHCEWQMKYR"],
...     charges=[2, 2], nces=[20, 20], instruments=["QE", "QE"],
... )
>>> df.columns.tolist()
['peptide', 'charge', 'nce', 'instrument', 'rt', 'ccs',
 'ion_type', 'fragment_charge', 'ordinal', 'intensity']

Only RT + CCS (skip MS2):

>>> prop = rp.PropertyPrediction(predict_ms2=False)
>>> df = prop.predict_df(["PEPTIDE"], charges=[2])
predict(peptides: list[str], charges: int | list[int] | None = None, nces: int | float | list[int] | list[float] | None = None, instruments: str | list[str | None] | None = None, multiplier: float = 10000.0) dict[source]#

Run enabled models and return raw results in a dict.

Parameters:
  • peptides – List of peptide sequences (inline modifications supported).

  • charges – Charge state per peptide (required for CCS and MS2). If a single integer is provided, it is broadcast to all peptides. If a list of charges is provided and its length differs from the number of peptides, a Cartesian product is performed.

  • nces – Normalized collision energy per peptide (required for MS2). Can be a single value (broadcast to all) or a list matching the expanded length.

  • instruments – Instrument name per peptide (optional, used by MS2). Can be a single string (broadcast to all) or a list matching the expanded length.

  • multiplier – Scalar applied to MS2 predicted intensities (default 10 000).

Returns:

Keys that may be present: "rt" (1-D ndarray), "ccs" (list[dict]), "ms2" (list[dict]).

Return type:

dict

predict_df(peptides: list[str], charges: int | list[int] | None = None, nces: int | float | list[int] | list[float] | None = None, instruments: str | list[str | None] | None = None, multiplier: float = 10000.0, exclude_zeros: bool = True, annotate_mz: bool = True, annotate_mobility: bool = False, framework: str = 'pandas')[source]#

Predict all enabled properties and return a single long-format DataFrame.

When MS2 is enabled every fragment row is emitted; the scalar RT and CCS values are broadcast (repeated) across those rows so that each row is fully self-contained.

When MS2 is disabled the DataFrame contains one row per peptide with only the scalar columns that are enabled.

Parameters:
  • peptides – List of peptide sequences (inline modifications supported).

  • charges – Charge state per peptide (required for CCS and MS2). If a single integer is provided, it is broadcast to all peptides. If a list of charges is provided and its length differs from the number of peptides, a Cartesian product is performed.

  • nces – Normalized collision energy per peptide (required for MS2). Can be a single value (broadcast to all) or a list matching the expanded length.

  • instruments – Instrument name per peptide (optional, used by MS2). Can be a single string (broadcast to all) or a list matching the expanded length.

  • multiplier – Scalar applied to MS2 predicted intensities (default 10 000).

  • exclude_zeros – If True, individual zero-intensity fragment rows are dropped.

  • annotate_mz – If True (default), compute and add m/z columns. When MS2 is enabled a precursor_mz column and a per-fragment mz column are added. When MS2 is disabled only precursor_mz is added. Requires charges to be provided.

  • annotate_mobility – If True, compute and append an ion_mobility column converted from the predicted CCS value. Requires charges and the CCS model to be enabled. Default False.

  • framework'pandas' (default) or 'polars'.

Returns:

Possible columns (depending on which models are enabled): peptide, charge, nce, instrument, rt, ccs, ion_mobility, precursor_mz, ion_type, fragment_charge, ordinal, intensity, mz.

Return type:

pandas.DataFrame or polars.DataFrame

class redeem_properties.RTModel(model_path: str, arch: str, constants_path: str | None = None, use_cuda: bool = False)[source]#

Bases: object

Retention time prediction model.

Parameters:
  • model_path – Path to the .pth model weights file.

  • arch – Model architecture string (e.g. "rt_cnn_lstm").

  • constants_path – Optional path to the .yaml constants file.

  • use_cuda – Whether to run inference on GPU (requires CUDA build). Default False.

classmethod from_pretrained(name: str, use_cuda: bool = False) RTModel[source]#

Load an RTModel from the shipped pretrained weights.

Accepted name values (case-insensitive): "rt", "alphapeptdeep-rt-cnn-lstm", "redeem-rt-cnn-tf".

Parameters:
  • name – Pretrained model identifier.

  • use_cuda – Whether to run inference on GPU. Default False.

param_count() int[source]#

Return total number of parameters in the loaded model (if available).

This delegates to the compiled Rust extension when present. If the underlying extension does not expose a param_count method an AttributeError is raised.

predict(peptides: list[str])[source]#

Predict retention times for a list of peptides.

Peptides may contain inline modification annotations ([+X.X] mass-shift or (UniMod:N) notation).

Parameters:

peptides – List of peptide sequences.

Returns:

1-D float32 array of predicted RT values, one per peptide.

Return type:

numpy.ndarray

predict_df(peptides: list[str], framework: str = 'pandas')[source]#

Predict retention times and return the result as a DataFrame.

Parameters:
  • peptides – List of peptide sequences (inline modifications supported).

  • framework'pandas' (default) or 'polars'.

Returns:

Columns: peptide (str), rt (float32).

Return type:

pandas.DataFrame or polars.DataFrame

summary() str[source]#

Return a compact/detailed model summary string delegated to the Rust extension.

Prefer the detailed Rust-side summary when available.

redeem_properties.ccs_to_mobility(ccs_value, charge, precursor_mz)#

Convert CCS to ion mobility for Bruker (timsTOF) instruments.

redeem_properties.compute_fragment_mzs(proforma_sequence, max_fragment_charge)#

Compute theoretical product (fragment) ion m/z values for a peptide.

Generates b and y ion m/z values (without neutral losses) up to the specified maximum fragment charge.

Parameters:
  • proforma_sequence (str) – Peptide sequence in ProForma notation.

  • max_fragment_charge (int) – Maximum fragment ion charge state to generate.

Returns:

A dictionary with keys: - “ion_types”: list of str (“b” or “y”) - “charges”: list of int - “ordinals”: list of int (1-based series number) - “mzs”: list of float (monoisotopic m/z values)

Return type:

dict

redeem_properties.compute_peptide_mz_info(proforma_sequence, charge, max_fragment_charge)#

Compute both precursor and fragment m/z values for a peptide.

Parameters:
  • proforma_sequence (str) – Peptide sequence in ProForma notation.

  • charge (int) – Precursor charge state.

  • max_fragment_charge (int) – Maximum fragment ion charge state to generate.

Returns:

A dictionary with keys: - “precursor_mz”: float - “ion_types”: list of str - “charges”: list of int - “ordinals”: list of int - “mzs”: list of float

Return type:

dict

redeem_properties.compute_precursor_mz(proforma_sequence, charge)#

Compute the precursor m/z for a peptide in ProForma notation.

Parameters:
  • proforma_sequence (str) – Peptide sequence in ProForma notation (e.g., “PEPTM[+15.9949]IDE”).

  • charge (int) – Precursor charge state (must be > 0).

Returns:

The monoisotopic precursor m/z value.

Return type:

float

redeem_properties.download_pretrained_models()#

Download pretrained models from the GitHub release and extract them locally.

This fetches the pretrained model archive from the redeem GitHub releases page, validates the download, and extracts the model files so they can be used by RTModel.from_pretrained(), CCSModel.from_pretrained(), and MS2Model.from_pretrained().

The models are downloaded to a stable user-local directory (or REDEEM_PRETRAINED_MODELS_DIR when set), so they are reusable across working directories. If the models already exist, this function returns immediately without re-downloading.

Returns:

The absolute path to the extracted pretrained models directory.

Return type:

str

Raises:

RuntimeError – If the download fails, the archive is corrupted, or extraction fails.

Example

>>> import redeem_properties as rp
>>> models_dir = rp.download_pretrained_models()
>>> print(f"Models available at: {models_dir}")
>>> rt_model = rp.RTModel.from_pretrained("rt")
redeem_properties.locate_pretrained(name)#

Locate the absolute path to a pretrained model file on disk.

This function searches the pretrained model registry for the given model name. It checks the following locations in order:

  1. The directory specified by the REDEEM_PRETRAINED_MODELS_DIR environment variable.

  2. The user’s local data directory (e.g., ~/.local/share/redeem/pretrained_models/ on Linux).

  3. The data/pretrained_models/ directory in development locations.

Parameters:

name (str) – The identifier of the pretrained model (e.g., “rt”, “ccs”, “ms2”).

Returns:

The absolute path to the located model file.

Return type:

str

Raises:

RuntimeError – If the model name is invalid or the file cannot be found.

redeem_properties.match_fragment_mzs(proforma_sequence, max_fragment_charge, predicted_ion_types, predicted_charges, predicted_ordinals)#

Match theoretical product ion m/z values to predicted fragment annotations.

Given the predicted ion types, charges, and ordinals from the MS2 model, look up the corresponding theoretical m/z for each.

Parameters:
  • proforma_sequence (str) – Peptide sequence in ProForma notation.

  • max_fragment_charge (int) – Maximum fragment charge used to generate theoretical fragments.

  • predicted_ion_types (list of str) – Ion types from MS2 prediction (e.g., [“b”, “y”, “b”, “y”]).

  • predicted_charges (list of int) – Charges from MS2 prediction.

  • predicted_ordinals (list of int) – Ordinals (series numbers) from MS2 prediction.

Returns:

m/z values aligned with the predicted arrays. NaN if no match found.

Return type:

list of float