API Reference#
This section provides detailed documentation for the classes and methods available in the redeem_properties package.
redeem_properties#
Python bindings for the redeem-properties Rust crate, exposing peptide property prediction models for retention time (RT), collisional cross-section (CCS), and MS2 fragment intensities.
All three model classes delegate inference to the compiled Rust extension
(_lib). The predict_df convenience methods additionally return results
as a pandas or polars DataFrame.
Quick start#
>>> import redeem_properties as rp
>>>
>>> # Download pretrained models (only needed once)
>>> rp.download_pretrained_models()
>>>
>>> rt_model = rp.RTModel.from_pretrained("rt")
>>> ccs_model = rp.CCSModel.from_pretrained("ccs")
>>> ms2_model = rp.MS2Model.from_pretrained("ms2")
>>>
>>> # numpy arrays / list[dict]
>>> rt_values = rt_model.predict(["PEPTIDE", "SEQU[+42.0106]ENCE"])
>>> ccs_results = ccs_model.predict(["PEPTIDE"], charges=[2])
>>> ms2_results = ms2_model.predict(["PEPTIDE"], charges=[2], nces=[20])
>>>
>>> # pandas DataFrames
>>> rt_df = rt_model.predict_df(["PEPTIDE", "SEQU[+42.0106]ENCE"])
>>> ccs_df = ccs_model.predict_df(["PEPTIDE"], charges=[2])
>>> ms2_df = ms2_model.predict_df(["PEPTIDE"], charges=[2], nces=[20])
- class redeem_properties.CCSModel(model_path: str, arch: str, constants_path: str | None = None, use_cuda: bool = False)[source]#
Bases:
objectCollisional cross-section prediction model.
- Parameters:
model_path – Path to the
.pthmodel weights file.arch – Model architecture string (e.g.
"ccs_cnn_lstm").constants_path – Path to the
.yamlconstants file (required).use_cuda – Whether to run inference on GPU. Default
False.
- classmethod from_pretrained(name: str, use_cuda: bool = False) CCSModel[source]#
Load a CCSModel from the shipped pretrained weights.
Accepted name values (case-insensitive):
"ccs","alphapeptdeep-ccs-cnn-lstm","redeem-ccs-cnn-tf".
- predict(peptides: list[str], charges: int | list[int])[source]#
Predict CCS values for a list of peptides.
- Parameters:
peptides – List of peptide sequences (inline modifications supported).
charges – Charge state per peptide. If a single integer is provided, it is broadcast to all peptides. If a list of charges is provided and its length differs from the number of peptides, a Cartesian product is performed (predicting each peptide at each charge state).
- Returns:
One dict per peptide with keys:
"ccs"– predicted CCS value (Ų)."charge"– charge state used for the prediction.
- Return type:
list[dict]
- predict_df(peptides: list[str], charges: int | list[int], annotate_mobility: bool = False, framework: str = 'pandas')[source]#
Predict CCS values and return the result as a DataFrame.
- Parameters:
peptides – List of peptide sequences (inline modifications supported).
charges – Charge state per peptide. If a single integer is provided, it is broadcast to all peptides. If a list of charges is provided and its length differs from the number of peptides, a Cartesian product is performed (predicting each peptide at each charge state).
annotate_mobility – If
True, compute and append anion_mobilitycolumn converted from the predicted CCS value. DefaultFalse.framework –
'pandas'(default) or'polars'.
- Returns:
Columns:
peptide(str),ccs(float32),charge(int), and optionallyion_mobility(float).- Return type:
pandas.DataFrame or polars.DataFrame
- class redeem_properties.MS2Model(model_path: str, arch: str, constants_path: str | None = None, use_cuda: bool = False)[source]#
Bases:
objectMS2 fragment intensity prediction model.
- Parameters:
model_path – Path to the
.pthmodel weights file.arch – Model architecture string (e.g.
"ms2_bert").constants_path – Path to the
.yamlconstants file (required).use_cuda – Whether to run inference on GPU. Default
False.
- classmethod from_pretrained(name: str, use_cuda: bool = False) MS2Model[source]#
Load an MS2Model from the shipped pretrained weights.
Accepted name values (case-insensitive):
"ms2","alphapeptdeep-ms2-bert".
- predict(peptides: list[str], charges: int | list[int], nces: int | float | list[int] | list[float], instruments: str | list[str | None] | None = None, multiplier: float = 10000.0)[source]#
Predict MS2 fragment intensities for a list of peptides.
- Parameters:
peptides – List of peptide sequences (inline modifications supported).
charges – Charge state per peptide. If a single integer is provided, it is broadcast to all peptides. If a list of charges is provided and its length differs from the number of peptides, a Cartesian product is performed (predicting each peptide at each charge state).
nces – Normalized collision energy per peptide. Can be a single value (broadcast to all) or a list matching the expanded length.
instruments – Instrument name per peptide (optional). Can be a single string (broadcast to all) or a list matching the expanded length.
multiplier – Scalar to multiply predicted intensities by (default 10_000.0). Use e.g.
10000.0to scale normalized outputs into typical intensity ranges.
- Returns:
One dict per peptide with keys:
"intensities"– 2-D float32 array(n_positions, 8)."ion_types"– list of 8 ion-type strings."ion_charges"– list of 8 fragment charge integers."b_ordinals"– 1-D int array[1, …, n_positions]."y_ordinals"– 1-D int array[n_positions, …, 1].
- Return type:
list[dict]
- predict_df(peptides: list[str], charges: int | list[int], nces: int | float | list[int] | list[float], instruments: str | list[str | None] | None = None, multiplier: float = 10000.0, exclude_zeros: bool = True, annotate_mz: bool = False, framework: str = 'pandas')[source]#
Predict MS2 fragment intensities and return a long-format DataFrame.
Each row represents one (peptide, ion_type, fragment_charge, ordinal) combination.
- Parameters:
peptides – List of peptide sequences (inline modifications supported).
charges – Precursor charge state per peptide. If a single integer is provided, it is broadcast to all peptides. If a list of charges is provided and its length differs from the number of peptides, a Cartesian product is performed (predicting each peptide at each charge state).
nces – Normalized collision energy per peptide. Can be a single value (broadcast to all) or a list matching the expanded length.
instruments – Instrument name per peptide (optional). Can be a single string (broadcast to all) or a list matching the expanded length.
multiplier – Scalar to multiply predicted intensities by (default 10_000.0). Use e.g.
10000.0to scale normalized outputs into typical intensity ranges.exclude_zeros – If True, exclude rows where all predicted intensities are zero.
annotate_mz – If
True, append amzcolumn with the theoretical monoisotopic m/z for each fragment ion (computed via rustyms). Neutral-loss ions (b_nl,y_nl) receiveNaN. DefaultFalse.framework –
'pandas'(default) or'polars'.
- Returns:
Columns:
peptide,ion_type,fragment_charge,ordinal,intensity, and optionallymz.- Return type:
pandas.DataFrame or polars.DataFrame
Example
>>> df = ms2_model.predict_df( ... ["AGHCEWQMKYR"], ... charges=[2], nces=[20], instruments=["QE"], ... ) >>> df.head() peptide ion_type fragment_charge ordinal intensity 0 AGHCEWQMKYR b 1 1 0.123 1 AGHCEWQMKYR b 2 1 0.045 ...
- class redeem_properties.PropertyPrediction(rt_model: RTModel | None = None, ccs_model: CCSModel | None = None, ms2_model: MS2Model | None = None, *, predict_rt: bool = True, predict_ccs: bool = True, predict_ms2: bool = True, use_cuda: bool = False)[source]#
Bases:
objectUnified peptide property predictor combining RT, CCS, and MS2 models.
Each model is optional. When a model is
Noneits columns are omitted from the output. By default the constructor loads the shipped pretrained weights for all three models; passpredict_rt=False,predict_ccs=False, orpredict_ms2=Falseto skip a model entirely.- Parameters:
rt_model – An
RTModelinstance, orNoneto skip RT prediction. Ignored when predict_rt isFalse.ccs_model – A
CCSModelinstance, orNoneto skip CCS prediction. Ignored when predict_ccs isFalse.ms2_model – An
MS2Modelinstance, orNoneto skip MS2 prediction. Ignored when predict_ms2 isFalse.predict_rt – Whether to include retention-time predictions. Default
True.predict_ccs – Whether to include CCS predictions. Default
True.predict_ms2 – Whether to include MS2 fragment-intensity predictions. Default
True.use_cuda – Forwarded to
from_pretrainedwhen constructing default models. DefaultFalse.
Examples
>>> import redeem_properties as rp >>> prop = rp.PropertyPrediction() # all three pretrained models >>> df = prop.predict_df( ... ["PEPTIDE", "AGHCEWQMKYR"], ... charges=[2, 2], nces=[20, 20], instruments=["QE", "QE"], ... ) >>> df.columns.tolist() ['peptide', 'charge', 'nce', 'instrument', 'rt', 'ccs', 'ion_type', 'fragment_charge', 'ordinal', 'intensity']
Only RT + CCS (skip MS2):
>>> prop = rp.PropertyPrediction(predict_ms2=False) >>> df = prop.predict_df(["PEPTIDE"], charges=[2])
- predict(peptides: list[str], charges: int | list[int] | None = None, nces: int | float | list[int] | list[float] | None = None, instruments: str | list[str | None] | None = None, multiplier: float = 10000.0) dict[source]#
Run enabled models and return raw results in a dict.
- Parameters:
peptides – List of peptide sequences (inline modifications supported).
charges – Charge state per peptide (required for CCS and MS2). If a single integer is provided, it is broadcast to all peptides. If a list of charges is provided and its length differs from the number of peptides, a Cartesian product is performed.
nces – Normalized collision energy per peptide (required for MS2). Can be a single value (broadcast to all) or a list matching the expanded length.
instruments – Instrument name per peptide (optional, used by MS2). Can be a single string (broadcast to all) or a list matching the expanded length.
multiplier – Scalar applied to MS2 predicted intensities (default 10 000).
- Returns:
Keys that may be present:
"rt"(1-D ndarray),"ccs"(list[dict]),"ms2"(list[dict]).- Return type:
dict
- predict_df(peptides: list[str], charges: int | list[int] | None = None, nces: int | float | list[int] | list[float] | None = None, instruments: str | list[str | None] | None = None, multiplier: float = 10000.0, exclude_zeros: bool = True, annotate_mz: bool = True, annotate_mobility: bool = False, framework: str = 'pandas')[source]#
Predict all enabled properties and return a single long-format DataFrame.
When MS2 is enabled every fragment row is emitted; the scalar RT and CCS values are broadcast (repeated) across those rows so that each row is fully self-contained.
When MS2 is disabled the DataFrame contains one row per peptide with only the scalar columns that are enabled.
- Parameters:
peptides – List of peptide sequences (inline modifications supported).
charges – Charge state per peptide (required for CCS and MS2). If a single integer is provided, it is broadcast to all peptides. If a list of charges is provided and its length differs from the number of peptides, a Cartesian product is performed.
nces – Normalized collision energy per peptide (required for MS2). Can be a single value (broadcast to all) or a list matching the expanded length.
instruments – Instrument name per peptide (optional, used by MS2). Can be a single string (broadcast to all) or a list matching the expanded length.
multiplier – Scalar applied to MS2 predicted intensities (default 10 000).
exclude_zeros – If
True, individual zero-intensity fragment rows are dropped.annotate_mz – If
True(default), compute and add m/z columns. When MS2 is enabled aprecursor_mzcolumn and a per-fragmentmzcolumn are added. When MS2 is disabled onlyprecursor_mzis added. Requires charges to be provided.annotate_mobility – If
True, compute and append anion_mobilitycolumn converted from the predicted CCS value. Requires charges and the CCS model to be enabled. DefaultFalse.framework –
'pandas'(default) or'polars'.
- Returns:
Possible columns (depending on which models are enabled):
peptide,charge,nce,instrument,rt,ccs,ion_mobility,precursor_mz,ion_type,fragment_charge,ordinal,intensity,mz.- Return type:
pandas.DataFrame or polars.DataFrame
- class redeem_properties.RTModel(model_path: str, arch: str, constants_path: str | None = None, use_cuda: bool = False)[source]#
Bases:
objectRetention time prediction model.
- Parameters:
model_path – Path to the
.pthmodel weights file.arch – Model architecture string (e.g.
"rt_cnn_lstm").constants_path – Optional path to the
.yamlconstants file.use_cuda – Whether to run inference on GPU (requires CUDA build). Default
False.
- classmethod from_pretrained(name: str, use_cuda: bool = False) RTModel[source]#
Load an RTModel from the shipped pretrained weights.
Accepted name values (case-insensitive):
"rt","alphapeptdeep-rt-cnn-lstm","redeem-rt-cnn-tf".- Parameters:
name – Pretrained model identifier.
use_cuda – Whether to run inference on GPU. Default
False.
- param_count() int[source]#
Return total number of parameters in the loaded model (if available).
This delegates to the compiled Rust extension when present. If the underlying extension does not expose a param_count method an AttributeError is raised.
- predict(peptides: list[str])[source]#
Predict retention times for a list of peptides.
Peptides may contain inline modification annotations (
[+X.X]mass-shift or(UniMod:N)notation).- Parameters:
peptides – List of peptide sequences.
- Returns:
1-D float32 array of predicted RT values, one per peptide.
- Return type:
numpy.ndarray
- predict_df(peptides: list[str], framework: str = 'pandas')[source]#
Predict retention times and return the result as a DataFrame.
- Parameters:
peptides – List of peptide sequences (inline modifications supported).
framework –
'pandas'(default) or'polars'.
- Returns:
Columns:
peptide(str),rt(float32).- Return type:
pandas.DataFrame or polars.DataFrame
- redeem_properties.ccs_to_mobility(ccs_value, charge, precursor_mz)#
Convert CCS to ion mobility for Bruker (timsTOF) instruments.
- redeem_properties.compute_fragment_mzs(proforma_sequence, max_fragment_charge)#
Compute theoretical product (fragment) ion m/z values for a peptide.
Generates b and y ion m/z values (without neutral losses) up to the specified maximum fragment charge.
- Parameters:
proforma_sequence (str) – Peptide sequence in ProForma notation.
max_fragment_charge (int) – Maximum fragment ion charge state to generate.
- Returns:
A dictionary with keys: - “ion_types”: list of str (“b” or “y”) - “charges”: list of int - “ordinals”: list of int (1-based series number) - “mzs”: list of float (monoisotopic m/z values)
- Return type:
dict
- redeem_properties.compute_peptide_mz_info(proforma_sequence, charge, max_fragment_charge)#
Compute both precursor and fragment m/z values for a peptide.
- Parameters:
proforma_sequence (str) – Peptide sequence in ProForma notation.
charge (int) – Precursor charge state.
max_fragment_charge (int) – Maximum fragment ion charge state to generate.
- Returns:
A dictionary with keys: - “precursor_mz”: float - “ion_types”: list of str - “charges”: list of int - “ordinals”: list of int - “mzs”: list of float
- Return type:
dict
- redeem_properties.compute_precursor_mz(proforma_sequence, charge)#
Compute the precursor m/z for a peptide in ProForma notation.
- Parameters:
proforma_sequence (str) – Peptide sequence in ProForma notation (e.g., “PEPTM[+15.9949]IDE”).
charge (int) – Precursor charge state (must be > 0).
- Returns:
The monoisotopic precursor m/z value.
- Return type:
float
- redeem_properties.download_pretrained_models()#
Download pretrained models from the GitHub release and extract them locally.
This fetches the pretrained model archive from the redeem GitHub releases page, validates the download, and extracts the model files so they can be used by
RTModel.from_pretrained(),CCSModel.from_pretrained(), andMS2Model.from_pretrained().The models are downloaded to a stable user-local directory (or
REDEEM_PRETRAINED_MODELS_DIRwhen set), so they are reusable across working directories. If the models already exist, this function returns immediately without re-downloading.- Returns:
The absolute path to the extracted pretrained models directory.
- Return type:
str
- Raises:
RuntimeError – If the download fails, the archive is corrupted, or extraction fails.
Example
>>> import redeem_properties as rp >>> models_dir = rp.download_pretrained_models() >>> print(f"Models available at: {models_dir}") >>> rt_model = rp.RTModel.from_pretrained("rt")
- redeem_properties.locate_pretrained(name)#
Locate the absolute path to a pretrained model file on disk.
This function searches the pretrained model registry for the given model name. It checks the following locations in order:
The directory specified by the REDEEM_PRETRAINED_MODELS_DIR environment variable.
The user’s local data directory (e.g., ~/.local/share/redeem/pretrained_models/ on Linux).
The data/pretrained_models/ directory in development locations.
- Parameters:
name (str) – The identifier of the pretrained model (e.g., “rt”, “ccs”, “ms2”).
- Returns:
The absolute path to the located model file.
- Return type:
str
- Raises:
RuntimeError – If the model name is invalid or the file cannot be found.
- redeem_properties.match_fragment_mzs(proforma_sequence, max_fragment_charge, predicted_ion_types, predicted_charges, predicted_ordinals)#
Match theoretical product ion m/z values to predicted fragment annotations.
Given the predicted ion types, charges, and ordinals from the MS2 model, look up the corresponding theoretical m/z for each.
- Parameters:
proforma_sequence (str) – Peptide sequence in ProForma notation.
max_fragment_charge (int) – Maximum fragment charge used to generate theoretical fragments.
predicted_ion_types (list of str) – Ion types from MS2 prediction (e.g., [“b”, “y”, “b”, “y”]).
predicted_charges (list of int) – Charges from MS2 prediction.
predicted_ordinals (list of int) – Ordinals (series numbers) from MS2 prediction.
- Returns:
m/z values aligned with the predicted arrays. NaN if no match found.
- Return type:
list of float