rdkit.Chem.rdRGroupDecomposition module¶
Module containing RGroupDecomposition classes and functions.
- class rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment¶
Bases:
enum
- MCS = rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.MCS¶
- NoAlignment = rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.NoAlignment¶
- None = rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.None¶
- names = {'MCS': rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.MCS, 'NoAlignment': rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.NoAlignment, 'None': rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.None}¶
- values = {0: rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.NoAlignment, 1: rdkit.Chem.rdRGroupDecomposition.RGroupCoreAlignment.MCS}¶
- rdkit.Chem.rdRGroupDecomposition.RGroupDecompose((AtomPairsParameters)cores, (AtomPairsParameters)mols[, (bool)asSmiles=False[, (bool)asRows=True[, (RGroupDecompositionParameters)options=<rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters object at 0x7802cabfdd80>]]]) object : ¶
- Decompose a collecion of molecules into their Rgroups
- ARGUMENTS:
- cores: a set of cores from most to least specific.
See RGroupDecompositionParameters for more details on how the cores can be labelled
mols: the molecules to be decomposed
asSmiles: if True return smiles strings, otherwise return molecules [default: False]
asRows: return the results as rows (default) otherwise return columns
RETURNS: row_or_column_results, unmatched
- Row structure:
rows[idx] = {rgroup_label: molecule_or_smiles}
- Column structure:
columns[rgroup_label] = [ mols_or_smiles ]
unmatched is a vector of indices in the input mols that were not matched.
- C++ signature :
boost::python::api::object RGroupDecompose(boost::python::api::object,boost::python::api::object [,bool=False [,bool=True [,RDKit::RGroupDecompositionParameters=<rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters object at 0x7802cabfdd80>]]])
- class rdkit.Chem.rdRGroupDecomposition.RGroupDecomposition((object)self, (AtomPairsParameters)cores)¶
Bases:
instance
RGroupDecompositionParameters controls how the RGroupDecomposition sets labelling and matches structures OPTIONS:
- RGroupCoreAlignment: can be one of RGroupCoreAlignment.None_ or RGroupCoreAlignment.MCS
If set to MCS, cores labels are mapped to each other using their Maximum common substructure overlap.
- RGroupLabels: optionally set where the rgroup labels to use are encoded.
RGroupLabels.IsotopeLabels - labels are stored on isotopes RGroupLabels.AtomMapLabels - labels are stored on atommaps RGroupLabels.MDLRGroupLabels - labels are stored on MDL R-groups RGroupLabels.DummyAtomLabels - labels are stored on dummy atoms RGroupLabels.AtomIndexLabels - use the atom index as the label RGroupLabels.RelabelDuplicateLabels - fix any duplicate labels RGroupLabels.AutoDetect - auto detect the label [default]
- Note: in all cases, any rgroups found on unlabelled atoms will be automatically
labelled.
- RGroupLabelling: choose where the rlabels are stored on the decomposition
RGroupLabelling.AtomMap - store rgroups as atom maps (for smiles) RGroupLabelling.Isotope - store rgroups on the isotope RGroupLabelling.MDLRGroup - store rgroups as mdl rgroups (for molblocks)
default: AtomMap | MDLRGroup
onlyMatchAtRGroups: only allow rgroup decomposition at the specified rgroups
removeAllHydrogenRGroups: remove all user-defined rgroups that only have hydrogens
removeAllHydrogenRGroupsAndLabels: remove all user-defined rgroups that only have hydrogens, and also remove the corresponding labels from the core
removeHydrogensPostMatch: remove all hydrogens from the output molecules
allowNonTerminalRGroups: allow labelled Rgroups of degree 2 or more
doTautomers: match all tautomers of a core against each input structure
doEnumeration: expand input cores into enumerated mol bundles
-allowMultipleRGroupsOnUnlabelled: permit more that one rgroup to be attached to an unlabelled core atom
Construct from a molecule or sequence of molecules
- C++ signature :
void __init__(_object*,boost::python::api::object)
- __init__( (object)self, (AtomPairsParameters)cores, (RGroupDecompositionParameters)params) -> None :
Construct from a molecule or sequence of molecules and a parameters object
- C++ signature :
void __init__(_object*,boost::python::api::object,RDKit::RGroupDecompositionParameters)
- Add((RGroupDecomposition)self, (Mol)mol) int : ¶
- C++ signature :
int Add(RDKit::RGroupDecompositionHelper {lvalue},RDKit::ROMol)
- GetMatchingCoreIdx((RGroupDecomposition)self, (Mol)mol[, (AtomPairsParameters)matches=None]) int : ¶
- C++ signature :
int GetMatchingCoreIdx(RDKit::RGroupDecompositionHelper {lvalue},RDKit::ROMol [,boost::python::api::object {lvalue}=None])
- GetRGroupLabels((RGroupDecomposition)self) list : ¶
Return the current list of found rgroups. Note, Process() should be called first
- C++ signature :
boost::python::list GetRGroupLabels(RDKit::RGroupDecompositionHelper {lvalue})
- GetRGroupsAsColumns((RGroupDecomposition)self[, (bool)asSmiles=False]) dict : ¶
- Return the rgroups as columns (note: can be fed directly into a pandas datatable)
- ARGUMENTS:
asSmiles: if True return smiles strings, otherwise return molecules [default: False]
- Column structure:
columns[rgroup_label] = [ mols_or_smiles ]
- C++ signature :
boost::python::dict GetRGroupsAsColumns(RDKit::RGroupDecompositionHelper {lvalue} [,bool=False])
- GetRGroupsAsRows((RGroupDecomposition)self[, (bool)asSmiles=False]) list : ¶
- Return the rgroups as rows (note: can be fed directly into a pandas datatable)
- ARGUMENTS:
asSmiles: if True return smiles strings, otherwise return molecules [default: False]
- Row structure:
rows[idx] = {rgroup_label: molecule_or_smiles}
- C++ signature :
boost::python::list GetRGroupsAsRows(RDKit::RGroupDecompositionHelper {lvalue} [,bool=False])
- Process((RGroupDecomposition)self) bool : ¶
Process the rgroups (must be done prior to GetRGroupsAsRows/Columns and GetRGroupLabels)
- C++ signature :
bool Process(RDKit::RGroupDecompositionHelper {lvalue})
- ProcessAndScore((RGroupDecomposition)self) tuple : ¶
Process the rgroups and returns the score (must be done prior to GetRGroupsAsRows/Columns and GetRGroupLabels)
- C++ signature :
boost::python::tuple ProcessAndScore(RDKit::RGroupDecompositionHelper {lvalue})
- class rdkit.Chem.rdRGroupDecomposition.RGroupDecompositionParameters((object)self)¶
Bases:
instance
RGroupDecompositionParameters controls how the RGroupDecomposition sets labelling and matches structures OPTIONS:
- RGroupCoreAlignment: can be one of RGroupCoreAlignment.None_ or RGroupCoreAlignment.MCS
If set to MCS, cores labels are mapped to each other using their Maximum common substructure overlap.
- RGroupLabels: optionally set where the rgroup labels to use are encoded.
RGroupLabels.IsotopeLabels - labels are stored on isotopes RGroupLabels.AtomMapLabels - labels are stored on atommaps RGroupLabels.MDLRGroupLabels - labels are stored on MDL R-groups RGroupLabels.DummyAtomLabels - labels are stored on dummy atoms RGroupLabels.AtomIndexLabels - use the atom index as the label RGroupLabels.RelabelDuplicateLabels - fix any duplicate labels RGroupLabels.AutoDetect - auto detect the label [default]
- Note: in all cases, any rgroups found on unlabelled atoms will be automatically
labelled.
- RGroupLabelling: choose where the rlabels are stored on the decomposition
RGroupLabelling.AtomMap - store rgroups as atom maps (for smiles) RGroupLabelling.Isotope - store rgroups on the isotope RGroupLabelling.MDLRGroup - store rgroups as mdl rgroups (for molblocks)
default: AtomMap | MDLRGroup
onlyMatchAtRGroups: only allow rgroup decomposition at the specified rgroups
removeAllHydrogenRGroups: remove all user-defined rgroups that only have hydrogens
removeAllHydrogenRGroupsAndLabels: remove all user-defined rgroups that only have hydrogens, and also remove the corresponding labels from the core
removeHydrogensPostMatch: remove all hydrogens from the output molecules
allowNonTerminalRGroups: allow labelled Rgroups of degree 2 or more
doTautomers: match all tautomers of a core against each input structure
doEnumeration: expand input cores into enumerated mol bundles
-allowMultipleRGroupsOnUnlabelled: permit more that one rgroup to be attached to an unlabelled core atom
Constructor, takes no arguments
- C++ signature :
void __init__(_object*)
- property alignment¶
- property allowMultipleRGroupsOnUnlabelled¶
- property allowNonTerminalRGroups¶
- property chunkSize¶
- property doEnumeration¶
- property doTautomers¶
- property gaMaximumOperations¶
- property gaNumberOperationsWithoutImprovement¶
- property gaNumberRuns¶
- property gaParallelRuns¶
- property gaPopulationSize¶
- property gaRandomSeed¶
- property includeTargetMolInResults¶
- property labels¶
- property matchingStrategy¶
- property onlyMatchAtRGroups¶
- property removeAllHydrogenRGroups¶
- property removeAllHydrogenRGroupsAndLabels¶
- property removeHydrogensPostMatch¶
- property rgroupLabelling¶
- property scoreMethod¶
- property substructMatchParams¶
- property timeout¶
- class rdkit.Chem.rdRGroupDecomposition.RGroupLabelling¶
Bases:
enum
- AtomMap = rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.AtomMap¶
- Isotope = rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.Isotope¶
- MDLRGroup = rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.MDLRGroup¶
- names = {'AtomMap': rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.AtomMap, 'Isotope': rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.Isotope, 'MDLRGroup': rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.MDLRGroup}¶
- values = {1: rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.AtomMap, 2: rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.Isotope, 4: rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.MDLRGroup}¶
- class rdkit.Chem.rdRGroupDecomposition.RGroupLabels¶
Bases:
enum
- AtomIndexLabels = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AtomIndexLabels¶
- AtomMapLabels = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AtomMapLabels¶
- AutoDetect = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AutoDetect¶
- DummyAtomLabels = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.DummyAtomLabels¶
- IsotopeLabels = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.IsotopeLabels¶
- MDLRGroupLabels = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.MDLRGroupLabels¶
- RelabelDuplicateLabels = rdkit.Chem.rdRGroupDecomposition.RGroupLabels.RelabelDuplicateLabels¶
- names = {'AtomIndexLabels': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AtomIndexLabels, 'AtomMapLabels': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AtomMapLabels, 'AutoDetect': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AutoDetect, 'DummyAtomLabels': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.DummyAtomLabels, 'IsotopeLabels': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.IsotopeLabels, 'MDLRGroupLabels': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.MDLRGroupLabels, 'RelabelDuplicateLabels': rdkit.Chem.rdRGroupDecomposition.RGroupLabels.RelabelDuplicateLabels}¶
- values = {1: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.IsotopeLabels, 2: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AtomMapLabels, 4: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AtomIndexLabels, 8: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.RelabelDuplicateLabels, 16: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.MDLRGroupLabels, 32: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.DummyAtomLabels, 255: rdkit.Chem.rdRGroupDecomposition.RGroupLabels.AutoDetect}¶
- class rdkit.Chem.rdRGroupDecomposition.RGroupMatching¶
Bases:
enum
- Exhaustive = rdkit.Chem.rdRGroupDecomposition.RGroupMatching.Exhaustive¶
- GA = rdkit.Chem.rdRGroupDecomposition.RGroupMatching.GA¶
- Greedy = rdkit.Chem.rdRGroupDecomposition.RGroupMatching.Greedy¶
- GreedyChunks = rdkit.Chem.rdRGroupDecomposition.RGroupMatching.GreedyChunks¶
- NoSymmetrization = rdkit.Chem.rdRGroupDecomposition.RGroupMatching.NoSymmetrization¶
- names = {'Exhaustive': rdkit.Chem.rdRGroupDecomposition.RGroupMatching.Exhaustive, 'GA': rdkit.Chem.rdRGroupDecomposition.RGroupMatching.GA, 'Greedy': rdkit.Chem.rdRGroupDecomposition.RGroupMatching.Greedy, 'GreedyChunks': rdkit.Chem.rdRGroupDecomposition.RGroupMatching.GreedyChunks, 'NoSymmetrization': rdkit.Chem.rdRGroupDecomposition.RGroupMatching.NoSymmetrization}¶
- values = {1: rdkit.Chem.rdRGroupDecomposition.RGroupMatching.Greedy, 2: rdkit.Chem.rdRGroupDecomposition.RGroupMatching.GreedyChunks, 4: rdkit.Chem.rdRGroupDecomposition.RGroupMatching.Exhaustive, 8: rdkit.Chem.rdRGroupDecomposition.RGroupMatching.NoSymmetrization, 16: rdkit.Chem.rdRGroupDecomposition.RGroupMatching.GA}¶
- class rdkit.Chem.rdRGroupDecomposition.RGroupScore¶
Bases:
enum
- FingerprintVariance = rdkit.Chem.rdRGroupDecomposition.RGroupScore.FingerprintVariance¶
- Match = rdkit.Chem.rdRGroupDecomposition.RGroupScore.Match¶
- names = {'FingerprintVariance': rdkit.Chem.rdRGroupDecomposition.RGroupScore.FingerprintVariance, 'Match': rdkit.Chem.rdRGroupDecomposition.RGroupScore.Match}¶
- values = {1: rdkit.Chem.rdRGroupDecomposition.RGroupScore.Match, 4: rdkit.Chem.rdRGroupDecomposition.RGroupScore.FingerprintVariance}¶
- rdkit.Chem.rdRGroupDecomposition.RelabelMappedDummies((Mol)mol[, (int)inputLabels=rdkit.Chem.rdRGroupDecomposition.RGroupLabelling(7)[, (int)outputLabels=rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.MDLRGroup]]) None : ¶
Relabel dummy atoms bearing an R-group mapping (as atom map number, isotope or MDLRGroup label) such that they will be displayed by the rendering code as R# rather than #*, :#, #:#, etc. By default, only the MDLRGroup label is retained on output; this may be configured through the outputLabels parameter. In case there are multiple potential R-group mappings, the priority on input is Atom map number > Isotope > MDLRGroup. The inputLabels parameter allows to configure which mappings are taken into consideration.
- C++ signature :
void RelabelMappedDummies(RDKit::ROMol {lvalue} [,unsigned int=rdkit.Chem.rdRGroupDecomposition.RGroupLabelling(7) [,unsigned int=rdkit.Chem.rdRGroupDecomposition.RGroupLabelling.MDLRGroup]])