RDKit
Open-source cheminformatics and machine learning.
|
Classes | |
class | ss_matcher |
Typedefs | |
typedef std::tuple< boost::dynamic_bitset<>, uint32_t, unsigned int > | AccumTuple |
typedef std::map< std::uint32_t, std::vector< std::pair< std::uint32_t, std::uint32_t > > > | BitInfoMap |
Variables | |
RDKIT_FINGERPRINTS_EXPORT std::vector< std::string > | defaultFeatureSmarts |
const std::string | morganConnectivityInvariantVersion = "1.0.0" |
const std::string | morganFeatureInvariantVersion = "0.1.0" |
const std::string | morganFingerprintVersion = "1.0.0" |
typedef std::tuple<boost::dynamic_bitset<>, uint32_t, unsigned int> RDKit::MorganFingerprints::AccumTuple |
Definition at line 99 of file FingerprintUtil.h.
typedef std::map<std::uint32_t, std::vector<std::pair<std::uint32_t, std::uint32_t> > > RDKit::MorganFingerprints::BitInfoMap |
Definition at line 58 of file MorganFingerprints.h.
RDKIT_FINGERPRINTS_EXPORT void RDKit::MorganFingerprints::getConnectivityInvariants | ( | const ROMol & | mol, |
std::vector< std::uint32_t > & | invars, | ||
bool | includeRingMembership = true |
||
) |
returns the connectivity invariants for a molecule
mol | : the molecule to be considered |
invars | : used to return the results |
includeRingMembership | : if set, whether or not the atom is in a ring will be used in the invariant list. |
RDKIT_FINGERPRINTS_EXPORT void RDKit::MorganFingerprints::getFeatureInvariants | ( | const ROMol & | mol, |
std::vector< std::uint32_t > & | invars, | ||
std::vector< const ROMol * > * | patterns = nullptr |
||
) |
returns the feature invariants for a molecule
mol | the molecule to be considered |
invars | : used to return the results |
patterns | if provided should contain the queries used to assign atom-types. if not provided, feature definitions adapted from reference: Gobbi and Poppinger, Biotech. Bioeng. 61 47-54 (1998) will be used for Donor, Acceptor, Aromatic, Halogen, Basic, Acidic |
RDKIT_FINGERPRINTS_EXPORT SparseIntVect< std::uint32_t > * RDKit::MorganFingerprints::getFingerprint | ( | const ROMol & | mol, |
unsigned int | radius, | ||
std::vector< boost::uint32_t > * | invariants = nullptr , |
||
const std::vector< boost::uint32_t > * | fromAtoms = nullptr , |
||
bool | useChirality = false , |
||
bool | useBondTypes = true , |
||
bool | useCounts = true , |
||
bool | onlyNonzeroInvariants = false , |
||
BitInfoMap * | atomsSettingBits = nullptr , |
||
bool | includeRedundantEnvironments = false |
||
) |
returns the Morgan fingerprint for a molecule
These fingerprints are similar to the well-known ECFP or FCFP fingerprints, depending on which invariants are used.
The algorithm used is described in the paper Rogers, D. & Hahn, M. Extended-Connectivity Fingerprints. JCIM 50:742-54 (2010) https://doi.org/10.1021/ci100050t
The original implementation was done using this paper: D. Rogers, R.D. Brown, M. Hahn J. Biomol. Screen. 10:682-6 (2005) and an unpublished technical report: http://www.ics.uci.edu/~welling/teaching/ICS274Bspring06/David%20Rogers%20-%20ECFP%20Manuscript.doc
mol | the molecule to be fingerprinted |
radius | the number of iterations to grow the fingerprint |
invariants | : optional pointer to a set of atom invariants to be used. By default ECFP-type invariants are used (calculated by getConnectivityInvariants()) |
fromAtoms | : if this is provided, only the atoms in the vector will be used as centers in the fingerprint |
useChirality | : if set, additional information will be added to the fingerprint when chiral atoms are discovered. This will cause C[C@H](F)Cl, C[C@@H](F)Cl, and CC(F)Clto generate different fingerprints. |
useBondTypes | : if set, bond types will be included as part of the hash for calculating bits |
useCounts | : if set, counts of the features will be used |
onlyNonzeroInvariants | : if set, bits will only be set from atoms that have a nonzero invariant. |
atomsSettingBits | : if nonzero, this will be used to return information about the atoms that set each particular bit. The keys are the map are bit ids, the values are lists of (atomId, radius) pairs. |
includeRedundantEnvironments | : if set, the check for redundant atom environments will not be done. |
RDKIT_FINGERPRINTS_EXPORT ExplicitBitVect * RDKit::MorganFingerprints::getFingerprintAsBitVect | ( | const ROMol & | mol, |
unsigned int | radius, | ||
unsigned int | nBits, | ||
std::vector< std::uint32_t > * | invariants = nullptr , |
||
const std::vector< std::uint32_t > * | fromAtoms = nullptr , |
||
bool | useChirality = false , |
||
bool | useBondTypes = true , |
||
bool | onlyNonzeroInvariants = false , |
||
BitInfoMap * | atomsSettingBits = nullptr , |
||
bool | includeRedundantEnvironments = false |
||
) |
returns the Morgan fingerprint for a molecule as a bit vector
see documentation for getFingerprint() for theory/references
mol | the molecule to be fingerprinted |
radius | the number of iterations to grow the fingerprint |
nBits | the number of bits in the final fingerprint |
invariants | : optional pointer to a set of atom invariants to be used. By default ECFP-type invariants are used (calculated by getConnectivityInvariants()) |
fromAtoms | : if this is provided, only the atoms in the vector will be used as centers in the fingerprint |
useChirality | : if set, additional information will be added to the fingerprint when chiral atoms are discovered. This will cause C[C@H](F)Cl, C[C@@H](F)Cl, and CC(F)Clto generate different fingerprints. |
useBondTypes | : if set, bond types will be included as part of the hash for calculating bits |
onlyNonzeroInvariants | : if set, bits will only be set from atoms that have a nonzero invariant. |
atomsSettingBits | : if nonzero, this will be used to return information about the atoms that set each particular bit. The keys are the map are bit ids, the values are lists of (atomId, radius) pairs. |
includeRedundantEnvironments | : if set, the check for redundant atom environments will not be done. |
RDKIT_FINGERPRINTS_EXPORT SparseIntVect< std::uint32_t > * RDKit::MorganFingerprints::getHashedFingerprint | ( | const ROMol & | mol, |
unsigned int | radius, | ||
unsigned int | nBits = 2048 , |
||
std::vector< boost::uint32_t > * | invariants = nullptr , |
||
const std::vector< boost::uint32_t > * | fromAtoms = nullptr , |
||
bool | useChirality = false , |
||
bool | useBondTypes = true , |
||
bool | onlyNonzeroInvariants = false , |
||
BitInfoMap * | atomsSettingBits = nullptr , |
||
bool | includeRedundantEnvironments = false |
||
) |
returns the Morgan fingerprint for a molecule
These fingerprints are similar to the well-known ECFP or FCFP fingerprints, depending on which invariants are used.
The algorithm used is described in the paper Rogers, D. & Hahn, M. Extended-Connectivity Fingerprints. JCIM 50:742-54 (2010) https://doi.org/10.1021/ci100050t
The original implementation was done using this paper: D. Rogers, R.D. Brown, M. Hahn J. Biomol. Screen. 10:682-6 (2005) and an unpublished technical report: http://www.ics.uci.edu/~welling/teaching/ICS274Bspring06/David%20Rogers%20-%20ECFP%20Manuscript.doc
mol | the molecule to be fingerprinted |
radius | the number of iterations to grow the fingerprint |
invariants | : optional pointer to a set of atom invariants to be used. By default ECFP-type invariants are used (calculated by getConnectivityInvariants()) |
fromAtoms | : if this is provided, only the atoms in the vector will be used as centers in the fingerprint |
useChirality | : if set, additional information will be added to the fingerprint when chiral atoms are discovered. This will cause C[C@H](F)Cl, C[C@@H](F)Cl, and CC(F)Clto generate different fingerprints. |
useBondTypes | : if set, bond types will be included as part of the hash for calculating bits |
onlyNonzeroInvariants | : if set, bits will only be set from atoms that have a nonzero invariant. |
atomsSettingBits | : if nonzero, this will be used to return information about the atoms that set each particular bit. The keys are the map are bit ids, the values are lists of (atomId, radius) pairs. |
includeRedundantEnvironments | : if set, the check for redundant atom environments will not be done. |
|
extern |
const std::string RDKit::MorganFingerprints::morganConnectivityInvariantVersion = "1.0.0" |
Definition at line 114 of file FingerprintUtil.h.
const std::string RDKit::MorganFingerprints::morganFeatureInvariantVersion = "0.1.0" |
Definition at line 132 of file FingerprintUtil.h.
const std::string RDKit::MorganFingerprints::morganFingerprintVersion = "1.0.0" |
Definition at line 60 of file MorganFingerprints.h.