RDKit
Open-source cheminformatics and machine learning.
Loading...
Searching...
No Matches
RDKit::MorganFingerprints Namespace Reference

Classes

class  ss_matcher
 

Typedefs

typedef std::tuple< boost::dynamic_bitset<>, uint32_t, unsigned intAccumTuple
 
typedef std::map< std::uint32_t, std::vector< std::pair< std::uint32_t, std::uint32_t > > > BitInfoMap
 

Functions

RDKIT_FINGERPRINTS_EXPORT void getConnectivityInvariants (const ROMol &mol, std::vector< std::uint32_t > &invars, bool includeRingMembership=true)
 returns the connectivity invariants for a molecule
 
RDKIT_FINGERPRINTS_EXPORT void getFeatureInvariants (const ROMol &mol, std::vector< std::uint32_t > &invars, std::vector< const ROMol * > *patterns=nullptr)
 returns the feature invariants for a molecule
 
RDKIT_FINGERPRINTS_EXPORT SparseIntVect< std::uint32_t > * getFingerprint (const ROMol &mol, unsigned int radius, std::vector< boost::uint32_t > *invariants=nullptr, const std::vector< boost::uint32_t > *fromAtoms=nullptr, bool useChirality=false, bool useBondTypes=true, bool useCounts=true, bool onlyNonzeroInvariants=false, BitInfoMap *atomsSettingBits=nullptr, bool includeRedundantEnvironments=false)
 returns the Morgan fingerprint for a molecule
 
RDKIT_FINGERPRINTS_EXPORT SparseIntVect< std::uint32_t > * getHashedFingerprint (const ROMol &mol, unsigned int radius, unsigned int nBits=2048, std::vector< boost::uint32_t > *invariants=nullptr, const std::vector< boost::uint32_t > *fromAtoms=nullptr, bool useChirality=false, bool useBondTypes=true, bool onlyNonzeroInvariants=false, BitInfoMap *atomsSettingBits=nullptr, bool includeRedundantEnvironments=false)
 returns the Morgan fingerprint for a molecule
 
RDKIT_FINGERPRINTS_EXPORT ExplicitBitVectgetFingerprintAsBitVect (const ROMol &mol, unsigned int radius, unsigned int nBits, std::vector< std::uint32_t > *invariants=nullptr, const std::vector< std::uint32_t > *fromAtoms=nullptr, bool useChirality=false, bool useBondTypes=true, bool onlyNonzeroInvariants=false, BitInfoMap *atomsSettingBits=nullptr, bool includeRedundantEnvironments=false)
 returns the Morgan fingerprint for a molecule as a bit vector
 

Variables

RDKIT_FINGERPRINTS_EXPORT std::vector< std::string > defaultFeatureSmarts
 
const std::string morganConnectivityInvariantVersion = "1.0.0"
 
const std::string morganFeatureInvariantVersion = "0.1.0"
 
const std::string morganFingerprintVersion = "1.0.0"
 

Typedef Documentation

◆ AccumTuple

typedef std::tuple<boost::dynamic_bitset<>, uint32_t, unsigned int> RDKit::MorganFingerprints::AccumTuple

Definition at line 99 of file FingerprintUtil.h.

◆ BitInfoMap

typedef std::map<std::uint32_t, std::vector<std::pair<std::uint32_t, std::uint32_t> > > RDKit::MorganFingerprints::BitInfoMap

Definition at line 58 of file MorganFingerprints.h.

Function Documentation

◆ getConnectivityInvariants()

RDKIT_FINGERPRINTS_EXPORT void RDKit::MorganFingerprints::getConnectivityInvariants ( const ROMol mol,
std::vector< std::uint32_t > &  invars,
bool  includeRingMembership = true 
)

returns the connectivity invariants for a molecule

Parameters
mol: the molecule to be considered
invars: used to return the results
includeRingMembership: if set, whether or not the atom is in a ring will be used in the invariant list.

◆ getFeatureInvariants()

RDKIT_FINGERPRINTS_EXPORT void RDKit::MorganFingerprints::getFeatureInvariants ( const ROMol mol,
std::vector< std::uint32_t > &  invars,
std::vector< const ROMol * > *  patterns = nullptr 
)

returns the feature invariants for a molecule

Parameters
molthe molecule to be considered
invars: used to return the results
patternsif provided should contain the queries used to assign atom-types. if not provided, feature definitions adapted from reference: Gobbi and Poppinger, Biotech. Bioeng. 61 47-54 (1998) will be used for Donor, Acceptor, Aromatic, Halogen, Basic, Acidic

◆ getFingerprint()

RDKIT_FINGERPRINTS_EXPORT SparseIntVect< std::uint32_t > * RDKit::MorganFingerprints::getFingerprint ( const ROMol mol,
unsigned int  radius,
std::vector< boost::uint32_t > *  invariants = nullptr,
const std::vector< boost::uint32_t > *  fromAtoms = nullptr,
bool  useChirality = false,
bool  useBondTypes = true,
bool  useCounts = true,
bool  onlyNonzeroInvariants = false,
BitInfoMap atomsSettingBits = nullptr,
bool  includeRedundantEnvironments = false 
)

returns the Morgan fingerprint for a molecule

These fingerprints are similar to the well-known ECFP or FCFP fingerprints, depending on which invariants are used.

The algorithm used is described in the paper Rogers, D. & Hahn, M. Extended-Connectivity Fingerprints. JCIM 50:742-54 (2010) https://doi.org/10.1021/ci100050t

The original implementation was done using this paper: D. Rogers, R.D. Brown, M. Hahn J. Biomol. Screen. 10:682-6 (2005) and an unpublished technical report: http://www.ics.uci.edu/~welling/teaching/ICS274Bspring06/David%20Rogers%20-%20ECFP%20Manuscript.doc

Parameters
molthe molecule to be fingerprinted
radiusthe number of iterations to grow the fingerprint
invariants: optional pointer to a set of atom invariants to be used. By default ECFP-type invariants are used (calculated by getConnectivityInvariants())
fromAtoms: if this is provided, only the atoms in the vector will be used as centers in the fingerprint
useChirality: if set, additional information will be added to the fingerprint when chiral atoms are discovered. This will cause
C[C@H](F)Cl,
                      C[C@@H](F)Cl, and CC(F)Cl 
to generate different fingerprints.
useBondTypes: if set, bond types will be included as part of the hash for calculating bits
useCounts: if set, counts of the features will be used
onlyNonzeroInvariants: if set, bits will only be set from atoms that have a nonzero invariant.
atomsSettingBits: if nonzero, this will be used to return information about the atoms that set each particular bit. The keys are the map are bit ids, the values are lists of (atomId, radius) pairs.
includeRedundantEnvironments: if set, the check for redundant atom environments will not be done.
Returns
a pointer to the fingerprint. The client is responsible for calling delete on this.

◆ getFingerprintAsBitVect()

RDKIT_FINGERPRINTS_EXPORT ExplicitBitVect * RDKit::MorganFingerprints::getFingerprintAsBitVect ( const ROMol mol,
unsigned int  radius,
unsigned int  nBits,
std::vector< std::uint32_t > *  invariants = nullptr,
const std::vector< std::uint32_t > *  fromAtoms = nullptr,
bool  useChirality = false,
bool  useBondTypes = true,
bool  onlyNonzeroInvariants = false,
BitInfoMap atomsSettingBits = nullptr,
bool  includeRedundantEnvironments = false 
)

returns the Morgan fingerprint for a molecule as a bit vector

see documentation for getFingerprint() for theory/references

Parameters
molthe molecule to be fingerprinted
radiusthe number of iterations to grow the fingerprint
nBitsthe number of bits in the final fingerprint
invariants: optional pointer to a set of atom invariants to be used. By default ECFP-type invariants are used (calculated by getConnectivityInvariants())
fromAtoms: if this is provided, only the atoms in the vector will be used as centers in the fingerprint
useChirality: if set, additional information will be added to the fingerprint when chiral atoms are discovered. This will cause
C[C@H](F)Cl,
                      C[C@@H](F)Cl, and CC(F)Cl 
to generate different fingerprints.
useBondTypes: if set, bond types will be included as part of the hash for calculating bits
onlyNonzeroInvariants: if set, bits will only be set from atoms that have a nonzero invariant.
atomsSettingBits: if nonzero, this will be used to return information about the atoms that set each particular bit. The keys are the map are bit ids, the values are lists of (atomId, radius) pairs.
includeRedundantEnvironments: if set, the check for redundant atom environments will not be done.
Returns
a pointer to the fingerprint. The client is responsible for calling delete on this.

◆ getHashedFingerprint()

RDKIT_FINGERPRINTS_EXPORT SparseIntVect< std::uint32_t > * RDKit::MorganFingerprints::getHashedFingerprint ( const ROMol mol,
unsigned int  radius,
unsigned int  nBits = 2048,
std::vector< boost::uint32_t > *  invariants = nullptr,
const std::vector< boost::uint32_t > *  fromAtoms = nullptr,
bool  useChirality = false,
bool  useBondTypes = true,
bool  onlyNonzeroInvariants = false,
BitInfoMap atomsSettingBits = nullptr,
bool  includeRedundantEnvironments = false 
)

returns the Morgan fingerprint for a molecule

These fingerprints are similar to the well-known ECFP or FCFP fingerprints, depending on which invariants are used.

The algorithm used is described in the paper Rogers, D. & Hahn, M. Extended-Connectivity Fingerprints. JCIM 50:742-54 (2010) https://doi.org/10.1021/ci100050t

The original implementation was done using this paper: D. Rogers, R.D. Brown, M. Hahn J. Biomol. Screen. 10:682-6 (2005) and an unpublished technical report: http://www.ics.uci.edu/~welling/teaching/ICS274Bspring06/David%20Rogers%20-%20ECFP%20Manuscript.doc

Parameters
molthe molecule to be fingerprinted
radiusthe number of iterations to grow the fingerprint
invariants: optional pointer to a set of atom invariants to be used. By default ECFP-type invariants are used (calculated by getConnectivityInvariants())
fromAtoms: if this is provided, only the atoms in the vector will be used as centers in the fingerprint
useChirality: if set, additional information will be added to the fingerprint when chiral atoms are discovered. This will cause
C[C@H](F)Cl,
                      C[C@@H](F)Cl, and CC(F)Cl 
to generate different fingerprints.
useBondTypes: if set, bond types will be included as part of the hash for calculating bits
onlyNonzeroInvariants: if set, bits will only be set from atoms that have a nonzero invariant.
atomsSettingBits: if nonzero, this will be used to return information about the atoms that set each particular bit. The keys are the map are bit ids, the values are lists of (atomId, radius) pairs.
includeRedundantEnvironments: if set, the check for redundant atom environments will not be done.
Returns
a pointer to the fingerprint. The client is responsible for calling delete on this.

Variable Documentation

◆ defaultFeatureSmarts

RDKIT_FINGERPRINTS_EXPORT std::vector<std::string> RDKit::MorganFingerprints::defaultFeatureSmarts
extern

◆ morganConnectivityInvariantVersion

const std::string RDKit::MorganFingerprints::morganConnectivityInvariantVersion = "1.0.0"

Definition at line 114 of file FingerprintUtil.h.

◆ morganFeatureInvariantVersion

const std::string RDKit::MorganFingerprints::morganFeatureInvariantVersion = "0.1.0"

Definition at line 132 of file FingerprintUtil.h.

◆ morganFingerprintVersion

const std::string RDKit::MorganFingerprints::morganFingerprintVersion = "1.0.0"

Definition at line 60 of file MorganFingerprints.h.