RDKit
Open-source cheminformatics and machine learning.
|
class for reading and searching FPB files More...
#include <FPBReader.h>
Public Member Functions | |
FPBReader () | |
FPBReader (const char *fname, bool lazyRead=false) | |
ctor for reading from a named file | |
FPBReader (const std::string &fname, bool lazyRead=false) | |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. | |
FPBReader (std::istream *inStream, bool takeOwnership=true, bool lazyRead=false) | |
ctor for reading from an open istream | |
~FPBReader () | |
void | init () |
Read the data from the file and initialize internal data structures. | |
void | cleanup () |
cleanup | |
boost::shared_ptr< ExplicitBitVect > | getFP (unsigned int idx) const |
returns the requested fingerprint as an ExplicitBitVect | |
boost::shared_array< std::uint8_t > | getBytes (unsigned int idx) const |
returns the requested fingerprint as an array of bytes | |
std::string | getId (unsigned int idx) const |
returns the id of the requested fingerprint | |
std::pair< boost::shared_ptr< ExplicitBitVect >, std::string > | operator[] (unsigned int idx) const |
returns the fingerprint and id of the requested fingerprint | |
std::pair< unsigned int, unsigned int > | getFPIdsInCountRange (unsigned int minCount, unsigned int maxCount) |
unsigned int | length () const |
returns the number of fingerprints | |
unsigned int | nBits () const |
returns the number of bits in our fingerprints | |
double | getTanimoto (unsigned int idx, const std::uint8_t *bv) const |
double | getTanimoto (unsigned int idx, boost::shared_array< std::uint8_t > bv) const |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. | |
double | getTanimoto (unsigned int idx, const ExplicitBitVect &ebv) const |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. | |
std::vector< std::pair< double, unsigned int > > | getTanimotoNeighbors (const std::uint8_t *bv, double threshold=0.7, bool usePopcountScreen=true) const |
returns tanimoto neighbors that are within a similarity threshold | |
std::vector< std::pair< double, unsigned int > > | getTanimotoNeighbors (boost::shared_array< std::uint8_t > bv, double threshold=0.7, bool usePopcountScreen=true) const |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. | |
std::vector< std::pair< double, unsigned int > > | getTanimotoNeighbors (const ExplicitBitVect &ebv, double threshold=0.7, bool usePopcountScreen=true) const |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. | |
double | getTversky (unsigned int idx, const std::uint8_t *bv, double ca, double cb) const |
double | getTversky (unsigned int idx, boost::shared_array< std::uint8_t > bv, double ca, double cb) const |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. | |
double | getTversky (unsigned int idx, const ExplicitBitVect &ebv, double ca, double cb) const |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. | |
std::vector< std::pair< double, unsigned int > > | getTverskyNeighbors (const std::uint8_t *bv, double ca, double cb, double threshold=0.7, bool usePopcountScreen=true) const |
returns Tversky neighbors that are within a similarity threshold | |
std::vector< std::pair< double, unsigned int > > | getTverskyNeighbors (boost::shared_array< std::uint8_t > bv, double ca, double cb, double threshold=0.7, bool usePopcountScreen=true) const |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. | |
std::vector< std::pair< double, unsigned int > > | getTverskyNeighbors (const ExplicitBitVect &ebv, double ca, double cb, double threshold=0.7, bool usePopcountScreen=true) const |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. | |
std::vector< unsigned int > | getContainingNeighbors (const std::uint8_t *bv) const |
returns indices of all fingerprints that completely contain this one | |
std::vector< unsigned int > | getContainingNeighbors (boost::shared_array< std::uint8_t > bv) const |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. | |
std::vector< unsigned int > | getContainingNeighbors (const ExplicitBitVect &ebv) const |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. | |
class for reading and searching FPB files
basic usage:
Note: this functionality is experimental and the API may change in future releases.
Note on thread safety Operations that involve reading from the FPB file are not thread safe. This means that the init()
method is not thread safe and none of the search operations are thread safe when an FPBReader
is initialized in lazyRead
mode.
Definition at line 58 of file FPBReader.h.
|
inline |
Definition at line 60 of file FPBReader.h.
ctor for reading from a named file
fname | the name of the file to reads |
lazyRead | if set to false all fingerprints from the file will be read into memory when init() is called. |
Definition at line 68 of file FPBReader.h.
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
Definition at line 72 of file FPBReader.h.
|
inline |
ctor for reading from an open istream
inStream | the stream to read from |
takeOwnership | if set, we will take over ownership of the stream pointer |
lazyRead | if set to false all fingerprints from the file will be read into memory when init() is called. |
Some additional notes:
Definition at line 87 of file FPBReader.h.
|
inline |
Definition at line 94 of file FPBReader.h.
|
inline |
cleanup
Cleans up whatever memory was allocated during init()
Definition at line 119 of file FPBReader.h.
returns the requested fingerprint as an array of bytes
|
inline |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
Definition at line 242 of file FPBReader.h.
std::vector< unsigned int > RDKit::FPBReader::getContainingNeighbors | ( | const ExplicitBitVect & | ebv | ) | const |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
std::vector< unsigned int > RDKit::FPBReader::getContainingNeighbors | ( | const std::uint8_t * | bv | ) | const |
returns indices of all fingerprints that completely contain this one
(i.e. where all the bits set in the query are also set in the db molecule)
boost::shared_ptr< ExplicitBitVect > RDKit::FPBReader::getFP | ( | unsigned int | idx | ) | const |
returns the requested fingerprint as an ExplicitBitVect
std::pair< unsigned int, unsigned int > RDKit::FPBReader::getFPIdsInCountRange | ( | unsigned int | minCount, |
unsigned int | maxCount | ||
) |
returns beginning and end indices of fingerprints having on-bit counts within the range (including end points)
returns the id of the requested fingerprint
|
inline |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
Definition at line 153 of file FPBReader.h.
double RDKit::FPBReader::getTanimoto | ( | unsigned int | idx, |
const ExplicitBitVect & | ebv | ||
) | const |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
returns the tanimoto similarity between the specified fingerprint and the provided fingerprint
|
inline |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
Definition at line 176 of file FPBReader.h.
std::vector< std::pair< double, unsigned int > > RDKit::FPBReader::getTanimotoNeighbors | ( | const ExplicitBitVect & | ebv, |
double | threshold = 0.7 , |
||
bool | usePopcountScreen = true |
||
) | const |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
std::vector< std::pair< double, unsigned int > > RDKit::FPBReader::getTanimotoNeighbors | ( | const std::uint8_t * | bv, |
double | threshold = 0.7 , |
||
bool | usePopcountScreen = true |
||
) | const |
returns tanimoto neighbors that are within a similarity threshold
The result vector of (similarity,index) pairs is sorted in order of decreasing similarity
bv | the query fingerprint |
threshold | the minimum similarity to return |
usePopcountScreen | if this is true (the default) the popcount of the neighbors will be used to reduce the number of calculations that need to be done |
|
inline |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
Definition at line 199 of file FPBReader.h.
double RDKit::FPBReader::getTversky | ( | unsigned int | idx, |
const ExplicitBitVect & | ebv, | ||
double | ca, | ||
double | cb | ||
) | const |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
double RDKit::FPBReader::getTversky | ( | unsigned int | idx, |
const std::uint8_t * | bv, | ||
double | ca, | ||
double | cb | ||
) | const |
returns the Tversky similarity between the specified fingerprint and the provided fingerprint
idx | the fingerprint to compare to |
bv | the query fingerprint |
ca | the Tversky a coefficient |
cb | the Tversky a coefficient |
|
inline |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
Definition at line 225 of file FPBReader.h.
std::vector< std::pair< double, unsigned int > > RDKit::FPBReader::getTverskyNeighbors | ( | const ExplicitBitVect & | ebv, |
double | ca, | ||
double | cb, | ||
double | threshold = 0.7 , |
||
bool | usePopcountScreen = true |
||
) | const |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
std::vector< std::pair< double, unsigned int > > RDKit::FPBReader::getTverskyNeighbors | ( | const std::uint8_t * | bv, |
double | ca, | ||
double | cb, | ||
double | threshold = 0.7 , |
||
bool | usePopcountScreen = true |
||
) | const |
returns Tversky neighbors that are within a similarity threshold
The result vector of (similarity,index) pairs is sorted in order of decreasing similarity
bv | the query fingerprint |
ca | the Tversky a coefficient |
cb | the Tversky a coefficient |
threshold | the minimum similarity to return |
usePopcountScreen | if this is true (the default) the popcount of the neighbors will be used to reduce the number of calculations that need to be done |
void RDKit::FPBReader::init | ( | ) |
Read the data from the file and initialize internal data structures.
This must be called before most of the other methods of this class.
Some notes:
lazyRead
is not set, all fingerprints will be read into memory. This can require substantial amounts of memory for large files. lazyRead
and takeOwnership
are both false
it is safe to close and delete inStream after calling init()
Referenced by RDKit::MultiFPBReader::addReader().
|
inline |
returns the fingerprint and id of the requested fingerprint
Definition at line 134 of file FPBReader.h.
References RDKit::getFP().