RDKit
Open-source cheminformatics and machine learning.
|
#include <InfoBitRanker.h>
Public Types | |
enum | InfoType { ENTROPY = 1 , BIASENTROPY = 2 , CHISQUARE = 3 , BIASCHISQUARE = 4 } |
the type of measure for information More... | |
Public Member Functions | |
InfoBitRanker (unsigned int nBits, unsigned int nClasses, InfoType infoType=InfoBitRanker::ENTROPY) | |
Constructor. | |
~InfoBitRanker () | |
void | accumulateVotes (const ExplicitBitVect &bv, unsigned int label) |
Accumulate the votes for all the bits turned on in a bit vector. | |
void | accumulateVotes (const SparseBitVect &bv, unsigned int label) |
double * | getTopN (unsigned int num) |
Returns the top n bits ranked by the information metric. | |
unsigned int | getNumInstances () const |
return the number of labelled instances(examples) or fingerprints seen so far | |
unsigned int | getNumClasses () const |
return the number of classes | |
void | setBiasList (RDKit::INT_VECT &classList) |
Set the classes to which the entropy calculation should be biased. | |
void | setMaskBits (RDKit::INT_VECT &maskBits) |
Set the bits to be used as a mask. | |
void | writeTopBitsToStream (std::ostream *outStream) const |
Write the top N bits to a stream. | |
void | writeTopBitsToFile (const std::string &fileName) const |
Write the top bits to a file. | |
Definition at line 87 of file InfoBitRanker.h.
the type of measure for information
Enumerator | |
---|---|
ENTROPY | |
BIASENTROPY | |
CHISQUARE | |
BIASCHISQUARE |
Definition at line 92 of file InfoBitRanker.h.
|
inline |
Constructor.
ARGUMENTS:
Definition at line 111 of file InfoBitRanker.h.
|
inline |
Definition at line 128 of file InfoBitRanker.h.
void RDInfoTheory::InfoBitRanker::accumulateVotes | ( | const ExplicitBitVect & | bv, |
unsigned int | label | ||
) |
Accumulate the votes for all the bits turned on in a bit vector.
ARGUMENTS:
void RDInfoTheory::InfoBitRanker::accumulateVotes | ( | const SparseBitVect & | bv, |
unsigned int | label | ||
) |
|
inline |
return the number of classes
Definition at line 169 of file InfoBitRanker.h.
|
inline |
return the number of labelled instances(examples) or fingerprints seen so far
Definition at line 164 of file InfoBitRanker.h.
double * RDInfoTheory::InfoBitRanker::getTopN | ( | unsigned int | num | ) |
Returns the top n bits ranked by the information metric.
This is actually the function where most of the work of ranking is happening
num | the number of top ranked bits that are required |
void RDInfoTheory::InfoBitRanker::setBiasList | ( | RDKit::INT_VECT & | classList | ) |
Set the classes to which the entropy calculation should be biased.
This list contains a set of class ids used when in the BIASENTROPY mode of ranking bits. In this mode, a bit must be correllated higher with one of the biased classes than all the other classes. For example, in a two class problem with actives and inactives, the fraction of actives that hit the bit has to be greater than the fraction of inactives that hit the bit
ARGUMENTS: classList - list of class ids that we want a bias towards
void RDInfoTheory::InfoBitRanker::setMaskBits | ( | RDKit::INT_VECT & | maskBits | ) |
Set the bits to be used as a mask.
If this function is called, only the bits which are present in the maskBits list will be used.
ARGUMENTS: maskBits - the bits to be considered
void RDInfoTheory::InfoBitRanker::writeTopBitsToFile | ( | const std::string & | fileName | ) | const |
Write the top bits to a file.
void RDInfoTheory::InfoBitRanker::writeTopBitsToStream | ( | std::ostream * | outStream | ) | const |
Write the top N bits to a stream.