RDKit
Open-source cheminformatics and machine learning.
|
Class used to rank bits based on a specified measure of information. More...
Classes | |
class | BitCorrMatGenerator |
class | InfoBitRanker |
Typedefs | |
typedef std::vector< RDKit::USHORT > | USHORT_VECT |
typedef std::vector< USHORT_VECT > | VECT_USHORT_VECT |
Functions | |
template<class T > | |
double | ChiSquare (T *dMat, long int dim1, long int dim2) |
template<class T > | |
double | InfoEntropy (T *tPtr, long int dim) |
template<class T > | |
double | InfoEntropyGain (T *dMat, long int dim1, long int dim2) |
Class used to rank bits based on a specified measure of information.
Basically a primitive mimic of the CombiChem "signal" functionality To use:
Sample usage and results from the python wrapper: Here's a small set of vectors:
for i,bv in enumerate(bvs): print bv.ToBitString(),acts[i] ... 0001 0 0101 0 0010 1 1110 1
Default ranker, using infogain:
ranker = InfoBitRanker(4,2) for i,bv in enumerate(bvs): ranker.AccumulateVotes(bv,acts[i]) ... for bit,gain,n0,n1 in ranker.GetTopN(3): print
int(bit),'%.3f'gain,int(n0),int(n1) ... 3 1.000 2 0 2 1.000 0 2 0 0.311 0 1
Using the biased infogain:
ranker = InfoBitRanker(4,2,InfoTheory.InfoType.BIASENTROPY) ranker.SetBiasList((1,)) for i,bv in enumerate(bvs): ranker.AccumulateVotes(bv,acts[i]) ... for bit,gain,n0,n1 in ranker.GetTopN(3): print
int(bit),'%.3f'gain,int(n0),int(n1) ... 2 1.000 0 2 0 0.311 0 1 1 0.000 1 1
A chi squared ranker is also available:
ranker = InfoBitRanker(4,2,InfoTheory.InfoType.CHISQUARE) for i,bv in enumerate(bvs): ranker.AccumulateVotes(bv,acts[i]) ... for bit,gain,n0,n1 in ranker.GetTopN(3): print
int(bit),'%.3f'gain,int(n0),int(n1) ... 3 4.000 2 0 2 4.000 0 2 0 1.333 0 1
As is a biased chi squared:
ranker = InfoBitRanker(4,2,InfoTheory.InfoType.BIASCHISQUARE) ranker.SetBiasList((1,)) for i,bv in enumerate(bvs): ranker.AccumulateVotes(bv,acts[i]) ... for bit,gain,n0,n1 in ranker.GetTopN(3): print
int(bit),'%.3f'gain,int(n0),int(n1) ... 2 4.000 0 2 0 1.333 0 1 1 0.000 1 1
typedef std::vector<RDKit::USHORT> RDInfoTheory::USHORT_VECT |
Definition at line 84 of file InfoBitRanker.h.
typedef std::vector<USHORT_VECT> RDInfoTheory::VECT_USHORT_VECT |
Definition at line 85 of file InfoBitRanker.h.
double RDInfoTheory::ChiSquare | ( | T * | dMat, |
long int | dim1, | ||
long int | dim2 | ||
) |
Definition at line 15 of file InfoGainFuncs.h.
double RDInfoTheory::InfoEntropy | ( | T * | tPtr, |
long int | dim | ||
) |
Definition at line 68 of file InfoGainFuncs.h.
Referenced by InfoEntropyGain().
double RDInfoTheory::InfoEntropyGain | ( | T * | dMat, |
long int | dim1, | ||
long int | dim2 | ||
) |
Definition at line 89 of file InfoGainFuncs.h.
References InfoEntropy().