Logotop image

 

SigFinder

SigFinder is a tool that exploits statistical significance of substructures from a given compound dataset. It is based on the GraphSig Technology, and can be used for two purposes:

  1. Identification of significant substructures
  2. Classification of the dataset into categories on the basis of BBB permeability/Toxicity/ADME properties.

graphsig_idea

Features and Capabilities

  • Given a compound dataset, this tool mines statistically significant substructures---substructures that are representative of a given dataset because they are structurally overrepresented or underrepresented. This can help in the design of new drugs with similar capabilities.

graphsig_concept

  • It is capable of mining substructures that would not surface by doing a simple Frequent Substructure search on the dataset. Some of the substructures have less than 0.1% support in our test datasets.
  • It can be used as an in silico tool to predict BBB permeability/ Toxicity---properties of molecules that not only depend on pharmacokinetic properties, but the arrangement and interaction of different topological fragments of a molecule.

Method

The workflow of using SigFinder for identifying significant substructures is as follows:

graphsig_workflow

 

The workflow of using SigFinder for classifying chemical compounds is as follows:

graphsig_classification

 

Validation

  • We discovered the following substructures from the ACE inhibitors in the MDDR database.

graphsig significants ACE inhibitor

  • We applied SigFinder to discover significant substructures from a dataset of compounds classified as blood-brain barrier permeable and non-permeable. A couple of examples are shown below.

graphsig_BBB_significant

  • We applied SigFinder to classify the Anti-cancer screen dataset into actives and inactives. The BEDROC scores comparing our performance to Daylight are shown below.

graphsig_table

More details on this technology can be found here.

References

  1. Huahai He; Ambuj K. Singh; GraphRank: Statistical Modeling and Mining of Significant Subgraphs in the Feature Space. Proceedings of the 6th IEEE International Conference on Data Mining (ICDM), December, 2006, doi:10.1109/ICDM.2006.79
  2. Sayan Ranu; Ambuj Singh; Mining Statistically Significant Molecular Substructures for Efficient Molecular Classification., J. Chem. Inf. Model., 2009, 49(11), pp 2537–2550 DOI : 10.1021/ci900035z
 

 

 

Click here to download the SigFinder brochure