Compound statistics
Distribution, overlap, and identifier coverage for unique compounds (ns_id).
Top 10 compounds by list count
Most shared compounds across lists.
Compound overlap distribution
How many compounds appear in 1, 2, 3–5, … lists.
Identifier coverage (unique compounds)
Counts of compounds having each identifier in mv_compound_cards.
Missing values in compounds table (%)
Percent of rows where each column is NULL/blank (based on public.compounds row count).
DTX vs PubChem match coverage
Unique compounds grouped by whether we have DTX (dtx_id) and/or PubChem (pc_id) enrichment.
Identifier coverage (radar)
Same coverage as the donut, shown as percentages of unique compounds.
Identifier completeness score
Distribution of how many identifiers are available per compound (0–6).
Mass distribution (unique compounds)
Histogram of numeric mass values from mv_compound_cards (0–1000 Da, 50-Da bins).
Top lists by unique compounds
Largest lists measured as DISTINCT ns_id per list (Top 20).
H-bond donor vs Monoisotopic mass
Bubble bins built from PubChem enrichment (pc_chem): size ∝ √count, color ∝ log(1+count).
Top compounds
Most shared: Bisphenol A (50 lists)
| Structure | Compound | Lists |
|---|---|---|
|
|
Bisphenol A | 50 |
|
|
Dibutyl phthalate | 49 |
|
|
PFOA | 49 |
|
|
Acetaminophen | 48 |
|
|
Perfluorooctanesulfonic acid | 47 |
|
|
Diuron | 46 |
|
|
Caffeine | 44 |
|
|
Thiabendazol | 43 |
|
|
Triclosan | 43 |
|
|
Perfluorobutanesulfonic acid | 42 |
Heaviest compounds
Top mass values from mv_compound_cards (numeric).
| Structure | Compound | Mass |
|---|---|---|
|
|
Eptotermin alfa | 15664.52 |
|
|
n.a. | 11002.28 |
|
|
CID 168266388 | 10417.74 |
|
|
Mekasermin | 7643.59 |
|
|
LEPIRUDIN | 6981.00 |
|
|
ISIS 2302 | 6363.61 |
|
|
Inulin | 6176.02 |
|
|
MURODERMIN | 6035.54 |
|
|
insulin (human) | 5803.64 |
|
|
Insulin (ox), 8A-l-threonine-10A-l-isoleucine- | 5773.63 |