CATEGORY
Transform
SOURCE
RDKit
DESCRIPTION
Two cells performing molecular standardisation using the MolVS and Flatkinson standardisers.
INPUTS
A Dataset of Molecules
OUTPUTS
A Dataset of Molecules
OPTIONS
RDKIt version | Which version of RDKit to use |
ADDITIONAL INFO
Implemented in RDKit using sanifier.py from the Pipelines project.
MolVStandardiser is based on code developed by Matt Swain and can be found here.
FlatkinsonStandardiser is based on code developed by Francis Atkinson and can be found here.
The aim is to provide alternative standardisers that can be used for different purposes. Its likely we will provide multiple configurations for different purposes
Output
Each input molecule is transformed according to the standardiser’s rules. The output is the transformed molecule, or the input molecule if no transformations are applied. For each input molecule one molecule is output.
A field name “Standardised” is added to the output molecules to indicate what was done. There are 3 values:
- True - the molecule was transformed
- False - the molecule was not transformed
- Error - an error occurred during standardisation and the input molecule is returned without any changes