Assignment of EC Numbers to Enzymatic Reactions with Reaction Difference Fingerprints
The EC numbers represent enzymes and enzyme genes (genomic information), but they are also utilized as identifiers of enzymatic reactions (chemical information). In the present work (ECAssigner), our newly proposed reaction difference fingerprints (RDF) are applied to assign EC numbers to enzymatic reactions. The fingerprints of reactant molecules minus the fingerprints of product molecules will generate reaction difference fingerprints, which are then used to calculate reaction Euclidean distance, a reaction similarity measurement, of two reactions. The EC number of the most similar training reaction will be assigned to an input reaction. For 5120 balanced enzymatic reactions, the RDF with a fingerprint length at 3 obtained at the sub-subclass, subclass, and main class level with cross-validation accuracies of 83.1%, 86.7%, and 92.6% respectively. Compared with three published methods, ECAssigner is the first fully automatic server for EC number assignment. The EC assignment system (ECAssigner) is freely available via: http://cadd.whu.edu.cn/ecassigner/.