Thank you Noel for look into this.
for and intermediate file.
regardless the input format.
another subject. As a related note, we have tested several atom typing
analyzing a 3600 structures in the PDBbind database.
When I convert the molecules as given with obabel, you're right - you
run into a bug that's been fixed on the development branch -
aromaticity is perceived differently depending on the presence/absence
obabel 3rlb_ligand.* -osmi
Cc1nc(N)c(Cn2csc(CCO)c2C)cn1 3rlb_ligand
Cc1nc(N)c(CN2CSC(=C2C)CCO)cn1 ./3rlb_ligand.pdb
If you delete the explicit Hs first, you can get the same aromaticity
obabel 3rlb_ligand.* -d -O tmp.sdf
obabel tmp.sdf -osmi
Cc1nc(N)c(CN2=CSC(=C2C)CCO)cn1 3rlb_ligand
Cc1nc(N)c(CN2CSC(=C2C)CCO)cn1 ./3rlb_ligand.pdb
If you paste these SMILES into Marvin Sketch you can see the
difference. The MOL2 file contains an extra double bond to a nitrogen.
So what's going on?...
I'm guessing that the correct structure is in the MOL2 file, but it
was read incorrectly by Open Babel and so is missing the charge on the
4-valent nitrogen. MOL2 is a horrible format but we should do a better
job. I note in passing that MarvinSketch interprets it the same as
Open Babel but that's no excuse.
The PDB file of course does not contain any bond orders and so we
guess them. We do an okay job - this is an example where we miss the
bond. If you removed these bond orders from the MOL2 file you would
get the same wrong structure too.
- Noel
Here is one example from the PDBBind refine data set.
Please find bellow the code, the output, and attached the mol2 and the
pdb
input files.
#include <iostream>
#include <openbabel/obconversion.h>
#include <openbabel/obiter.h>
#include <openbabel/mol.h>
#include <openbabel/atom.h>
int main(int argc,char **argv)
{
OpenBabel::OBConversion conv;
OpenBabel::OBMol mol;
std::string filename;
filename = argv[1];
conv.ReadFile(&mol,filename);
mol.DeleteHydrogens();
mol.ConnectTheDots();
mol.PerceiveBondOrders();
mol.UnsetAromaticPerceived();
FOR_ATOMS_OF_MOL(atom, mol) {
std::cout << atom->IsAromatic() ;
}
}
000000111110000000 (mol2)
000000000000111111 (pdb)
Post by Noel O'BoyleMaybe if you can give an example of the problem with aromaticity, we
can help? The only information that is used by that function is the
structure, so it was probably wrong at that point.
Dear Noel Thank you for your answer. Please see my comments bellow.
Post by Noel O'BoyleIn other words, you want to assign atom types based on the structure.
Yes, that's right.
Post by Noel O'BoyleThe source of the structure is immaterial except in so far as it
introduces noise. For example, to read a PDB file you need to guess
various things. To read a MOL file, you don't need to guess anything.
That noise is what we are trying to avoid by always calculating (guessing)
things with the same algorithm.
Post by Noel O'BoyleRegarding your code, you should never throw away information and then
try to guess it.
Well, that depend on your faith on the quality of the information
putted
Post by Noel O'Boylein
the input format.
One can always set a flag to keep the input information if its considered
accurate enough, but if you want consistency regarding the input file format
I don't see other way but to strip off all the information in the
input
Post by Noel O'Boyleand
recalculate it.
Post by Noel O'BoyleAlso, I note in passing that DeleteHydrogens()
doesn't delete anything, it just suppresses any explicit hydrogens.
I'm a bit unclear why you are using the internal Open Babel atom
types. Personally, I would avoid this as the atom types may not be
suitable.
Instead, just implement your own atom type function to suit
your needs. Any atom typing can be implemented as a function that
takes an OBAtom* and returns the type, perhaps as an enum.
Are you referring to functions like "IsAmideNitrogen" or so?. We used these
functions, and they worked just fine for our needs.
The problem we faced was with "IsAromatic" that we couldn't make it
input-format agnostic. Our guess is that some information of the input
format is always remaining when calling it, regardless
UnsetAromaticPerceived and the like were called before.
This lead us to try the route of put all the atom types in internal
Open
Post by Noel O'BoyleBabel types and build upon it.
Post by Noel O'Boyle- Noel
Post by Marcos VillarrealHello,
For an application we are developing, we would like to get an atom typing
independent of the input format.
For example a mol2 with all Hydrogen atoms and a pdb without
Hydrogens
of
the same molecule (i.e. identical heavy atom coordinates) should
get
Post by Noel O'BoylePost by Noel O'BoylePost by Marcos Villarrealthe
same atom types.
The attached program is our try in that direction, but
unfortunately
Post by Noel O'BoylePost by Noel O'BoylePost by Marcos Villarrealwithout
success. How could one get ride off all the input information and
let
Post by Noel O'BoylePost by Noel O'BoylePost by Marcos Villarrealbabel
do all the new calculations of atom types?
Thank you in advance.
int main(int argc,char **argv)
{
OpenBabel::OBConversion conv;
OpenBabel::OBMol mol;
std::string filename;
filename = argv[1];
conv.ReadFile(&mol,filename);
mol.DeleteHydrogens();
mol.ConnectTheDots();
mol.PerceiveBondOrders();
int i=0;
FOR_ATOMS_OF_MOL(atom, mol) {
i++;
std::cout << i << ": " << atom->GetType() << std::endl ;
}
}
--
Marcos Villarreal
Dpto de QuÃmica Teórica y Computacional
Facultad de Ciencias QuÃmicas
Universidad Nacional de Córdoba
Argentina.
------------------------------------------------------------
------------------
Post by Noel O'BoylePost by Noel O'BoylePost by Marcos VillarrealCheck out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
OpenBabel-discuss mailing list
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
--
Marcos Villarreal
Dpto de QuÃmica Teórica y Computacional
Facultad de Ciencias QuÃmicas
Universidad Nacional de Cordoba
--
Marcos Villarreal
Dpto de QuÃmica Teórica y Computacional
Facultad de Ciencias QuÃmicas
Universidad Nacional de Cordoba