BaseMolecule

Next: DrawMolecule Up: Molecule objects Previous: Atom Contents

Subsections

BaseMolecule

Files:	`BaseMolecule.h, BaseMolecule.C`
Derived from:	\bf Animation
Global instance (if any):	none
Used in optional component:	Part of main VMD code

Description

BaseMolecule is the second-highest level in the class hierarchy for Molecule objects; it is derived from Animation, and is used to store the static information for the molecule, which is the basic information about the structure and contents of the system which do not change with time. BaseMolecule has no ability to read in this information itself, instead it provides routines which derived classes call in order to add new molecules, bonds, etc. to the molecule. BaseMolecule has no responsibility for graphically displaying the molecule, either; that is left to the DrawMolecule class, which is derived from BaseMolecule and Displayable.

You do not ever create a BaseMolecule instance directly, instead you create an instance of a class derived from Molecule, for which BaseMolecule is a base class (among others). When initially created, a new BaseMolecule is empty, with zero atoms and zero bonds. The derived classes contain the actual code to read in the molecular structure from a file or from a network connection, and they add the components to the internal storage via routines in BaseMolecule. When all the structure is completely read in, then routine are called in BaseMolecule to analyze the structure, and calculate such things as what atoms are in what residues, how many residues there are, what are the backbone bonds, etc. In fact, a molecule contains these structural features which are either directly added to BaseMolecule, or calculated by BaseMolecule after the basic structure is read in:

N atoms, which are added to the system as they are read in from some source. Each atom has associated with it several names, which help to distinguish the atoms and make it possible for the VMD atom selection mechanism to choose subsets of atoms. These names are:
- The atom name, which is usually a standard chemical nomenclature name. For example, alpha carbons in proteins have the name CA.
- The atom type, which for some molecular data files is a name from a much smaller set of total names, used to classify atoms into small sets which are more manageable. For example, PSF files use atom types to simplify the parameterization of the atoms for molecular dynamics simulations. If the atom type is not known from the input files, it is just set to be the same as the atom name.
- The residue name, a three-letter code which is usually quite standard. All glycine amino acids, for example, are in residues with the name GLY.
- The residue ID, a numeric value assigned to the residues in a molecule, quite often in increasing order from one end of a linear chain to another. Particular useful in proteins, which are unbranched polymers.
- The chain ID, a single-letter code used to distinguish atoms among different subcomponents of a molecule. If it is unknown for an atom, it is given a default value of `X'.
- The segment name, similar to the chain ID but allowed to be up to four characters. This is not as standard as the chain ID, and if it is unknown it will be given a default value of MAIN.
N bonds, which are added to the system as they are read in or calculated by a derived class. The bonds are not stored directly in a list, however; instead, each atom stores a list of the bonds it participates in, which makes it much faster to display the molecule.
N protein backbone bonds, and N nucleic-acid backbone bonds, which are determined by BaseMolecule after the atoms and bonds are added.
N residues, where each residue is a collection of atoms and bonds which form some subunit.
N segments, which each consist of a collection of atoms in a functional substructure within the molecule. For example, quite often a protein is a segment, and surrounding water molecules are another segment.
N fragments, where each fragment is a collection of connected residues. If a system consists of three disconnected alpha helices, for example, then each helix would be a separate fragment. There are lists of protein fragments as well as nucleic acid fragments.

After the atoms and bonds are added to the BaseMolecule, then the connectivity is analyzed and the names of the atoms are used to find the backbone bonds, the residues in the system, and the fragments. Atoms, bonds, residues, and other components are numbered 0 ... N-1 in their respective lists.

One other item which BaseMolecule stores is the unique molecule ID number, which is assigned when the molecule is created. Each new molecule in VMD gets assigned an integer ID. The assigned ID values increase by one as each new system is loaded. The commands used to affect the molecules use these ID numbers to determine which molecule the command should affect. The name of the molecule displayed in the Molecule on-screen menu form has this ID number appended to the end of it.

Constructors

BaseMolecule::BaseMolecule(void)

Enumerations, lists or character name arrays

The MoleculeType enumeration lists the different type of molecules which VMD understands. When the structure is analyzed, the type of molecule is determined. The types are:

UAPROTEIN
EHPROTEIN
UAPROTDNA
EHPROTDNA
NUCLEIC
ORGANIC
INORGANIC

Internal data structures

MoleculeType type - type of this molecule (from above list).
int nAtoms - number of atoms in this molecule. Can be zero.
int nBonds - number of bonds in this molecule.
int nBackProtein - number of protein backbone bonds.
int nBackDNA - number of nucleic-acid backbone bonds.
int nResidues - number of residues.
int nSegments - number of segments.
int ID - molecule integer ID number.
int maxAtoms - maximum storage currently allotted to store the atoms (i.e. size of atomList array, which may be larger than the actual number of atoms stored there).
Atom **atomList - array of Atom objects.
NameList<int> atomNames - list of unique atom names in this molecule.
NameList<int> atomTypes - list of unique atom types in this molecule.
NameList<int> resNames - list of unique residue names in this molecule.
NameList<int> resIds - list of unique residue ID's in this molecule.
NameList<int> chainNames - list of unique chain ID's in this molecule.
NameList<int> segNames - list of unique segment names in this molecule.
ResizeArray<Residue *> residueList - list of which residues are connected to which.
ResizeArray<Fragment *> fragList - list of connected residues, which form fragments.
ResizeArray<Fragment *> pfragList - list of connected protein residues which form protein fragments. A protein fragment is a single chain from N to C.
ResizeArray<Fragment *> nfragList - list of connected nucleic acid residues, which form nucleic acid fragments. A nucleic acid fragment is a single chain from 5' to 3'.

Nonvirtual member functions

void init_atoms(int) - initializes storage to store data for N atoms. This only allocates memory, it does not store anything in that memory. This should be called when constructing a new molecule, when the number of atoms has been determined but before the atom data itself is stored into the BaseMolecule structures.
int add_atom(char *, char *, char *, char *, char *, char *, float *pos, float *extra) - add a new atom to the molecule, with the specified names, and given starting x, y, z position (pos) and given starting extra data (such as beta value and occupancy).
int add_bond(int, int, Atom::BackboneType = Atom::NORMAL) - add a new bond between the atoms specified as the first two arguments, where the bond is of the specified type. See the description of Atom for a list of the different bond types.
int find_backbone(void) - determines which bonds are backbone bonds, and stores this data in the Atom objects stored in the atomList member. Returns the number of backbone bonds found.
int find_residues(void) - find which atoms are in which residues, and store this data. Returns the number of residues found.
int find_waters(void) - Find the waters, based on their residue name, and return the number found.
int find_segments(void) - Find the segments in the molecule, and store this data. Return the number round.
int find_fragments(void) - Find the fragments in the molecule, and store this data. Return the number found.
int find_atom_in_residue(char *nm, int r) - find the index of the first atom in the specified residue with the given name, or return -1 if none is found with that name.
int id(void) - return the ID of the molecule.
Atom *atom(int) - return the Nth Atom for the molecule.
char *atom_full_name(int, char * = NULL) - return a string containing the full name specification for the Nth atom. If the second argument is not NULL, the name will be placed in the given character array. Otherwise, an internal static buffer will be used to hold the name. The name is of the form:
<mol ID>:<atom index>
This name is guaranteed to be unique for each atom.
char *atom_short_name(int, char * = NULL) - the same as for the full name, except the name returned is of the form:
<residue name><residue ID>:<atom name>
This form is nicer to read, but is not generally unique for a given atom.
float default_charge(char *) - returns a default partial charge to use for the specified atom name. Used when this information is not supplied by the source of molecular structure. The following routines also supply default data based on a given atom name.
float default_mass(char *)
float default_radius(char *)
float default_occup(char *)
float default_beta(char *)

Virtual member functions

virtual int create(void) - the main virtual routine provided by this class. This is used after a new Molecule subclass has been created (with the required information for reading the molecule given in the constructor). Initially the Molecule is empty; to initialize it, the create() routine is called which will then start the actual process of reading in the data. Each version of create() supplied by the derived classes should, after doing it's own creation, call the create() routine in the parent class. This routine returns the success of the creation operation.
virtual float scale_factor(void) - returns (possibly calculating first) the scaling factor required to scale the coordinates for the current timestep to fit in a box from -1 ... 1 in all dimensions.
virtual void cov(float &, float &, float &) - return the position of the center of volume of the current coordinate set.

Method of use

A new molecule is first created by using `new' with the proper subclass of Molecule (Molecule is the `standard' class to use for all molecule objects in VMD; classes derived from Molecule are specialized to read in data from different sources, while classes above the Molecule level only deal with some of the information required to store and display and animate a structure.). Then, after the new instance is assigned to a Molecule pointer, then the create() virtual function should be called. This will actually result in all the action being done, for example data files will be read or network connections will be established. The version of create() in BaseMolecule should be called after the molecule has been read in by the derived classes. It analyzes the structure and finds the backbone bonds, fragments, etc. When create() is finished, the molecule is ready to go. If create() does not return TRUE, however, the creation failed (i.e. the files could not be opened), and the new molecule will still be empty.

Next: DrawMolecule Up: Molecule objects Previous: Atom Contents

vmd@ks.uiuc.edu