Tuesday, October 19, 2010

Extensible Markup Language (XML)

Extensible Markup Language (XML) is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards.
XML's design goals emphasize simplicity, generality, and usability over the Internet. It is a textual data format with strong support via Unicode for the languages of the world. Although the design of XML focuses on documents, it is widely used for the representation of arbitrary data structures, for example in web services.
Many application programming interfaces (APIs) have been developed that software developers use to process XML data, and several schema systems exist to aid in the definition of XML-based languages.
As of 2009, hundreds of XML-based languages have been developed, including RSS, Atom, SOAP, and XHTML. XML-based formats have become the default for most office-productivity tools, including Microsoft Office (Office Open XML), OpenOffice.org (OpenDocument), and Apple's iWork.
For more information please visit: XML

Example of XML :

XML Table
Computational Methods:
Molecular Mechanics Semi Empirical Ab Initio
Very fast speed Fast speed Slow speed
Restriction parameters Good accuracy Very good accuracy
Very good protein modelling Good protein modelling Best protein modelling

XML Tree



XML Documentations

<?xml version=“1.0” encoding=“ISO-8859-1”?>
<Computational_Method>

    <molecular_mechanics>

        <speed>Very fast speed</speed>

        <accuracy>Restriction parameters</accuracy>

        <protein_modelling>Very good protein modelling</protein_modelling>

    </molecular_mechanics>

    <semi_empirical>

        <speed>Fast speed</speed>

        <accuracy>Good accuracy</accuracy>

        <protein_modelling>Good protein modelling</protein_modelling>

    </semi_empirical>

    <ab_initio>

        <speed>Slow speed</speed>

        <accuracy>Very good accuracy</accuracy>

        <protein_modelling>Best protein modelling</protein_modelling>

    </ab_initio>

</Computational_Method>

Sunday, October 17, 2010

Protein Data Bank and RasMol.


The Protein Data Bank (PDB) is a repository for the 3-D structural data of large biological molecules, such as proteins and nucleic acids. The data, typically obtained by X-ray crystallography or NMR spectroscopy and submitted by biologists and biochemists from around the world, are freely accessible on the internet. The PDB is overseen by an organization called the Worldwide Protein Data Bank, wwPDB.
The PDB is a key resource in areas of structural biology, such as structural genomics. Most major scientific journals, and some funding agencies, such as the NIH in the USA, now require scientists to submit their structure data to the PDB. If the contents of the PDB are thought of as primary data, then there are hundreds of derived (i.e., secondary) databases that categorize the data differently. For example, both SCOP and CATH categorize structures according to type of structure and assumed evolutionary relations; GO categorize structures based on genes.
To know more about PDB, have a stop at PDB.

Lexa on RasMol



RasMol is a computer program written for molecular graphics visualization intended and used primarily for the depiction and exploration of biological macromolecule structures, such as those found in the Protein Data Bank. It was originally developed by Roger Sayle in the early 90s.
Historically, it was an important tool for molecular biologists since the extremely optimized program allowed the software to run on (then) modestly powerful personal computers. Before RasMol, visualization software ran on graphics workstations that, due to their expense, were less accessible to scholars. RasMol has become an important educational tool as well as continuing to be an important tool for research in structural biology.
RasMol has a complex version history. Starting with the series of 2.7 versions, RasMol is licensed under a dual license (GPL or custom licenseRASLIC).
RasMol includes a language (for selecting certain protein chains, or changing colors etc). Jmol and Sirius has incorporated the RasMol scripting language into its commands.
Protein Databank (PDB) files can be downloaded for visualization from the Research Collaboratory for Structural Bioinformatics (RCSB) bank. These have been uploaded by researchers who have characterized the structure of molecules usually by X-ray crystallography or NMR spectroscopy.
For more information on RasMol, please visit RasMol.

RasMol Version and its Features.


Version Features
RasMol 2.7.1
  • The ability to automatically mark non bonded atoms in wireframe and stick displays.
  • The ability to use a proportionally spaced font and to draw labels with heavier strokes. 
  • The ability to auto-recognize PDB vs. CIF and mmCIF datasets.
  • Extensive updating to the manual.
  • Updating the canvas title with the PDB ID code and EXPDTA information, so models will be clearly distinguished from experimental data.
  • The ability to report coordinates.
  • Additions to the list of pre-defined colours.
  • Improved accuracy of coordinates in pseudo-PDB output.
  • Fixes to the centring logic.
RasMol 2.7.1.1
  • Introduction of a multilingual structure for RasMol.
  • Population of messages and menu lists for English and Spanish.
  • Upgrade of some of the Windows printer logic.
  • Correction of coordinate handling for Mol2 and XYZ coordinates.
  • Fix to the parsing of D2O.
RasMol 2.7.2
  • RasMol may have difficulty in allocating colours for molecules after the first. The fix for this interacts with some other pending changes, and should be ready for the next release.
  • As has been true for all recent versions, the stereo mode defaults to cross-eyed, which is inconvenient for many users.
  • Printing under windows is not working for many modern systems. 
  • The fixes for several of the bugs reported against RasMol 2.7.1 have not been incorporated into RasMol 2.7.2 yet. 
RasMol 2.7.2.1
  • Adaption of the multilingual mods from RasMol 2.7.1.1 into Rasmol 2.7.2.1.
  • Rewrite of the mouse handling and rotation logic to correct the problems in 2.7.2 and make the feel of 2.7.2.1 closer to that of RasMol 2.7.1.
  • Addition of French menus and messages.
  • Addition of Italian menus and messages.
  • Adoption of picking for selection of atoms, groups or chains from RasTop 1.3.
  • Adoption of backclipping from RasTop 1.3.
  • Adoption of shadepower command for glassy surfaces from RasTop 1.3
  • Change of the menu stereo option to rotate cross-wall-none
  • Allow longer atom names (12 characters) in CIFs.
RasMol 2.7.3
  • Adjustment to the mouse handling for a better, more natural feel. Our thanks to C. Chigbo for the suggestion.
  • Correction to cif.c for blanks after an initial quote mark. 
  • Correction to mswin31.c to restore lost initializations of ZRange and DialValue[8..9]. 
  • Correction to vector.c for nested bond rotations.
  • Modifications by Mamoru Yamanishi to Imakefile and rasmol.c to use xforms for GUI file open. This patch needs the opens source xforms 1.0.90 library by Steve Lamont.
  • Correction to molecule.c to correct input of xyz files.
  • Revision to CPK colors by C. Chigbo. The new colors are called CPKNEW. The current CPK colors remain available as CPK. 
  • Correction to negative torsion angle monitors and to imprecise distance and angle monitors by C. Chigbo.  This patch corrects the display of negative torsion angles caused by use of the unsigned short type, and correctys imprecise distance and angle displays. This extends the original patch which was just for torsion angles (torsion.patch). A side effect of this change is to limit the available range for distance monitors to approximately 327 Ã…ngstroms.
  • Initial code for display of solid Lee-Richards molecular surfaces. This patch adds the basic code for display of Lee-Richards surfaces with a new Molecular Surface menu item, and surface molecule <probe radius> and surface solvent <probe radius> commands. Two other, related, major patches are pending that depend upon this one: code by P. Zhivkov to simulate surfaces efficiently by blurring and code to display surfaces using OpenGL.
  • Corrections of ribbons 0, etc. commands by R. Chachra. With this patch, the wireframe 0ribbon 0cartoon 0backbone 0strands 0 and trace 0 commands work the same as these command with off instead of 0.
RasMol 2.7.4
  • Extended language support. Messages and menus in Bulgarian, Chinese, English, Italian, Japanese, Russian and Spanish are now supported on systems with appropriate fonts.
  • Support for maps. On systems with sufficient memory, RasMol now can read maps in CCP4 and CBF map formats and can write maps in CBF map format. Maps of density from pseudo-Gaussian atoms can be generated. Support is provided for generation of surfaces for SAXS bead models.
  • An MS windows installer was proposed by G. A. Pozhvanov, and reimplemented on the open source base of NSIS-2.21.
  • A unix installer script, rasmol_install.sh, and a matching script to select an appropropriate binary version to run under unix, rasmol_run.sh have been added by H. J. Bernstein.

Tuesday, October 12, 2010

SMILES (^_^)

The simplified molecular input line entry specification or SMILES is a specification for unambiguously describing the structure of chemical molecules using short ASCII strings. SMILES strings can be imported by most molecule editors for conversion back into two-dimensional drawings or three-dimensional models of the molecules.

The term SMILES refers to a line notation for encoding molecular structures and specific instances should strictly be called SMILES strings. However, the term SMILES is also commonly used to refer to both a single SMILES string and a number of SMILES strings; the exact meaning is usually apparent from the context. The terms Canonical and Isomeric can lead to some confusion when applied to SMILES. The terms describe different attributes of SMILES strings and are not mutually exclusive.
Typically, a number of equally valid SMILES can be written for a molecule. For example, CCO, OCC and C(O)C all specify the structure of ethanol. Algorithms have been developed to ensure the same SMILES is generated for a molecule regardless of the order of atoms in the structure. This SMILES is unique for each structure, although dependent on the canonicalisation algorithm used to generate it, and is termed the Canonical SMILES. These algorithms first convert the SMILES to an internal representation of the molecular structure and do not simply manipulate strings as is sometimes thought. Various algorithms for generating Canonical SMILES have been developed, including those by Daylight Chemical Information Systems, OpenEye Scientific Software, MEDIT and Chemical Computing Group. A common application of Canonical SMILES is indexing and ensuring uniqueness of molecules in a database.
SMILES notation allows the specification of configuration at tetrahedral centers, and double bond geometry. These are structural features that cannot be specified by connectivity alone and SMILES which encode this information are termed Isomeric SMILES. A notable feature of these rules is that they allow rigorous partial specification of chirality. The term Isomeric SMILES is also applied to SMILES in which isotopes are specified.

Some of the examples of SMILES :

Creating SMILES using ChemSketch

The colourful SMILES

It's so easy!


The Grammar of SMILES :

SMILES Grammar Description
Atoms Atoms are the nouns of the SMILES grammar. One represents each atom by its chemical symbol. Usually one encloses the symbol in square brackets, like this: [Cl]. However, the following organic subset symbols may appear without the brackets: B, C, N, O, P, S, F, Cl, Br, and I. These include the halogens, which would normally bond to only one other atom in any case, and other atoms that are assumed to be bound to hydrogen if they are not explicitly bound to something else. An atom participating in an aromatic ring structure is listed in lowercase.

The use of the brackets is significant. For example, [S] refers to elemental sulfur, while the symbol S represents hydrogen sulfide, which has two atoms of hydrogen bound to the one of sulfur. (However, Cl-Cl refers to the diatomic molecule of chlorine, while Cl refers to hydrochloric acid.)
Charges and positions of atoms Charge signs (+ and -) and digits giving the multiple of a charge or the position of an atom are the adjectives (and sometimes the adverbs) of SMILES grammar. An ionic valence is a classic application. For example, [Fe+2] is the ferrous or iron (II) ion. Note that SMILES does not require, nor use, superscripts or subscripts.

One does not multiply atoms themselves (except for atoms of hydrogen) by using numbers. Instead, one repeats the atomic symbol as many times as the atom appears.
Bonds Bonds are the verbs of the SMILES grammar.To simplify things even further, one may omit the - and : symbols for atoms that are adjacent to one another and have single or aromatic bonds joining them. This is the reason for representing an aromatically bound atom in lowercase instead of in UPPERCASE.

Thus the SMILES for diatomic oxygen is O=O; that for carbon dioxide is O=C=O; for diatomic nitrogen, N#N; for hydrogen cyanide, C#N; for acetylene or ethyne, C#C; for hydrazine, N=N.
Branches Branches are the subordinating conjunctions of the SMILES grammar. A structure that branches from the main line is enclosed in parentheses. Nesting and stacking of branches is permitted. An atom other than carbon in a linear structure would also receive a branch. Thus the SMILES for chloromethane (formerly called "methyl chloride") would be C(Cl), and that for tetrachloromethane ("carbon tetrachloride") would be C(Cl)(Cl)(Cl)(Cl).

Carboxylic acids are a common branching structure. The SMILES for acetic acid, for example, is CC(=O)O.
Rings To write a cyclic or ring structure, you "break" one of the bonds and write the structure as a line having digits following the atoms in the broken bond. Thus the SMILES for cyclohexane is C1CCCCC1. If a given atom is part of more than one ring structure, and you have to break more than one bond, you then use a different digit for each broken bond, in order to convey how to re-join the atoms.

By convention, aromatic ring vertices are written in lowercase. Thus the SMILES for benzene is c1ccccc1 and that for pyridine is n1ccccc1.
Disconnected Structures A simple dot (.) serves as the most common example of a coordinating conjunction in SMILES. Two structures not having a covalent bond of any kind to join them are considered disconnected, and are joined with a dot. This is the proper method for representing ionic compounds. For example, the SMILES for sodium chloride is [Na+].[Cl-]. The SMILES for sodium acetate is [Na+].[CC(=O)O-].

For more information, come and look at SMILES!

The joys of ACD/ChemSketch! (^_^)

ChemSketch is a PC package which can also be used to draw molecules, reactions, and schematic diagrams and includes powerful optimization and 3D visualization tools.
ACD/ChemSketch Freeware is a drawing package that allows you to draw chemical structures including organics, organometallics, polymers, and Markush structures. It also includes features such as calculation of molecular properties (e.g., molecular weight, density, molar refractivity etc.), 2D and 3D structure cleaning and viewing, functionality for naming structures (fewer than 50 atoms and 3 rings), and prediction of logP.

Benefits :
  • Visualize chemical structures in 2D or 3D to gain more insight into spatial configurations, and relationships to molecular properties.
  • Create professional reports, working with structures, text, and graphics simultaneously.
  • ACD/ChemSketch also allows you to check other tautomeric forms for your drawn structure. Consideration of tautomeric forms is very important for structure searching, predictions (such as physicochemical properties, i.e., pKa), and interpretation (i.e., of NMR, MS, and other analytical data).
Some of its applications :




Features of ChemSketch :
Features Description
Drawing * Click and drag between two atoms to quickly create bonds
* Create chemical structures from InChI and SMILES codes
* Draw Markush structures (generic view), structures with delocalization, and polymers
* Create special Markush structures with added or removed mass or fragments, to describe metabolic and mass-spectral transformations
* Depict reactions by drawing, importing, mapping atom-atom transformation, and editing reaction conditions
* Adjust the position of Hydrogens near each atom
* Apply powerful 3D and 2D rotation, and move/resize features
* Create 3D models from 2D structures with the geometry optimization button
Structure Searching * Search for chemical structures in various file formats throughout your computer's file systems. (SK2; MOL; SDF; SKC; CHM; CDX; RXN; Adobe PDF; Microsoft Office DOC, XLS, PPT; and ACD/Labs databases CUD, HUD, CFD, NDB, ND5, INT)
* Search Microsoft Word documents with structures created in ChemDraw or Symyx (MDL) ISIS
* Search using full or partial structures
Chemistry * Choose from a wide range of special bond types including aromatic, delocalized, undefined single and double stereo, quadruple, and coordination bonds
* Automatically assign hydrogen atoms and charges to fill valence
* Instantly display chemical formula, molecular weight, percentage composition, and estimated macroscopic properties such as molar refractivity, refractive index, molar volume, density, and parachor
* Look up elements on the Expanded Periodic Table of Elements which includes physical properties, NMR properties, isotope composition, and images of elements in their natural form
* Draw reactions and complex chemical schemes with manual or automatic mapping
* Calculate quantities for chemical reactions and solutions
* View all the suggested tautomeric forms for your structure
Reporting * Create professional chemistry-related reports and presentations
* Export your ChemSketch files to Adobe Acrobat PDF format
* Cut- and-paste structures and chemical information directly into Windows applications and maintain OLE links
* Convert your work into HTML
* Create templates for generating reports from other ACD/Labs products, defined by rules or company standards
Convenient Interface Design * Customize toolbars
* Customize display properties such as atom numbering, chemical symbols, and valence
* Send ChemSketch SK2 files or PDF documents as an e-mail attachment directly from the ACD/ChemSketch interface
* Work with structures, text, and graphics simultaneously
* Save and load object styles


For more informations, have a look at ACD/ChemSketch

Monday, October 11, 2010

Microsoft Excel,a useful application.

Microsoft Excel (full name Microsoft Office Excel) is a spreadsheet application written and distributed by Microsoft for Microsoft Windows and Mac OS X. It features calculation, graphing tools, pivot tables and a macro programming language called VBA (Visual Basic for Applications). It has been a very widely applied spreadsheet for these platforms, especially since version 5 in 1993. Excel forms part of Microsoft Office. The current versions are Microsoft Office Excel 2010 for Windows and 2008 for Mac.

Microsoft Excel has the basic features of all spreadsheets,using a grid of cells arranged in numbered rows and letter-named columns to organize data manipulations like arithmetic operations. It has a battery of supplied functions to answer statistical, engineering and financial needs. In addition, it can display data as line graphs, histograms and charts, and with a very limited three-dimensional graphical display. It allows sectioning of data to view its dependencies on various factors from different perspectives (using pivot tables and the scenario manager).
It has a programming aspect, Visual Basic for Applications, allowing the user to employ a wide variety of numerical methods, for example, for solving differential equations of mathematical physics, and then reporting the results back to the spreadsheet. Finally, it has a variety of interactive features allowing user interfaces that can completely hide the spreadsheet from the user, so the spreadsheet presents itself as a so-called application, or decision support system (DSS), via a custom-designed user interface, for example, a stock analyzer, or in general, as a design tool that asks the user questions and provides answers and reports.
In a more elaborate realization, an Excel application can automatically poll external databases and measuring instruments using an update schedule, analyze the results, make a Word report or Power Point slide show, and e-mail these presentations on a regular basis to a list of participants.

The view of Microsoft Office Excel

Microsoft Office 2010


For more useful information about Microsoft Office Excel and to learn how it works, please refer to this link

Versions of Microsoft Windows Excel :
Microsoft Windows Year Published
Excel 97 1997
Excel 2000 1999
Excel 2002 2001
Excel 2003 2003
Excel 2007 2007
Excel 2010(latest) 2010