I AM AJ

Tuesday, January 25, 2011

EXCEL

Introduction

Linear Regression

In statistics, linear regression is an approach to modeling the relationship between a scalar variable y and one or more variables denoted X. In linear regression, models of the unknown parameters are estimated from the data using linear functions. Such models are called linear models. Most commonly, linear regression refers to a model in which the conditional mean of y given the value of X is an affine function of X. Less commonly, linear regression could refer to a model in which the median, or some other quantile of the conditional distribution of ygiven X is expressed as a linear function of X. Like all forms of regression analysis, linear regression focuses on the conditional probability distribution of y given X, rather than on the joint probability distribution of y and X, which is the domain of multivariate analysis.

Linear regression was the first type of regression analysis to be studied rigorously, and to be used extensively in practical applications. This is because models which depend linearly on their unknown parameters are easier to fit than models which are non-linearly related to their parameters and because the statistical properties of the resulting estimators are easier to determine.

Linear regression has many practical uses. Most applications of linear regression fall into one of the following two broad categories:

If the goal is prediction, or forecasting, linear regression can be used to fit a predictive model to an observed data set of y and X values. After developing such a model, if an additional value of X is then given without its accompanying value of y, the fitted model can be used to make a prediction of the value of y.
Given a variable y and a number of variables X₁, ..., X_p that may be related to y, then linear regression analysis can be applied to quantify the strength of the relationship between yand the X_j, to assess which X_j may have no relationship with y at all, and to identify which subsets of the X_j contain redundant information about y, thus once one of them is known, the others are no longer informative.

Linear regression models are often fitted using the least squares approach, but they may also be fitted in other ways, such as by minimizing the “lack of fit” in some other norm, or by minimizing a penalized version of the least squares loss function as in ridge regression. Conversely, the least squares approach can be used to fit models that are not linear models. Thus, while the terms “least squares” and linear model are closely linked, they are not synonymous.

Quadratic or Polynomial Regression

In statistics, polynomial regression is a form of linear regression in which the relationship between the independent variable x and the dependent variable y is modeled as an nth orderpolynomial. Polynomial regression fits a nonlinear relationship between the value of x and the corresponding conditional mean of y, denoted E(y|x), and has been used to describe nonlinear phenomena such as the growth rate of tissues^[1], the distribution of carbon isotopes in lake sediments ^[2], and the progression of disease epidemics^[3]. Although polynomial regression fits a nonlinear model to the data, as a statistical estimation problem it is linear, in the sense that the regression function E(y|x) is linear in the unknown parameters that are estimated from the data. For this reason, polynomial regression is considered to be a special case of multiple linear regression.

Polynomial regression models are usually fit using the method of least squares. The least-squares method minimizes the variance of the unbiased estimators of the coefficients, under the conditions of the Gauss–Markov theorem. The least-squares method was published in 1805 by Legendre and in 1809 by Gauss. The first design of an experiment for polynomial regression appeared in an 1815 paper of Gergonne^[4]^[5]. In the twentieth century, polynomial regression played an important role in the development of regression analysis, with a greater emphasis on issues of design and inference^[6]. More recently, the use of polynomial models has been complemented by other methods, with non-polynomial models having advantages for some classes of problems.

For more information, visit:http://en.wikipedia.org/wiki/Main_Page

Tuesday, January 11, 2011

SMILES

What Made SMILES?

Daylight provides enterprise-level cheminformatics software technologies to life science companies. Our superior chemistry, high performance, and open architecture have earned Daylight a reputation for delivering the state-of-the-art in chemical information processing since 1987.

Daylight Chemical Information Systems, Inc. is a privately held company with corporate offices in Aliso Viejo, CA and research offices in Santa Fe, NM and Cambridge, England.

What is SMILES?
SMILES
Simplified Molecular Input Line Entry System

SMILES^TM as a simple yet comprehensive chemical language in which molecules and reactions can be specified using ASCII characters representing atom and bond symbols. SMILES^TM contains the same information as is found in an extended connection table but with several advantages. A SMILES^TM string is human understandable, very compact, and if canonicalized represents a unique string that can be used as a universal identifier for a specific chemical structure. In addition, a chemically correct and comprehensible depiction can be made from any SMILES^TM string symbolizing either a molecule or reaction.

SMILES^TM development was initiated by David Weininger in the late 1980s using the concept of a graph with nodes as atoms and edges as bonds to represent a molecule. Parentheses are used to indicate branching points and numeric labels designate ring connection points. The basic SMILES^TM grammar also includes as well as isotopic information, configuration about double bonds, and chirality leading to what is known as isomeric SMILES^TM.

Acknowledgments

Development of SMILES was initiated by the author, David Weininger, at the Environmental Research Laboratory, U.S.E.P.A., Duluth, MN; the design was completed at Pomona College in Claremont, CA. It was embodied in the Daylight Toolkit with the assistance of Cedar River Software.

Introduction

SMILES (Simplified Molecular Input Line Entry System) is a line notation (a typographical method using printable characters) for entering and representing molecules and reactions. Some examples are:

SMILES contains the same information as might be found in an extended connection table. The primary reason SMILES is more useful than a connection table is that it is a linguistic construct, rather than a computer data structure. SMILES is a true language, albeit with a simple vocabulary (atom and bond symbols) and only a few grammar rules. SMILES representations of structure can in turn be used as "words" in the vocabulary of other languages designed for storage of chemical information (information about chemicals) and chemical intelligence (information about chemistry).
Part of the power of SMILES is that unique SMILES exist. With standard SMILES, the name of a molecule is synonymous with its structure; with unique SMILES, the name is universal. Anyone in the world who uses unique SMILES to name a molecule will choose the exact same name.
One other important property of SMILES is that it is quite compact compared to most other methods of representing structure. A typical SMILES will take 50% to 70% less space than an equivalent connection table, even binary connection tables. For example, a database of 23,137 structures, with an average of 20 atoms per structure, uses only 1.6 bytes per atom when represented with SMILES. In addition, ordinary compression of SMILES is extremely effective. The same database cited above was reduced to 27% of its original size by Ziv-Lempel compression (i.e. 0.42 bytes per atom).
These properties open many doors to the chemical information programmer. Examples of uses for SMILES are:

Keys for database access
Mechanism for researchers to exchange chemical information
Entry system for chemical data
Part of languages for artificial intelligence or expert systems in chemistry

The rest of this chapter is a concise exposition of the SMILES encoding rules. For further information, the reader is referred to "SMILES 1. Introduction and Encoding Rules", Weininger, D., J.Chem. Inf. Comput. Sci. 1988, 28,31.

Branches

Branches are specified by enclosing them in parentheses, and can be nested or stacked. In all cases, the implicit connection to a parenthesized expression (a "branch") is to the left. Examples are:

Cyclic Structures

Cyclic structures are represented by breaking one bond in each ring. The bonds are numbered in any order, designating ring opening (or ring closure) bonds by a digit immediately following the atomic symbol at each ring closure. This leaves a connected non-cyclic graph which is written as a non-cyclic structure using the three rules described above. Cyclohexane is a typical example:

Isomeric SMILES

This section describes the SMILES rules used to specify isotopism, configuration about double bonds, and chirality. The term isomeric SMILES collectively refers to SMILES written using these rules. The SMILES isomer specification rules allow chirality to be completely specified for any structure, if it is known. Unlike most existing chemical nomenclatures such as CIP and IUPAC, these rules are also designed to allow rigorous partial specification of chirality. Aside from use in macros, substructure searching, and other pattern matching operations, this is important because much of the world's available chemical information is known for structures with incompletely resolved chiralities (not all possible chiral centers are separated, known, or reported).
All isomer specification rules in SMILES are therefore optional. The absence of a specification for any attribute implies that the value of that attribute is unspecified.

Aromaticity

Aromaticity must be deduced in a system such as SMILES which generates an unambiguous chemical nomenclature because of the fundamental requirement to characterize the symmetry of a molecule. Given effective aromaticity-detection algorithms, it is not necessary to enter any structure as aromatic if the user prefers to enter an aliphatic (Kekulé-like) structure. Entering structures as aromatic directly (i.e., by using lower case atomic symbols) provides a shortcut to accurate chemical specification and is closer to the mental molecular model used by most chemists. The SMILES algorithm uses an extended version of Hueckel's rule to identify aromatic molecules and ions. To qualify as aromatic, all atoms in the ring must be sp² hybridized and the number of available "excess" p-electrons must satisfy Hueckel's 4N+2 criterion. As an example, benzene is written c1ccccc1, but an entry of C1=CC=CC=C1 - cyclohexatriene, the Kekulé form - leads to detection of aromaticity and results in an internal structural conversion to aromatic representation. Conversely, entries of c1ccc1 and c1ccccccc1 will produce the correct anti-aromatic structures for cyclobutadiene and cyclooctatetraene, C1=CC=C1 and C1=CC=CC=CC=C1. In such cases the SMILES system looks for a structure that preserves the implied sp² hybridization, the implied hydrogen count, and the specified formal charge, if any. Some inputs, however, may not only be incorrect but also impossible, such as c1cccc1. Here c1cccc1 cannot be converted to C1=CCC=C1 since one of the carbon atoms would be sp³ with two attached hydrogens. In such a structure alternating single and double bond assignments cannot be made. The SMILES system will flag this as an "impossible" input. Please note that only atoms on the following list can be considered aromatic: C, N, O, P, S, As, Se, and * (wildcard). In addition, exocyclic double bonds do not break aromaticity.

Hydrogens

Hydrogens in reactions are handled as with molecules; they are suppressed unless "special". Recall that for molecules, hydrogens are special if they are: charged, isotopic, bonded to another hydrogen, or multiply bonded. With reactions, there is an additional case which will make a hydrogen special. It is often desirable (eg. 1,5-hydride shift) to store information about the location of hydrogens as part of the atom map of a reaction. Hydrogens with a supplied atom map are considered "special" and these hydrogens are not suppressed. These mapped hydrogens appear explicitly in Absolute SMILES for reactions. Otherwise, atom-mapped hydrogens do not appear in Unique SMILES.

For More Information on SMILES, visit
http://www.daylight.com/

Tuesday, January 4, 2011

Protein Data Bank

What is Protein Data Bank?

A repository for 3-D biological macromolecular structure
All data are available to the public
It includes proteins, nucleic acids and viruses
Obtained by X-Ray crystallography (80%) or NMR spectroscopy (16%)
Submitted by biologists and biochemists from around the world

History of Protein Data Bank

Founded in 1971 by Brookhaven National Laboratory, New York
First set of data were entered on punched cards. Then with magnetic tapes
Transferred to the Research Collaborators for Structural Bioinformatics (RCSB) in 1998
Currently it holds 29,000 released structures

FtsH peptidase

The signal recognition particle (SRP) is a multimeric protein, which along with its conjugate receptor (SR), is involved in targeting secretory proteins to the rough endoplasmic reticulum (RER) membrane in eukaryotes, or to the plasma membrane in prokaryotes PUBMED:17622352, PUBMED:16469117. SRP recognises the signal sequence of the nascent polypeptide on the ribosome, retards its elongation, and docks the SRP-ribosome-polypeptide complex to the RER membrane via the SR receptor. SRP consists of six polypeptides (SRP9, SRP14, SRP19, SRP54, SRP68 and SRP72) and a single 300 nucleotide 7S RNA molecule. The RNA component catalyses the interaction of SRP with its SR receptor PUBMED:17507650. In higher eukaryotes, the SRP complex consists of the Alu domain and the S domain linked by the SRP RNA. The Alu domain consists of a heterodimer of SRP9 and SRP14 bound to the 5' and 3' terminal sequences of SRP RNA. This domain is necessary for retarding the elongation of the nascent polypeptide chain, which gives SRP time to dock the ribosome-polypeptide complex to the RER membrane.
This entry represents the N-terminal helical bundle domain of the 54 kDa SRP54 component, a GTP-binding protein that interacts with the signal sequence when it emerges from the ribosome. SRP54 of the signal recognition particle has a three-domain structure: an N-terminal helical bundle domain, a GTPase domain, and the M-domain that binds the 7s RNA and also binds the signal sequence. The extreme C-terminal region is glycine-rich and lower in complexity and poorly conserved between species.
These proteins include Escherichia coli and Bacillus subtilis ffh protein (P48), which seems to be the prokaryotic counterpart of SRP54; signal recognition particle receptor alpha subunit (docking protein), an integral membrane GTP-binding protein which ensures, in conjunction with SRP, the correct targeting of nascent secretory proteins to the endoplasmic reticulum membrane; bacterial FtsY protein, which is believed to play a similar role to that of the docking protein in eukaryotes; the pilA protein from Neisseria gonorrhoeae, the homologue of ftsY; and bacterial flagellar biosynthesis protein flhF.

Primary Citation

Cryo-Em Structure of the E. Coli Translating Ribosome in Complex with Srp and its Receptor.
Author:Estrozi, L.F., Boehringer, D., Shan, S.-O., Ban, N., Schaffitzel, C.
Journal: (2010) Nat.Struct.Mol.Biol.
Not in PubMed

Molecular Description

Classification:	Protein Transport
Structure Weight:	110291.10

Molecule:

SIGNAL RECOGNITION PARTICLE PROTEIN

Polymer:

Type:

polypeptide(L)

Length:

294

Chains:

EC#:

3.6.5.4

Fragment:

NG DOMAIN, RESIDUES 1-294

Molecule:

4.5S RNA

Polymer:

Type:

polyribonucleotide

Length:

114

Chains:

Other Details:

ONLY THE PART OF THE 4.5S RNA THAT IS VISIBLE IN THE EM RECONSTRUCTION IS INCLUDED

Molecule:

SIGNAL RECOGNITION PARTICLE PROTEIN

Polymer:

Type:

polypeptide(L)

Length:

Chains:

Fragment:

M DOMAIN, RESIDUES 329-430

Other Details:

ONLY THE PART OF THE M DOMAIN THAT IS VISIBLE IN THE EM RECONSTRUCTION IS INCLUDED

Molecule:

CELL DIVISION PROTEIN FTSY

Polymer:

Type:

polypeptide(L)

Length:

303

Chains:

Source

Polymer: 1

Scientific Name:

Escherichia coli

Expression System:

Escherichia coli

Polymer: 2

Scientific Name:

Escherichia coli

Expression System:

Escherichia coli

Polymer: 3

Scientific Name:

Escherichia coli

Polymer: 4

Scientific Name:

Escherichia coli

Expression System:

Escherichia coli

Experiment Details

Method: ELECTRON MICROSCOPY

Resolution [Å]: 13.5
Aggregation State: PARTICLE
Reconstruction Method: SINGLE PARTICLE
Specimen Type: VITREOUS ICE

Gene Ontology

Type	Synonym
narrow:	protein folding chaperone
related:	protein tagging activity
related:	protein degradation tagging activity
exact:	protein amino acid binding
related:	alpha-2 macroglobulin receptor-associated protein activity

Thermolysin

Metalloproteases are the most diverse of the four main types of protease, with more than 50 families identified to date. In these enzymes, a divalent cation, usually zinc, activates the water molecule. The metal ion is held in place by amino acid ligands, usually three in number. The known metal ligands are His, Glu, Asp or Lys and at least one other residue is required for catalysis, which may play an electrophillic role. Of the known metalloproteases, around half contain an HEXXH motif, which has been shown in crystallographic studies to form part of the metal-binding site PUBMED:7674922. The HEXXH motif is relatively common, but can be more stringently defined for metalloproteases as 'abXHEbbHbc', where 'a' is most often valine or threonine and forms part of the S1' subsite in thermolysin and neprilysin, 'b' is an uncharged residue, and 'c' a hydrophobic residue. Proline is never found in this site, possibly because it would break the helical structure adopted by this motif in metalloproteases PUBMED:7674922.
In the MEROPS database peptidases and peptidase homologues are grouped into clans and families. Clans are groups of families for which there is evidence of common ancestry based on a common structural fold:

Each clan is identified with two letters, the first representing the catalytic type of the families included in the clan (with the letter 'P' being used for a clan containing families of more than one of the catalytic types serine, threonine and cysteine). Some families cannot yet be assigned to clans, and when a formal assignment is required, such a family is described as belonging to clan A-, C-, M-, S-, T- or U-, according to the catalytic type. Some clans are divided into subclans because there is evidence of a very ancient divergence within the clan, for example MA(E), the gluzincins, and MA(M), the metzincins.
Peptidase families are grouped by their catalytic type, the first character representing the catalytic type: A, aspartic; C, cysteine; G, glutamic acid; M, metallo; S, serine; T, threonine; and U, unknown. The serine, threonine and cysteine peptidases utilise the amino acid as a nucleophile and form an acyl intermediate - these peptidases can also readily act as transferases. In the case of aspartic, glutamic and metallopeptidases, the nucleophile is an activated water molecule.

In many instances the structural protein fold that characterises the clan or family may have lost its catalytic activity, yet retain its function in protein recognition and binding.
This group of metallopeptidases constitutes the MEROPS peptidase family M4 (thermolysin family, clan MA(E)). The protein fold of the peptidase domain of thermolysin, is the type example for members of the clan MA. The thermolysin family is composed only of secreted eubacterial endopeptidases. The zinc-binding residues are H-142, H-146 and E-166, with E-143 acting as the catalytic residue. Thermolysin also contains 4 calcium-binding sites, which contribute to its unusual thermostability. The family also includes enzymes from a number of pathogens, including Legionella and Listeria, and the protein pseudolysin, all with a substrate specificity for an aromatic residue in the P1' position. Three-dimensional structure analysis has shown that the enzymes undergo a hinge-bend motion during catalysis. Pseudolysin has a broader specificity, acting on large molecules such as elastin and collagen, possibly due to its wider active site cleft PUBMED:7674922.

Authors: Juers, D.H.,   Weik, M.

Experiment Details
Method:   X-RAY DIFFRACTION
Exp. Data:
Structure Factors
EDS

Unit Cell:
	Length [Å]	Angles [°]
	a = 93.26	α = 90.00
	b = 93.26	β = 90.00
	c = 128.69	γ = 120.00

Gene Ontology

Type	Synonym
exact :	metalloendoproteinase activity
exact :	metalloendoprotease activity

Primary Citation
Radiation damage study of thermolysin - 100K structure B (2.5 MGy)
Juers, D.H., Weik, M.
Journal: To be Published
Not in PubMed
Molecular Description

Classification:	Hydrolase
Structure Weight:	34588.30

Molecule:

Thermolysin

Polymer:

Type:

polypeptide(L)

Length:

316

Chains:

EC#:

3.4.24.27

Fragment:

UNP residues 233-548

Source
Polymer: 1
Scientific Name Bacillus thermoproteolyticus

Leucyl Aminopeptidase

Aminopeptidases are exopeptidases involved in the processing and regular turnover of intracellular proteins, although their precise role in cellular metabolism is unclear PUBMED:1555602, PUBMED:2395881. Leucine aminopeptidases cleave leucine residues from the N-terminal of polypeptide chains, but substantial rates are evident for all amino acids PUBMED:2395881.
The enzymes exist as homo-hexamers, comprising 2 trimers stacked on top of one another PUBMED:2395881. Each monomer binds 2 zinc ions and folds into 2 alpha/beta-type quasi-spherical globular domains, producing a comma-like shape PUBMED:2395881. The N-terminal 150 residues form a 5-stranded beta-sheet with 4 parallel and 1 anti-parallel strand sandwiched between 4 alpha-helices PUBMED:2395881. An alpha-helix extends into the C-terminal domain, which comprises a central 8-stranded saddle-shaped beta-sheet sandwiched between groups of helices, forming the monomer hydrophobic core PUBMED:2395881. A 3-stranded beta-sheet resides on the surface of the monomer, where it interacts with other members of the hexamer PUBMED:2395881. The two zinc ions and the active site are entirely located in the C-terminal catalytic domain PUBMED:2395881.

Authors : Natarajan, S.,   Huynh, K.-H.,   Kang, L.W.

Experimental Details
Method:   X-RAY DIFFRACTION
Exp. Data:
Structure Factors
EDS

Unit Cell:
	Length [Å]	Angles [°]
	a = 152.13	α = 90.00
	b = 152.13	β = 90.00
	c = 152.13	γ = 90.00

Gene Ontology
Type      Synonym
related : nucleocytoplasm
exact   : internal to cell
related : protoplast
exact : protoplasm

Primary Citation
Crystal structure of Leucyl Aminopeptidase (pepA) from Xoo0834,Xanthomonas oryzae pv. oryzae KACC10331
Natarajan, S.,   Huynh, K.-H.,   Kang, L.W.
Journal: to be published
Not in PubMed
Molecular Description

Classification:	Hydrolase
Structure Weight:	102628.49

Molecule:Probable cytosol aminopeptidase
Polymer:1 Type:polypeptide(L)
Length:490

Chains:

A, B

EC#:

3.4.11.1

Source
Polymer: 1
Scientific Name: Xanthomonas oryzae pv. oryzae
Expression System: Escherichia coli

For More Further Information, visit:
http://www.rcsb.org/pdb/home/home.do

Tuesday, December 28, 2010

ChemSketch

Who made Chemsketch???

Advanced Chemistry Development, Inc.

Advanced Chemistry Development, Inc., (ACD/Labs) is a global chemistry software company developing desktop and enterprise solutions to effectively utilize the wealth of scientific knowledge generated among the many branches of chemical, biochemical, and pharmaceutical R&D. Using our deep understanding of chemical disciplines, advanced mathematical algorithms, and computer science, our solutions help guide on-going research, aid decision-making, and speed the development of new chemical entities for the marketplace.
Founded in 1994 and headquartered in Toronto, Canada, ACD/Labs employs a team of over 160 dedicated individuals, including many PhD level scientists from varying chemical disciplines. In 2009, ACD/Labs joined forces with Pharma Algorithms (a leader in ADME/Tox prediction). The merger represents greater possibilities for the future of in silico modeling that will benefit molecular discovery in pharmaceutics and biotechnology.

What is Chemsketch????
What is ACD/ChemSketch ACD/ChemSketch is a chemical drawing software package from Advanced Chemistry Development, Inc. designed to be used alone or integrated with other applications. ChemSketch is used to draw chemical structures, reactions and schematic diagrams. It can also be used to design chemistryrelated reports and presentations. ACD/ChemSketch has the following major capabilities: · Structure Mode for drawing chemical structures and calculating their properties. · Draw Mode for text and graphics processing. · Molecular Properties calculations for automatic estimation of: * molecular weight; * percentage composition; * molar refractivity; * molar volume; * parachor; * index of refraction; * surface tension; * density; * dielectric constant; and * polarizability. ACD/ChemSketch can stand alone as a drawing package or act as the “front end” to other ACD software such as the NMR Predictor engines.

To download the freeware of Chemsketch, visit:
http://www.acdlabs.com/download/

Tuesday, December 21, 2010

Introduction to HTML

HTML, which stands for HyperText Markup Language, is the predominant markup language for web pages. A markup language is a set of markup tags, and HTML uses markup tags to describe web pages.

HTML is written in the form of HTML elements consisting of "tags" surrounded by angle brackets (like <html>) within the web page content. HTML tags normally come in pairs like <b> and </b>. The first tag in a pair is the start tag, the second tag is the end tag (they are also called opening tags and closing tags).

The purpose of a web browser is to read HTML documents and display them as web pages. The browser does not display the HTML tags, but uses the tags to interpret the content of the page.

HTML elements form the building blocks of all websites. HTML allows images and objects to be embedded and can be used to create interactive forms. It provides a means to create structured documents by denoting structural semantics for text such as headings, paragraphs, lists, links, quotes and other items. It can embed scripts in languages such as JavaScript which affect the behavior of HTML webpages.

HTML can also be used to include Cascading Style Sheets (CSS) to define the appearance and layout of text and other material. The W3C, maintainer of both HTML and CSS standards, encourages the use of CSS over explicit presentational markup.

History

Origins

Tim Berners-Lee

In 1980, physicist Tim Berners-Lee, who was a contractor at CERN, proposed and prototyped ENQUIRE, a system for CERN researchers to use and share documents. In 1989, Berners-Lee wrote a memo proposing an Internet-based hypertext system. Berners-Lee specified HTML and wrote the browser and server software in the last part of 1990. In that year, Berners-Lee and CERN data systems engineer Robert Cailliau collaborated on a joint request for funding, but the project was not formally adopted by CERN. In his personal notes from 1990 he lists "some of the many areas in which hypertext is used" and puts an encyclopedia first.

HTML version timeline

November 24, 1995

HTML 2.0 was published as IETF RFC 1866. Supplemental RFCs added capabilities:

* November 25, 1995: RFC 1867 (form-based file upload)
* May 1996: RFC 1942 (tables)
* August 1996: RFC 1980 (client-side image maps)
* January 1997: RFC 2070 (internationalization)

In June 2000, all of these were declared obsolete/historic by RFC 2854.. January 1997

HTML 3.2 was published as a W3C Recommendation. It was the first version developed and standardized exclusively by the W3C, as the IETF had closed its HTML Working Group in September 1996.

HTML 3.2 dropped math formulas entirely, reconciled overlap among various proprietary extensions and adopted most of Netscape's visual markup tags. Netscape's blink element and Microsoft's marquee element were omitted due to a mutual agreement between the two companies. A markup for mathematical formulas similar to that in HTML wasn't standardized until 14 months later in MathML..

December 1997

HTML 4.0 was published as a W3C Recommendation. It offers three variations:

* Strict, in which deprecated elements are forbidden,
* Transitional, in which deprecated elements are allowed,
* Frameset, in which mostly only frame related elements are allowed;

Initially code-named "Cougar",HTML 4.0 adopted many browser-specific element types and attributes, but at the same time sought to phase out Netscape's visual markup features by marking them as deprecated in favor of style sheets. HTML 4 is an SGML application conforming to ISO 8879 - SGML.. April 1998

HTML 4.0 was reissued with minor edits without incrementing the version number.
December 1999

HTML 4.01 was published as a W3C Recommendation. It offers the same three variations as HTML 4.0 and its last errata were published May 12, 2001.. May 2000

ISO/IEC 15445:2000 ("ISO HTML", based on HTML 4.01 Strict) was published as an ISO/IEC international standard. In the ISO this standard falls in the domain of the ISO/IEC JTC1/SC34 (ISO/IEC Joint Technical Committee 1, Subcommittee 34 - Document description and processing languages).

As of mid-2008, HTML 4.01 and ISO/IEC 15445:2000 are the most recent versions of HTML. Development of the parallel, XML-based language XHTML occupied the W3C's HTML Working Group through the early and mid-2000s..

HTML Draft Version Timeline

October 1991

HTML Tags, an informal CERN document listing twelve HTML tags, was first mentioned in public.
June 1992

First informal draft of the HTML DTD, with seven subsequent revisions (July 15, August 6, August 18, November 17, November 19, November 20, November 22)
November 1992

HTML DTD 1.1 (the first with a version number, based on RCS revisions, which start with 1.1 rather than 1.0), an informal draft
June 1993

Hypertext Markup Language was published by the IETF IIIR Working Group as an Internet-Draft (a rough proposal for a standard). It was replaced by a second version one month later, followed by six further drafts published by IETF itself that finally led to HTML 2.0 in RFC1866.

November 1993

HTML+ was published by the IETF as an Internet-Draft and was a competing proposal to the Hypertext Markup Language draft. It expired in May 1994.
April 1995 (authored March 1995)

HTML 3.0 was proposed as a standard to the IETF, but the proposal expired five months later without further action. It included many of the capabilities that were in Raggett's HTML+ proposal, such as support for tables, text flow around figures and the display of complex mathematical formulas.

W3C began development of its own Arena browser as a test bed for HTML 3 and Cascading Style Sheets,but HTML 3.0 did not succeed for several reasons. The draft was considered very large at 150 pages and the pace of browser development, as well as the number of interested parties, had outstripped the resources of the IET. Browser vendors, including Microsoft and Netscape at the time, chose to implement different subsets of HTML 3's draft features as well as to introduce their own extensions to it.These included extensions to control stylistic aspects of documents, contrary to the "belief [of the academic engineering community] that such things as text color, background texture, font size and font face were definitely outside the scope of a language when their only intent was to specify how a document would be organized. Dave Raggett, who has been a W3C Fellow for many years has commented for example, "To a certain extent, Microsoft built its business on the Web by extending HTML features.
January 2008

HTML 5 was published as a Working Draft by the W3C.

Although its syntax closely resembles that of SGML, HTML 5 has abandoned any attempt to be an SGML application and has explicitly defined its own "html" serialization, in addition to an alternative XML-based XHTML 5 serialization.. .

Resources From :
http://en.wikipedia.org/wiki/HTML