Theoretical basis of isoelectric point calculation, i.e. how to calculate isoelectric point of protein

Let's start from isoelectric point definition:

Isoelectric point (pI) is a pH in which net charge of protein is zero. In case of proteins isoelectric point mostly depends on seven charged amino acids: glutamate (δ-carboxyl group), aspartate (ß-carboxyl group), cysteine (thiol group), tyrosine (phenol group), histidine (imidazole side chains), lysine (ε-ammonium group) and arginine (guanidinium group). Additonally, one should take into account charge of protein terminal groups (NH₂ i COOH). Each of them has its unique acid dissociation constant referred to as pK.
Moreover, net charge of the protein is in tight relation with the solution (buffer) pH. Keeping in main this we can use Henderson-Hasselbach equation to calculate protein charge in certain pH:

where pK_pis the acid dissociation constant of positively charged amino acid

As you can see, only pH of buffer is variable in equations. If we successively change this value, finally we will find isoelectric point of analyzed protein. The knowledge of isoelectric point is of great significance in biochemistry (mainly in elecrophoresis and isofocusing techniques), because it allows to match proper environment before the experiment starts.

Generally, macromolecules are positively charged and on the other hand, above proteins isoelectric point, their charge is negative.For example, during electrophoresis, direction of proteins migration, depends only from their charge. If buffer pH (and as a result gel pH) is higher than protein isoelectric point, the particles will migrate to the anode (negative electrode) and if the buffer pH is lower than isoelectric point they will go to the cathode. In situation when the gel pH and the protein isoelectric point are equal, proteins do not move at all.
Using above formulae, we can calculate theoretical isoelectric point. The result will be almost surely different than real isoelectric point. It is mainly because many proteins are chemically modified (amino acids can be phosphorylated, methylated, acetyleted etc.), which change their charge. Problematic is also the occurrence of cysteines (negative charge) which can oxidise and form disulfide bond in protein. Therefore, they will become cystines, which do not express any charge.
Nevertheless, one can approximately calculate protein isoelectric point which is ± 0.5 of exact isoelectric point. The most critical moment during isoelectric point determination is usage of appropriate pK values. Unfortunately, there is no agreement in this matter. Each source gives different pKs. Some of them are presented below:

* Arg was not included in the study and the average pK from all other scales was taken

More advanced algorithm, implemented in ProMoST, takes into account localization of the charched amino acid:

aa	N term	middle	C term
K R H D E C U* Y	10.00 11.50 4.89 3.57 4.15 8.00 5.20 9.34	9.80 12.50 6.08 4.07 4.45 8.28 5.43 9.84	10.30 11.50 6.89 4.57 4.75 9.00 5.60 10.34
* pK was taken from Byun et al. 2011

Additionally different pK values are used for N and C terminus depending on uncharched amino acid if aplicable:

             *   X - average from all amino acids
             **  Z =(E+Q)/2
             *** B =(N+D)/2         

Now, having this few peaces of information we can try to write simple computer program which calculate isoelectric point. We will use free compiler DevC++ as the program will be written in C++ programming language. To read next section you should have at least basic knowledge in C++.

For more theoretical information go to:
http://en.wikipedia.org/wiki/Isoelectric_point
Tabb DL (2001) An algorithm for isoelectric point estimation
Sillero A, Maldonado A. (2006) Isoelectric point determination of proteins and other macromolecules: oscillating method. Comput Biol Med. 36(2), 157-66. Epub 2005 Jan 1 - this one is not open access article
Grimsley GR, Scholtz JM, Pace CN. A summary of the measured pK values of the ionizable groups in folded proteins. Protein Sci. 2009 Jan;18(1):247-51.

top

Amino acid	NH₂	COOH	C	D	E	H	K	R	Y
EMBOSS	8.6	3.6	8.5	3.9	4.1	6.5	10.8	12.5	10.1
DTASelect	8.0	3.1	8.5	4.4	4.4	6.5	10.0	12.0	10.0
Solomon	9.6	2.4	8.3	3.9	4.3	6.0	10.5	12.5	10.1
Sillero	8.2	3.2	9.0	4.0	4.5	6.4	10.4	12.0	10.0
Rodwell	8.0	3.1	8.33	3.68	4.25	6.0	11.5	11.5	10.07
Patrickios	11.2	4.2	-	4.2	4.2	-	11.2	11.2	-
Wikipedia	8.2	3.65	8.18	3.9	4.07	6.04	10.54	12.48	10.46
Lehninger	9.69	2.34	8.33	3.86	4.25	6.0	10.5	12.4	10.0
Grimsley	7.7	3.3	6.8	3.5	4.2	6.6	10.5	12.04*	10.3