UNIT 3.5 Selecting the Right Protein-Scoring Matrix

  1. David Wheeler

Published Online: 1 NOV 2002

DOI: 10.1002/0471250953.bi0305s00

Current Protocols in Bioinformatics

Current Protocols in Bioinformatics

How to Cite

Wheeler, D. 2002. Selecting the Right Protein-Scoring Matrix. Current Protocols in Bioinformatics. 00:3.5:3.5.1–3.5.6.

Author Information

  1. Human Genome Center, Baylor College of Medicine, Houston, Texas

Publication History

  1. Published Online: 1 NOV 2002
  2. Published Print: JAN 2003

This is not the most recent version of the article. View current version (15 OCT 2013)


Every program for searching protein sequences against a database includes a choice of a protein weight matrix, also called a scoring matrix. Weight matrices add sensitivity to the search, while statistical significance adds selectivity. Virtually every user chooses the default, typically PAM 250 or BLOSUM62. Despite the fact that the choice of matrix can strongly influence the outcome of the analysis, most users do not know why a particular matrix should be used. In general, scoring matrices implicitly represent a particular theory of protein sequence evolution. Understanding the assumptions underlying the PAM and BLOSUM scoring matrices can aid in making the proper choice. The purpose of this unit is to guide the choice of a scoring matrix. It covers the selection of PAM matrices, BLOSUM matrices and provides a brief overview of the wide variety of specialized scoring matrices.