CIO Blast from the Past: 60 years of Hamming codes
- 25 November, 2010 14:09
Richard W. Hamming pioneered Hamming codes, a discovery that would lay an important foundation for the entire modern computing and communications industries
In 1950 Bell Labs researcher Richard W. Hamming made a discovery that would lay an important foundation for the entire modern computing and communications industries. He had invented a code for correcting errors in communication and the Hamming code was born. CIO Blast from the Past takes a journey through 60 years of information theory and discovers how the Hamming code legacy lives on today.
Richard W Hamming was born in Chicago in 1915. He studied mathematics at three universities and received a bachelor of science, master of arts and a PhD. His early experience was obtained at Los Alamos National Laboratory from 1945 to 1946, where he managed the computers used to build the first atomic bomb.
From there, he moved to Bell Labs where he spent 30 years in various aspects of computing, numerical analysis, and management of computing.
In 1976 he joined the Naval Postgraduate School in Monterey, California where he worked as a teacher and authored no fewer than nine books on numerical methods, coding theory and computing.
In April 1950 the American Telegraph and Telephone company published Volume 29 of the Bell System Technical Journal. The paper, titled Error Detecting and Error Correcting Codes, by RW Hamming, outlined a method for performing computing operations on a large scale without errors.
The paper is still available today and is hosted by Alcatel-Lucent (PDF), the modern owner of Bell Labs.
The “special codes” Hamming proposed could handle error detection and correction and he speculated would only need to be applied to systems requiring unattended operation for long periods of time or “extremely large and tightly integrated” systems where a single failure “incapacitates the entire installation”.
Named after their founder, the Hamming codes went on to be superseded and applied to all aspects of computing and communications systems.
Hamming, a Turing Award recipient in 1968, passed away on January 7, 1998. The IEEE Richard W Hamming Medal, awarded annually for exceptional contributions to IT, lives on in his honour.
Meet the Shannon bound
To understand Hamming’s contribution in error correction coding in perspective, it is helpful to understand some other events in the history of communications, says Todd Moon, professor of electrical and computer engineering at Utah State University.
“In 1948, Claude Shannon published A Mathematical Theory of Communication, (Bell System Technical Journal, v. 27, pp. 623—656), in which he created the field of information theory. In that paper, he showed that digital communication could be made as reliable as desired (essentially with zero probability of error), provided that error correction coding was used and that the rate of communications is less than a quantity associated with the channel called the ‘channel capacity’”, Moon says.
See related slideshow: CIO Blast from the Past - 60 years of cryptography |
Transmitting data at a rate less than capacity is tied to the concept of error correction coding.
Moon says Shannon’s bold “existence proof” entails that such codes exist, but the paper does not say how to find them. Codes that are as good as theoretically possible are said to “meet the Shannon bound”.
“Shannon’s paper sparked 60 years of research as engineers began to understand the implications of the theory to actual communications and push to find codes that are close to the Shannon bound,” Moon says.
“It is hardly an overstatement to say that Shannon was the impetus behind the digital revolution, as successively better and better ways of coding led to better and better communications.”
Shannon’s theories were first applied to digital communications over phone lines, and there was a steady succession of faster modems: from 300 to 14,400 bits per second.
David MacKay, professor of natural philosophy in the department of Physics at the University of Cambridge and chief scientific adviser to the UK Department of Energy and Climate Change, has also written about Hamming codes and in 2003 published Information Theory, Inference, and Learning Algorithms.
MacKay says Hamming's first steps in coding theory coincided with Shannon's 1948 discovery of the complementary field of information theory.
“Shannon proved the existence of error correcting codes with capabilities way beyond those of Hamming codes,” he says. “He proved the existence of larger codes capable of correcting larger number of errors. Shannon's proof was non-constructive, and laid down the coding theory community a delightful challenge; Hamming's work was the foundation stone.”
The Hamming code connection
Where does Hamming fit into this early work on communications theory?
In 1950 Hamming was working on computers, which at that time were very unreliable since they were based on vacuum tubes, Moon says.
“He was looking for a way to make these computations more reliable, and came up with the concept of these codes which add some redundancy to enable the correction of some errors.
“At the time, his focus was on computer hardware, but it soon emerged that Hamming’s codes were precisely the kind of codes that Shannon was referring to in his paper. They don’t meet the Shannon bound – they don’t even come close – but they started the exploration of the whole field.”
There is not just one, but a whole family of Hamming codes. The (7,4) Hamming code means four bits are put into the encoder and seven coded bits are returned. And since there are four bits in, there are 16 (2^4) possible codewords.
“There is also a (15,11) Hamming code (with 2^11 = 2048 codewords of length 15), a (31,26) Hamming code (with 2^26 codewords of length 31),” Moon says.
“Each of these codes is capable of correcting one error out of the code word. For example, in the (7,4) code, out of the seven coded bits, if a single bit error occurs anywhere, the decoder can fix it. Or, in the (31,26) code, if a single error occurs anywhere, the decoder can fix it.”
Hamming codes can also be used to detect when errors occur. Any of the Hamming codes are capable of identifying whether two bits are flipped (two errors) and, if a system is aware that errors have occurred, the information can be retransmitted.
“This is the way the TCP/IP protocol works on the internet, although it uses a different kind of code for stronger error detection capability,” Moon says.
“In exploring the properties of these codes, Hamming used some geometric analogies to talk about ‘distance’ between codewords. Comparing codewords, it is clear that every code word differs from any other code word in at least three bit positions. This measure of distance is referred to as the Hamming distance, due to Hamming’s original use of the term.”
As the theory of error correcting codes developed, various other attributes and families of codes were discovered, including a class of codes called “perfect” codes and “cyclic” codes. Hamming codes exhibit both code types.
MacKay says the Hamming code was the first discovery in an immense field called coding theory and was also a rather unusual foundation stone for the field because it is a "perfect" code: a code whose codewords are the centres of hyperspheres that perfectly fill the space of all strings, without overlapping.
“Perfect codes are remarkably rare objects. See page 228 of my book which is available free online,” he says.
Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.
- Bell System Technical Journal, Vol 29, 1950 (PDF)
- The IEEE Richard W Hamming Medal
- CIO Blast from the Past: 40 years of Multics, 1969-2009
- SLIDESHOW: CIO Blast from the Past - 40 years of Multics
- SLIDESHOW: CIO Blast from the Past -- 110 Years of IBM technology
- Professor Todd Moon, Utah State University
- SLIDESHOW: CIO Blast from the Past - 60 years of cryptography
- Professor David J.C. MacKay
- Information Theory, Inference, and Learning Algorithms
- Wikipedia :: Hamming code
- You and Your Research (PDF)
- Wikipedia :: David J.C. MacKay
The enlightened CIO’s guide to running projects
Why IT projects really fail
Queensland government to provide 200 services online by 2015
Call Centers Suffer From Big Data Overload
CIO 100: Carsales wins top gong for innovation
Robust Data Protection Solutions for Virtual Environments
Organisations face a juggling act with the need to improve backup and recovery, increase server virtualization, manage data growth, while remaining in operation. Virtualization has complicated the protection landscape, as protecting virtual environments can be a challenge, especially as VMs are quickly and easily created, moved, and deleted in data centres and in the cloud. This white paper explores how new backup systems have been invigorated with future-proof functionality aimed at today’s virtualized environments, offering the backup “fountain of youth”.
IDC: Delivering Customer Value with Enterprise Flash Deployments
When it comes to flash, “one size does not fit all.” IDC examines recent flash trends in enterprise storage deployments. This includes: highlighting how SSDs are filling in gaps of existing storage systems when coupled with intelligent archiving and automated tiering, the pros and cons of different SSD approaches, and tips to overcome concerns of reliability, manageability and scalability.
Complexity Ate My Budget
It’s high time we tamed the monster we created! Against a backdrop of sustained and uncontrollable data growth, most of today’s operational problems revolve around backup and recovery. Understanding the hidden costs and implications for data protection strategies is critical, but the complexity of the nebulous and amorphous cloud can make everything hazy. This white paper breaks it down to different dimensions of virtualisation and how to deliver the productivity and flexibility it promises.