Bioinformatics and Computational Biology: 30 Years

Image of 1980s DNA sequencing
gel by Mark Carrington Picture Cambridge in 1981: some colleges had yet to accept women; there was no internet and mobile phones were unknown. To communicate with students one asked a kindly porter to leave a note in their pigeonholes. Submitting a paper required the deciphering of a manuscript by a patient secretary followed by the posting of a large envelope to the offices of the journal. To keep abreast of their chosen field, scientists retired to the library, or more probably pedalled round town to find the department that subscribed to the journal they sought. Biology itself was mostly confined to the lab: the word bioinformatics was not yet in use.

But all this was soon to change. The development of DNA sequencing by Fred Sanger (Nobel Prize in Chemistry 1958 and 1980) was soon followed by a need for ways to piece together the fragments of DNA code and analyze the completed puzzles. Rodger Staden and others had started to write software to deal with these new data. Other biologists had become aware of the new type of data and had started to see the implications for evolutionary studies, which later underpinned an extraordinary revolution in fields as diverse as archaeology, cancer research and zoology. The EMBL database of DNA sequences had been founded and started its exponential growth.

Image courtesy of Dr Mark Carrington, Department of Biochemistry and former Fellow of St John's College: autoradiograph made in 1988 of a Sanger dideoxy sequencing gel analysing the results of six sequencing reactions. The total number of base pairs determined was around 2400 and it took approximately eight hours labour from start to finish. Current third generation sequencing machines can produce more than 1 billion base pairs of sequence for the same effort. See the side-bar and following pages for more information.

The story continues...