|
| |
| Introduction |
| |
Bioinformatics is the field of science in which biology,
computer science, and information technology merge
to form a single discipline. It applies computer and
information science to gather, store, analyze and
integrate biological and genetic information. Over
the past few decades, major advances in the fields
of genetics and other related areas, together with
the development of several new technologies, have
led to an explosive growth of biological information.
The huge amount of data generated by projects such
as the sequencing of the human genome led to an urgent
need for means to store and organize these data. Programs
and special tools to view and analyze the data were
also in demand. |
 |
The rapidly emerging field of bioinformatics promises
to lead to advances in understanding basic biological
processes, and in turn, advances in the diagnosis,
treatment, and prevention of many genetic diseases.
Bioinformatics has transformed the discipline of biology
from a purely lab-based science to an information
science as well. Increasingly, biological research
projects begin with a scientist in front of a computer
rather than in the lab with a pipette. |
|
A biological database is a large and categorized body
of data, usually associated with computerized software
designed to update, query, and retrieve specific sets
of information stored in the database. For example,
an entry associated with a nucleotide sequence usually
includes the name of the scientist who deposited the
sequence in the database, a description of the molecule
(if it's a DNA, RNA, protein, etc...), the name of
the organism from which the sequence was determined,
the date when the sequence was deposited and so on.
|
|
In order to benefit from a database, scientists need
to have full access and be able to find the information
they need easily. In the GenBank, the public database
of the National Center for Biological Information
(NCBI), the information can be found in a variety
of ways, from uploading a nucleotide sequence to a
text search of the name of the gene of interest. |
|
At NCBI, all biological sequences (DNA, RNA, cDNA
or protein) can be found using a unique search and
retrieval system called Entrez, that allows access
to all integrated databases in the NCBI. For example,
the Entrez protein database is cross-linked to the
Entrez taxonomy database, allowing scientists to find
taxonomic information--taxonomy is a division of the
natural sciences that deals with the classification
of animals and plants--for the species from which
a protein sequence was derived. Additionally, the
database can provide important insights on evolution
and the relationships between different kinds of life
on earth (a field called philogenetics). In this case,
scientists can find a gene within a nucleotide sequence,
and search for relationships or similarities between
the same gene sequence from different organisms. That's
how scientists determined which genes share an evolutionary
past. The more closely related two organisms are the
more similar their genes will be. |
| |
|