An Overview of Biodiversity Informatics

Background for the first meeting of the All Species Project

By	:	Stanley D. Blum
		California Academy of Sciences

First distributed	:	September 18, 2000
This version	:	October 7, 2000

Preface

This overview of biodiversity informatics is strongly focussed on biological systematics and the work conducted in systematics collections (i.e., natural history museums). It does not address observational (monitoring) data, rare and endangered species, ecosystem characterization, threat assessment, and a host of other biodiversity issues.

Introduction

Regardless of what the All Species project sets as its ultimate goals, it is certain that the work will create large quantities of new information. Most people will actually experience the project's results not by coming into contact with newly discovered and described organisms, but by experiencing the new information generated about the organisms – everything from text, to pictures, diagrams, maps, sounds and video.

Ensuring that information flows efficiently, from creation, through analysis, into appropriate outputs, is the essence of biodiversity informatics – the application of information technology to the domain of biodiversity.

This overview of biodiversity informatics describes the main subject areas in biological systematics, their interrelationships, and the most important informatics projects in a given area. The subject areas are:

Taxonomic Names and Classification
Taxonomic Character Data: Taxonomic Descriptions, Keys, and Phylogenetic Data
Specimen Data and Species Distributions

In each of these areas, I will discuss

The nature and uses of the information
How the data are captured and managed
Status of data capture and management within the community (e.g., percent digitized)
Significant projects for compiling data, developing software, etc.

Taxonomic Names and Classification

Nature and Uses of Names and Classification

Biological taxonomy – the scientific names of organisms – provides a global (at least internationally recognized) system of designating natural groups of organisms; i.e., species and higher taxa. Classifications assemble smaller groups into larger groups, and provide a way of making statements about or retrieving information about many species at a time. One of the first steps in communicating the discovery of a new kind of organism is to give it a name, and to infer what it is by classifying it – i.e., saying that it is a particular kind of something more general and perhaps already familiar.

Examples of taxonomic names

Gorilla gorilla the name of a species

Scombridae the name of a family of fishes (tunas, mackerels, etc.); the family contains 15 genera and 49 valid/accepted species; 212 species names are "available" for these 49 species so (212 - 49 =) 163 of those names are synonyms.

– (From the ITIS database)

An example of a species in a classification
                                                                                Kingdom Animalia animals  Phylum Chordata chordates    Subphylum Vertebrata vertebrates      Class Mammalia mammals        Order Primates primates          Family Hominidae man-like primates            Genus Gorilla              Species Gorilla gorilla                                                              
                
– (From the ITIS database)

Defining a group of organisms (taxon) and naming it are conceptually distinct operations. Determining that a taxon is biologically meaningful or real is a question of scientific judgement and is subject to refutation. Determining what name should be applied to it is a matter of following the rules set forth in the relevant international code of nomenclature. (There are three; a code for microbiology, another for botany, and a third for zoology.)

Taxonomic names are created and put into use via publication. There are no requirements of scientific veracity, reasonableness, or qualifications of the author for a name to be effectively published and admitted into the universe of discourse. Once a name is published it has to be dealt with in subsequent works.

Scope of nomenclatural and classification data

There are an estimated 1.5 - 2 million known species. There are somewhere between one and two synonyms for every valid/accepted species (in addition the valid name). Compiling a list of scientific names for a major group takes years of effort. For the resulting list to represent real progress – something that doesn't have to be done again – each original publication must be reviewed (at least briefly) by a taxonomist and the decisions documented with supplemental information. Data gathered along with the name typically include the bibliographic reference, author(s), and date of publication. Additional information may include references to type specimens (institution and catalog number), type locality, and references to subsequent taxonomically significant publications.

Significant projects

Projects that aim to compile taxonomic databases fall into two major categories: nomeclators and checklists. A nomenclator is a compilation of all relevant names, but does not present opinions about which taxa are accepted or valid. A checklist represents determinations about which taxa are accepted or v

   original link:
   <a href='http://Apiaceae.github.io/blog/2009/03/18/%E7%94%9F%E7%89%A9%E5%A4%9A%E6%A0%B7%E6%80%A7%E4%BF%A1%E6%81%AF%E5%AD%A6%E4%BB%8B%E7%BB%8D/'>http://Apiaceae.github.io/blog/2009/03/18/%E7%94%9F%E7%89%A9%E5%A4%9A%E6%A0%B7%E6%80%A7%E4%BF%A1%E6%81%AF%E5%AD%A6%E4%BB%8B%E7%BB%8D/</a><br/>
   &nbsp;written by <a href='http://Apiaceae.github.io'>Hooker</a>
   &nbsp;posted at <a href='http://Apiaceae.github.io'>http://Apiaceae.github.io</a>
   </p>

虎克的博客

Enthusiasm Biogeography-Biodiversity Informatics-Data Sciences

生物多样性信息学介绍