Aquatic ape theory and speech origins: a hypothesis
Marc J. M. Verhaegen
Speculations in Science and
Technology 11, 165-171 (1988)
Received: March 1987
Abstract - The question of speech origins is
discussed in the light of the theory that humans had semi-aquatic hominid
ancestors. Diving requires a special anatomy of the airway entrances and a very
refined control of breathing. The brain structures that "voluntarily"
controlled the airway entrances’ closure and breathing could also be used for
elaborating the older (early hominid, perhaps gibbon-like) sound production. Later,
the evolution of association areas in the brain greatly enhanced human ability
for attaching a particular meaning to a conventional sound combination.
The aquatic ape theory
(AAT) of Sir Alister Hardy (1) states that a few million years ago human
ancestors spent a considerable part of their day swimming and diving in a
river, lake or sea, and, at least partially, consumed aquatic food. The AAT is
supported by the presence of our thick subcutaneous fat layers, by our lack of
body hair and by several other features that are absent in non-human primates,
but widespread among aquatic mammals (1-13).
The ability to speak is a
uniquely human characteristic. Innumerable attempts to explain it have been
made but the question of how language emerged is not yet solved. Recently, it
has been suggested that the origin of speech was facilitated by our aquatic
past (5,14). All aquatic mammals
"voluntarily" control their breathing. When surfaced they open the
airway passage whenever they want to inhale air, and they can hyperventilate
and then close the airway passage when they intend to dive. The subtle
"voluntary" control of breathing and airway closure in mammals in
general is a pre-adaptation for speech (15,16).
All of this is very
speculative but interesting enough to elaborate the hypothesis of the aquatic
origin of speech and to propose a possible scenario for speech emergence. Schematically,
I discern four, more or less distinct, phases in speech evolution, based on the
supposed evolutionary sequence of the speech centres.
Phase I - Gibbon-like
song (the tropical forest phase of hominoid evolution)
Presumably, the earliest
hominoids were tree living creatures that were smaller than modern pongids and
hominids, and therefore probably had smaller home ranges (17). There are
reasons to believe that they produced gibbon-like territorial songs. Each pair of
gibbons produces its own duet, a rather stereotyped melody (18) that is
recognised by its neighbours as belonging to that pair. This is also seen in
other monogamous arboreal vertebrates, especially tropical birds (19,20) and some small and medium sized primates (21). In the
forest canopy, pure tones seem to carry further than sounds without pure tones
(21). Music, a rhythmic arrangement of pure tones, often grouped in alternating
"voices", can powerfully affect our emotions. It might be a rudiment
of a territorial and pair- or group-binding mechanism, as in the case of
national anthems, hymns, love songs, etc., and could have its origin in
early hominoid territorial song.
If this is so, it seems to
imply that the earliest hominoids were monogamous. That is not impossible,
since all three (remarkable different) social group types of the great apes
(orangs are solitary, gorillas live in harem groups, common chimps live in
territories defended by related males, see ref. 22, pp.25 and 36) are easily
derivable from ancestral monogamy.
In that case, the great
apes must have lost the more musical gibbon-like utterances. The adult males
develop large laryngeal air sacs. They produce loud, low-frequency calls, which
are probably better suited to their life-style where they are no longer
monogamous and have larger home ranges, than the more varied and more musical
gibbon-like songs (see ref. 22, p.207).
Gibbon songs are mainly
generated by the vocal chords, although the vocal tract also probably plays an
important role (as in birds, see ref. 23). It is not known which brain
structures the gibbons use for song generation and analysis. The early
hominoids could use subcortical, limbic, extra-pyramidal and/or peri-Sylvian
structures. Great apes show slight Sylvian enlargements of the left auditory
cortex. Probably, the early hominoids did not use the "voluntary"
Area 4 (see Phase II) much in song generation. As an illustration, many
patients with paralytic strokes and right hemiplegia (left Area 4 lesion)
cannot speak but can swear or even sing.
Phase II - Voluntary
breathing and airway closure (the semi-aquatic phase)
All animals can
"voluntarily" open the mouth to bite. "Voluntary" means, in
this context, controlled by the primary motor projection cortex, i.e. in
primates, the precentral Area 4 of the cerebral cortex (see Figure 1). Aquatic
mammals can close the airway entrances much more completely than land mammals,
thus avoiding being drowned by water entering the lungs, and they have a very
refined voluntary control of mouth, nose and throat passages.
Modern man has a very
special anatomy of the airway entrances that is not incompatible with a
previous semi-aquatic lifestyle. He has a smaller mouth which can be closed
more efficiently (24) and, presumably, the wet mucosa of our fleshy lips allows
a better fit than the dry skin of the lips of non-human primates. In other
primates, the tongue is generally flatter and somewhat less mobile than in
humans (ref. 16, p.625). Our nasal cavity is elongated by an external nose
(ref. 25, Figure 159) and narrowed by strongly developed inferior conchae,
which often cause even complete obstruction in some humans (11,26,29). The
nasal cavity can be disconnected from the throat by muscles that raise the
velum (probably also in apes) (5). In human adults, the larynx is placed more
caudally than in non-human mammals (15,25,27, for a
possible explanation, see ref. 14). We have a larynx that is much more mobile
than that of apes (28). Moreover, humans, like aquatic mammals, can breathe
voluntarily; they can hold their breath, though this is never necessary on
land, and can hyperventilate whenever they want to (e.g. before diving,
see ref. 29). Humans, in contrast with most terrestrial mammals (ref. 25,
Figure 25), can voluntarily breathe through the mouth, possibly facilitated by
the laryngeal descent (16,25), which could be an adaptation to enable rapid in-
or ex-halation of a large amount of air before or just after a dive, the nose
passage alone being too small.
Our primary motor projection
cortex is much larger than that of apes, mostly due to the expansion of the
areas for the musculature of mouth, throat and
breathing, i.e. the latero-inferior section of Area 4 (see Figure 1). Just
in front of that enlarged Area 4 lies Broca’s Area. It
is a typically human structure indispensable for speech generation, and can be
distinguished histologically from all other human cortical areas (ref. 30,
pp.5-12). In present-day man, Broca’s Area coordinates the activities of the
latero-inferior section of Area 4, in order to produce the right sound at the
right time. Broca’s Area (or the first Broca-like structure) originates in my
theory’s Phase II to coordinate the muscles commanded by the enlarged Area 4 to
make the right airway muscle contract at the right moment: just before, during
or just after a dive.
Figure 1
Lateral view of left
human cerebral cortex.
After
Chusid (30),
Geschwind (33) and Thompson (39).
Phase III - Voluntary
sound production
The varied but merely
emotional sound production (Phase I) combined with the voluntary control of the
airway musculature (Phase II) predisposed to a voluntary sound production that
could be extremely varied. When our ancestors returned to a wholly terrestrial
habitat, airway control for diving became superfluous and the refined airway
musculature could be used exclusively for improving vocalisation. The sounds
generated by the vocal chords could be strongly modified and diversified by
contracting certain muscles in the lips, tongue, velum, pharynx and larynx,
governed by the neocortex of Area 4 and Broca’s area. In order to use the
voluntary airway control for the vocal apparatus, our ancestor must have been
able to register and interpret his own sound production (feedback, cf. motor
theory of speech production) (31,32). This was
certainly improved by the evolution of the arcuate fasciculus (see
Figure 1), a typically human neural pathway between Broca’s Area and Wernicke’s Area (33). Wernicke’s Area,
a primary language area used for decoding spoken language, lies immediately
dorsal to the primary auditory receptive area, and to the postcentral principal
sensory areas for mouth and throat (see Phase IV). In Wernicke’s
Area, connections could be made with other, nearby neocortical areas
(especially the auditory, visual and sensory areas, see Figure 1), and the
sound or certain combinations of sounds (words) could be associated with
something that our ancestor was aware of (hearing, seeing, feeling, doing) at the
same moment. The first "words" could be an extension or an
abbreviation of one’s own melody or an imitation of somebody else’s territorial
song or group song, of weeping, crying, laughter, panting, etc., or of
natural phenomena, like branch cracking, animal calls, etc. Later (in
Phase IV?), a fixed "word" order may have become established by
custom (e.g. the actor before the action: subject/verb), and fusion of words
that often followed each other created conjugation, flexion and new words.
Phase IV - Association
areas and thought
Compared with a
chimpanzee’s brain, our association areas are enormously large. These areas are
found in the temporal, preoccipital, parietal and inferior frontal lobes (see
Figure 1). The cortex of these areas can be distinguished histologically from
the other cortical areas and even from Broca’s Area (ref. 30, pp.5-12). This
suggests that Broca’s Area and the association areas evolved separately
(respectively in Phases II and IV?). In my interpretation, most association
areas evolved after the breathing and air-holding function of the enlarged Area
4 and Broca’s Area had been integrated with sound generation (Phase III). The
new association areas amplified the possible applications of the
sound-producing apparatus. They acted as the hardware of the computer, whereas
the sound analysing and producing areas acted as the input/output apparatus. The
particular "language" was the software.
There are indications, I
think, that our ancestors returned to a more terrestrial habitat not earlier
than two million years ago (in a cooler and drier period of the Pleistocene?
see ref. 11). In the hominid fossil record, the great expansion of the
association areas seems to begin about two million years ago, with the genus Homo
(34,35). The limited brain enlargement of Homo
habilis could correspond broadly with the enlargement of Area 4, Broca’s
area (34), the arcuate fasciculus and Wernicke’s Area (already in Phase III?);
that of Homo erectus with a further association cortex enlargement. This
would mean that some sort of speech is much older than one million years. The
oldest "languages" could have been tonally different, i.e.
more musical than today’s languages. Even today, intonation is indispensable in
normal speech, and perhaps half of the world’s languages are tonal (cf.
Phase I). The relatively small size of the brain of the australopithecines
(possibly without a real Area of Broca (34)) could be explained by their
dwelling or having dwelt in inland semi-aquatic habitats (e.g. gallery
forests), and not in littoral habitats (11). If early Homo lived at the
sea coasts, he had to dive deeper and longer than his freshwater cousins, so
the voluntary control of this airway muscles became more important. Brain
enlargement is a striking feature of many cetaceans. Conceivably, the support
of the body (and brain) weight by the surrounding water allowed sea
mammals to obtain large brains (for echo-location for whatever
"purpose"), because of the weakened necessity of brain
miniaturisation in an aquatic environment (e.g. the neurone density in the
brain of a baleen whale is more than 100 times less than that in the brain of a
wren, see ref. 20, p. 1 19). In this vision, it seems possible that our
semi-aquatic life lasted as long as the brain enlargement in Homo, i.e.
until less than one million years ago.
Conclusions
Most authors discussing
language origins try to explain our speech capacities by an enormous
amplification of vocalising abilities that already existed in rudimentary forms
in pre-human primates (36,37,38) but fail to explain how exactly this could
have occurred. In my view, most of these problems are readily solved by the
application of the aquatic theory on the evolution of the vocal and breathing
apparatus. For instance, a simultaneous emergence of Broca’s Area, the arcuate
fasciculus and Wernicke’s Area seems highly improbable
evolutionarily, and yet in the traditional view, each of these structures has
no function without the other two. However, in an aquatic phase, a Broca-like
structure alone could be used as a coordination centre for controlling
breathing and airway closure, and only later the arcuate fasciculus arose,
which induced Wernicke’s Area. The traditional view
has the same difficulties in explaining laryngeal descent, etc.
Concerning the relation of
language and thought, I assume that a simpler (non-verbal) sort of thinking
already existed in our pre-aquatic ancestors, but the great unfolding of human
cognitive abilities became possible only after the acquisition of proper input/output
organs for the brain. Hence, our great communicational capacities may not have
evolved thanks to our large brain; rather the opposite seems true: large
association areas only became usable with our voluntary sound
production.
Acknowledgements
I wish to thank Mrs E.
Morgan, Dr J. Wind, Dr P. van Cauwenberge, Professor M. LeMay
and Professor D. Falk for discussions, corrections and help.
References
1
Hardy, A. C., "Was man more aquatic in the past?",
New Scientist, 7,642-645 (1960).
2
Morris, D., The Naked Ape.
3
Morris, D., Manwatching.
4
Morgan, E., The Descent of Woman. Souvenir Press,
5
Morgan, E., The Aquatic Ape. Souvenir Press,
6 Morgan, E., "The aquatie
hypothesis", New Scientist, 1405, 17 (1984).
7
Morgan, E., "Sweaty old man and the sea", New Scientist, 1448,27-28 (1985).
8 Morgan, E., "Lucy’s child", New
Scientist, 1540, 13-15 (1986).
9
Cunnane, S. C., The aquatic ape theory reconsidered", Medical
Hypotheses, 6,49-58 (1980).
10
Gribbin, N. and Cherfas, J., The Monkey Puzzle. Paladin,
11
Verhaegen, M. J. B., "The aquatic ape theory: evidence and a possible
scenario", Medical Hypotheses, 16, 16-32 (1985).
12 Verhaegen, M. J. B., "Origin of hominid
bipedalism", Nature, 325, 305-306 (1987).
13 Ellis, D. V., "Proboscis monkey and
aquatic theory", Sarawak Museum J., 57, 251-262 (1986).
14
Morgan, E. and Verhaegen, M., "In the beginning was the water", New
Scientist, 1498, 62-63 (1986).
15
Wind, J., On the Phylogeny and Ontogeny of the
Human Larynx. Wolters-Noordhoff.
16 Wind, J., "Phylogeny of the human vocal
tract", Annals N. Y. Acad. Sci., 280,612-630 (1976).
17
Clutton-Brock, T.H. and
18
Brockelman, W.Y. and Schilling, D., "Inheritance of stereotyped gibbon
calls". Nature, 312, 634-636 (1984).
19 Thorpe, W. H., "Duet-singing
birds", Scient. Am., 229, 70-79 (1973).
20
Chauvin, R., La Biologie de I’Esprit.
21 Haimoff, E. H., "Convergence of
duetting of monogamous
22
Chalmers, N., Social Behaviour in Primates.
23
Nowicki, S., "Vocal tract resonances in oscine bird sound reproduction:
evidence from bird songs in a helium atmosphere", Nature, 325, 53-55
(1987).
24 Hockett, C. F., "The foundations of
language in man.
the small mouthed animal", Scient. Am., 217, 141-144 (1967).
25
Negus, V., The Comparative Anatomy of the
Larynx. W. Heinemann Medical Books,
26 Cauwenberge, P. van, "Clinical use of
rhinomanometry in children", Internat. J. Ped. ORL, 8, 163-175
(1984).
27
Laitman, J. T., "Evolution of the hominid upper respiratory tract: the
fossil evidence", in Tobias, P. V. (Editor), Hominid Evolution, pp.281-286.
A.
28
Fink, B. R. and Frederickson, E. L, "Laryngeal preadaptation to
articulated language", in Chivers, D. J. and Joysey, K. A. (Editors), Recent
Advances in Primatology, Volume 3, pp. 93-95. Academic
Press,
29 Verhaegen, M. J. B., "The aquatic ape
theory and some common diseases", Medical Hypotheses, 24, 293-300
(1987).
30 Chusid, J. G., Correlative Neuroanatomy
and Functional Neurology. Lange Medical Publications,
31
Williams, H. and Nottebohm, F., "Auditory responses in avian vocal neuron:
a motor theory for song perception in birds", Science, 229,279-282
(1985).
32 Kelly. D.B., "A motor theory
of song perception", Trends Neurosci., 9, 149-150 (1986).
33 Geschwind, N., "Language and the
brain", Scient. Am., 226, 76-83
(1972).
34 Falk, D., "Cerebral cortices of East
African early hominids", Science, 221, 1072-1074 (1983).
35 Yellen, J.E., "The longest human
record", Nature, 322, 774 (1986).
36
Wind, J., "Fossil evidence for primate vocalisations?",
in Chivers, D. J. and Joysey. K. A. (Editors), Recent Advances in Primatology,
Volume 3, pp. 87-91. Academic Press,
37 Lieberman, P., "On the evolution of
human syntactic ability", J. Hum. Evol., 14, 657-668 (1985).
38
Steklis, H. D., "Primate communication,
comparative neurology, and the origin of language re-examined", J. Hum.
Evol., 14, 157-173 (1985).
39 Thompson, R. F., Introduction to
Physiological Psychology. Harper International Edition,