Linguistics can be defined as the scientific study of language. Linguistic elements usually form sub-systems that operate as complex components of the whole language system. Each of these sub-systems has a definite structure and can be approached separately from other sub-systems. To describe in detail a language system is an extensive/impossible task. This is why linguistic studies are generally limited to one of these levels. This restriction has given rise to specialized branches (disciplines) of linguistics, such as grammar or phonetics.
There are two linguistic disciplines (phonetics and phonology) which deal with the same subject matter: Phonetics deals with concrete physiologic, physical and psychological factors, describing how speech sounds are produced, transmitted and received by man. It is by nature descriptive and classificatory since it deals with the observation, measurement and description of the sounds of human speech. Phonology deals with such immaterial factors as the rules by which sounds are used to give rise to meaningful units in a particular language. It is particular and functional. This is why it is sometimes called functional phonetics. The term phonetics should be used for the science concerned with speech sounds in general whereas phonology should be kept for the branch of such science dealing specifically with the distinctive, functional aspect of these sounds. But we will ...
Speech sounds are vibrations in the air caused by our speech apparatus. These perturbations are propagated through the atmosphere around the speaker and picked up by the listener’s ear. They are then converted into neural impulses, which travel to the brain and are linguistically interpreted there.
The phonetician studies speech sounds from different points of view:
There are other types of phonetics (historical, experimental, etc.) but this course is not concerned with them.
Through the years, phonetics has had a number of applications. The writing of alphabets for those languages that lack one and the treatment of speech defects are only two of these applications. In the teaching of foreign languages in particular, phonetics has been used to describe sounds, to compare the sound systems of the foreign language and the students’ mother tongue, to describe, explain and correct pronunciation errors, and to formulate rules for the conscious assimilation of the foreign sound system.
As we have already seen, speech sounds may be studied from three different points of view. Accordingly, phonetics may be divided into three branches: articulatory, acoustic and auditory phonetics.
Although acoustic phonetics is gaining ground and advancements in psychology and auditory phonetics are being made, classical articulatory phonetics is still the basic source of knowledge in most phonetics courses. Ours will not be an exception to this tradition. We will provide students with essential terms in acoustic phonetics and offer details inasmuch as they yield a detailed description of the physical correlates of phonemic oppositions. By the same token, we will only make specifications related to hearing in order to find answers to problems in the learning of foreign languages.
There are three main factors in the production of speech sounds:
The process is very much the same as that which occurs when we play a guitar or a drum. The atmosphere surrounding the instrument would be the raw material in both cases. The source of power would also be the same: the hands of the artist. In a guitar, the modifying agent would be the strings arrested by one hand and plucked by the other; they would agitate the air and put it into vibration. Besides, there is the body of the guitar, which resonates the sound produced by the vibrating strings. In a drum, the modifying agent would be the skin or drumhead, which moves back and forth rapidly, setting the air into vibration. Besides, there is the cylinder of the drum that acts as a resonant cavity. The resonant chambers in our body are the pharynx, the mouth and the nasal cavity. These three organs can be compared to an F made with tubes. The vertical tube would correspond to the pharynx; the upper horizontal tube, to the nasal cavity; and the lower horizontal one, to the mouth.
Pulmonic air moves through the laryngeal, pharyngeal, oral and, potentially, the nasal structures, and its flow is modified by the various articulatory movements at these areas which create various resonance configurations. These in turn account for the particular kinds of air turbulence that we perceptually associate with speech sounds.
We can also speak of three phases of speech production:
When man speaks, he makes use of organs such as the lungs, the lips and teeth whose primary function is not connected with oral communication. From the evolutionary point of view, the breathing and eating functions have been adapted to speech production. This is why —occasionally and in a subordinated manner— man makes use of the respiratory apparatus and part of the digestion apparatus to produce speech.
(The speech cycle or speech chain): Speech is a linguistically significant modification of air coming in or going out of the body through the mouth and/or nasal passages. The modification is essentially a very simple process of varying the degree of opening between some part(s) of the lower jaw and some part(s) of the upper jaw at some point(s) along the line from the glottis to the lips.
The oral cavity can be modified by moving various parts of the lower jaw (particularly parts of the tongue) towards various parts of the upper jaw. What is happening here is that the movement of the tongue towards the upper jaw divides the oral cavity into two —a cavity behind the tip and a cavity in front of the tip. The air modified by the various states of the glottis ("modified air", "air having frequency/frequencies", etc.) is affected further by this division of the oral cavity. The air is moving and "bounces around" in these cavities. Vibrating air resonates and the resonance depends on factors such as the size and rigidity of the cavity (resonance chamber). If the air can pass into the nasal cavity (eventually escaping into the atmosphere through the nostrils) we will have resonance in the nasal cavity.
Let us analyze in more detail the organs of speech and their function in the process of phonation.
1. The lungs can be compared to two floppy balls made of soft sponge-like material. They can be squeezed by the action of the diaphragm and the muscles between the ribs, which causes the air to flow out. It is exhaled air, made to move by our respiratory system, that we use to produce Spanish and English sounds.
2. The trachea or windpipe: Main trunk or passage through which the air passes to and from the lungs.
3. The larynx: Modified upper part of the trachea containing the vocal cords. The various muscles of the larynx can cause changes in tenseness. Variation in length, thickness and tension in this area accounts for the variation in human laryngeal sound production.
4. The vocal cords. Two folds of tissue running along the sides of the larynx from front to back. The opening between them is known as the glottis. Their main action is to function as vibrators in the production of voice. They can be made thick or thin by the action of various muscles: thick and flabby vocal cords give us low-pitched voice; thin and tense, high-pitched voice. Sex, age and hormonal changes in the body affect the vocal cords and, therefore, voice quality.
The vocal cords can take several positions. Different types of sounds are thus produced. For example, when the vocal cords are wide open, leaving space for the air to pass through without any obstruction, consonants such as /p/, /tS/ and /f/ are produced. This is also the position for breathing. When they are brought together in such a way that the air forces its way through them, voice is produced, as in the case of vowels and consonants like /b/, /d /, /v/, /m/ or /l/.
Let us assume that the glottis is closed, i.e., the vocal folds have been pulled together. Air coming from the lungs now meets a resistance at the closed glottis and there is a build-up of pressure till it forces the vocal folds apart. If this latter state continued, all we would have would be a continuous escape of air. However, a physical principle called "Bernoulli Effect" comes into effect??? here: when a moving column of gas or liquid passes between two elastic bodies or one elastic and one rigid body, there is a drop in pressure between the bodies. Then the vocal folds come back to their previous position (a closed glottis state). Then the build-up takes place again followed by the opening of the glottis, followed by the Bernoulli effect followed by another closure. Each time this happens, a puff of air escapes and it is these air disturbances that we perceive as "voice" as they are propagated through the atmosphere around us. The faster the stream of puffs, the higher the pitch; the slower the stream, the lower the pitch. Large puffs are perceived louder than smaller puffs.
5. The pharyngeal cavity. Passage extending from the top of the larynx to the region in the rear of the soft palate or velum. It is a muscular cylinder which can change its form and volume depending on the vibrations it should reinforce.
6. The soft palate. Flap of soft muscular tissue from which a small, fleshy lobe called uvula hangs down. It can take up two positions: raised or lowered.
a) When it is raised, it touches the pharyngeal wall and blocks off the entrance into the nasal cavity above, so the air cannot escape through the nasal cavity. Therefore, the only way out left for the lung air is through the oral cavity. This is the position of oral sounds such as /p/, /s/ or /i/.
b) When it is lowered, the air can escape either entirely through the nose (if the mouth cavity is closed), producing nasal sounds such as /m/, /n/ and /N/; or through the nasal and mouth cavities, giving rise to nasalized sounds (as in French un bon vin blanc).
7. The hard palate. Bony arch following the soft palate or velum. It is also known as the roof of the mouth.
8. The alveolar ridge, teeth ridge or gum ridge. It can be felt behind the upper front teeth. It is the inner surface of the gums.
9. The upper and lower teeth.
10. The upper and lower lips. The lips are very important organs since the different positions they can adopt change the modulation of sounds by shaping the vocal cavity.
11. The jaws. The lower jaw is more important than the upper one because it is movable.
12. The tongue. It is the most important organ of speech since it is capable of making many movements and of modifying the air stream in numerous ways, playing a major role in the production of both, consonantal and vocalic sounds. Although it does not show obvious sections, phoneticians make several divisions of it. For our purpose it will only suffice it to arbitrarily divide it into:
Tip: The tongue’s extremity.
Front: Part lying opposite the hard palate.
Back: Part lying opposite the soft palate.
Center: Part between the front and the back.
After they are produced and before they reach our ears, speech sounds travel through the air.
Specialists have become aware of the constraints imposed on speech studies by the vocal mechanism and have resorted to acoustic parameters with a definite and rigorous relation to articulation.
Computer programs provide routines that allow data to be displayed on an oscilloscope or punched out on paper tape for later use. A basic problem in speech analysis is to extract phonetically-significant aspects from the mass of data available in an acoustic spectrum but additional advantages are error-checking routines built into the programs. These routines evaluate certain properties of a given speaker, such as the approximate range of variation of his or her formant frequencies so that no error is made in the extraction of a particular parameter while important data are not discarded.
Let us have a look at a few relevant elements in the acoustic analysis of speech sounds and at their relationship with articulatory and auditory correlates.
Amplitude. Distance from the highest and lowest points to the mean. It is the main physical counterpart of the impressionistic notion of LOUDNESS and is measured in dB, dynes. The greater the amplitude, the louder the sound.
Frequency. Number of complete oscillations (cycles) in a given time. It is the physical counterpart of PITCH (or tone) and is measured in cps = Hz. The higher the frequency, the higher the pitch.
Duration. It depends on the speed of the utterance and is the physical counterpart of LENGTH.
Voice quality. It highly depends on the anatomical configuration of the organs of speech of an individual and it permits us to identify one person from another. The size of the chest, larynx, oral and nasal cavities; the resonance characteristics of the teeth, tongue, cheek, etc.; even the hardness of the head bones; they all contribute to the typical voice quality of an individual.
Sound quality. It is made up by the joined effect of the vibrating body and the resonator when producing a speech sound. It is achieved by the shapes the individual gives to his or her resonators above the larynx (pharynx, mouth, nasal cavity) when pronouncing a specific sound.
The acoustic structure of sounds can be visualized by means of equipment such as oscillographers, spectrographers and computers. Regardless of their components and operational peculiarities, what these machines do is basically to pass sounds through filters covering different frequencies and to produce a visual record of them with time represented in horizontal direction, frequencies in vertical direction, and amplitude in the density or darkness of the picture.
With the help of machines, ornithologists have been able to accurately record and analyze the complex time and frequency relations in the songs of birds. Objective, unequivocal analysis of human speech through these machines has also been made by linguists (voice training is one example).
The most important elements of a spectrogram are the formants, which are the frequencies or groups of frequencies that characterize the quality of the sound. They are visible as dark bands of a certain frequency width covering the duration of the sound and sometimes showing continuity to formants of adjacent sounds.
Spectrographic analysis reveals that the formants bend from those positions typical of a vowel to those characteristic of another. It also reveals the way in which there tends to be a merging of units which, linguistically, we treat separately. In the case of /p, t, k/, which involve a complete obstruction of the air-stream and whose release is sometimes characterized acoustically by a relatively brief burst of noise, the transition between this noise and the steady state of the vowel appears to be of prime importance for our recognition of the consonant.
The process of audition begins when the air movements caused by articulation reach the listener’s outer ear. The following is a brief description of what follows.
The outer ear acts like a funnel to pick up these movements of air and direct them down the ear-hole or meatus. This channel is about one inch in length and its surface is covered with small hairs which, together with wax secretions, serve to protect the middle and inner ear against any accumulation of dirt. At the end of the meatus, there is a roundish, thin membrane set somewhat obliquely in a bony frame. This membrane, known as the ear drum or tympanum, is very tightly stretched; when the air impinges on it, it vibrates to the different tones.
The middle ear is an air-containing chamber about 1/6" from side to side. Air is supplied and pressure equilibrium maintained in this small chamber via the Eustachian tube, which is in the anterior part of the middle ear and stretches down to the back of the nasal cavity. Within the middle ear, there is a structure of three articulated small bones or ossicles: the hammer, the anvil and the stirrup. The hammer is attached by its handle to the tympanum near the top and by its head loosely to the anvil, which is, in turn, joined to the tiny head of the stirrup. The base of the stirrup fits into the oval window of the inner ear. This ossicle arrangement transfers vibrations from the tympanum to the inner ear. The articulation of the ossicles functions to reduce the amplitude of the tympanum reacting directly to the motions of the air.
Vibrations are transmitted in the inner ear through different fluids and membranes until they reach the auditory nerve (organ of corti) and are transmitted to the brain.
Speech represents one of the most complex activities of the nervous system. Most of our readers have little familiarity with the physiology of the human brain, so we will purposefully leave out explanations concerning "association areas", "localization of functions", etc. and focus only on what happens when a person hears the sounds of a foreign language and on the importance of hearing in pronunciation.
The same way a succession of nerve impulses flows out from the speaker’s brain and the appropriate muscles contract to produce sounds, when those sounds reach the listener’s ear drums they are converted again into nerve impulses that are conducted along his auditory nerves and into his brain.
Our hearing mechanism plays an important role in the monitoring of our own speech. The better we hear the differences between sounds, the better we are able to make them. The more capable we are of making such differences, the better we can hear them, so that training in making different sounds improves our ability to distinguish them.
Perception is the process whereby an individual becomes aware of the surrounding world. It is individual and unique. A person perceives an event in terms of his or her past experiences, present circumstances and motivations. The psychological interpretation of speech sounds performed by a listener consists mainly in selecting from a mass of acoustic material those features which are relevant for the language being spoken.
A listener interprets the sounds of another speaker in terms of his own experience as a speaker. When we hear a foreign language, our brain selects from the mass of acoustic material, that information which would be necessary for the perception of our own language since the perception of speech sounds is conditioned by our accumulated language experience.
Many are the mistakes made by learners of a foreign language in which the learner makes an incorrect imitation of the foreign sound because he or she has not perceived it correctly.
In our "Accuracy" section, we will help readers "hear" the most typical elements of a sound or utterance, which will greatly help them guide both their listening and pronunciation. We will use techniques devised by Verbo-Tonal System1 advocates, some of which exaggerate or distort a given feature in a sound to bring it out for the student to perceive it. This is done in the hope that... but should not at all be taken as a pattern to follow in normal conversation.
The phonemes of a language are not the sounds its speakers use to communicate among themselves, but groups of features in these sounds, as the point of articulation, the manner of articulation, the presence or absence of voice, etc. These "bunches" of sound features help differentiate the meaning of morphemes, words, phrases, etc. though they lack semantic content themselves. The English phoneme /p/ means nothing except in the sequence /pIt/ inasmuch as it differentiates it from /bIt/, /fIt/, /sIt/, etc. The Spanish phoneme /k/ has no meaning in itself, it only differentiates /kasa/ from /tasa/, /pasa/, /masa/ or /asa/.
A phoneme is made up of only those features which help differentiate it from other phonemes (relevant or inherent features): /p/ is different from /b/ because it is voiceless, it is different from /t/ and /k/ because it is bilabial, and different from /m/ because it is voiceless and plosive.
On the other hand, those features that which do not help differentiate phonemes but the different realizations of the same phoneme —its allophonic variants or allophones— are known as non-distinctive or non-relevant features. English /p/ is aspirated in "pole" but it is not aspirated in "speak". Spanish /b/ is pronounced with a total closure of the lips in bote and with a partial closure in cabo. In these cases we speak of allophonic variants or allophones since the differences are due to the phonetic context in which the same phoneme is pronounced and hold no relation with meaning.
Under what conditions can two speech sounds be considered realizations of two different phonemes? Under what conditions can they be considered allophonic variants of the same phoneme?
If two sounds cannot be interchanged in a word without changing its meaning or rendering it unrecognizable, the two sounds are realizations of two different phonemes.
Two sounds of a given language are merely variants of a single phoneme if they are interchangeable without a change in the lexical meaning of the word.
Phonological oppositions is a technique which consists in finding two separate words in the language which sound almost but not completely identical, two words that have only one sound difference between them. These two words make up a minimal pair and the sounds in the two words that enable us to distinguish between them are always phonemes: there are no minimal pairs contrasting allophones.
When linguists carry out a first phonemic analysis of a language in order to know what its phonemes are, they pronounce a word with some modification in one of its phonemes. If the modification falls within the range of normal deviation, the sounds pronounced are allophones of the same phoneme. If the modification is an extreme deviation from the norm, if some other word is heard, one may conclude that the modification is tantamount to the substitution of one phoneme for another. This procedure of finding minimal pairs is continued until all linguistically-relevant sound contrasts are isolated and the complete inventory of phonemes established.
Examples of minimal pairs in English are Say - may, caught - cat; in Spanish, ajo - ojo, beso - queso.
The set of symbols linguists need to make a record of their observations by providing one sign for each phoneme of the language is called a phonetic alphabet, and a record of speech in the shape of these symbols is called a transcription.
On account of the imperfections of traditional writing and the lack of a sufficient number of characters in our so-called "Latin" alphabet, scholars have devised many phonetic alphabets.
The phonetic alphabet of the International Phonetic Association (commonly referred to by the acronym IPA) is derived primarily from the Roman alphabet. The symbols for phonemes are written in slant brackets —/.../— and those for allophones are written in square brackets —[...].
Two types of transcription derive from this: a phonemic transcription (...) and a phonetic transcription (...).
In order to represent allophonic variants, phoneticians use modifying marks near or through a phonetic symbol indicating a specific value which is different from that of the unmarked or otherwise-marked symbol. These are called diacritic marks or simply diacritics.
Plosives and affricates are released nasally when followed by a nasal sound, e.g. open, sudden, submerge, darkness. This nasal release is represented as [Cn]. The only diacritic that we will use in this course is ...