University of Oregon

Math Text-to-Speech

Math Text-to-Speech: Introduction


Making math accessible through synthetic speech

The standard way of communicating mathematical information is by using a distinct notational language rather than the "plain English" found in most literature. The notational language of mathematics combines a broad array of numbers, Roman and Greek alphabetic characters, and a host of other non-alphanumeric symbols with precise meanings which may change depending upon the math discipline and the context in which the symbols are used.

Image of sphere from wikipediaExample:
A. Formula for the volume of a sphere: V = 4 3 π r 3

In addition, the typographical convention of math notation uses a two-dimensional layout where much of the meaning of an expression is implicit based upon the spatial position of one symbol in relation to another.

A. Implied multiplication: the meaning of the expression 3 ( x + y ) includes the implied multiplication of three times the variable x plus three times the variable y.

While understanding the notational language of mathematics is difficult enough for the average student, this task becomes even more problematic for a student with a print disability, such as a visual impairment or reading disorder, who may use text-to-speech assistive technology. One fundamental issue with speaking math, whether it is spoken by a human being or a computer, is that an audio stream in inherently linear, while math notation is not. Although a sighted person can browse a complex equation and quickly ascertain important implicit information based upon the spatial position of various elements of the expression, this is very difficult to do with an audio stream because of its linear nature.
A. Formula for Standard Deviation σ = 1 N N i = 1 ( x i μ ) 2

B. Example of "symbols that have different meanings in different contexts":
  | x | could mean "the absolute value of x", while  | X | could mean "the cardinality of the set X"

Listening to an equation also requires a high degree of short term memory, since expressions heard earlier in an equation must be retained in memory and compared with audio information which is heard later. However, text-to-speech for mathematics holds great promise for accessibility due to the ever increasing availability of accessible eText materials and synthetic speech applications.

The ability to provide effective text-to-speech for mathematics content has increased steadily over the past decade, due largely to the development and support of Mathematical Markup Language (MathML), which provides a standard, open-source method to encode math notation within digital content in such a way that synthetic speech engines can automatically generate math speech. However, many people in academic and professional settings are unfamiliar with the concept of math text-to-speech. The aim of this section of the MeTRC website, therefore, is to provide an introduction to computer-generated math speech and to discuss the issues which are related to making math accessible through synthetic speech.


Page 1 of 6