Pronunciation Visualization for Cebuano

Written by

Fullstack Engineer

Language Learning Advisor

April 30, 2026

We are launching the first version of Dictionarying's pronunciation visualization engine, and we chose Cebuano as our first language. This post explains why Cebuano, what the science behind the visualization actually is, and how we built it — including the specific acoustic measurements that drive every animation and waveform on the platform.

Why Cebuano

Cebuano (also called Bisaya or Binisaya) is the second most widely spoken language in the Philippines, with an estimated 20–27 million native speakers primarily in the Visayas and Mindanao regions. Despite this scale, it is significantly underrepresented in digital language-learning tools compared to Tagalog or Filipino.

More importantly for our purposes, Cebuano has a well-documented phonological stress system — one that is rich enough to be genuinely useful for learners, and specific enough that imprecise pronunciation causes real misunderstandings. Stress in Cebuano is lexically contrastive: the same sequence of sounds can mean different things depending on which syllable carries stress. This makes pronunciation accuracy meaningful, not cosmetic.

That combination — large speaker base, underserved by existing tools, and phonologically interesting — made Cebuano the right first language for a Cebuano dictionary built around making pronunciation visible.

The Acoustic Basis of the visualization

Every animation, waveform, and stress marker on the platform is grounded in acoustic measurements of native speaker speech. We do not interpolate or approximate. Here is what the data shows.

Duration is the Primary Cue

The most comprehensive experimental study of Cebuano stress to date — Xu (2020), Cebuano Stress: Phonetic Cues and Phonological Pattern — analysed stressed and unstressed syllables in disyllabic words using Praat-based acoustic measurement across multiple native speakers. The findings are unambiguous:

Stressed syllables are substantially longer. The mean duration difference between stressed and unstressed syllables was +54.2 ms — a large, consistent, and perceptually salient effect with minimal overlap between distributions. This is the primary cue to stress in Cebuano, and it is the most prominent feature encoded in our visualizations. When you see a syllable marker expand on screen, that proportional expansion directly reflects this durational relationship.

Pitch and Intensity are Secondary Cues

The same study found two additional acoustic correlates of stress, both statistically significant but with considerably smaller effect sizes:

Fundamental Frequency (F0): Stressed syllables showed a mean pitch increase of +9.23 Hz compared to adjacent unstressed syllables. This effect is real but small — the box-plot distributions overlap substantially, meaning F0 alone is not a reliable stress cue in Cebuano.
Intensity: Stressed syllables were louder by a mean of +2.14 dB. Again, statistically significant, but a weak perceptual cue with high overlap.

The practical implication: in a phrase like Unsa diay imong trabaho? the stressed syllables (ÚN-, DI-, MÓNG, BÁ-) carry pitch and amplitude that is slightly elevated relative to unstressed syllables (sa, ay, i-, tra-, ho), but the dominant perceptual marker is vowel length. Our waveform visualization reflects all three dimensions — duration, F0 contour, and amplitude envelope — so learners can observe the full acoustic picture, not just the most prominent feature.

This finding also aligns with the broader typological picture for Philippine languages. Shryock's (1993) foundational metrical analysis of Cebuano described stress assignment in terms of an iambic foot structure, with the penultimate syllable as the default stress position — a pattern confirmed and refined by the acoustic evidence in Xu (2020).

Intonation vs. Lexical Stress

An important distinction for learners: Cebuano has both lexical stress (which syllable within a word carries prominence) and phrasal intonation (how F0 moves across a whole utterance). In questions, global F0 typically rises toward the end of the phrase — but this rise sits on top of the lexical stress pattern, it does not replace it. Stressed syllables within a question still show their characteristic F0 peak and duration relative to their local context.

Our visualizations handle these two layers separately: the stress markers encode lexical prominence, while the prosody overlay encodes phrasal intonation. This separation is deliberate and reflects current phonetic understanding of how the two systems interact.

How We Built It

Acoustic Analysis Workflow

Pronunciation models are built from recordings of native speakers using Praat, the industry-standard acoustic analysis software developed at the Institute of Phonetic Sciences, University of Amsterdam. Praat allows us to extract precise measurements of duration, F0 trajectory, and intensity envelope at the phoneme level for every recorded token.

From these measurements we derive:

Duration ratios between stressed and unstressed syllables in each word and phrase — these drive the proportional sizing of syllable markers in the animation.
F0 trajectories normalised to the speaker's pitch range — these drive the pitch contour overlay.
Intensity envelopes — these drive the amplitude visualization in the waveform.

This workflow is informed by the methodology established in the Seeing Speech project at the University of Glasgow, which pioneered the use of articulatory visualization for language learning and whose ultrasound-based approach to making speech visible inspired the founding of this platform.

Phonological Grounding

Beyond the acoustic measurements, the stress assignment patterns are cross-referenced against the metrical phonology literature. The primary sources we used:

Shryock (1993) — the foundational metrical analysis of Cebuano stress, establishing the iambic foot and penultimate default: A metrical analysis of stress in Cebuano, Lingua 91, pp. 103–148
Xu (2020) — the experimental acoustic study: Cebuano Stress: Phonetic Cues and Phonological Pattern
Himmelmann & Kaufman (2018) — the typological overview of prosody across Austronesian languages, situating Cebuano within its broader language family: Prosodic systems: Austronesia
Liwanag (2012) — a cross-linguistic comparison that directly contrasts stress cues in three Philippine languages, confirming the primacy of duration in Cebuano relative to its close relatives: Acoustic Correlates of Stress in Ilocano, Cebuano, and Tagalog

What Is Visualised

For each Cebuano word and phrase currently on the platform, users can observe:

Articulatory animation — showing lip, tongue, and jaw position during production of each sound, based on articulatory data cross-referenced with the Seeing Speech database
Syllable stress markers — proportionally scaled to reflect the durational difference between stressed and unstressed syllables
Waveform display — showing the full acoustic signal with F0 contour and amplitude envelope visible at the phoneme level
Prosodic overlay for phrases — showing the intonation contour across the full utterance, distinguishing phrasal pitch movement from lexical stress peaks

What Comes Next

Cebuano is the foundation. The acoustic analysis pipeline, the annotation workflow, and the visualization rendering engine are now established — built to generalize to additional languages without rebuilding from scratch.

The choice of subsequent languages will be guided by the same criteria that led us to Cebuano: speaker population size, underrepresentation in existing learning tools, and phonological richness that makes visual feedback genuinely useful. We will announce the next language once the models reach the quality threshold we set for Cebuano.

You can explore the Cebuano pronunciation tool at dictionarying.com/cebuano

If you have questions about the methodology, want to report an error in a word’s stress annotation, or suggest a language for future development, you can contact us via dictionarying.com/contact or email us at [email protected].

References

Xu, S. C. A. (2020). Cebuano Stress: Phonetic Cues and Phonological Pattern. https://www.scangelaxu.com/pdf/Xu_CebuanoStress2020.pdf
Shryock, A. (1993). A metrical analysis of stress in Cebuano. Lingua, 91, 103–148. https://www.sciencedirect.com/science/article/abs/pii/002438419390010T
Himmelmann, N. P. & Kaufman, D. (2018). Prosodic systems: Austronesia. https://bahasawan.com/wp-content/uploads/2019/11/Himmelmann-and-Kaufman-to-appear-Austronesian-Prosodic-Systems.pdf
Liwanag, M. H. C. (2012). Acoustic Correlates of Stress in Ilocano, Cebuano, and Tagalog. The 2nd Philippine Conference-Workshop on Mother Tongue-based Multilingual Education. https://www.researchgate.net/publication/361242190
Seeing Speech project, University of Glasgow. https://www.seeingspeech.ac.uk/
Institute of Phonetic Sciences, University of Amsterdam. https://www.uva.nl/en/research/research-institutes/uil-ots/phonetic-sciences.html
Praat: doing phonetics by computer (Boersma & Weenink). https://www.fon.hum.uva.nl/praat/