The English language contains so many sounds and spellings that seem, at least on the surface, to make little sense. In this piece, Neil Almond explores the reasons for this complexity and how we might understand it better.

The reason English is such a hard language to read and spell is because it has a deep orthography or complex code. Orthography refers simply to the spelling system. In English there are multiple ways to spell the 44 sounds of the English system – around 176 common spellings to be precise1. In very simple terms, this is why spelling in English is so difficult.

Contrast this with a language with a shallow orthography, or simple code, such as Finnish, where each sound in the language is only spelled by one symbol. This makes learning to decode and spell words in Finnish far simpler. The reasons for the complexity in English code is a complex journey that I will attempt to distill.

In 1786, the Anglo-Welshman Sir William Jones, working as a British judge in India, wrote a sentence that appears in many textbooks on the subject of historic linguistics. A talented linguist, he was tasked with the regulation of English merchants in India along with the existing rights of the local population that followed ancient rules and laws based on Hindu laws. These laws were written in Sanskrit, a language that no other British judge could read. Over time, Jones poured over Sanskrit texts and was soon able to speak the language. It was while studying this, that he came to a conclusion which he presented in his third annual discourse to the Asiatic Society of Bengal.

The Sanskrit language, whatever be its antiquity, is of a wonderful structure: more perfect than the Greek, more copious than the Latin, and more exquisitely refined than either; yet bearing to both of them a stronger affinity, both in the roots of verbs and in the forms of grammar, than could possibly have been produced by accident; so strong indeed, that no philologer could examine them all three, without believing them to have sprung from some common source, which, perhaps, no longer exists.”2 

Three languages: Latin, Greek and Sanskrit, separated by time and space but all having a common source language, echos similarities to Darwin’s theory of Evolution. Only, we are not talking about the ancestry of biological life-form but a socially constructed invention. It was this understanding of the nature of language, that languages have evolved from a common ancestor that linguists were able to trace Latin, Greek and Sanskrit back to that ancestor – Proto-Indo European (PIE). Over time, linguists were able to add other languages which had their ancestry in PIE.

Indo European Tree

As can be seen in the Indo-European-Tree, English is not a direct descendant of PIE. (Some words can be traced back to PIE. For example ‘father’ comes from a word that would have sounded like <p>, <ah>, <t> <er>3 .) However, its roots lie in Proto-Germanic, a direct descendant of PIE, that was spoken some 6,000 years ago.

The Germanic influence arrived in the early 5th century with the departure of the Romans from Britain and the arrival of three Germanic tribes: the Angles, Jutes and Saxons. These tribes would have brought their West-Germanic language with them when they eventually settled on these isles. This would be known as Old-English.

It is worth pointing out here that, despite this being the main root of the English language, if we were able to travel through time to 500CE, we could not converse easily with an Anglo-Saxon. Indeed, both our languages would sound completely foreign to each other. That is because after approximately one-thousand years, a language evolves to such an extent that it would become difficult for each party either side of that millennium to converse with each other (unless specialist study was undertaken).

793CE marked the first planned invasion of the Vikings to Britain and with their eventual settlement here, they brought along their Norse language from Scandinavia, whose ancestry can be found in North Germanic languages. The mixture of language, culture and trade (another way that new vocabulary would have come into use) of the Anglo-Saxons and the Vikings in Britain continued for just under 300 years, becoming rooted in daily life due to the increasing literacy of the elite of society. But another invasion of Britain was soon to add another layer of complexity to the mix.

On 28 September 1066, William, Duke of Normandy, landed in Pevensey with an army and on October 14 this army defeated King Harold Godwinson at the Battle of Hastings. The successful invasion and eventual settlement brought with it the language of old-French, which has its root in Latin, due to Julius Caesar’s conquest of Gaul (modern-day France) a millennia before.

This Latin would also have brought with it elements of Ancient Greek, due to the Roman occupation of Greece. This is where we get the spelling <ph> to represent the sound /f/. French continued to be the language of the elite and of the courts but it failed to become the dominant language of the peasant class with Old-English continuing to be used.

From then on, the versions of English and French that were spoken at the time battled for dominance and it wasn’t until mid 15th century, for various political reasons between the nobility of France and England, that English emerged as the dominant language.

However, it was not just through invasion and settlement that the English language evolved. Cultural shifts in religion also impacted the language that was spoken. Latin was the language of the Catholic Church which has been the dominant religion of the Britain for the best part of two-thousand years. From Emperor Constantine through to Alfred the Great, William the Conquerer and beyond, Latin would have been spoken by some in Britain for approximately 1500 years.

It is worth noting that words which contain Latin roots are more often than not still Latin and not English. For example, struct is the Latin root of construction and the root has no meaning in English. This is useful to know as the Latin layer of the English code can be treated differently the English layer, when it comes to spelling.

The above represents a very brief and simplistic version of how English spelling system came to be so complex. It is a cocktail of other languages that through time have mixed together through political, cultural and social causes to produce what we have right now.

It would, however, be remiss of me to not briefly mention the attempt to standardise the spelling of words by Samuel Johnson in 1755, after this great cocktail had been produced. However, therein lies part of the issue. Johnson attempted to standardise the spelling of words and not the sounds that make up those words. Had Johnson standardised the spelling at the phonemic (sound) level, then it is possible that the spelling <ee> could have just stood for the sound /ee/ as in ‘see’. The failure to do this is yet another reason why the English code continues to be so complex.

1 McGuinness, D., 2006. Early reading instruction: What science really tells us about how to teach reading. MIT Press.

2 Anthony, D. 2010. The Horse, the Wheel, and Language: How Bronze-Age Riders from the Eurasian Steppes Shaped the Modern World. Princeton: Princeton University Press.

3 For an in-depth look at the history of the word ‘father’ and to trace it back through various languages, I highly recommend the History of English Podcast with Kevin Stroud.


Neil was a classroom teacher for 5 years before leading Teach and Learning in a small academy trust. Now he is a Deputy Headteacher on Thornton Heath. He regularly blogs and speaks at educational events around the country.

Write A Comment