Sindhi becomes first Pakistani language to be selected for digitization

The Sindhi language has achieved a historic milestone by becoming the first language from Pakistan to be selected for digitization. The Universal Dependencies, a combined project of Stanford University and Google, is an ongoing project working to convert languages into machine-readable formats. It has, to date, selected 100 languages, including Sindhi out of the 6,000 human languages spoken globally.

Urdu, the national language of Pakistan, was also picked up by the project, but it was proposed for inclusion by contributors from India.

According to the United Nations, at least 43% of the estimated 6,000 languages spoken in the world are endangered. The United Nations Educational, Scientific and Cultural Organization (UNESCO) warns that half of these languages will be extinct by the end of this century.

Sindh is one of the world’s oldest languages, written in right-handed Perso-Arabic script. Linked to the Indus Civilization, the language’s first recorded script samples were found during excavation of the UNESCO heritage site of Mohenjo Daro in Sindh.

After the UD addition, Sindhi will now be accessible online for translation through more than 150 treebanks.

Source: Geo TV