Loyola Marymount University Computer Science Assistant Professor Jared Coleman, along with a small team of researchers, has developed an innovative AI-powered translation tool to help revitalize Owens Valley Paiute, an endangered Indigenous language.
“I am a proud member of the Big Pine Paiute Tribe of the Owens Valley in California,” said Coleman. “I’ve always been interested in languages and learned years ago that my tribe’s language was critically endangered. I decided to pursue computer science as my major in college to support my community’s efforts to learn and revitalize our native language.”
Coleman was drawn to LMU as a faculty member because of the opportunities to involve students in research. “I’m impressed by the level of faculty-student interaction at LMU and by the research opportunities and other academic activities available to students outside the classroom,” he said. “I most enjoy working with students and other collaborators on research projects.”
Coleman’s Ph.D. research primarily focused on distributed computing, a field he plans to continue alongside his work in endangered language revitalization. Distributed computing involves systems where multiple computers collaborate to solve problems, often enhancing efficiency in large-scale data processing. “Many of the problems we tackle are impossible to solve optimally, but researchers have proposed numerous sub-optimal yet practical algorithms. My research involves developing tools to analyze and compare these algorithms to determine which are best suited for different types of problems,” said Coleman.
During his Ph.D. studies at USC, Coleman and his advisor discovered a shared passion for languages, leading them to explore the intersection of computer science and endangered languages. Their 2023 publication proposes a machine translation approach called Large Language Model-Assisted Rule-Based Machine Translation (LLM-RBMT) to assist people in learning no-resource languages. This approach combines traditional rule-based translation tools with advanced language-processing capabilities of large language models (LLMs). In this method, the LLM guides rule-based translators, which rely on grammatical and vocabulary rules to translate between languages.
“For endangered, no-resource languages, creating translators is challenging, and accuracy is even more critical,” Coleman explains. “For this reason, our goal isn’t to produce perfect translations but to generate accurate ones that closely capture the user’s intended meaning.” As an example, if the system is asked to translate the sentence “The dog is running and jumping,” it might respond with something like: “I can’t translate that exactly, but I can say ‘The dog is running. The dog is jumping’ (Ishapugu-uu poyoha-ti. Ishapugu-uu yotsi-ti).”
He also developed a suite of digital tools for language revitalization, named Kubishi, meaning “brain” in Owens Valley Paiute. These tools include an online dictionary, a sentence builder, and a translation system powered by his research. These tools complement decades of effort by tribal organizations to preserve their language, such as recordings of elders speaking, printed dictionaries, and ongoing language classes.
For the last 10 years, Coleman has dedicated himself to learning Owens Valley Paiute. While he can hold a conversation, he acknowledges there is still much to learn. “I am grateful my ancestors had the foresight to use the resources available to them to preserve the language,” said Coleman. “Hearing recordings of my great-great-grandfather speaking the language was an incredible experience. I didn’t even know those recordings existed until my 20s! This motivates me and gives me hope that the language can be revitalized.”
Coleman brings real-world experience to the classroom, having worked at the Aerospace Corporation on computational tools for launch vehicle analysis. He holds a Ph.D. in computer science from the University of Southern California, as well as both a master’s and a bachelor’s degree in computer science from California State University, Long Beach.