Luxshan Thavarasa - Software Engineer & AI Research Enthusiast

About

Final-year Computer Science and Engineering student at the University of Moratuwa with a passion for Full Stack Development and Artificial Intelligence. My research focuses on Speech Emotion Recognition (SER) for low-resource languages, aiming to bring AI advancements to diverse linguistic communities.

I enjoy working in collaborative environments and thrive on creating innovative solutions that make complex data accessible and meaningful to users. I'm eager to connect with like-minded individuals and work on projects that push the boundaries of technology.

Research Focus: Speech Emotion Recognition · Natural Language Processing · Agentic AI Systems · Full-Stack Development

Languages: Tamil (Native), English (Professional Working)

Current Role: Software Engineer at H2O.ai, developing frontend and backend solutions for H2OGPTe and Agentic Applications.

Technical Skills

Programming Languages Python, JavaScript, TypeScript, Java, C++, PHP, HTML/CSS

Frameworks & Libraries React, FastAPI, Express.js, Flask, Node.js, React Native, Bootstrap, jQuery, PyTorch, NumPy, Pandas

Databases MySQL, PostgreSQL, MongoDB, Chroma DB, RDBMS

Cloud & DevOps AWS (EC2, S3, RDS), Docker, GitHub Actions, CI/CD, Nginx, Reverse Proxy

AI/ML Technologies Machine Learning, Deep Learning, PyTorch, Transfer Learning, NLP, Large Language Models, Generative AI, CNN, Attention Mechanisms

Development Tools Git, GitHub, REST APIs, OAuth 2.0, Unit Testing, Jest, Version Control, Web Scraping

Specializations Full-Stack Development, Speech Emotion Recognition, Natural Language Processing, Agentic Systems, UI/UX Design

Experience

Professional

Software Engineer, H2O.ai (Colombo, Sri Lanka)

July 2025 – Present (2 months)

H2OGPTe: Developing the frontend with a strong focus on UI/UX for agent interactions, while contributing to backend improvements to ensure seamless functionality.

Intern Engineering - Software, H2O.ai (Colombo, Sri Lanka)

November 2023 – July 2025 (1 year 9 months)

H2OGPTe: Developed frontend with focus on UI/UX for agent interactions while contributing to backend enhancements for seamless functionality
Agentic Application: Worked on both frontend and backend development using React with Fluent UI and H2O UI tools for frontend, and FastAPI with PostgreSQL for backend
ChurnApp: Created application to predict customer churn for H2O.ai. Trained model with H2O Driverless AI and deployed in MLOps environment. Integrated Salesforce and helpdesk data with React frontend and FastAPI backend
Olympic App: Developed platform for conducting hackathons, including participant and administrative applications using H2O Wave and PostgreSQL

Full-stack Developer, aaivu (Colombo, Sri Lanka)

January 2023 – June 2025 (2 years 6 months)

Rtuthaya Website: Contributed to the development of Dr. Uthayasankar Thayasivam's website. Managed content creators and guided new juniors in enhancing the application. Key contributions included project page and conference page development using PHP, MySQL, HTML, CSS, JavaScript, and Bootstrap.

Education

BSc Eng Hons, Computer Science & Engineering

University of Moratuwa, Sri Lanka

March 2021 – May 2025

Specialization: Computer Science and Engineering
Key Skills Developed: Project Documentation, Computer Science, MySQL, Flask, NumPy, Software Development, Problem Solving, Software Architecture, Python, Data Structures, Java, Web Services, Mathematics, Solution Architecture, Software Quality, Technical Documentation, Analytical Skills, Leadership, Test Cases, API Development, SQL, RDBMS, Pandas, Communication, Test Planning, Scripting, Architecture Development, Generative AI, C++, Unit Testing
Final Year Project: Multilingual Universal Speech Emotion Recognition Model

Secondary Education

Mu/Visuvamadu Maha Vidyalayam, Mullaitivu, Sri Lanka

January 2011 – August 2019

G.C.E. (A/L) Examination 2019: 3A grades | Physical Science stream | Island Rank: 209 | Z-Score: 2.4704

G.C.E. (O/L) Examination 2016: 8A, 1B | Including Mathematics, Science, English, and Business & Accounts

Primary Education

Kn/Tharmapuram G.T.M.S, Sri Lanka

January 2006 – December 2010

Publications

Global PIQA: Evaluating Physical Commonsense Reasoning Across 100+ Languages and Cultures

arXiv preprint • October 2024

To date, there exist almost no culturally-specific evaluation benchmarks for large language models (LLMs) that cover a large number of languages and cultures. In this paper, we present Global PIQA, a participatory commonsense reasoning benchmark for over 100 languages, constructed by hand by 335 researchers from 65 countries around the world. The 116 language varieties in Global PIQA cover five continents, 14 language families, and 23 writing systems. In the non-parallel split of Global PIQA, over 50% of examples reference local foods, customs, traditions, or other culturally-specific elements. We find that state-of-the-art LLMs perform well on Global PIQA in aggregate, but they exhibit weaker performance in lower-resource languages (up to a 37% accuracy gap, despite random chance at 50%). Open models generally perform worse than proprietary models. Global PIQA highlights that in many languages and cultures, everyday knowledge remains an area for improvement, alongside more widely-discussed capabilities such as complex reasoning and expert knowledge. Beyond its uses for LLM evaluation, we hope that Global PIQA provides a glimpse into the wide diversity of cultures in which human language is embedded.

Incepto@DravidianLangTech 2025: Detecting Abusive Tamil and Malayalam Text Targeting Women on YouTube

Association for Computational Linguistics • May 3, 2025

This study introduces a novel multilingual model designed to effectively address the challenges of detecting abusive content in low-resource, code-mixed languages. Our model achieved a macro F1 score of 0.7864 on Tamil dataset and 0.7058 on Malayalam dataset using transfer learning techniques and multi-head attention mechanisms.

EmoTa: A Tamil Emotional Speech Dataset

International Committee on Computational Linguistics • January 19, 2025

This paper introduces EmoTa, the first emotional speech dataset in Tamil, designed to reflect the linguistic diversity of Sri Lankan Tamil speakers. EmoTa comprises 936 utterances from 22 native Tamil speakers across five primary emotions with substantial Fleiss' Kappa agreement score of 0.74 and F1-scores of 0.91 and 0.90.

Key Projects

FastMCP File Server

December 2024

A versatile, secure file server implementing the Model Context Protocol (MCP) that provides AI assistants with safe file operations. Features multiple connection modes (stdio, HTTP, public access via ngrok), configurable access levels, and comprehensive security controls for various deployment scenarios. Supports comprehensive file operations, advanced text manipulation, file analysis, batch operations, archive support, and format conversion.

Technologies: Python, FastAPI, Model Context Protocol (MCP), ngrok, Security & Authentication, File I/O

DravidaKavacham – Abusive Content Detection for Tamil & Malayalam

December 2024

Developed an open-source tool for detecting abusive Tamil and Malayalam content targeting women. Built as part of DravidianLangTech @ NAACL 2025, leveraging NLP and machine learning for accurate text classification.

Technologies: Python, Machine Learning, Transfer Learning, Attention Mechanisms

QuickChat - Chrome Extension Help Chatbot

May 2024 – July 2024

Developed Chrome extension integrating help chatbot into any website's bottom right corner. Backend processes involve web scraping, content embedding with OpenAI embeddings and Chroma DB, with real-time responses using GPT-3.5.

Technologies: JavaScript, HTML/CSS, Python, FastAPI, Web Scraping, OpenAI Embeddings, Chroma DB, Vector Databases, RAG

Driver's License Test Application

July 2023 – October 2023

Developed user-friendly web portal and mobile application for driver's license exam preparation with comprehensive study materials, practice tests, and performance feedback.

Technologies: HTML, CSS, JavaScript, Bootstrap, React JS, Express JS, MongoDB, React Native, Git, CI/CD, Docker, AWS EC2, Redux.js

Supply Chain Management System

September 2022 – January 2023

Developed web-based SCMS for online shopping with team of 5 members. Designed RDBMS for efficient data management and implemented triggers and functions for enhanced database performance.

Technologies: HTML, CSS, JavaScript, Bootstrap, PHP, MySQL

Email Client

July 2022 – August 2022

Developed console-based Email Client application using Java programming, design patterns, and OOP concepts. Features include contact management, email sending, automated birthday wishes, and email history.

Technologies: Java, Object-Oriented Programming, Design Patterns

Honors & Awards

All Island Mathematics Competition 2018: Senior 2, 2nd Runner-Up (Northern Province Team) - Ministry of Education, Sri Lanka
Academic Excellence: Island Rank 209 in G.C.E Advanced Level with Z-Score 2.4704
Research Publications: Accepted papers in COLING 2025 and NAACL 2025 workshops
Professional Recognition: Promoted from Intern to Software Engineer at H2O.ai

Licenses & Certifications

Fundamentals of Deep Learning - NVIDIA (March 2025)

Credential ID: 1tO0Ys3ITkGJkXM3sgBKrQ

Skills: Deep Learning, Machine Learning, PyTorch, Neural Networks

Large Language Models (LLMs) - Level 1 - H2O.ai (March 2024)

Skills: Large Language Models, Generative AI, Natural Language Processing

Machine Learning A-Z - Udemy (March 2024)

Credential ID: 60b2314b-e9de-4b14-af89-b301ec16a5ed

Skills: CNN, Deep Learning, Supervised/Unsupervised Learning, PyTorch, NLP

Meta Frontend Development Specialization - Coursera (2023)

Comprehensive Program Including: Advanced React, React Basics, HTML and CSS in depth, Programming with JavaScript, UX Design Foundations, Version Control

Volunteering

Reviewer - DravidianLangTech Workshop (NAACL 2025)

January 2025 (1 month)

Contributed as volunteer paper reviewer for the Fifth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages. Evaluated research submissions specializing in hate speech detection across text and speech data.

Contact Me

Get in Touch

Email: luxshan2327@gmail.com

Mobile: +94-77-3038402

Address: Visuvamadu East, Visuvamadu, Mullaitivu, Northern Province, Sri Lanka

LinkedIn: linkedin.com/in/lux-thavarasa

GitHub: github.com/luxshan2000

Languages: Tamil (Native), English (Professional Working)

"Let's shape the future together! I'm eager to connect with like-minded individuals and work on projects that push the boundaries of technology."