Available for work

Designing
Modern Data & AI
Systems

Welcome! I'm Sanjitha, a Data and AI Specialist with a passion for turning complex data into intelligent solutions. By combining analytical thinking and advanced AI techniques, I transform raw information into actionable insights and innovative systems that drive efficiency, accuracy, and future-ready performance.

20+

Project done

Years of experience

Education

My academic journey and qualifications from esteemed institutions.

University of Moratuwa

MSc. Data Science and Artificial Intelligence

2024 – 2026

University of Moratuwa

BSc. Engineering (Hons)

2019 – 2023

Essential Tools I use

Discover the powerful tools and technologies I use to build intelligent, data-driven solutions and cutting-edge AI applications.

Python

Programming Language

SQL

Relational Databases

Power BI

Buisness Intelligence

Spark

Big Data Processing

Pytorch

Machine Learning

ML Flow

ML Operations

Docker

ML Deployment

Azure

Cloud Services

n8n

Workflow Automation

Langchain

LLM Applications

Chroma DB

Vector Databases

Neo4j

Graph Databases

Skills

A comprehensive overview of my technical expertise and professional capabilities.

Develop Power BI Dashboards to Give Insights (BI/Analysis)

Design Data Engineering ETL Pipelines (Engineering)

Build Scalable ML Models for Production (Machine Learning)

Analyze Complex Data to Predict Trends (Analytics)

Architect Autonomous AI Agents (Agentic AI)

Engineer Generative AI & RAG Systems (GenAI/LLMs)

Optimize Big Data Processing (Data Engineering)

Implement Data Governance & Security (Infrastructure)

Deploy End-to-End MLOps Pipelines (Operations)

Monitor Model Performance & Drift (Model Reliability)

Perform Exploratory Data Analysis (EDA) (Statistical Analysis)

Conduct Rigorous Academic Research (Research & Development)

My portfolio highlights

MLOps-Driven Data Drift & Feature Engineering Pipeline - Insurance Customer Churn

This Airflow-orchestrated MLOps pipeline automates insurance data ingestion, validation, and feature engineering. It utilizes Amazon S3 for storage and Terraform-provisioned EC2 for XGBoost training. By integrating MLflow tracking and Grafana monitoring, the system detects data drift, triggering automated retraining to ensure model accuracy amidst evolving customer behaviors.

PythonScikit-learnAirflowDockerMLflowGrafanaTerraformAmazon S3EC2XGBoost

Snowflake-Driven HR Intelligence & Streamlit Analytics

This automated ecosystem transforms raw HR data into organizational intelligence using a multi-tier Snowflake architecture. Orchestrated via Snowflake Tasks, the pipeline ingests data from S3 to calculate real-time KPIs like eNPS and attrition velocity. A Streamlit dashboard provides leadership with interactive, data-driven insights across the entire employee lifecycle.

SnowflakeSnowflake SQLStreamlitPythonAmazon S3

Agentic AI App for Financial Audit & NLQ

This LangChain-powered ecosystem utilizes Llama 4 Scout via Groq for real-time suspicious journal entry identification. Featuring a multi-agent architecture, it enables NLQ-driven insights and autonomous SQL rule generation. The Streamlit interface allows auditors to dynamically flag anomalies, transforming manual oversight into a proactive, high-speed financial integrity framework.

LangChainLlamaGroqStreamlitPythonSQL

Certifications

Explore the professional certifications I've earned to stay at the forefront of technology and innovation.

Academy Accreditation - Azure Databricks Platform Architect

Databricks • 2026

Databricks Fundementals

Databricks • 2026

Oracle Cloud Infrastructure Certified Data Science Professional

Oracle • 2023

Research

A collection of my research contributions and academic publications in the field of AI and Data Science.

Classification of Defects of Cotton Yarns Using Convolutional Neural Networks

S.H.A. Arachchi, P.H.K. Vidushka, S.N. Niles, R.P. Abeysooriya

The detection and classification of defects in cotton yarn are crucial in maintaining the quality of textile production. This is hardly getting attention in the literature due to the complexities and non-homogeneous features appearing in the cotton yarn. This study explores the application of transfer learning techniques in convolutional neural networks (CNNs) to classify yarn defects, including neps, thick and thin places, hairiness, and snarls, as well as identifying non-defective yarns. A dataset of 1,250 images was divided into five classes to evaluate three CNN models: ResNet-50, VGG-16, and Inception-v3. Inception-v3 achieved the highest validation accuracy at 98.8%, followed closely by VGG-16 with 98%, while ResNet-50 reached 77.2%. Inception-v3 and VGG-16 were successful in detecting complex yarn defects. The study further emphasizes the capability of CNNs to automate yarn defect identification by decreasing the processing time, by allowing CNN models to integrate with GPUs.

Predictive Modeling of Knit Fabric Shrinkage via ANN (Supervisor)

Muralidas Dhakshala, Sanjitha Hashan Amarathunga Arachchi, Jayasankar Janeni, S.A. Ariadurai

Predicting dimensional shrinkage in knitted fabrics remains a complex challenge due to the non-linear interplay of material and process variables. This study introduces an Artificial Neural Network (ANN) model designed to estimate shrinkage in 100% cotton and polyester-elastane single-jersey knits. Built using TensorFlow-Keras with a feed-forward backpropagation architecture, the model integrates twenty-three critical inputs, including yarn count, stitch density, tightness factor, and machine settings. By capturing data from the knitting, pre-setting, and finishing stages, the ANN provides a comprehensive framework that surpasses conventional predictive models in both accuracy and scalability. The research demonstrates that the trained ANN can rapidly forecast shrinkage using known parameters, enabling proactive quality control during production planning. This digital approach significantly reduces the need for resource-intensive physical sampling and post-compacting tests. Validated against independent samples, the model showed high correlation coefficients and minimal error rates. This integration of machine learning into textile manufacturing offers a robust solution for enhancing productivity and dimensional stability across various fabric structures.

AMulti-Agentic Framework for Identifying Suspicious Journal Entries

Dr. Thanuja Ambegoda, Sanjitha Hashan Amarathunga Arachchi

Financial fraud accounts for an estimated 5% loss in annual corporate revenue, severely undermining stakeholder trust and capital market stability. Undetected anomalous journal entries are a primary driver of these losses, yet traditional auditing methods and "black-box" machine learning models often fall short. Manual oversight is increasingly unscalable, while opaque algorithms lack the explainability required for regulatory compliance and forensic validation. To address these critical gaps, this research introduces an automated, explainable multi-agent AI system specifically engineered for journal-entry fraud detection. By leveraging a collaborative framework of specialized agents, the system achieves an impressive 90.48% overall accuracy. Beyond mere identification, the framework prioritizes transparency, providing auditors with clear reasoning for flagged transactions. This approach not only mitigates direct financial loss but also strengthens investor confidence and ensures sustainable corporate performance through proactive, high-precision financial oversight and robust regulatory alignment.

Contact me for
collaboration

Reach out today to discuss your project needs and start collaborating on something amazing!

DesigningModern Data & AISystems

Education

University of Moratuwa

University of Moratuwa

Essential Tools I use

Python

SQL

Power BI

Spark

Pytorch

ML Flow

Docker

Azure

n8n

Langchain

Chroma DB

Neo4j

Skills

My portfolio highlights

MLOps-Driven Data Drift & Feature Engineering Pipeline - Insurance Customer Churn

Snowflake-Driven HR Intelligence & Streamlit Analytics

Agentic AI App for Financial Audit & NLQ

Certifications

Research

Classification of Defects of Cotton Yarns Using Convolutional Neural Networks

Predictive Modeling of Knit Fabric Shrinkage via ANN (Supervisor)

AMulti-Agentic Framework for Identifying Suspicious Journal Entries

Contact me forcollaboration

Designing
Modern Data & AI
Systems

Contact me for
collaboration