Profile Picture

Xuan Truong Nguyen

Data Science | Data Engineering Enthusiast

About Me

I'm a Data Science enthusiast with a passion for turning complex problems into elegant solutions. My approach? Finding the unconventional path that others might overlook.

Technical Skills

Programming Languages

Python
Java
R
C
MATLAB
Bash

Machine Learning & Data Science

Pandas
NumPy
Scikit-learn
PyTorch
TensorFlow

Data Engineering & Cloud

PostgreSQL
Snowflake
GraphQL
Apache Spark
AWS
Git

Experience

Data Engineering Intern

Lions Eye Institute — Jun 2025 – Present

  • Developed an end-to-end data transformation pipeline to standardize over 250,000 eye health records collected over 30 years (~60K annual visitors) using an international healthcare data model
  • Cleaned, integrated, and mapped clinical data for accurate analysis and research compatibility across multiple systems
  • Followed established conventions and vocabularies to support global research standards and interoperability
  • Collaborated with clinical and technical teams to validate mappings and ensure data quality

Geospatial Data Science Intern

The Wilderness Society — Feb 2025 – Jun 2025

  • Automated annual land cover classification (1987–2025) using unsupervised K-Means clustering on Landsat imagery with Remotior Sensus
  • Conducted remote sensing analysis with Google Earth Engine and Sentinel-2/Landsat to detect long-term deforestation trends
  • Generated time-series land cover maps and vectorized outputs to quantify and visualize mining footprint expansion across decades
  • Analyzed spectral changes and built geospatial datasets to assess environmental impact across decades

Projects

Cross-Modal Transformer with Adversarial Domain Adaptation for Robust Emotion Recognition

Personal Project — Mar 2025 – Aug 2025

  • Developed a state-of-the-art deep learning model for multi-class emotion recognition (5 emotions) from concurrent EEG (Differential Entropy features) and eye-tracking data, utilizing the SEED-V dataset
  • Designed and implemented a custom Transformer-based architecture featuring dedicated EEG and eye-tracking encoders, Cross-Modal Attention mechanism, and Feature Importance Modules
  • Integrated a robust adversarial domain adaptation strategy using a Gradient Reversal Layer (GRL) and subject-specific normalization layers to mitigate inter-subject variability
  • Validated model generalization using rigorous Leave-One-Subject-Out (LOSO) cross-validation across 16 subjects, achieving a mean accuracy of 75.42% ± 11.15% and a best-fold accuracy of 91.11%

Predictive Weather Analysis Using Web Scraping, ML Algorithms, and Time Series Modeling

Personal Project — Jul 2024

  • Engineered a web scraping tool using Selenium, successfully collecting 12 years of historical weather data for comprehensive analysis
  • Achieved 99% accuracy in weather forecasting by implementing a prediction system using KNN, Random Forest, and XGBoost algorithms, leveraging features like humidity, dew point, and pressure
  • Conducted time series analysis using SARIMAX model to predict future temperature trends, enhancing the accuracy of short-term weather forecasts

Contact

Get In Touch

I'm always open to discussing new opportunities, collaborations, or just having a chat about data science and technology.