Alya Firdausyi
Data Engineer

Alya Firdausyi.

I design and build scalable data platforms, cross-cloud pipelines, and real-time streaming systems that turn raw data into reliable, production-grade infrastructure. Formerly in cloud — now shaping how data flows through it.

01

About

Experienced Data Engineer specializing in designing and implementing scalable, cross-cloud data platforms and warehouse solutions to support advanced business and finance reporting. Proven expertise in orchestrating workflows using Airflow, managing large-scale ETL/ELT processes across GCP and AWS, and building robust real-time streaming solutions with Kafka. Strong command of SQL, Python, and Java, with experience in implementing CI/CD pipelines to ensure reliable, high-quality data delivery and deployment automation.

02

Experience

Data Platform Engineer

PT Link Net Tbk

Aug 2024 — Present Tangerang, ID

Leading Indonesian telecommunications provider delivering high-speed broadband, premium pay-TV, and innovative connectivity solutions through an extensive fiber network under the First Media and Link Net Fiber brands.

  • Engineered end-to-end ETL pipelines ingesting data from diverse on-premise sources (MySQL, Oracle, SQL Server, SAP DB) into a multi-layer BigQuery data warehouse and S3-based lakehouse; implemented Medallion architecture and Slowly Changing Dimensions (SCD) with Open Table Formats
  • Orchestrated 20+ Airflow DAGs with SLA monitoring, automating complex ETL using Python/Java with GCP tools (Dataproc, Dataflow)
  • Led cloud migration of 200+ tables from GCP to AWS, utilizing AWS Glue, Lake Formation, and MWAA, significantly optimizing metadata governance and performance
  • Implemented real-time Change Data Capture (CDC) pipeline using Apache Kafka and Debezium to stream data changes from SQL Server to downstream application APIs, enabling event-driven integration and reducing data latency
  • Implemented CI/CD pipeline using Azure DevOps to automate deployment and management of all data assets, including AWS Glue jobs, Athena queries, Redshift, and Airflow DAGs
Fellowship

Data Engineer — Data Fellowship 12

IYKRA

Mar 2024 — Jul 2024 Jakarta, ID

Consulting, training, and implementation company focusing on AI, Big Data, and Analytics. The Data Fellowship accelerates the journey of aspiring Indonesian data engineers through an intensive 4-month curriculum.

  • Selected as one of 20 Data Fellowship Batch 12 students, awarded a fully funded scholarship to complete an intensive 4-month comprehensive data engineering curriculum
  • Led a team of five in building a customer segmentation model (capstone: "Building Customer Segmentation for Effective Personalized Marketing") that earned the Best Capstone Project distinction
  • Developed end-to-end data pipelines ingesting data from data lakes (GCP), performing transformations using Airflow, dbt, Apache Kafka, Apache NiFi, BigQuery, and delivering results through Looker and Tableau
  • Leveraged dbt to transform and ingest data into Google BigQuery as a data warehouse

Cloud Engineer

Xtremax Teknologi Indonesia

Mar 2019 — Oct 2022 Bandung, ID

Certified digital transformation company providing innovative cloud solutions. Headquartered in Singapore, assisting clients with large-scale server migrations, cloud migrations, application modernizations, and containerized services.

  • Maintained and optimized 20+ internal servers on AWS EC2, utilizing Load Balancers and Auto Scaling Groups for the Content Website Platform project
  • Remediated 500+ vulnerability findings within cloud servers and infrastructure using Nexpose vulnerability scanner, resulting in 70% reduction in security incidents
  • Streamlined installation, configuration, patching, and troubleshooting of services for 5 CMS platforms (WordPress, SWIIIT, Sitecore, Sitefinity, SharePoint) across Linux and Windows environments
  • Documented 100+ complex issues and resolutions as RFC, Incident Report, Build Docs, or Wiki, while consistently updating project documentation
03

Projects

Customer Segmentation & CLV Prediction Pipeline

End-to-end automated ML pipeline for customer lifetime value prediction and segmentation using RFM analysis. Built with Medallion architecture on GCP, integrating data quality checks and interactive dashboards.

Automated customer segmentation reducing manual analysis time by 80%, enabling data-driven marketing strategies.

Python BigQuery Airflow dbt Docker Looker Studio
View Project →

Attendance ETL Pipeline

Scalable ETL pipeline for university attendance data processing with Docker containerization. Three-layer architecture (Staging → Warehouse → Mart) with automated weekly reporting.

Three-layer Medallion architecture with containerized deployment; automated weekly attendance reporting.

Python PostgreSQL Docker pandas
View Project →

BigQuery Data Ingestion Pipeline

Data ingestion pipeline from local PostgreSQL to BigQuery using Python transformation and Cloud SQL. Processes banking fraud detection dataset through a clean data flow.

End-to-end data flow from local CSV through transformation to BigQuery data warehouse.

Python BigQuery PostgreSQL GCP Cloud SQL pandas
View Project →

CI/CD Pipeline for AWS Glue Jobs

Production-ready CI/CD pipeline automating AWS Glue job deployment using Azure DevOps with multi-environment support (DEV/PRD), smart change detection, and S3 Tables integration with Apache Iceberg.

Reduced manual deployment time from hours to minutes; zero-downtime deployments across environments.

AWS Glue AWS S3 Azure DevOps Python boto3 Apache Iceberg
View Project →
04

Skills

Data Engineering & Orchestration

Python
SQL
Apache Airflow
dbt
Apache Kafka
Apache NiFi
PySpark
Java

Cloud Platforms & Data Services

AWS
GCP
Azure
BigQuery
AWS Glue
Dataproc
Dataflow

Infrastructure & DevOps

Docker
Kubernetes
Terraform
Git
Linux
Azure DevOps
CI/CD Pipelines

Databases & Storage

PostgreSQL
MySQL
OracleDB
SQL Server
SAP DB
Snowflake
Cassandra

Visualization & Analytics

Looker
Tableau
Excel
Power Query

Machine Learning

scikit-learn
Pandas
NumPy

Data Practices

Data Validation
Data Management
Data Governance
05

Certifications

06

Education

Institut Teknologi Bandung

Master of Science in Computational Science 2020 — 2023
Bachelor of Science in Physics 2014 — 2018
07

Contact

Open to discussing new opportunities, collaboration, or conversations about data engineering and cloud infrastructure.