Data Engineer
2026-04-09T16:42:11+00:00
Earth AI
https://www.greataustraliajobs.com/jsjobsdata/data/employer/comp_5051/logo/earth%20ai.png
https://earth-ai.com/
FULL_TIME
Alexandria, Sydney NSW
Sydney NSW
2000
Australia
Professional, Scientific, and Technical Services
Computer & IT, Science & Engineering
2026-04-14T17:00:00+00:00
8
EARTH AI is a fast-growing tech startup transforming the way critical metals are discovered. Founded in 2016 in Sydney, we are on a mission to secure the future of technology and infrastructure by responsibly sourcing the metals essential for global advancement. Using cutting-edge technology, we identify and develop new metal deposits with unmatched efficiency and sustainability. Our vision is to lead the industry by discovering and owning the majority of future metal deposits while becoming the most advanced, efficient, and responsible metal commodity company in the world.
About the Role
Earth AI is seeking an early to mid-career Data Engineer to join our AI & Geoscience team. This role is focused on building, operating, and scaling production-grade data systems that power Earth AI’s machine learning and analytics platforms. You will engineer data pipelines that transform raw, heterogeneous geoscience data, including geochemistry, geophysics, multispectral imagery, and drilling datasets into AI-ready, reproducible datasets. Working closely with our Senior AI Researcher and geologists, you will help translate complex field and historical data into robust schemas and reliable data products used in modelling and decision-making.
Key responsibilities:
- Engineer and maintain end-to-end data pipelines for ingestion, standardisation, and quality assurance of geoscience datasets;
- Build transformation and feature-engineering workflows that support analytics and machine learning use cases;
- Design and deploy scalable, production-quality ETL systems using Python and SQL;
- Develop reproducible datasets, schemas, and metadata standards suitable for AI training and inference;
- Operate data pipelines across local high-performance compute environments and cloud infrastructure;
- Apply best practices for geospatial data engineering, including CRS management, spatial joins, and large-scale performance optimisation;
- Work closely with AI researchers and geologists to define data requirements, validation criteria, and delivery timelines;
- Own data engineering tasks from design through deployment and ongoing maintenance;
About You
- Tertiary qualification in Computer Science, Data Science, Software Engineering, or a related field;
- 2–5 years’ experience as a Data Engineer (or similar role) delivering production data pipelines;
- Strong proficiency in Python and SQL, with experience building reliable, testable ETL workflows;
- Demonstrated experience managing the full data lifecycle, from raw ingestion to ML-ready outputs;
- Comfortable working with messy, real-world geoscience data, including missing metadata and evolving schemas;
- Experience with cloud object storage and common geospatial and tabular data formats;
- Working knowledge of geospatial data concepts and tooling (e.g. GeoPandas, GDAL, Rasterio, PostGIS, QGIS, or equivalents);
- Familiarity with version control, code review processes, and basic CI/CD;
- Strong systems thinking, attention to detail, and pragmatic engineering judgement;
- Clear communicator who can collaborate effectively with technical and domain experts;
- Experience with large-scale, distributed, time-series, or survey-grid datasets is highly regarded.
Why Join Us:
- Growth Opportunities: We support professional development and career advancement;
- Collaborative Environment: Work with a team of passionate and supportive colleagues;
- Competitive compensation package, plus quarterly performance bonus;
- Team lunches;
- Company retreats.
- Engineer and maintain end-to-end data pipelines for ingestion, standardisation, and quality assurance of geoscience datasets;
- Build transformation and feature-engineering workflows that support analytics and machine learning use cases;
- Design and deploy scalable, production-quality ETL systems using Python and SQL;
- Develop reproducible datasets, schemas, and metadata standards suitable for AI training and inference;
- Operate data pipelines across local high-performance compute environments and cloud infrastructure;
- Apply best practices for geospatial data engineering, including CRS management, spatial joins, and large-scale performance optimisation;
- Work closely with AI researchers and geologists to define data requirements, validation criteria, and delivery timelines;
- Own data engineering tasks from design through deployment and ongoing maintenance;
- Python
- SQL
- ETL workflows
- Cloud object storage
- Geospatial data concepts and tooling (e.g. GeoPandas, GDAL, Rasterio, PostGIS, QGIS, or equivalents)
- Version control
- Code review processes
- CI/CD
- Machine Learning
- Analysis
- Pipelines
- Tertiary qualification in Computer Science, Data Science, Software Engineering, or a related field;
- 2–5 years’ experience as a Data Engineer (or similar role) delivering production data pipelines;
- Strong proficiency in Python and SQL, with experience building reliable, testable ETL workflows;
- Demonstrated experience managing the full data lifecycle, from raw ingestion to ML-ready outputs;
- Comfortable working with messy, real-world geoscience data, including missing metadata and evolving schemas;
- Experience with cloud object storage and common geospatial and tabular data formats;
- Working knowledge of geospatial data concepts and tooling (e.g. GeoPandas, GDAL, Rasterio, PostGIS, QGIS, or equivalents);
- Familiarity with version control, code review processes, and basic CI/CD;
- Strong systems thinking, attention to detail, and pragmatic engineering judgement;
- Clear communicator who can collaborate effectively with technical and domain experts;
- Experience with large-scale, distributed, time-series, or survey-grid datasets is highly regarded.
JOB-69d7d6e3defa8
Vacancy title:
Data Engineer
[Type: FULL_TIME, Industry: Professional, Scientific, and Technical Services, Category: Computer & IT, Science & Engineering]
Jobs at:
Earth AI
Deadline of this Job:
Tuesday, April 14 2026
Duty Station:
Alexandria, Sydney NSW | Sydney NSW
Summary
Date Posted: Thursday, April 9 2026, Base Salary: Not Disclosed
Similar Jobs in Australia
Learn more about Earth AI
Earth AI jobs in Australia
JOB DETAILS:
EARTH AI is a fast-growing tech startup transforming the way critical metals are discovered. Founded in 2016 in Sydney, we are on a mission to secure the future of technology and infrastructure by responsibly sourcing the metals essential for global advancement. Using cutting-edge technology, we identify and develop new metal deposits with unmatched efficiency and sustainability. Our vision is to lead the industry by discovering and owning the majority of future metal deposits while becoming the most advanced, efficient, and responsible metal commodity company in the world.
About the Role
Earth AI is seeking an early to mid-career Data Engineer to join our AI & Geoscience team. This role is focused on building, operating, and scaling production-grade data systems that power Earth AI’s machine learning and analytics platforms. You will engineer data pipelines that transform raw, heterogeneous geoscience data, including geochemistry, geophysics, multispectral imagery, and drilling datasets into AI-ready, reproducible datasets. Working closely with our Senior AI Researcher and geologists, you will help translate complex field and historical data into robust schemas and reliable data products used in modelling and decision-making.
Key responsibilities:
- Engineer and maintain end-to-end data pipelines for ingestion, standardisation, and quality assurance of geoscience datasets;
- Build transformation and feature-engineering workflows that support analytics and machine learning use cases;
- Design and deploy scalable, production-quality ETL systems using Python and SQL;
- Develop reproducible datasets, schemas, and metadata standards suitable for AI training and inference;
- Operate data pipelines across local high-performance compute environments and cloud infrastructure;
- Apply best practices for geospatial data engineering, including CRS management, spatial joins, and large-scale performance optimisation;
- Work closely with AI researchers and geologists to define data requirements, validation criteria, and delivery timelines;
- Own data engineering tasks from design through deployment and ongoing maintenance;
About You
- Tertiary qualification in Computer Science, Data Science, Software Engineering, or a related field;
- 2–5 years’ experience as a Data Engineer (or similar role) delivering production data pipelines;
- Strong proficiency in Python and SQL, with experience building reliable, testable ETL workflows;
- Demonstrated experience managing the full data lifecycle, from raw ingestion to ML-ready outputs;
- Comfortable working with messy, real-world geoscience data, including missing metadata and evolving schemas;
- Experience with cloud object storage and common geospatial and tabular data formats;
- Working knowledge of geospatial data concepts and tooling (e.g. GeoPandas, GDAL, Rasterio, PostGIS, QGIS, or equivalents);
- Familiarity with version control, code review processes, and basic CI/CD;
- Strong systems thinking, attention to detail, and pragmatic engineering judgement;
- Clear communicator who can collaborate effectively with technical and domain experts;
- Experience with large-scale, distributed, time-series, or survey-grid datasets is highly regarded.
Why Join Us:
- Growth Opportunities: We support professional development and career advancement;
- Collaborative Environment: Work with a team of passionate and supportive colleagues;
- Competitive compensation package, plus quarterly performance bonus;
- Team lunches;
- Company retreats.
Work Hours: 8
Experience in Months: 24
Level of Education: bachelor degree
Job application procedure
Visit our website for more information https://earth-ai.com/
All Jobs | QUICK ALERT SUBSCRIPTION