2024 Conference Programme

Subpage Hero

Subpage Hero

      

Loading

Managing the Lifecycle of Datasets at Scale: Pipelines + Storage + Access

10 Oct 2024
AI, Machine Learning & Advanced Analytics Theatre

Every day, data grows not only in size, but also in value. People regularly make data-driven decisions about whether to wear rain boots or what to invest in (e.g., a pharmaceutical company with the next blockbuster weight loss drug). Bloomberg's Data Platform Engineering team manages diverse financial datasets. In order to scale, the team built configurable workflows to standardize the broad variety of structured and unstructured data in order to make it machine readable. These datasets are then delivered to internal clients and external customers through various applications and APIs. In this session, we will review one of Bloomberg’s data pipeline architectures used to onboard hundreds of datasets. We will explore our extensive use of Apache Airflow to orchestrate ingestion, horizontally-scaled PostgreSQL clusters, and Trino to access and combine disparate datasets. This session will inspire new ideas and strategies for standardizing and scaling your data.

Speakers
Christopher Hong, Software Engineering Team Lead - Bloomberg
Yenny Su, Software Engineer - Bloomberg

Sponsors

Keynote Theatre Sponsor

AI, Machine Learning & Advanced Analytics Theatre Sponsor


 

VIP Lounge Sponsor

VIP Lunch Sponsors


 

Gold Sponsors

Silver Sponsors

Bronze Sponsors

Exhibitors

Partners

Data & AI Learning Partner

Preferred Learning Partner

Community Partner

AI Insights Partner

Association Partners

Event Partners

Media Partners

Official News Release Distributor Partner

Official Training Partner

Knowledge Partner

frost & sullivan

 

Official Partner Hotel

Held In

Supported By

Singapore MICE Sustainability Certification - BRONZE