Modern Data Team Hats

Different hats in a team. Picture from Storyset.

This blog was written together Martin Rusnak from and Bujar Bakiu.

  • Lack of communication between teams. Requirements prioritized in one of them were not aligned with the other teams. For instance, if the Data Science team needed to explore the new marketing campaign data, it had to wait for the Data Engineering team to make these data available.
  • Considering solutions in isolation. Data Scientists might not be considering the performance of the solution during inference, but rather optimizing for accuracy during testing and evaluation. However, the inference would be a huge challenge for the operations team.

Data Engineer

  • Orchestration, e.g. Airflow, Dagster, Prefect
  • Data processing, e.g. Pandas, Spark, Dask
  • Data warehousing, e.g. BigQuery, Redshift, Hive
  • Data versioning, e.g. DVC, Pachyderm

Analytics Engineer

  • Data warehousing, e.g. BigQuery, Redshift, Snowflake
  • Transformation, e.g. dbt, Dataform

Data Analyst

  • Visualization, e.g. Metabase, Looker, Power BI, Tableau
  • Transformation, e.g. dbt, Dataform, SQL

Data Scientist

  • ML libraries like scikit-learn, XGboost
  • Deep Learning libraries, e.g. Tensorflow, PyTorch
  • Experiment tracking, e.g. MLflow, Kubeflow, Aim
  • Feature store, e.g. Feast, Hopsworks
  • Explainability, e.g. Lime, SHAP

Machine Learning Engineer

  • Orchestration: MLflow, Kubeflow, Flyte, Kubernetes
  • Model serving, e.g. seldon-core, BentoML, TensorFlow Serving, Torchserve
  • Training, e.g. Horovod, Ray
  • Feature store, e.g. Feast, Hopsworks


  • Model Monitoring, e.g. whylabs, evidently
  • Automation, e.g. Gitlab CI, Github Actions
  • Infrastructure, e.g. Terraform, Kubernetes, Helm charts

Product Manager



CEO & Principal ML/Data Engineer @

