The Analytics Power Hour Episode #251: The Continued Rise of the Analytics Engineer with Josh Cohearst Release Date: August 6, 2024
Introduction to the Episode
In episode #251 of The Analytics Power Hour, hosts Michael Helbling, Mo Kiss, and Val Krul dive deep into the burgeoning role of the Analytics Engineer within the data and analytics ecosystem. Joining them is special guest Josh Cohearst, a seasoned Analytics Engineer at Xebia and co-author of the influential book Fundamentals of Analytics Engineering. The conversation explores the definition, evolution, and significance of the Analytics Engineer role, distinguishing it from traditional data engineering positions, and examining its impact on organizational structures and data practices.
Defining Analytics Engineering
Josh Cohearst opens the discussion by providing a comprehensive definition of Analytics Engineering:
“The analytics engineer is kind of a bridge between business and data engineering.”
[03:53]
He draws parallels to the evolution of web development, likening the Analytics Engineer to the full stack developer who seamlessly integrates both front-end and back-end tasks. In the analytics domain, this role has emerged to address the increasing demand for data-driven decision-making by blending technical expertise with business acumen.
Analytics Engineer vs. Data Engineer
Val Krul seeks clarity on differentiating Analytics Engineers from traditional Data Engineers:
“A Data Engineer is usually a person that will do like a little bit more low-level stuff... setting up ingestion... validating your data that's coming in. Whereas Analytics Engineers work more in SQL to map raw data to business processes.”
[06:31]
Josh elaborates, emphasizing that while Data Engineers focus on data ingestion, infrastructure, and ensuring the data pipeline is robust, Analytics Engineers are tasked with transforming and modeling this data to make it semantically meaningful for analysts and business users. This involves intricate data modeling techniques and a deep understanding of business logic.
Organizational Structures and Roles
The conversation shifts to how organizations are structuring their data teams:
Josh Cohearst observes a mix of centralized and decentralized models:
“I see a lot, I see a mix really... centralized or with the finance organization. So it's really a mix.”
[25:33]
He notes that Analytics Engineers can be found across various departments—finance, marketing, web analytics—each bringing unique perspectives and requirements. This decentralization often leads to silos, where different analytics teams may not communicate effectively, underscoring the need for a more unified analytics strategy.
Trends in Data Engineering and Analytics
Michael Helbling touches upon the dynamic landscape of data engineering:
“A lot of data engineers are not as familiar with event-driven data... you have to think about how to make a meaningful aggregation of that over a certain period.”
[27:43]
Josh discusses the convergence of traditional data engineering with modern analytics needs, highlighting how Analytics Engineers are pivotal in bridging gaps between streaming data and static data sources, ensuring comprehensive and cohesive data models.
Transition from ETL to ELT
A significant portion of the episode delves into the shift from Extract, Transform, Load (ETL) to Extract, Load, Transform (ELT) processes:
Josh Cohearst explains the technological advancements enabling this transition:
“We went from row-based storage systems to column-based storage systems... and now with powerful single-server databases like DuckDB, things are getting simpler.”
[29:24]
This shift allows organizations to load all data first and then perform transformations as needed, offering greater flexibility and scalability. The hosts discuss how this impacts data modeling and the workflows of Analytics Engineers, facilitating more agile and responsive data practices.
Privacy and Governance in Analytics Engineering
The topic of data privacy and governance surfaces as a critical responsibility of Analytics Engineers:
“Analytics Engineers can identify where privacy issues might arise... applying masking strategies to sensitive fields.”
[36:08]
Josh emphasizes that while Analytics Engineers may not manage privacy directly, they play a crucial role in implementing technical safeguards and collaborating with data owners to ensure compliance with data protection standards. This involves setting up processes to mask or anonymize sensitive information, thereby safeguarding data integrity and privacy.
Professional Development in Analytics Engineering
Mo Kiss raises concerns about career progression within the relatively new field of Analytics Engineering:
“It's been really difficult to help people with their professional development in a space that is quite new.”
[41:06]
Josh offers insights into fostering professional growth, highlighting the importance of continuous learning and adapting to evolving tools and practices. He suggests leveraging resources like their co-authored book and participating in community meetups to build a robust skill set. Additionally, he underscores the value of a well-rounded portfolio that blends technical prowess with strong communication and consulting abilities.
Tools and Best Practices
The episode touches on essential tools and best practices that define the Analytics Engineering role:
- Version Control and Testing: Implementing software development best practices to ensure code reliability and maintainability.
- Data Modeling: Mastery of SQL and understanding of schemas like star and snowflake for effective data representation.
- Cloud Computing: Familiarity with cloud data warehouses like BigQuery and Snowflake, understanding cost management and performance optimization.
- Observability and Data Quality: Utilizing tools and methodologies to monitor data pipelines, ensuring data accuracy and reliability.
Josh stresses that while tools are vital, the underlying concepts and problem-solving approaches are equally important for successful Analytics Engineering.
Conclusion and Final Thoughts
As the episode wraps up, the hosts and Josh reflect on the rapid growth and importance of Analytics Engineering in today's data-driven landscape. They encourage listeners to embrace the evolving roles within analytics, emphasizing the need for continuous adaptation and collaboration across departments.
Josh concludes with a call to action for aspiring Analytics Engineers to define their roles clearly, build a strong foundation in both technical and business skills, and actively engage with the community to stay abreast of industry trends.
Notable Quotes:
-
“The analytics engineer is kind of a bridge between business and data engineering.”
Josh Cohearst
[03:53] -
“Analytics Engineers can identify where privacy issues might arise... applying masking strategies to sensitive fields.”
Josh Cohearst
[36:08] -
“I think Analytics Engineering is trying to take that a little bit away from all the different analysts and put it into one central place.”
Josh Cohearst
[08:50]
Additional Resources and Community Engagement
At the end of the episode, the hosts share recommended readings and resources for further exploration:
- Josh Cohearst's Last Call: An article by Gailey Oros on Pragmatic Engineer about tech compensation trends.
- Val Krul's Last Call: A Medium article on emerging UX patterns in generative AI experiences.
- Mo Kiss's Last Call: An article from FS Blog discussing first principles in engineering and the importance of innovative leadership.
Upcoming Events:
- Measure Camp Chicago: Scheduled for Saturday, September 7th, at the Leo Burnett Building, downtown Chicago.
- Measure Camp Sydney: Scheduled for Saturday, October 26th, in Sydney, ANZ.
Listeners are encouraged to join these events to engage with the analytics community, share insights, and continue their professional development.
Stay Connected: For more insights and discussions, connect with The Analytics Power Hour community via LinkedIn, email, or the MeasureSlack chat group. Share your thoughts, questions, and experiences to keep the conversation thriving.
Keep analyzing and stay ahead in the ever-evolving world of analytics!
