Data Engineer


CAQH recognizes that its most important asset is its growing team of smart, creative, collaborative, forward-thinking and passionate professionals – and that a comprehensive employee benefits package is an important factor for them in choosing where to work. CAQH offers competitive compensation along with an extensive benefits package for all full-time employees, including medical, dental and vision coverage, tuition assistance and a 401k. We offer full-time remote work to all staff from any location and maintain a physical office (with many amenities) in downtown Washington, D.C. 

Who we are:

Named one of the "Best Places to Work" by Modern Healthcare for five consecutive years, CAQH has helped nearly 1,000 health plans, 1.6 million providers, government entities and vendors connect, exchange information and operate more efficiently. CAQH technology-enabled solutions and its Committee on Operating Rules for Information Exchange (CORE) bring the healthcare industry together to make sharing business information more automated, predictable and consistent. CAQH Explorations researches opportunities to reduce the burden of manual processes in healthcare administration.


Data Engineer

CAQH Solutions
Reports To

Chief Data Officer

Data is CAQH’s most critical asset and you will change how we understand and employ it. CAQH is seeking an action-oriented Data Engineer for our Data Science team. The data engineer will lead and implement data engineering projects, support and maintain data pipelines, and provide expertise and best practices regarding data engineering for staff across the company. Typical data engineering projects focus on building new data pipelines in our cloud environment, improving performance, and adding features to existing data pipelines. As needed, the Data Engineer will design and develop new data engineering pipelines as part of the Data Science team. Further, the Data Engineer will help decide how and implement improvements to pipeline, systems, and infrastructure focusing on data quality and accuracy. The right candidate will have a passion for building cloud enabled pipelines for large data sets and working with stakeholders to improve business outcomes.

This role is full-time, exempt and reports to the Chief Data Officer. 

Specific Responsibilities
  • Demonstrates an understanding of business operations in one or more area and applies data engineering practices to anticipate and deliver information connected to those purposes.
  • Provides support and recommendations to intermediary level engineers to enhance their technical skills
  • Performs rapid data profiling of internal and external data sources to determine quality and value to business questions.
  • Counsels business analysts and data scientists on available data sources and potential biases within datasets.
  • Participates in enterprise-wide agile ceremonies determining and recommending technical programming approaches and solutions to complex applications.
  • Provides ad-hoc support for incidents related to data warehouses, datasets, and data pipelines.
  • Provides operational support for systems and applications troubleshooting and maintenance support.
  • Upgrades supported applications to current state versions in accordance with vendor specifications.
  • Constantly evaluates internal departmental workflows and standard procedures to ensure efficiencies and requirements are being met.
  • Designs and documents conceptual, logical, and physical data models (including entity relationship (ER) diagrams).
  • Evaluates requests for systems development and enhancement.
  • Understand and anticipate information needed by the business in order to build intelligent, supportable data pipelines.
  • Develop, test, and deploy ELT / ETL processes across multiple sources, targets, and tools.
  • Develop, test, and deploy self-service reporting assets to provide data exploration capabilities to end users.
  • Collaborate across multiple teams to deliver timely and reliable information that enables users to uncover insight from information.
  • Utilizes best practices and organizational standards around the Software Development Lifecycle to ensure the production of high quality, reliable data assets.
  • Follow enterprise architectural standards and designs for building data models using a variety of sources and target systems.
  • Adheres to employee standards and laws governing the use of sensitive data as it relates to Protected Health Information (PHI) and Personally Identifiable Information (PII).
  • Captures, documents, and grooms features and stories for agile/scrum teams to maximize visibility and readiness for upcoming planning and execution.
  • Participates in Data Governance related to the management of data assets.
  • Anticipates and documents upcoming data needs of users through identification of commonly used data in source systems through metadata analysis.
  • Communicates with data consumers with skill growth mindset to improve performance of processes connecting to data and shares best practices.
  • Develops, monitors, and maintains processes to track overall health of platforms and systems and identifies opportunities to tune processes connecting to data.
  • Develops and maintains Continuous Integration and Continuous Development (CI/CD) processes focused on acceleration of value delivery and minimization of manual touchpoints.
  • Performs data cleansing and profiling to ensure accuracy and quality of information.
  • Provides on-call support for data platforms, reporting assets, and ETL processes.
Knowledge, skills and abilities
  • Experience with relational database management systems (Oracle/Microsoft SQL Server)
  • Familiarity with at least one commonly used programming language
  • Knowledge of software development life cycle and best practices
  • Data wrangling experience using a variety of tools and languages
  • Experience with SQL
  • Strong technical writing and presentation skills
  • Data visualization experience in common business intelligence tools (Tableau, PowerBI, R/Shiny)
  • Familiarity with project management workflow tracking software such as JIRA/Trello
  • Experience with lean-agile principles and framework for project completion
  • Capability to design and document conceptual, logical, and physical data models for relational and dimensionally modeled databases
  • Interpret process performance outputs and improve workflow performance for affected jobs
  • Shell scripting (Bash, csh, ssh, etc.)
  • Familiarity with modern source/version control tools (Git, CodeCommit, Subversion)
  • Familiarity with various raw data source types and how to interpret them. (Unstructured -JSON/BSON, Flat files, XML, etc.)
  • Experience with scripting languages such as R and Python
  • Experience interacting with and extracting data from modern web APIs
  • Knowledge of data streaming or realtime data processing (Kafka, Confluent)
  • Experience with cloud platforms such as Azure and Snowflake

5 or more years of experience manipulating data sets and building statistical models


Bachelors degree in Computer Science, Engineering or relevant field; Masters or PhD in Data Science or another quantitative field is preferred.

PDF version
Download (193.01 KB)
Employment Type
Hiring Organization