VedaPointe is seeking a senior data pipeline engineer to join our Data Engineering and Analytics team. In this role, you’ll be assisting our architects with maturing our data operations while working with your associated team to design, build, and optimize our data infrastructure and related pipelines. Additionally, you’ll work closely with cross-functional teams to deliver robust data solutions that help us deliver our products and related data initiatives in the healthcare industry.
From a day-to-day perspective, you’ll be working on a team of data engineers who are lead by an architect and a team lead. You’ll be responsible for overseeing a myriad of technologies to support the ingestion, storage, transformation, and consumption of healthcare data which will include personally identifiable information, protected health information, and other forms of confidential data for VedaPointe and its partners. There will be a strong focus on building high performing, reliable, and secure data pipelines which will run on technologies deployed to Amazon Web Services. As an organization, we leverage Azure DevOps as a code repository, documentation destination, to run our DevOps pipelines, and to plan and track work.
Responsibilities
- Design, develop and maintain manageable and scalable data pipelines and ETL processes to support data acquisition, integration and delivery for upstream applications, analytics, and other data-driven initiatives
- Design and implement advanced data storage solutions (databases, data warehouses, and data lakes) and efficient data transport layers (REST API’s, message queues, message buses)
- Collaborate with executive leadership, product owners, technical architects, development teams, governance stakeholders, and other key personnel to gather, define, and document technical requirements for data pipelines that align with business objectives
- Create and maintain analytics dashboards providing actionable insights into customer acquisition, operational efficiency, and other critical business performance metrics
- Continuously optimize data architectures, data transformations, and data pipelines to enhance performance, reliability and scalability of the system.
- Apply software engineering best practices to ensure data quality, integrity, validity, availability, and compliance
- Proactively identify, troubleshoot, and resolve production data-related issues
- Create high-impact data tools for analytics and data science teams that help others move faster
- Participate in design, code development, code reviews, testing, data quality monitoring, deployment activities, and operations support
- Contribute to your overall team growth by staying current with and evaluating emerging data technologies and industry trends while sharing with colleaguesQualifications
Qualifications
- Bachelor’s or Master’s degree in Computer Science, Information Technology, Statistics, or a related field
- 5+ years of experience in data engineering with a modern tech stack
- Strong proficiency in Python, Java, or Scala
- Advanced SQL skills and extensive experience with relational databases (e.g., PostgreSQL, MS SQL Server)
- Hands-on experience with AWS and cloud data warehouses (e.g., Redshift, ClickHouse)
- Expertise in building batch and streaming data integration solutions (e.g., Kafka, Spark Streaming), data replication (e.g., AWS DMS, Airbyte, Debezium), and data transformation and enrichment (Python, SQL, Spark)
- Experience with data orchestration and workflow tools such as Airflow, Prefect, or AWS Step Functions
- Solid understanding of data modeling, data architecture, and data governance principles
- Exceptional problem-solving abilities, attention to detail, and strong communication and collaboration skills
- Prior experience designing, implementing, and supporting data pipeline architectures which include remote data ingestion, data orchestration, data governance, and data transformations.
- Preference for candidates in the St. Louis, Missouri area (hybrid), but open to eligible candidates residing in the continental United States (remote).
Nice to Have
- Experience with containerization and orchestration tools (Docker, Kubernetes)
- Proficiency in Infrastructure as Code (IaC) technologies (e.g., Terraform, CloudFormation)
- Familiarity with CI/CD tooling and workflows
- Understanding of machine learning and data science workflows
- Extensive experience with big data technologies, focusing on data quality and observability
- Knowledge of NoSQL databases (e.g., MongoDB, Elasticsearch, Cassandra)
IMPORTANT:
-Only resumes in PDF format will be accepted.
-We do not provide immigration sponsorship.
Do you believe you are a fit? Please send your resume/CV to [email protected].