We are seeking a highly skilled Senior Databricks Platform Engineer to design, implement, and maintain our enterprise-level Databricks platform supporting a federal customer organization's unified data initiative. This role's primary focus is platform management—centered on building scalable infrastructure, governance frameworks, and operational excellence—with secondary responsibilities in data engineering to support platform optimization and best practices.
You will serve as the steward of our Databricks platform, acting as the frontline technical owner responsible for creating robust, scalable processes and frameworks that enable multiple data teams to work efficiently and securely. This position offers a unique opportunity to work with cutting-edge cloud data technologies and shape the foundational infrastructure of our federal customer's data ecosystem.
Responsibilities:
- Maintain enterprise-scale Databricks platform, managing multiple workspaces.
- Build and maintain platform observability through development of KPIs, executive dashboards, and monitoring solutions that provide visibility into system performance, resource consumption, cost trends, and compliance adherence.
- Develop and implement scalable user provisioning and de-provisioning processes.
- Design and manage RBAC (Role-Based Access Control) frameworks and security groups.
- Integrate with enterprise identity management systems (SAML, SCIM, Active Directory).
- Manage Unity Catalog implementation and governance structures.
- Oversee metastore configuration, catalog hierarchies, and data organization.
- Implement data lineage, auditing, and compliance frameworks.
- Design and enforce compute policies, cluster configurations, and pool management strategies.
- Implement cost optimization frameworks and resource allocation policies.
- Monitor and optimize cluster utilization and performance.
- Manage serverless SQL warehouses and compute resource governance.
- Establish and manage connections to diverse source systems (databases, APIs, file systems).
- Configure and maintain storage integrations (S3, ADLS, GCS, external locations) in addition to setting up and data access patterns.
- Implement and monitor secure credential management and secrets handling.
- and mount points.
- Build and configure Databricks workspaces, clusters, notebooks, jobs, and Delta Lake storage, integrating with AWS services such as S3, IAM, and KMS.
- Implement and maintain security controls including IAM policies, cluster security configurations, KMS-based encryption, and data masking to protect sensitive data.
- Monitor and tune Databricks clusters and job workloads for reliability, cost optimization, and performance, leveraging autoscaling and workload management.
- Apply data governance best practices including cataloging, metadata management, and data lineage using Databricks Unity Catalog and AWS-native capabilities.
- Collaborate with cloud infrastructure teams to configure supporting AWS components such as S3 storage, networking, logging, monitoring, and access controls.
- Maintain detailed technical documentation including solution designs, data flow diagrams, configuration standards, and operational procedures.
- Stay up to date with advancements in Databricks, Delta Lake, Spark, and AWS services, integrating new features that improve automation and efficiency.
- Partner with data engineers, analysts, and scientists to implement data models and reusable transformation patterns within Databricks and AWS.
- Troubleshoot and resolve platform-level issues including but not limited to workspace connectivity, cluster startup failures, authentication problems, and infrastructure performance degradation.
- Ensure compliance with data governance, privacy, and security requirements by applying secure architecture patterns and validating controls throughout the data lifecycle.
- Support the evaluation and hands-on testing of new AWS and Databricks features or services that enhance the Data Lake environment.