We are seeking a highly experienced and skilled Senior Data Lake Engineer to join our team. As the Senior Data Lake Engineer, you will play a critical role in establishing and configuring an enterprise-level Databricks solution to support our federal customer organization's data lake initiatives. This position offers a unique opportunity to work with cutting-edge technologies and shape the future of our federal customer's data infrastructure.
If you are a highly skilled and experienced Senior Data Lake Engineer with expertise in Databricks and passion for building scalable and secure data lake solutions, we would like to hear from you.
This position has an on-site requirement of 5 days week in Arlington VA.
Responsibilities:
- Lead the design, implementation, and configuration of an enterprise Data Lake solution utilizing Databricks, ensuring scalability, reliability, and optimal performance.
- Collaborate with cross-functional teams to gather requirements, understand data integration needs, and define data lake architecture and governance policies.
- Establish and configure Databricks workspaces, clusters, and storage components, optimizing the solution for efficient data processing, query performance, and data governance.
- Design and implement data ingestion pipelines to efficiently extract, transform, and load data from various sources into the data lake using Databricks tools and services.
- Develop and maintain data lake security frameworks, including access controls, encryption solutions, and data masking techniques to protect sensitive data.
- Collaborate with data engineers and data scientists to optimize data pipelines, develop data transformations, and ensure data quality and integrity.
- Monitor and tune Databricks clusters and workloads to ensure performance, reliability, and cost optimization, utilizing automated scaling and resource management techniques.
- Implement best practices for data governance, data cataloging, metadata management, and data lineage within Databricks, adhering to regulatory and compliance requirements.
- Collaborate with infrastructure teams to ensure data lake infrastructure meets scalability and availability requirements, leveraging Databricks cluster management and AWS/Azure services.
- Develop and maintain documentation and guidelines related to the Databricks solution, including architecture diagrams, standards, and processes.
- Stay up to date with the latest advancements in Databricks, big data technologies, and cloud platforms, continuously evaluating and implementing new features and capabilities.
- Provide technical guidance and mentorship to junior data engineers, promoting best practices and fostering a culture of continuous learning and growth.
- Collaborate with stakeholders to understand their data analytics and reporting needs and develop scalable data models and data transformation processes to support these requirements.
- Support data lake-related incident resolutions, troubleshooting data quality issues, performance bottlenecks, and other data-related challenges.
- Collaborate with data governance and compliance teams to ensure data privacy, security, and compliance guidelines are adhered to within the data lake solution.
- Participate in the evaluation and selection of new tools, technologies, and services to enhance the data lake infrastructure.