We are seeking a highly experienced and skilled Senior Data Lake Engineer to join our team. As the Senior Data Lake Engineer, you will play a critical role in establishing and configuring an enterprise-level Databricks solution to support our federal customer organization's data lake initiatives. This position offers a unique opportunity to work with cutting-edge technologies and shape the future of our federal customer's data infrastructure.
If you are a highly skilled and experienced Senior Data Lake Engineer with expertise in Databricks and passion for building scalable and secure data lake solutions, we would like to hear from you.
Responsibilities:
- Lead the design, implementation, and configuration of an enterprise Data Lake solution utilizing Databricks, ensuring scalability, reliability, and optimal performance.
- Collaborate with cross-functional teams to gather requirements, understand data integration needs, and define data lake architecture and governance policies.
- Establish and configure Databricks workspaces, clusters, and storage components, optimizing the solution for efficient data processing, query performance, and data governance.
- Design and implement data ingestion pipelines to efficiently extract, transform, and load data from various sources into the data lake using Databricks tools and services.
- Develop and maintain data lake security frameworks, including access controls, encryption solutions, and data masking techniques to protect sensitive data.
- Collaborate with data engineers and data scientists to optimize data pipelines, develop data transformations, and ensure data quality and integrity.
- Monitor and tune Databricks clusters and workloads to ensure performance, reliability, and cost optimization, utilizing automated scaling and resource management techniques.
- Implement best practices for data governance, data cataloging, metadata management, and data lineage within Databricks, adhering to regulatory and compliance requirements.
- Collaborate with infrastructure teams to ensure data lake infrastructure meets scalability and availability requirements, leveraging Databricks cluster management and AWS/Azure services.
- Develop and maintain documentation and guidelines related to the Databricks solution, including architecture diagrams, standards, and processes.
- Stay up to date with the latest advancements in Databricks, big data technologies, and cloud platforms, continuously evaluating and implementing new features and capabilities.
- Provide technical guidance and mentorship to junior data engineers, promoting best practices and fostering a culture of continuous learning and growth.
- Collaborate with stakeholders to understand their data analytics and reporting needs and develop scalable data models and data transformation processes to support these requirements.
- Support data lake-related incident resolutions, troubleshooting data quality issues, performance bottlenecks, and other data-related challenges.
- Collaborate with data governance and compliance teams to ensure data privacy, security, and compliance guidelines are adhered to within the data lake solution.
- Participate in the evaluation and selection of new tools, technologies, and services to enhance the data lake infrastructure.