To Apply for this Job Click Here
Always Connecting, Always Evolving.
TECHEAD is seeking qualified applicants for the following Direct Hire position – HPC Engineer – TS/SCI / Charlottesville VA – (JOB-22344). If you are looking for a new opportunity and this position looks to be a fit, please apply to see the TECHEAD difference that has made us successful for 30+ years!
You can find more about our team and values by checking us out at TECHEAD.com or on Glassdoor
Job Description:
As an HPC Engineer, you will design, optimize, and maintain advanced high-performance computing environments that power large-scale data processing, simulation, and research operations. Your expertise will directly enable advanced data-intensive research efforts crucial to national defense.
Key Responsibilities
-
System Architecture & Administration: Manage and optimize complex HPC cluster systems, including RHEL servers, CPU/GPU compute nodes, multi-petabyte Lustre file systems, and high-performance storage arrays.
-
Network & Interconnect Management: Maintain high-speed, low-latency interconnections utilizing Ethernet, Fiber, Omni-Path, and InfiniBand.
-
Infrastructure & Automation: Develop functional IT requirements, plan/schedule new hardware/software installations, and leverage automation frameworks (Ansible) alongside scripting (Python, Bash, Perl) for systems administration.
-
Platform & Resource Optimization: Configure and optimize job schedulers (Slurm, PBSPro, Torque) and system resources (GitLab, Bright Cluster Manager, LUA/TCL modules).
-
Troubleshooting & Diagnosis: Resolve complex hardware, software, and network dependency issues, managing everything from network distributed services to backups and data archiving.
-
User Support & Documentation: Provide technical leadership to staff, support user-developed software, and maintain detailed system configuration guides alongside user-facing documentation.
-
Stakeholder Engagement: Create and deliver technical presentations to both technical and non-technical stakeholders.
Qualifications & Requirements
- Clearance: Active TS/SCI required for this position.
-
Degree: Bachelor’s degree in Computer Engineering, Computer Science, or a related technical field.
-
Experience: 10+ years of directly related experience in HPC cluster systems design, installation, and maintenance.
-
Operating Systems: Advanced Linux/Unix administration (installation, networking, security compliance, patching, and data archiving).
-
HPC Theory: Thorough understanding of distributed computing theory, parallel processing, and associated infrastructure.
-
Storage & File Systems: Hands-on experience with parallel file systems (e.g., Lustre) and storage arrays.
-
Virtualization & Containers: Experience with virtualization environments (VMWare) and modern container ecosystems (Docker, Kubernetes).
TECHEAD’s mission is to make our on-site associates successful by placing them in the right environment so they can grow and prosper. How we treat and respond to our clients and employees is a reflection of who we are and makes us stand out from the rest. Keeping our business focused on building and maintaining relationships with our employees and clients is the key to our success. We won’t strive for anything less.
TECHEAD provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, disability, genetic information, marital status, amnesty, or status as a covered veteran in accordance with applicable federal, state and local laws governing nondiscrimination in employment in every location in which the company has facilities. This policy applies to all terms and conditions of employment, including, but not limited to, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation, and training.
For more information on TECHEAD please visit www.techead.com.
No second parties will be accepted.
