Search Jobs
Jobs filters
TACTICAL SPORTS SCIENTIST - CLEARANCE REQUIRED - 13416
Anywhere, USA; Anywhere, USADEVSECOPS ENGINEER - CLEARANCE REQUIRED - 13420
Anywhere, USA; Anywhere, USAHUMAN PERFORMANCE SPECIALIST - 13417
Anywhere, USA; Anywhere, USASITE RELIABILITY ENGINEER - 13429
Anywhere, USA; Anywhere, USASOFTWARE INTEGRATION & TEST ENGINEER - 13426
Anywhere, USA; Anywhere, USATIER 2 HELP DESK TECHNICIAN - CLEARANCE REQUIRED - 13435
Anywhere, USA; Anywhere, USAPMO SUPPORT SPECIALIST - 13400
Anywhere, USA; Anywhere, USASENIOR FULL STACK DEVELOPER - 13405
Anywhere, USA; Anywhere, USASENIOR CLOUD ARCHITECT - 13406
Anywhere, USA; Anywhere, USAPROGRAM SCHEDULER - 13390
Anywhere, USA; Anywhere, USASENIOR PROJECT CONTROL ANALYST - 13370
Anywhere, USA; Anywhere, USASMART PROGRAM SCHOLAR COORDINATOR - 13337
Anywhere, USA; Anywhere, USA
SITE RELIABILITY ENGINEER - 13429
Anywhere, USA; Anywhere, USALMI seeks a Site Reliability Engineer (SRE) to support the U.S. Army Center for Initial Military Training’s (CIMT) Holistic Health & Fitness Management System (H2FMS). H2FMS is hosted in Army GovCloud and integrates data from the vendor-provided H2F application into a secure cloud environment supporting analytics, AI/ML, and readiness dashboards. The SRE ensures the reliability, scalability, performance, and operational integrity of the H2FMS environment by implementing monitoring tools, managing system access and logs, supporting incident response, and improving system resilience through automation.
This role directly supports the Cloud Architect, DevSecOps team, ISSM/ISSOs, Cybersecurity Engineers, Data Engineers, Full Stack Developers, and the Technical PM.
LMI is a new breed of digital solutions provider dedicated to accelerating government impact with innovation and speed. Investing in technology and prototypes ahead of need, LMI brings commercial-grade platforms and mission-ready AI to federal agencies at commercial speed.
Leveraging our mission-ready technology and solutions, proven expertise in federal deployment, and strategic relationships, we enhance outcomes for the government, efficiently and effectively. With a focus on agility and collaboration, LMI serves the defense, space, healthcare, and energy sectors—helping agencies navigate complexity and outpace change. Headquartered in Tysons, Virginia, LMI is committed to delivering impactful results that strengthen missions and drive lasting value.
Responsibilities
- Monitor the health, performance, and availability of H2FMS applications, services, APIs, and data services in Army GovCloud.
- Troubleshoot system issues across application, data, and infrastructure layers.
- Implement reliability patterns such as redundancy, graceful degradation, and failover strategies.
- Support performance optimization activities based on monitoring metrics and trends.
- Manage user access controls, role-based permissions, and environment access configurations.
- Maintain, monitor, and archive system logs, audit logs, and access logs to support RMF and cATO requirements.
- Support ISSO and Cybersecurity teams in log retrieval, incident investigations, and audit preparation.
- Develop and maintain automation scripts to improve environment stability, operational workflows, and deployment reliability.
- Collaborate with DevSecOps engineers to integrate automated runtime checks, monitoring, and health checks within CI/CD pipelines.
- Assist in implementing automated scaling, alerting, and self-healing mechanisms.
- Participate in incident response activities, including detection, diagnosis, escalation, mitigation, and documentation.
- Coordinate with cybersecurity teams during security events or anomalies.
- Conduct root-cause analysis and contribute to long-term corrective actions.
- Maintain environment configuration inventories related to access, logging, monitoring, and deployment parameters.
- Support configuration management, patch activities, and version control for infrastructure and application components.
- Collaborate with the Cloud Architect on environment design updates and capacity planning.
- Document system configurations, access processes, log retention procedures, and environment health dashboards.
- Support the ISSM and ISSO teams in continuous monitoring package updates and RMF documentation.
- Maintain audit-ready artifacts related to reliability operations and environment management.
Qualifications
Required
- Bachelor’s degree in information technology, Computer Science, Engineering, Cybersecurity, or a related field.
- 3–6 years of experience in cloud operations, SRE, DevOps, or system administration roles.
- Hands-on experience with cloud monitoring, logging, and performance management tools (AWS CloudWatch, Azure Monitor, ELK/Splunk, Prometheus/Grafana, etc.).
- Experience with automation tools (Python, Bash, Terraform, Ansible, etc.).
- Familiarity with RMF, Zero Trust, and DoW cloud security requirements.
- Understanding of CI/CD pipelines and deployment processes.
- Ability to obtain and maintain a DoD Secret clearance.
- Location: Remote.
- Travel: 1–2 trips per quarter to Fort Eustis, VA or LMI HQ in Tysons, VA.
Desired
- Experience supporting DoW programs or operating in secure cloud environments (AWS GovCloud, Azure IL4/IL5, cARMY).
- Experience with container orchestration (Kubernetes/EKS/AKS).
- Familiarity with incident response processes and SRE best practices (SLOs, SLIs, error budgets).
- Certifications such as AWS SysOps, AWS Cloud Practitioner, Azure Administrator, or equivalent.
Target salary range: $140,000 - 170,000.
Disclaimer: The salary range displayed represents the typical salary range for this position and is not a guarantee of compensation. Individual salaries are determined by various factors including, but not limited to location, internal equity, business considerations, client contract requirements, and candidate qualifications, such as education, experience, skills, and security clearances.
#LI-SH1