SAIC has an immediate need for a Senior Systems Administrator to support the NASA Langley Information Technology Enhanced Services II (LITES II) contract at NASA's Langley Research Center in Hampton, VA. The LITES II provides for IT support services in the areas of science and engineering applications, some center infrastructure applications, data center support, business management applications support, and IT project management support. As a member of the LITES II team, this Senior Systems Administrator will manage and support Linux server infrastructure, desktops, laboratory, and scientific equipment supporting the Langley research mission.
- Ensure the day-to-day operation and ongoing support of LITES II Servers and Systems including event, fault, and performance management systems.
- Develop solutions to monitor, notify, and correct anomalies/alarms.
- Design the environment, hardware specifications, and software layers to provide for a reliable, scalable, secure and highly available Linux Systems Environment.
- Find gaps in the Linux Environment and use analytical skills and knowledge of database methodologies to problem solve and provide solutions.
- Ensure that server hardware and operating systems are installed, properly configured, and operating as expected on production, development, or lab systems running on various version of Linux, and Windows environments.
- Configure operating systems and utilities to support customer requirements.
- Lead efforts and work closely with database and application administrators; data/telecommunication analysts; researchers; outside vendors; and application programmers in engineering solutions or resolving problems. Act as a lead consulting and training resource for LITES II Servers and systems.
- Provide guidance to engineers/analysts on the configuration of server hardware and architectures to meet application and service requirements.
- Lead IR&D efforts to identify new solutions, communicating solutions and progress with Senior and Program Management.
- Work closely with vendors or other groups within the organization to correct problems or perform root cause analysis.
- Analyze and make formal recommendations on optimization and capacity planning for systems.
- Lead design, develop, and maintain process automation scripts development and utilities written in BASH/Python/Ansible languages.
- Develop and maintain documentation related to the design, installation, administration, and maintenance of multiple Linux systems in a distributed computing environment.
- Ensure stability, viability and maintenance of the 24/7 mission-critical production environment.
- Prepare weekly reports on significant events. Write test plans and conduct product evaluations in lab environments.
- Perform system and application account management activities.
- Participate on a 24/7 on-call rotation to support critical incident management events.
- Design and maintain a repeatable, documented patch deployment methodology for administrators to implement.
- Lead projects responsible for the optimization of systems functionality, diagnosis complex performance issues and facilitates recovery of failed systems in accordance with customer Service Level Agreements.
Required Education and Experience:
- Bachelor’s degree and 5+ years of related experience. Years of experience may be accepted in place of degree.
- Experience in server architectures, storage architectures, virtualization, monitoring tools, backup/restore, performance tuning/capacity planning, documentation, server management.
- Must have experience with UNIX server administration including RedHat LINUX platform, backup/restore, performance tuning, DNS, TCP/IP, and clustering.
- Demonstrated knowledge/experience working with VMWare hypervisors and architecture.
- Solid understanding of data and telecommunication technologies and networking fundamentals.
- Working knowledge of database management, security and data access technologies
- Demonstrated excellent interpersonal and oral/written communication skills for effective interaction with customers and co-workers
- Proven ability to provide project leadership as well as work routine tasks with minimal supervision
- High Performance Computing skills desired.
- Must be able to obtain a Public Trust security clearance
Certifications / Training:
Travel (if any):
Desired Qualifications (if any): (no need to include the words “Desired / Preferred Qualifications” as this is prepopulated)
- Computer Science or Cyber Security Bachelor’s Degree
Desired Skills / Experiences:
- Knowledge of best practices from large scale enterprise deployments.
- Ability to develop and maintain Python programs or shell scripts in a UNIX environment for automation
- Prior experience in a Highly Available Linux Systems Environment
- Prior experience in a Linux High Performance Computing Environment
- Knowledge of Ansible and/or Yum
- Knowledge of Kickstart procedures
Desired Certifications / Training:
- RedHat, VMWare, Microsoft Windows, Ansible, or Python programming