How to Apply
A cover letter and resume are required. The cover letter must be PAGE 1 of your resume and should:
- Specifically outline the reasons for your interest in the position and
- Outline your particular skills and experience that directly relate to this position.
Starting salary will vary depending on the qualifications and experience of the selected candidate.
**NOTE: This a term-limited position ending after three (3) years with the intent to re-evaluate for possible extension.
The Advanced Research Computing - Technology Services (ARC-TS) organization at the University of Michigan has an exciting opportunity for a High Performance Computing (HPC) Systems Administrator. ARC-TS has an impact on some of the most interesting challenges we have today such as machine learning, drug discovery, simulation of galaxies and many others. In this role, our selected candidate will have an opportunity to interact and accelerate solutions towards these problems.
Our selected candidate will bear a responsiblity to build, operate and support research computing platforms in support of university researchers both on premise and in the cloud. These platforms can consist of High Performance Computing (HPC) Linux clusters, High Throughput Computing and monitoring/logging systems (Elasticsearch, Graphite). Our selected candidate will work closely with ARC-TS team members with input from unit support staff to provide the next generation of research computing infrastructure and will have the opportunity to be a part of a dynamic team to meet the changing needs for building and supporting new and innovative systems to meet to the needs of faculty.
For more information about ARC-TS, please visit our website: http://arc-ts.umich.edu/ and for more information about ITS, please visit our website: http://its.umich.edu/.
While not limited to those listed below, the HPC Systems Administrator should expect these key responsibilities:
Operating System Support
- Monitoring HPC systems for availability and providing remediation
- Updating system software
- Kernel Modules Configuration
- Node provisioning and de-provisioning
- Create and maintain documentation
User Support for Systems-related Issues
- Installing and debugging software
- Assisting users in troubleshooting batch compute jobs
- Investigate ways to improve user experience
System Capacity Planning (data centers, networking, etc.)
- Maintain node/VM inventory
- Manage data center/virtual data center use
Collaboration and Research
- Work with other ARC-TS and ARC-TS affiliated persons to support computational research around the University.
- Stay abreast of application technology trends in scientific hardware and environments (Computers, accelerators, system management methods, etc.)
While not limited to the following, the HPC Administrator will be expected to demonstrate the following organizational competencies:
Advancing the Mission
- Demonstrates knowledge of the primary mission of the University and Michigan Medicine.
- Demonstrates awareness of the diversity of constituency groups, their roles, purposes and issues.
Creative Problem Solving / Strategic Thinking
- Demonstrates ability to provide necessary attention to solve different level problems, often multitasking to solve moderate level problems.
- Defines problems, analyzes causes, identifies possible solutions, selects the best solution, and develops action plans. Generates new ideas and goes beyond the status quo.
- Demonstrates ability to use creative thinking to improve processes and solve complex problems.
Development of Self and Others
- Demonstrates initiative in participating in growth opportunities for continuous development and improvement.
- Demonstrates ability to apply new skills/knowledge to the job and serve as a training resource to less experienced staff.
- Demonstrates ability to establish and maintain effective relationships with internal and external customers in a manner that consistently meets the organization’s expectations for exemplary customer service.
- Demonstrates the ability to see issues from the customer’s perspective, assesses urgency of requests and responds accordingly.
- Demonstrates focus on fulfilling expectations by seeking insight into customer needs and developing solutions that provide value for the customer.
- Bachelor’s degree in computer science, engineering or an equivalent combination of education and experience.
- One (1) or more years of experience in a production Linux environment.
- Understanding of bash/shell and one of Perl or Python.
- Understanding of security practices in a shared environment.
- Understand Unix/Linux TCP/IP networking.
- Strong interpersonal communication skills.
- Ability to communicate clearly, correctly, effectively and knowledgeably via email, letters and in person to teams and customers.
- Ability to creatively improve workflows and processes.
- Strong troubleshooting skills.
- Ability to manage priorities in face of multiple requests and projects.
- Ability to self-direct as well as participate in a larger distributed support structure.
- Familiar with batch computing environments (Slurm, Torque, HT-Condor).
- Experience with revision control systems (Git, SVN, etc.).
- Some experience with SQL.
- Experience with configuration management and system provisioning methods and tools.
- Experience with Linux kernel modules, preferably for Lustre, Intel Xeon PHIs, NVIDIA GPUs and Mellanox InfiniBand cards.
- Experience with X-Cat, Kickstart or Ansible.
- Experience providing IT support in an academic environment.
- Experience with logging and metric tools such as Elasticsearch, Logstash or Graphite.
- Experience with compute health monitoring systems such as Sensu or Nagios/Icinga.
- Experience with Intel Xeon PHI or GPU Accelerators.
- Familiar with any of C/C++, MATLAB, Fortran, R, CUDA or OpenACC.
The selected candidate may work with and/or support systems that maintain or process sensitive institutional data as defined by university policy and must comply with federal, state, and local law and/or university policies or agreements that require the university to implement specific privacy and security safeguards, including but not limited to: ITAR, EAR, HIPAA, FERPA and FISMA.
- Punctual, regular, and consistent attendance is required.
- Ability to conduct a normal amount of work at a computer.
- Staff members are required to provide and maintain his or her own high-speed residential Internet connectivity services.
Diversity, Equity and Inclusion
The University of Michigan Information and Technology Services seeks to recruit and retain a diverse workforce as a reflection of our commitment to serve the diverse people of Michigan, to maintain the excellence of the University and to offer our students richly varied disciplines, perspectives and ways of knowing and learning.
The University of Michigan is committed to offering a high-quality benefits package to support faculty, staff and their families. Learn more at https://hr.umich.edu/benefits-wellness
- The University of Michigan continues to be recognized nationally in the U.S. News & World Report’s annual rankings of the nation's best graduate schools for 2019. Among the programs ranked each year in America’s Best Graduate Schools — business, education, engineering, law, medicine and nursing — U-M maintained top-20 rankings in all six categories.
- The University of Michigan is ranked No. 4 on Kiplinger’s 2018 list of 100 Best Values in Public Colleges, moving up one spot from last year.
- The University of Michigan is ranked No. 2 public university in the United States and 27th overall in a survey announced 09/27/2017 by The Wall Street Journal and Times Higher Education.
- The University of Michigan maintained its ranking as the No. 4 public university in U.S. News & World Report's 2018 annual list of the nation's best undergraduate colleges and universities.
Job openings are posted for a minimum of seven calendar days. This job may be removed from posting boards and filled anytime after the minimum posting period has ended.
U-M EEO/AA Statement
The University of Michigan is an equal opportunity/affirmative action employer.