Apply Now Clicking "Apply Now" opens the link in a new window.

How to Apply

A cover letter and resume are required; the cover letter must be PAGE 1 of your resume. The letter should:

(1) specifically outline the reasons for your interest in the position;

(2) outline your particular skills and experience that directly relate to this position; and

(3) include your current or ending salary.

Starting salary may vary depending on qualifications and experience of the selected candidate.

**NOTE: This is a full-time; term-limited position ending after three years with the intent to re-evaluate for potential extension.**

Job Summary

The Advanced Research Computing - Technology Services (ARC-TS) organization has an exciting opportunity to hire a Research Cloud Administrator.

This position will be part of a team working on a novel platform for research computing in the university for data science and high performance computing.  The primary responsibilities for this position will be to develop and create a novel resource sharing environment to enable execution of Data Science and HPC workflows using containers for University of Michigan researchers.  This position would explore the maturity of various resource sharing frameworks (Mesos, Kubernetes, Rancher, etc) for inclusion in a research system in conjunction with object storage services in the same framework.  This person would also be responsible for deploying existing data science applications such as yarn, spark, impala, presto, and others, and then responsible for developing High Performance Computing (HPC) resource management under the resource scheduler framework.  This position will work with guidance from senior and technical lead staff as part of a larger team.

Advanced Research Computing - Technology Services (ARC-TS) is the University of Michigan research IT provider specializing in High Performance Computing (HPC), BigData (Hadoop/Spark/etc), high speed networking, storage, and other technologies to accelerate the research mission of the institution. For more information about ARC-TS visit our website: http://arc-ts.umich.edu.

Responsibilities*

  • 20%  Object and Block Storage Service Development:  Evaluate and set up cluster file system with object storage support.  Integrate automated provisioning of block storage to resource scheduler framework services.
  • 30% Resource Scheduler Service Development:  Evaluate the maturity of different resource scheduler environments.  Implement resource scheduler automation and deployment on a large scale cluster.  Internal documentation of how to utilize and deploy new offering on the scheduler.
  • 25% HDFS/Spark Service Development:  Develop Spark-based services.  Setup and maintain HDFS file system.
  • 20% User Support for system related issues:  Support pilot researchers on developed systems in concert with Data Science support staff on novel platform.  Develop documentation and training.  Work with other ARC-TS and ARC-TS affiliated staff to support computational research around the University.
  • 5% Development of Self:  Stay abreast of application technology trends in scientific hardware and environments (Computers, accelerators, system management methods, etc.).  This can include: on-the-job training, attending technical courses or conferences, reading, researching, and testing.

 

Required Organizational Competencies
Successful candidates will be expected to demonstrate in this role the following organization competencies, but not limited to:

  • Advancing the Mission:  Demonstrates knowledge of the primary mission of the University and Health Systems.  Demonstrates awareness of the diversity of constituency groups and their roles and purposes and issues.
  • Creative Problem Solving / Strategic Thinking:  Demonstrated ability to provide necessary attention to solve different level problems, often multitasking to solve moderate level problems.  Defines problems, analyzes causes, identifies possible solutions, selects the best solution, and develops action plans.  Generates new ideas and goes beyond the status quo.  Demonstrated ability to use creative thinking to improve processes and solve complex problems.
  • Development of Self and Others:  Demonstrated initiative in participating in growth opportunities for continuous development and improvement.  Demonstrated ability to apply new skills/knowledge to the job and serve as a training resource to less experienced staff.
  • Quality Service:  Demonstrated ability to establish and maintain effective relationships with internal and external customers in a manner that consistently meets the organization’s expectations for exemplary customer service.  Demonstrates the ability to see issues from the customer’s perspective, assesses urgency of requests and responds accordingly.  Demonstrated focus on fulfilling expectations by seeking insight into customer needs and developing solutions that provide value for the customer.

Required Qualifications*

  • Bachelor’s degree in computer science, engineering or an equivalent combination of education and experience.
  • Minimum of two (2) years experience supporting at least one (1) of the following types of deployments:
    • One of the following data science applications:  Yarn, Spark, Impala, Presto or other data science application.
    • Docker, and one of any number of container orchestration services (Rancher, Mesos, Kubernetes, OpenShift, Swarm).
    • Ceph or another Object Storage Service.
    • One of the following cloud environments: Amazon Web Services, Microsoft Azure or Google Compute Engine or equivalent cloud platform.
  • HPC Scheduling systems: torque, SLURM, Platform LSF.
  • Demonstrated ability with Linux, bash/shell, and Perl or Python.
  • Demonstrated ability to communicate effectively in technical concepts both verbally and in writing to teams and customers.
  • Ability to manage priorities in face of multiple requests and projects.
  • Demonstrated ability to work in a self-directed manner, skillfully manage complex projects and stay up-to-date with the latest industry developments and best practices and apply the knowledge in the workplace.
  • Demonstrated ability to troubleshoot difficult issues, and problem solving skills with a focus on process improvement and/or automation.

Desired Qualifications*

  • Knowledge and experience supporting at least two (2) of the following types of deployments:
    • One of the following data science applications:  Yarn, Spark, Impala, Presto or other data science application.
    • Docker, and one of any number of container orchestration services (Rancher, Mesos, Kubernetes, OpenShift, Swarm).
    • Ceph or another Object Storage Service.
    • One of the following cloud environments: Amazon Web Services, Microsoft Azure or Google Compute Engine or equivalent cloud platform.
  • HPC Scheduling systems: torque, SLURM, Platform LSF.
  • Experience with X-Cat, Kickstart, Salt, Ansible, or other configuration management tools.
  • Knowledge of golang (>=1.5).
     
  • Demonstrated ability with Linux, bash/shell, and Perl or Python.

Additional Information

Some development may be applicable to open source projects. In addition, there may be opportunities to speak at relevant conferences regarding work done on these endeavors.

Diversity, Equity and Inclusion

The University of Michigan Information and Technology Services seeks to recruit and retain a diverse workforce as a reflection of our commitment to serve the diverse people of Michigan, to maintain the excellence of the University and to offer our students richly varied disciplines, perspectives and ways of knowing and learning.

Go Blue!

In addition to the University of Michigan featured as one of the "Great Colleges to Work For" in the 2016 Chronicle of Higher Education, U-M is a recipient of the 2017 Seal of Distinction by the WorldatWork Alliance, recognizing strategies and practices that help employees achieve success in worklife effectiveness.  

QS World University Rankings list U-M as one of the world's leading public universities.  The University of Michigan was ranked 23rd overall among more than 900 colleges and universities in the world, up seven spots from 2015.  Launched in 2004, the QS World University Rankings are based on six factors: academic reputation, employer reputation, student-faculty ratio, citations per faculty, proportion of international faculty and proportion of international students.

Application Deadline

Job openings are posted for a minimum of seven calendar days. This job may be removed from posting boards and filled anytime after the minimum posting period has ended.

U-M EEO/AA Statement

The University of Michigan is an equal opportunity/affirmative action employer.