Apply Now Clicking "Apply Now" opens the link in a new window.

How to Apply

A cover letter and resume are required; the cover letter must be PAGE 1 of your resume. The letter should:

(1) specifically outline the reasons for your interest in the position;

(2) outline your particular skills and experience that directly relate to this position; and

(3) include your current or ending salary.

Starting salary may vary depending on qualifications and experience of the selected candidate.  

Job Summary

The Advanced Research Computing - Technology Services (ARC-TS) organization has an exciting opportunity to hire a Research Cloud Administrator.

This position will be part of a team working on a novel platform for research computing in the university for data science and high performance computing. The primary responsibilities for this position will be to develop and create a novel resource sharing environment to enable execution of Data Science and HPC workflows using containers for University of Michigan researchers. This position would explore the maturity of various resource sharing frameworks (Mesos, Kubernetes, Rancher, etc) for inclusion in a research system in conjunction with object storage services in the same framework.  This person would also be responsible for deploying existing data science applications such as yarn, spark, impala, presto, and others, and then responsible for developing High Performance Computing (HPC) resource management under the resource scheduler framework. This position will work with guidance from senior and technical lead staff as part of a larger team.

Advanced Research Computing - Technology Services (ARC-TS) is the University of Michigan research IT provider specializing in High Performance Computing (HPC), BigData (Hadoop/Spark/etc), high speed networking, storage, and other technologies to accelerate the research mission of the institution. For more information about ARC-TS visit our website: arc-ts.umich.edu.

**NOTE: This is a full time; term limited position ending after three years with the intent to reevaluate for potential extension.**

Responsibilities*

Object and Block Storage Service Development

Evaluate and set up cluster file system with object storage support.

Integrate automated provisioning of block storage to resource scheduler framework services.

Resource Scheduler Service Development

Evaluate the maturity of different resource scheduler environments.

Implement resource scheduler automation and deployment on a large scale cluster.

Internal documentation of how to utilize and deploy new offering on the scheduler.

HDFS/Spark Service Development

Develop Spark-based services.

Setup and maintain HDFS file system.

User Support for system related issues

Support pilot researchers on developed systems in concert with Data Science support staff on novel platform.

Develop documentation and training.

 Work with other ARC-TS and ARC-TS affiliated staff to support computational research around the University.

Development of Self

Stay abreast of application technology trends in scientific hardware and environments (Computers, accelerators, system management methods, etc.) This can include: on-the-job training, attending technical courses or conferences, reading, researching, and testing.

Required Organizational Competencies

Successful candidates will be expected to demonstrate in this role the following organization competencies, but not limited to:

Advancing the Mission

*Demonstrates knowledge of the primary mission of the University and Health Systems.

*Demonstrates awareness of the diversity of constituency groups and their roles and purposes and issues.

Creative Problem Solving / Strategic Thinking

*Demonstrated ability to provide necessary attention to solve different level problems, often multitasking to solve moderate level problems. 

*Defines problems, analyzes causes, identifies possible solutions, selects the best solution, and develops action plans.  Generates new ideas and goes beyond the status quo. 

*Demonstrated ability to use creative thinking to improve processes and solve complex problems.

Development of Self and Others

*Demonstrated initiative in participating in growth opportunities for continuous development and improvement. 

*Demonstrated ability to apply new skills/knowledge to the job and serve as a training resource to less experienced staff.

Quality Service

*Demonstrated ability to establish and maintain effective relationships with internal and external customers in a manner that consistently meets the organization’s expectations for exemplary customer service.

*Demonstrates the ability to see issues from the customer’s perspective assesses urgency of requests and responds accordingly. 

*Demonstrated focus on fulfilling expectations by seeking insight into customer needs and developing solutions that provide value for the customer.

Required Qualifications*

*Bachelor’s degree in computer science, engineering or an equivalent combination of education and experience.

*Minimum of two (2) years experience in a related role supporting deployments of some or all of the following data science applications(or other related applications):  Yarn, Spark, Impala, Presto .

*Demonstrated ability with Linux, bash/shell, and of Perl or Python.

*Experience with Docker and container services.

*Familiar with various container orchestration services (Rancher, Mesos, Kubernetes, OpenShift, Swarm)

*Demonstrated ability to communicate effectively in technical concepts both verbally and in writing to teams and customers.

*Ability to manage priorities in face of multiple requests and projects.

*Demonstrated ability to work in a self-directed manner, skillfully manage complex projects and stay up-to-date with the latest industry developments and best practices and apply the knowledge in the workplace.

*Demonstrated ability to troubleshoot difficult issues, and problem solving skills with a focus on process improvement and/or automation.

Desired Qualifications*

*Knowledge and experience working with Amazon Web Services, Microsoft Azure or Google Compute *Engine or equivalent cloud platform.

*Experience developing software on ARM platforms.

*Knowledge of golang (>=1.5).

*Experience with X-Cat, Kickstart, Salt, Ansible, or other configuration management tools.

*Experience with HPC Scheduling systems: torque, SLURM, Platform LSF.

Additional Information

*The University of Michigan was featured as one of the "Great Colleges to Work For" in the 2016 Chronicle of Higher Education.

*The University of Michigan is a recipient of the 2016 Seal of Distinction by the WorldatWork Alliance, recognizing strategies and practices that help employees achieve success in work-life effectiveness.

*The University of Michigan was ranked as the top U.S. public university in the latest QS World University Rankings for 2016, moving ahead of the University of California, Berkeley and the University of California, Los Angeles. U-M ranked 23rd overall among more than 900 colleges and universities in the world, up seven spots from last year (2015). The QS World University Rankings, launched in 2004, are based on six factors: academic reputation, employer reputation, student-faculty ratio, citations per faculty, proportion of international faculty and proportion of international students.

U-M EEO/AA Statement

The University of Michigan is an equal opportunity/affirmative action employer.