Apply Now Clicking "Apply Now" opens the link in a new window.

How to Apply

A cover letter is required for consideration for this position and should be attached as the first page of your resume. The cover letter should address your specific interest in the position and outline your skills and related experience.

Job Summary

The Inter-university Consortium for Political and Social Research (ICPSR) is part of the Institute for Social Research (ISR) at the University of Michigan. The consortium maintains the world’s largest archive of social scientific data with 10,000+ studies relating to education, aging, criminal justice, substance abuse, terrorism, and more. A global leader in data science, ICPSR also supports continuing education in research design, statistics, and data analysis. ICPSR employs 15 research faculty, 95 professional staff, and has an annual operating revenue of more than $15 million. 

ICPSR is looking for a Data Engineer to join our Computing and Network Services (CNS) team. Our team provides services and data products to a growing user community of social science researchers, students, instructors, and policymakers. If you love inventing new ways to do things with data, and can teach others to do the same, we want to talk to you! 

You will report to the CNS Team Lead at ICPSR.

Responsibilities*

  • Lead efforts to automate work for our Curation team (35%)
    • Take a usage and needs inventory for current and non-existing scripts; create a strategy to reduce the number of scripts required
    • Move current scripts for curation into source control
    • Introduce automated testing for high impact scripts
    • Introduce trigger-based automation to lower manual efforts
    • Collaborate across units and departments to author high-usage tools that reduce time to curate, increasing throughput for curation
  • Create data pipelines consuming APIs from government agencies (25%)
    • Create data pipelines as code, which are kept in source control, are modular and maintainable, and uses automated testing
    • Implement data pipelines using a DevSecOps mindset, with automated deployment of environments and configurations
  • Teach Data Engineering methods to our Curation team (15%)
    • Teach curators how to use source control, automated testing, and trigger based automation for current scripts the curators write
    • Teach modern data engineering methods (e.g. Python, R, data logistics tools) to curators who are interested in Data Engineering
    • Teach how to create automated acceptance tests for data manipulation code
  • Troubleshoot data pipeline, data manipulation scripts, and data encoding issues (25%)
    • Read both software and infrastructure logs to find clues to errors
    • Find the root cause of complex errors
    • Resolve errors by refactoring current systems to create higher resiliency

Required Qualifications*

  • Bachelor's degree in computer science, a related field, or equivalent experience
  • Seven or more years experience in software development, data engineering, or combination including: 
    • Five or more years experience in one or more programming language (Python/Java/C++)
    • Two or more years experience consuming data from APIs using publish and subscribe architecture
    • Two or more years experience using either R or Python for data management, or equivalent experience
    • Two or more years experience using regular expressions to validate data
  • Demonstrate understanding the reliability, scalability, and performance aspects of data structures and data pipelines
  • Collaboration skills for working within an Agile software development team

Desired Qualifications*

  • Familiarity with Social Sciences and academic research
  • Experience converting legacy scripting to more robust processing
  • Experience on a software team that deploys to production multiple times daily
  • Experience with statistics, data visualization, analytics, or machine learning
  • Experience working with SQL integration services or ETL tools
  • Experience with replication and partitioning data for transactional, batch, or streaming systems
  • Experience with NoSQL, Graph Databases, Spark, or Kafka
  • Experience with indexers such as Elasticsearch, Solr

Underfill Statement

This will be filled at the lead level salary range of $95,000 - $110,000. The requirements listed reflect the staff level expectation. Candidates with lesser experience are encouraged to apply and may be considered if the primary goal is not met. Lesser-experienced candidates will be reviewed at the senior level which requires at least 5 years of experience with an estimated salary range of $85,000 - 100,000.

Additional Information

In addition to a career filled with purpose and opportunity, The University of Michigan offers a comprehensive benefits package to help you stay well, protect yourself and your family and plan for a secure future. Benefits include:

  • Generous time off
  • A retirement plan that provides two-for-one matching contributions with immediate vesting
  • Many choices for comprehensive health insurance
  • Life insurance
  • Long-term disability coverage
  • Flexible spending accounts for healthcare and dependent care expenses
  • Various learning resources and tuition reimbursement

The Institute for Social Research at the University of Michigan wants to recruit and retain a diverse workforce as a reflection of our commitment to serve the diverse people of Michigan, to maintain the excellence of the university, and to ground our research in varied disciplines, perspectives, and ways of knowing and learning.

U-M EEO/AA Statement

The University of Michigan is an equal opportunity/affirmative action employer.