Data Science
Degrees Offered
 Master of Science in Data Science (NonThesis)
 Graduate Certificate in Data Science  Statistical Learning
 Graduate Certificate in Data Science  Earth Resources
 Graduate Certificate in Petroleum Data Analytics
 Graduate Certificate in Business Analytics
 PostBaccalaureate Certificate in Data Science  Foundations

Post Baccalaureate Certificate in Data Science  Computer Science
Program Description
The Master of Data Science (NonThesis) program is designed to give candidates a foundation in statistics and computer science and also provide knowledge in a particular application domain of science or engineering. The balance between these three elements is a strength of the program and can prepare candidates for Data Science careers in industry, government, or for further study at the PhD level. Throughout is an emphasis on working in teams, creative problem solving, and professional development.
The Data Science Certificates are designed for college graduates and professionals interested in the emerging field of Data Science as applied within their individual fields of study or industries.
Professors
Douglas Nychka, Applied Mathematics & Statistics
Paul Sava, Geophysics
Michael Wakin, Electrical Engineering
Professor of Practice
Jim Crompton, Petroleum Engineering
Associate Professors
Soutir Bandyopadhyay, Applied Mathematics and Statistics
Dorit Hammerling, Applied Mathematics & Statistics
Hua Wang, Computer Science
Teaching Associate Professor
Wendy Fisher, Computer Science
Research Professor
Alfred William (Bill) Eustes, III, Petroleum Engineering
Research Associate Professor
Zane Jobe, Geology and Geological Engineering
Master of Data Science (NonThesis)
The field of Data Science draws on elements of computer science, statistics and interdisciplinary applications to address the unique needs of gaining knowledge and insight through data analysis. This Masters NonThesis program is designed to give candidates a foundation in statistics and computer science and also provide knowledge in a particular application domain of science or engineering. The balance between these three elements is a strength of the program and can prepare candidates for Data Science careers in industry, government, or for further study at the PhD level. Moreover, the coursework will be flexible and tailored to each candidate. For example, the program will allow a candidate to increase his/her skills in data analytics while developing a focused area of application or alternatively allow a candidate with depth in an area of application to gain skills in statistics and computer science.
Program Requirements
This program will follow a 3 X 3 + 1 design: three modules and a minimodule.
Modules (each consisting of three 3credit courses)
Data Modeling and Statistical Learning  
DSCI530  STATISTICAL METHODS I  3.0 
DSCI560  INTRODUCTION TO KEY STATISTICAL LEARNING METHODS I  3.0 
DSCI561  INTRODUCTION TO KEY STATISTICAL LEARNING METHODS II  3.0 
Machine Learning, Data Processing and Algorithms, and Parallel Computation  
DSCI403  INTRODUCTION TO DATA SCIENCE  3.0 
DSCI470  INTRODUCTION TO MACHINE LEARNING  3.0 
DSCI575  MACHINE LEARNING  3.0 
Individualized and Domain Specific Coursework
Electives for the third module can be designed by the student but the plan needs to be approved by the program curriculum committee. Although this individualized module can draw on graduate courses from across the university, two specific examples from engineering and geophysics are given below:
Electrical Engineering  
EENG411  DIGITAL SIGNAL PROCESSING  3.0 
EENG509  SPARSE SIGNAL PROCESSING  3.0 
EENG511  CONVEX OPTIMIZATION AND ITS ENGINEERING APPLICATIONS  3.0 
EENG515  MATHEMATICAL METHODS FOR SIGNALS AND SYSTEMS  3.0 
or EENG519  ESTIMATION THEORY AND KALMAN FILTERING  
Geophysics  
GPGN533  GEOPHYSICAL DATA INTEGRATION & GEOSTATISTICS  3.0 
GPGN570  APPLICATIONS OF SATELLITE REMOTE SENSING  3.0 
or GPGN605  INVERSION THEORY 
Minimodule (comprised of three 1credit courses)
Professional Development  
SYGN502  INTRODUCTION TO RESEARCH ETHICS  1.0 
SYGN5XX  LEADERSHIP AND TEAMWORK  1.0 
LICM501  PROFESSIONAL ORAL COMMUNICATION  1.0 
Sample Course Schedule
First Year  

Fall  lec  lab  sem.hrs  
DSCI403  INTRODUCTION TO DATA SCIENCE  3.0  
DSCI470  INTRODUCTION TO MACHINE LEARNING  3.0  
DSCI530  STATISTICAL METHODS I  3.0  
ELECT  Elective Approved by Program*  3.0  
12.0  
Spring  lec  lab  sem.hrs  
DSCI575  MACHINE LEARNING  3.0  
DSCI560  INTRODUCTION TO KEY STATISTICAL LEARNING METHODS I  3.0  
ELECT  Elective Approved by Program*  3.0  
LICM501  PROFESSIONAL ORAL COMMUNICATION  1.0  
10.0  
Second Year  
Fall  lec  lab  sem.hrs  
DSCI561  INTRODUCTION TO KEY STATISTICAL LEARNING METHODS II  3.0  
ELECT  Elective Approved by Program*  3.0  
SYGN502  INTRODUCTION TO RESEARCH ETHICS  1.0  
SYGN5XX  LEADERSHIP AND TEAMWORK  1.0  
8.0  
Total Semester Hrs: 30.0 
*Electives for the third module can be designed by the student but the plan needs to be approved by the Data Science program curriculum committee. This individualized module can draw on graduate courses from across the university.
Mines Combined Undergraduate / Graduate Degree Program
Mines undergraduate students have the opportunity to begin work on a graduate degree in the Data Science program while completing their Bachelor's degree. Students must apply to the Combined degree program according to the Office of Graduate Studies timeline and be admitted to the program. Students enrolled in the Combined program may double count up to six credits, which were used in fulfilling the requirements of their undergraduate degree at Mines, toward their Data Science graduate program. Courses to be used for double counting can be at the 400 or 500+ level. Courses that will be double counted must be approved by the Data Science program director and the student's graduate advisor. These courses must have been passed with a "B" or better and meet all other University, Department, Division and Program requirements for graduate credit.
Certificate Programs in Data Science
Program Requirements
There are five Certificates in Data Science. Applicants for each are required to have an undergraduate degree to be admitted into the Certificate programs. Course prerequisites, if any, are noted for each Certificate program.
Students working toward one of the Data Science Certificates are required to successfully complete 12 credits, as detailed below for each Certificate. The courses taken for the Certificates can be used towards a Master’s or PhD degree at Mines, however courses used for one Data Science Certificate cannot also be counted toward another Data Science Certificate.
PostBaccalaureate Certificate in Data Science  Foundations (12 credits)
The Data Science  Foundations PostBaccalaureate Certificate is an online or residential program focusing on the foundational concepts in statistics and computer science that support the explosion of new methods for interpreting data in its many forms. The Certificate balances an introduction to data science with teaching basic skills in applying methods in statistics and machine learning to analyze data. Students will gain a perspective on the kinds of problems that can be solved by data intensive methods and will also acquire new analysis skills outside of the certificate. Moreover, the coursework will cover a broad range of applications, making it relevant for varied scientific and engineering domains.
Applicants must have completed the following courses, or their equivalents, with a B or better: CSCI261 and CSCI262 Data Structures, MATH332 Linear Algebra and MATH334 Introduction to Probability.
DSCI403  INTRODUCTION TO DATA SCIENCE  3.0 
DSCI470  INTRODUCTION TO MACHINE LEARNING  3.0 
DSCI530  STATISTICAL METHODS I  3.0 
DSCI560  INTRODUCTION TO KEY STATISTICAL LEARNING METHODS I  3.0 
Post Baccalaureate Certificate in Data Science  Computer Science (12 credits)
The Data Science  Computer Science Post Baccalaureate Certificate is an online or residential program focusing on data science concepts within computer science (e.g., computational techniques and machine learning) plus prerequisite knowledge (e.g., probability and regression). The aim of this certificate is to help students develop an essential skill set in data analytics, including (1) deriving predictive insights by applying advanced statistics, modeling, and programming skills, (2) acquiring indepth knowledge of machine learning and computational techniques, and (3) unearthing important questions and intelligence for a range of industries, from product design to finance.
Applicants must have completed the following courses, or their equivalents, with a B or better: CSCI261 and CSCI262 Data Structures, MATH213 Calculus III and MATH332 Linear Algebra. DSCI530 Statistical Methods I, will serve as the MATH201 Probability and Statistics prerequisite for the two machine learning courses of the certificate (DSCI470 Introduction to Machine Learning and DSCI575 Machine Learning).
DSCI403  INTRODUCTION TO DATA SCIENCE  3.0 
DSCI530  STATISTICAL METHODS I  3.0 
DSCI470  INTRODUCTION TO MACHINE LEARNING  3.0 
DSCI575  MACHINE LEARNING  3.0 
Graduate Certificate in Data Science  Statistical Learning (12 credits)
The Data Science  Statistical Learning Graduate Certificate is an online or residential program focusing on statistical methods for interpreting complex data sets and quantifying the uncertainty in a data analysis. The Certificate also includes gaining new skills in computer science but is grounded in statistical models for data, also termed statistical learning, rather than algorithmic approaches. Students will develop an essential skill set in statistical methods most commonly used in data science along with the understanding of the methods' strengths and weaknesses. Moreover, the coursework will cover a broad range of applications making it relevant for varied scientific and engineering domains.
Applicants must have completed the following courses, or their equivalents, with a B or better: CSCI261 and CSCI262 Data Structures, MATH332 Linear Algebra and MATH334 Introduction to Probability.
DSCI403  INTRODUCTION TO DATA SCIENCE  3.0 
DSCI530  STATISTICAL METHODS I  3.0 
DSCI560  INTRODUCTION TO KEY STATISTICAL LEARNING METHODS I  3.0 
DSCI561  INTRODUCTION TO KEY STATISTICAL LEARNING METHODS II  3.0 
Graduate Certificate in Data Science  Earth Resources (12 credits)
The Graduate Certificate in Data Science  Earth Resources is an online program building on the foundational concepts in data science as it pertains to managing surface and subsurface Earth resources and on specific applications (use cases) from the petroleum and minerals industries as well as water resource monitoring and remote sensing of Earth change. The Certificate includes one core introductory Data Science course, two courses specific to Earth resources and one elective.
DSCI403  INTRODUCTION TO DATA SCIENCE  3.0 
GEOL557  EARTH RESOURCE DATA SCIENCE 1: FUNDAMENTALS  3.0 
GEOL558  EARTH RESOURCE DATA SCIENCE 2: APPLICATIONS AND MACHINELEARNING  3.0 
ELECTIVE  (1) ELECTIVE FROM LIST BELOW  3.0 
Graduate Certificate in Data Science  Earth Resources Electives (select ONE (1) from the list below):
Geospatial Focus:  
GEGN575  APPLICATIONS OF GEOGRAPHIC INFORMATION SYSTEMS  3.0 
GEGN579  PYTHON SCRIPTING FOR GEOGRAPHIC INFORMATION SYSTEMS  3.0 
Petroleum Focus:  
GPGN519  ADVANCED FORMATION EVALUATION  3.0 
GPGN547  PHYSICS, MECHANICS, AND PETROPHYSICS OF ROCKS  3.0 
GPGN558  SEISMIC DATA INTERPRETATION AND QUANTITATIVE ANALYSIS  3.0 
GPGN651  ADVANCED SEISMOLOGY  3.0 
PEGN522  ADVANCED WELL STIMULATION  3.0 
PEGN551  PETROLEUM DATA ANALYTICS  FUNDAMENTALS  3.0 
Mining Focus:  
MNGN548  INFORMATION TECHNOLOGIES FOR MINING SYSTEMS  3.0 
Hydrology Focus:  
CEEN581  WATERSHED SYSTEMS MODELING  3.0 
Additional Options:  
DSCI/MATH530  STATISTICAL METHODS I  3.0 
EBGN525  BUSINESS ANALYTICS  3.0 
Graduate Certificate in Petroleum Data Analytics (12 credits)
The Graduate Certificate in Petroleum Data Analytics is an online program building on the foundational concepts in statistics and focusing on the data foundation of the oil and gas industry, the challenges of Big Data to oilfield operations and on specific applications (use cases) for petroleum analytics. The Certificate includes two core introductory Data Science courses and two course specific to petroleum engineering.
DSCI530  STATISTICAL METHODS I  3.0 
DSCI403  INTRODUCTION TO DATA SCIENCE  3.0 
PEGN551  PETROLEUM DATA ANALYTICS  FUNDAMENTALS  3.0 
PEGN552  PETROLEUM DATA ANALYTICS  APPLICATIONS  3.0 
Graduate Certificate in Business Analytics
Program Requirements
The certificate is an online or residential program. The requirements are to complete the following three courses:
EBGN525  BUSINESS ANALYTICS  3.0 
EBGN560  DECISION ANALYTICS  3.0 
EBGN571  MARKETING ANALYTICS  3.0 
Course substitutions can be approved on a casebycase basis by the Certificate directors. Completing the Certificate will also position students to apply to either the Master of ScienceEngineering and Technology Management degree or the Master of Science in Data Science degree, as all the Certificate courses can be applied to either degree.
Courses
DSCI403. INTRODUCTION TO DATA SCIENCE. 3.0 Semester Hrs.
(I, II) This course will teach students the core skills needed for gathering, cleaning, organizing, analyzing, interpreting, and visualizing data. Students will learn basic SQL for working with databases, basic Python programming for data manipulation, and the use and application of statistical and machine learning toolkits for data analysis. The course will be primarily focused on applications, with an emphasis on working with real (nonsynthetic) datasets. Prerequisite: CSCI101 or CSCI102 or CSCI261 or CSCI200.
DSCI470. INTRODUCTION TO MACHINE LEARNING. 3.0 Semester Hrs.
(I) The goal of machine learning is to build computer systems that improve automatically with experience, which has been successfully applied to a variety of application areas, including, for example, gene discovery, financial forecasting, and credit card fraud detection. This introductory course will study both the theoretical properties of machine learning algorithms and their practical applications. Students will have an opportunity to experiment with machine learning techniques and apply them to a selected problem in the context of term projects. Prerequisite: CSCI101 or CSCI 102 or CSCI261 or CSCI200; MATH201, MATH332.
DSCI530. STATISTICAL METHODS I. 3.0 Semester Hrs.
Introduction to probability, random variables, and discrete and continuous probability models. Elementary simulation. Data summarization and analysis. Confidence intervals and hypothesis testing for means and variances. Chi square tests. Distributionfree techniques and regression analysis. Prerequisite: MATH213 or equivalent.
DSCI560. INTRODUCTION TO KEY STATISTICAL LEARNING METHODS I. 3.0 Semester Hrs.
Part one of a twocourse series introducing statistical learning methods with a focus on conceptual understanding and practical applications. Methods covered will include Introduction to Statistical Learning, Linear Regression, Classification, Resampling Methods, Basis Expansions, Regularization, Model Assessment and Selection. Prerequisite: DSCI530 or MATH530.
DSCI561. INTRODUCTION TO KEY STATISTICAL LEARNING METHODS II. 3.0 Semester Hrs.
Part two of a two course series introducing statistical learning methods with a focus on conceptual understanding and practical applications. Methods covered will include Nonlinear Models, Treebased Methods, Support Vector Machines, Neural Networks, Unsupervised Learning. Prerequisite: DSCI560 or MATH560.
DSCI575. MACHINE LEARNING. 3.0 Semester Hrs.
The goal of machine learning research is to build computer systems that learn from experience and that adapt to their environments. Machine learning systems do not have to be programmed by humans to solve a problem; instead, they essentially program themselves based on examples of how they should behave, or based on trial and error experience trying to solve the problem. This course will focus on the methods that have proven valuable and successful in practical applications. The course will also contrast the various methods, with the aim of explaining the situations in which each is most appropriate. Prerequisite: CSCI262, MATH201, MATH332.