专业详情

Statistics is the science and art of prediction and explanation. The mathematical foundation of statistics lies in the theory of probability, which is applied to problems of making inferences and decisions under uncertainty. Practical statistical analysis also uses a variety of computational techniques, methods of visualizing and exploring data, methods of seeking and establishing structure and trends in data, and a mode of questioning and reasoning that quantifies uncertainty. Data science expands on statistics to encompass the entire life cycle of data, from its specification, gathering, and cleaning, through its management and analysis, to its use in making decisions and setting policy. This field is a natural outgrowth of statistics that incorporates advances in machine learning, data mining, and high-performance computing, along with domain expertise in the social sciences, natural sciences, engineering, management, medicine, and digital humanities.

Students majoring in Statistics and Data Science take courses in both mathematical and practical foundations. They are also encouraged to take courses in the discipline areas listed below.

The B.A. in Statistics and Data Science is designed to acquaint students with fundamental techniques in the field. The B.S. prepares students to participate in research efforts or to pursue graduate school in the study of data science.

Courses for Nonmajors and Majors

S&DS 100 and S&DS 101–109 and S&DS 123 (YData) assume knowledge of high-school mathematics only. Students who complete one of these courses should consider taking S&DS 230. This sequence provides a solid foundation for the major. Other courses for nonmajors include S&DS 110 and 160.

Prerequisites

Multivariable calculus is required and should be taken before or during the sophomore year. This requirement may be satisfied by one of MATH 120ENAS 151MATH 230MATH 302, or the equivalent.

Requirements of the Major

Students who wish to major in Statistics and Data Science are encouraged to take S&DS 220 or a 100-level course followed by S&DS 230. Students should complete the calculus prerequisite and linear algebra requirement (MATH 222 or 225 or 226) as early as possible, as they provide mathematical background that is required in many courses.

B.A. degree program The B.A. degree program requires eleven courses, ten of which are from the seven discipline areas described below: MATH 222 or 225 or MATH 226 from Mathematical Foundations and Theory; two courses from Core Probability and Statistics; two courses that provide Computational Skills; two courses on Methods of Data Science; and three courses from any of the discipline areas subject to DUS approval. The remaining course is fulfilled through the senior requirement.

B.S. degree program The B.S. degree program requires fourteen courses, including all the requirements for the B.A. degree. Specifically, B.S. degree candidates must take S&DS 242 and starting with the Class of 2024, S&DS 365 to fulfill the B.A. requirements. The three remaining courses include one course chosen from the Mathematical Foundations and Theory discipline and two courses chosen from Core Probability and Statistics (not including S&DS 242), Computational Skills, Methods of Data Science (not including S&DS 365), Mathematical Foundations and Theory, or Efficient Computation and Big Data discipline areas subject to DUS approval. 

Discipline Areas The seven discipline areas are listed below.

Core Probability and Statistics These are essential courses in probability and statistics. Every major should take at least two of these courses, and should probably take more. Students completing the B.S. degree must take S&DS 242.

Examples of such courses includeS&DS 238241242312351

Computational Skills Every major should be able to compute with data. While the main purpose of some of these courses is not computing, students who have taken at least two of these courses will be capable of digesting and processing data. While there are other courses that require more programming, at least two courses from the following list are essential.

Examples of such courses includeS&DS 220 or 230262265425CPSC 100 or 112, or 201 or ENAS 130 

Methods of Data Science These courses teach fundamental methods for dealing with data. They range from practical to theoretical. Every major must take at least two of these courses. Students completing the B.S. degree must take S&DS 365, starting with the Class of 2024.

Examples of such courses includeS&DS 312317361363365430431468EENG 400CPSC 446452477

Mathematical Foundations and Theory All students in the major must know linear algebra as taught in MATH 222 or 225 or 226. Students who have learned linear algebra through other courses (such as MATH 230231) may substitute another course from this category. Students pursuing the B.S. degree must take at least two courses from this list and those students contemplating graduate school should take additional courses from this list as electives.

Examples of such courses includeS&DS 364400410411CPSC 365366469MATH 222225MATH 226244250MATH 255MATH 256260300301, or MATH 302

Efficient Computation and Big Data These courses are for students focusing on programming or implementation of large-scale analyses and are not required for the major. Students who wish to work in the software industry should take at least one of these.

Examples of such courses includeCPSC 223323424437

Data Science in Context Students are encouraged to take courses that involve the study of data in application areas. Students learn how data are obtained, how reliable they are, how they are used, and the types of inferences that can be made from them. These course selections should be approved by the director of undergraduate studies (DUS).

Examples of such courses includeANTH 376EVST 362GLBL 191195LING 229234380PLSC 454PSYC 258

Methods in Application Areas These are methods courses in areas of applications. They help expose students to the cultures of fields that explore data. These course selections should be approved by the DUS.

Examples of such courses includeCPSC 453470475ECON 136420EENG 445S&DS 352LING 227

Substitution Some substitution, particularly of advanced courses, may be permitted with DUS approval.

Credit/D/Fail  Credit/D/Fail may not be counted toward the requirements of the major (this includes prerequisite courses).

Roadmap See visual roadmap of the requirements.

Senior Requirement

Students in both the B.A. degree program and B.S. degree program complete the senior requirement by taking a capstone course (S&DS 425) or an individual research project course. Courses for research opportunities include S&DS 491 or S&DS 492, and must be advised by a member of the department of Statistics and Data Science or by a faculty member in a related discipline area. Students must complete a research project to be eligible for Distinction in the Major.

Advising

Students intending to major in Statistics and Data Science should consult the department guide and FAQ. Statistics and Data Science can be taken either as a primary major or as one of two majors, in consultation with the DUS. Appropriate majors to combine with Statistics and Data Science include programs in the social sciences, natural sciences, engineering, computer science, or mathematics. A statistics concentration is also available within the Applied Mathematics major.

Combined B.S./M.A. degree program Exceptionally able and well-prepared students may complete a course of study leading to the simultaneous award of the B.S. in S&DS and M.A. in Statistics after eight terms of enrollment. See Academic Regulations, section L, Special Academic Arrangements, “Simultaneous Award of the Bachelor’s and Master’s Degrees.” Interested students should consult the DUS at the beginning of their fifth term of enrollment for specific requirements in Statistics and Data Science.