Plenary lecture, Monday (8:30 am – 9:30 am)
Given by: Gil Shamir, Google Research
Machine Learning – From Theory to Production, or Is it From Production to Theory?
We describe the experience of developing production scale machine learning systems from the eyes of an information theorist. On one hand, theory plays a substantial role in improving efficiency and performance of such systems. On the other hand, many theoretically principled approaches may fail or are usually immature in practice and may take very long development cycles until they make real impact. Furthermore, unlike classical research, in today’s fast paced environment, production considerations and development may take precedence and push theory, and not the other way around.
We cover three case studies of the relation between theory and production. In the first, we describe how ideas leveraging the theory of the Minimum Description Length principle were used to improve efficiency and performance of online machine learning systems. Despite early substantial improvements, the development cycle was still very long due to various difficulties and obstacles of productionalization. In the second case study, we describe theoretical limits in online learning, and how information theoretical techniques can be combined with other methods to develop such limits. While information theory closes theoretical gaps in understanding such techniques, full scale production systems preceded many insights in such understandings.
For the third case study, we consider irreproducibility of deep network models. With the vast volume of deep network publications, irreplicability, where research groups are unable to independently replicate published results, has become a critical concern. We use the term irreproducibility, however, for the problem in which running an experiment on the same system with the same model, infrastructure and data multiple times still leads to inconsistent results. Unlike irreplicability that is usually driven by insufficient information about the experiment, irreproducibility arises due to many random factors in real systems. Deep networks exacerbate
these randomness factors leading to the same models producing different results every time the same experiment is run. Irreproducibility can negatively affect production development cycles, as it may render experimental results unreliable. We describe various empirical approaches to partially address the problem and demonstrate how the use of smooth activations in deep networks reduces the magnitude of the problem. As in many developments in deep learning, empirical work driven by business needs preceded the development of rigorous theory. We shed some light on the motivations to the proposed solutions.
Gil Shamir received the B.Sc. (Cum Laude), and M.Sc. degrees from the Technion, Israel Institute of Technology, Haifa, Israel in 1990 and 1997, respectively, and the Ph.D. degree from the University of Notre Dame, Notre Dame, IN, U.S.A. in 2000, all in electrical engineering.
From 1990 to 1995 he participated in research and development of signal processing and communication systems. From 1995 to 1997 he was with the Electrical Engineering Department at the Technion – Israel Institute of Technology, as a graduate student and teaching assistant. From 1997 to 2000 he was a Ph.D. student and a research assistant in the Electrical Engineering Department at the University of Notre Dame, and then a post-doctoral fellow until 2001.
During his tenure at Notre Dame he was a fellow of the Center for Applied Mathematics of the university. Between 2001 and 2008 he was with the Electrical and Computer Engineering Department at the University of Utah, and between 2008 and 2009 with Seagate Research. Since 2009 he has been with Google, where he worked in the ads and commerce organizations, developed machine learning systems, led research projects, and is currently with the Google Research Brain Team. His main research interests include information theory, machine learning, coding and communication theory. Dr. Shamir received an NSF CAREER award in 2003.
Plenary lecture, Tuesday (8:30 am – 9:30 am)
Given by: Markus Grassl, University of Gdansk
Various Facets of Algebraic Quantum Codes
The talk discusses connections between quantum error-correcting codes (QECCs) and algebraic coding theory, looking at the problem from both sides. A general quantum error-correcting code is a subspace of a complex Hilbert space that allows to protect quantum information against certain errors. Using the so-called stabilizer formalism, we illustrate how a subclass of QECCs can be obtained using algebraic coding theory. We will highlight some direct and indirect construction methods, which also lead to some new questions in classical coding theory. The talk includes a short introduction to the relevant concepts of quantum mechanics.
Markus Grassl received the Diploma degree in computer science and the Ph.D. degree from the Fakultät für Informatik, Universität Karlsruhe (TH), Germany, in 1994 and 2001, respectively.
He held positions at Universität Karlsruhe, the Institute for Quantum Optics and Quantum Information in Innsbruck, the Centre for Quantum Technologies in Singapore, and the University of Erlangen as well as the Max Planck Institute for the Science of Light in Erlangen. Since 2019, he has been a Senior Scientist with the International Centre for Theory of Quantum Technologies, University of Gdansk.
His research interests include algebraic methods in quantum information science, focusing on constructions for various types of quantum error-correcting codes. For over 20 years, he has been contributing to the algorithms for coding theory in the computer algebra system Magma. He maintains the online tables www.codetables.de of good block quantum error-correcting codes and good linear block codes.
Dr. Grassl is a Senior Member of the IEEE Information Theory Society and served as an Associate Editor for Quantum Information Theory of the IEEE Transactions on Information Theory from 2015 to 2017. Currently, he serves on the editorial boards of the « International Journal of Quantum Information » and the journal « Cryptography and Communications ».
Plenary lecture, Thursday (8:30 am – 9:30 am)
Given by: Negar Kiyavash, EPFL
Causal Identification: Are We There Yet?
We discuss causal identifiability, the canonical problem of causal inference, where the goal is to calculate the effect of intervening on subset of variables on an outcome variable of interest. We first visit the definition of the problem and note that it is necessary to add positivity assumption of observational distribution to the original definition of the problem as without such an assumption the rules of do-calculus and consequently the proposed algorithms in the field are not sound. After discussing state of the art and recent progress in the field, we present some of the open problems and remaining challenges.
Negar Kiyavash received her Ph.D. degree in electrical and computer engineering (ECE) from the University of Illinois at Urbana-Champaign in 2006. She is the chair of business analytics at the College of Management of Technology at Ecole Polytechnique Fédérale de Lausanne (EPFL). Prior to joining EPFL, she was a joint associate professor in the H. Milton Stewart School of Industrial and Systems Engineering and in the School of ECE at the Georgia Institute of Technology (Georgia Tech).
Before joining Georgia Tech, she was a Willett Faculty Scholar at the University of Illinois and a joint associate professor of Industrial and Enterprise Engineering and ECE. She is a recipient of the National Science Foundation CAREER Award and Air Force Office of Scientific Research Young Investigator Research Program Award as well as the Illinois College of Engineering Dean’s Award for Excellence in Research. Her research interests are in the design and analysis of algorithms for network inference.