Keynotes and Panels
Expo Track Keynote: Computer vision at scale: Driving customer innovation and industry adoption
Swami Sivasubramanian
Dr. Swami Sivasubramanian is the Vice President of AI & Data at AWS. In this role, Swami oversees all AWS AI and Data Services. His team’s mission is to help organizations leverage the power of AI and data to solve their most urgent business needs.
Swami and his team innovate across multiple areas of the AI and data stack. Swami’s team works across three layers of the AI stack, including: (1) Amazon SageMaker and optimized deep learning frameworks/engines in the bottom layer of the stack (which is for developers and companies wanting to build foundation models (FMs); (2) Amazon Bedrock, which forms the middle layer of the stack, is for customers seeking to leverage an existing foundational model, customize it with their own data, and get access to features like RAG, Guardrails, etc., to build a GenAI application — all as a managed service. Amazon Bedrock, the first managed service of its kind, provides customers with the easiest way to build and scale GenAI applications with the broadest selection of first-party and third-party FMs, as well as leading ease-of-use capabilities that allow GenAI builders to get higher quality model outputs more quickly; (3) In the top layer of the stack, we have GenAI applications, with Amazon Q being the primary application to call out. Amazon Q, is an expert on AWS that writes, debugs, tests, and implements code, while also doing transformations (like moving from an old version of Java to a new one), and querying customers’ various data repositories (e.g. Intranets, wikis, Salesforce, Amazon S3, ServiceNow, Slack, Atlassian, etc.) to answer questions, summarize data, carry on coherent conversation, and take action. Q is the most capable work assistant available today and continues to evolve quickly. Most AI applications heavily rely on data, so Swami also leads teams focused on helping customers with data preparation (EMR, Glue), data catalog and governance (Amazon DataZone) and BI/analytics (with Amazon QuickSight).
Since joining Amazon in 2005, Swami has also led the AWS Analytics and Databases portfolio, plus helped to build AWS services including Amazon S3, Amazon CloudFront, Amazon RDS, and Amazon DynamoDB. In September 2023, Swami joined the Amazon senior leadership team, or Steam.
Swami has been awarded more than 250 patents, authored 40 referred scientific papers and journals, and participates in several academic circles and conferences. Swami is also a member of the National Artificial Intelligence Advisory Committee, which is tasked with advising the President of the United States and the National AI Initiative Office on topics related to the National AI Initiative.
It is still laughably easy to foil SotA AI with adversarial attacks. Why? Because such systems lack embodiment. But dropping deep learners into robots and calling the result Embodied AI misses the mark: Embodiment is about more than just having a body; it is about change. Consider how much the world changes from the perspective of a human as she grows from one cell into 36x10^12 of them. Grappling with such massive internal change makes grappling with external change, like learning to read or drive, easy by comparison. Thus, to realize safe AI, we must similarly create autonomous technologies whose internal physical changes pretrain them to handle external change, like new tasks or adversarial attacks. I will demonstrate some soft and biological robots capable of this “morphological pre-training” and point out several paths via which the CVPR community can join us in creating such a future.
Joshua Bongard
Josh Bongard is the Veinott Professor of Computer Science at the University of Vermont and director of the Morphology, Evolution & Cognition Laboratory. His work involves automated design and manufacture of soft-, evolved-, crowdsourced-, and biological robots (so-called "xenobots"). A PECASE, TR35, and Cozzarelli Prize recipient, he has received funding from NSF, NASA, DARPA, ARO and the Sloan Foundation. He is the co-author of the book How The Body Shapes the Way We Think, the instructor of a reddit-based evolutionary robotics MOOC, and director of the robotics outreach program Twitch Plays Robotics.
(Overflow A&B)
Moderator: Nicole Decari, Director of AI & Society at the Allen Institute for AI (AI2)
Fei-Fei Li
Dr. Fei-Fei Li is the inaugural Sequoia Professor in the Computer Science Department at Stanford University, and Co-Director of Stanford’s Human-Centered AI Institute. She served as the Director of Stanford’s AI Lab from 2013 to 2018. And during her sabbatical from Stanford from January 2017 to September 2018, Dr. Li was Vice President at Google and served as Chief Scientist of AI/ML at Google Cloud. Since then she has served as a Board member or advisor in various public or private companies.
Peter Lee
Dr. Peter Lee is President, Microsoft Research. He leads Microsoft Research and incubates new research-powered products and lines of business in areas such as artificial intelligence, computing foundations, health, and life sciences. Before joining Microsoft in 2010, he was at DARPA, where he established a new technology office that created operational capabilities in machine learning, data science, and computational social science. Prior to that, he was a professor and the head of the computer science department at Carnegie Mellon University. Dr. Lee is a member of the National Academy of Medicine and serves on the boards of the Allen Institute for Artificial Intelligence, the Brotman Baty Institute for Precision Medicine, and the Kaiser Permanente Bernard J. Tyson School of Medicine. He served on President Obama’s Commission on Enhancing National Cybersecurity. He has testified before both the US House Science and Technology Committee and the US Senate Commerce Committee. With Carey Goldberg and Dr. Isaac Kohane, he is the coauthor of the best-selling book, “The AI Revolution in Medicine: GPT-4 and Beyond.” In 2024, Peter Lee was named by Time magazine as one of the 100 most influential people in health and life sciences.
Oren Etzioni
Dr. Oren Etzioni is the founder of TrueMedia.org, a nonprofit fighting political deepfakes. He was the Founding Chief Executive Officer at the Allen Institute for AI (AI2), having served as CEO from its inception in 2013 until late 2022. He is Professor Emeritus at the University of Washington where he helped to pioneer meta-search, online comparison shopping, machine reading, and open information extraction. He has authored several award-winning technical papers, achieving an H-index of 100 (100 technical papers each cited over 100 times). Finally, he is a technical director of the AI2 Incubator and a Venture Partner at Madrona. He has founded several companies including Farecast (acquired by Microsoft).
Matt McIlwain
Matt is passionate about founders building companies that leverage applied machine learning and cloud computing to solve problems better than ever. They can be intelligent applications for enterprise or “intersections of innovation” — where life science and data science intersect.
Matt graduated from Dartmouth College and holds an MBA from Harvard Business School and a master’s in public policy from Harvard’s Kennedy School of Government. He has been on the Forbes Midas list and Top 100 Venture Capitalists by CB Insights and The New York Times several times. In 2017, Matt was named Emerging Company Director of the year by the Puget Sound Business Journal in partnership with the prestigious National Association of Corporate Directors’ Northwest chapter. In 2011, he received the Washington Policy Center’s Champion of Freedom Award.
Hadi Partovi
Hadi Partovi is a tech entrepreneur and investor, and CEO of the education nonprofit Code.org.
Born in Tehran, Iran, Hadi grew up during the Iran-Iraq war. His school did not offer computer science classes, so he taught himself to code at home on a Commodore 64. After immigrating to the United States, he spent his summers working as a software engineer to help pay his way through high school and college. Upon graduating from Harvard University with a Masters degree in computer science, Hadi pursued a career in technology starting at Microsoft where he rose into the executive ranks. He founded two startups: Tellme Networks (acquired by Microsoft), and iLike (acquired by Newscorp). Hadi now invests and advises other technology startups.
In 2013 Hadi and his twin brother Ali launched the education nonprofit Code.org, which Hadi continues to lead full-time as CEO. Code.org has established computer science classes reaching 30% of US students, created the most broadly used curriculum platform for K-12 computer science, and launched the global Hour of Code movement that has reached hundreds of millions of students spanning every country in the world.
Hadi has served as an early advisor or investor at many tech startups including Facebook, Dropbox, airbnb, and Uber. He currently serves on the Board of Directors of Axon and MNTN.
Expo Track Keynote: Today’s Pictures, Tomorrow’s Training Data: The Synergy Between Human Creativity and AI
Join Andrea Gagliano, Senior Director of AI/ML at Getty Images, for an engaging session that explores the dynamic symbiotic relationship between artificial intelligence and human creativity. The conversation will focus on how training data plays a crucial role in the effectiveness and quality of outputs—specifically how AI models can generate more authentic and culturally relevant visual content that is better reflective of modern society today. Andrea will also discuss the imperative of supporting creators and respecting intellectual property through content licensing, as well as recurring compensation to ensure that models are fueled responsibly, and technologies can be harnessed to push the boundaries of AI creativity versus hinder it.
Andrea Gagliano
Andrea Gagliano, Senior Director of AI/ML at Getty Images, leads the teams responsible for improving content discovery experiences. This includes visual search and generative AI. Andrea believes in the power of AI/ML to enhance and inspire creatives, which is work she began in her graduate studies at UC Berkeley.
Proteins mediate the critical processes of life and beautifully solve the challenges faced during the evolution of modern organisms. Our goal is to design a new generation of proteins that address current-day problems not faced during evolution. In contrast to traditional protein engineering efforts, which have focused on modifying naturally occurring proteins, we design new proteins from scratch to optimally solve the problem at hand. Increasingly, we develop and use deep learning methods to design amino acid sequences that are predicted to fold to desired structures and functions. We also produce synthetic genes encoding these sequences and characterize them experimentally. In this talk, I will describe several recent advances in protein design.
David Baker
David Baker is the director of the Institute for Protein Design, a Howard Hughes Medical Institute Investigator, a professor of biochemistry, and an adjunct professor of genome sciences, bioengineering, chemical engineering, computer science, and physics at the University of Washington. His research group is focused on the design of macromolecular structures and functions. Dr. Baker has published over 600 research papers, been granted over 100 patents, and co-founded 17 companies. Over 70 of his mentees have gone on to independent faculty positions. David received his PhD in biochemistry with Randy Schekman at UC Berkeley and did postdoctoral work in biophysics with David Agard at UCSF. Dr. Baker is a recipient of the Breakthrough Prize in Life Sciences and is a member of the National Academy of Sciences and the American Academy of Arts and Sciences.
Expo Track Keynote: Phase Transition in AI: Opportunities and Gaps Towards Making AI Real
Recent advances in AI not only created promises for what AI can do, but also introduced questions about how to bring this promise to reality in real-world applications in a responsible way. In this talk, I will describe my journey at Microsoft Research from being amazed by the sparks of GPT-4 to understanding limitations of current family of models and driving research on what comes next. I will discuss research directions we are pursuing to make future AI systems more efficient, sustainable, controllable and valuable through innovations in model training, agent technologies and engineering practices. I will conclude with reflections on our unified responsibly in balancing the promise of AI with rising risks and concerns.
Ece Kamar
Ece Kamar is the Managing Director of the AI Frontiers Lab, where she leads research and development towards pushing the frontiers of AI capabilities. She has a decade of experience studying the impact of AI on society and developing AI systems that are reliable, unbiased and trustworthy. Her work integrates techniques from artificial intelligence, human-computer interaction, responsible AI, and AI safety. She has been instrumental in building the Responsible AI efforts inside Microsoft. She serves as Technical Advisor for Microsoft’s Internal Committee on AI, Engineering and Ethics. Ece is an Affiliate Faculty in the Department of Computer Science and Engineering at the University of Washington is currently serving on the National Academies' Computer Science and Telecommunications Board (CSTB).
Sofia Crespo discusses her artistic practice and creative journey, focusing on the use of generative systems, and particularly neural networks, as a means to explore speculative lifeforms.
Sofia Crespo
Sofia Crespo is an artist working with a huge interest in biology-inspired technologies. One of her main focuses is the way organic life uses artificial mechanisms to simulate itself and evolve, this implying the idea that technologies are a biased product of the organic life that created them and not a completely separated object. Crespo looks at the similarities between techniques of AI image formation, and the way that humans express themselves creatively and cognitively recognize their world.
Her work brings into question the potential of AI in artistic practice and its ability to reshape our understandings of creativity. On the side, she is also hugely concerned with the dynamic change in the role of the artists working with machine learning techniques. She's also one half of the artist duo Entangled Others alongside Feileacan McCormick.
(Overflow A&B)
Moderator: Kiana Ehsani, Senior Research Scientist @PRIOR @Allen Institute for AI
Panelists:
Dima Damen, Professor of Computer Vision, University of Bristol and Senior Research Scientist at Google DeepMind.
Cordelia Schmidt, Head of the THOTH project team at INRIA
Ranjay Krishna, Assistant Professor, University of Washington
Dima Damen
Dima Damen is a Professor of Computer Vision at the University of Bristol and Senior Research Scientist at Google DeepMind. Dima is currently an EPSRC Fellow (2020-2025), focusing her research interests in the automatic understanding of object interactions, actions and activities using wearable visual (and depth) sensors. She is best known for her leading works in Egocentric Vision, and has also contributed to novel research questions including mono-to-3D, video object segmentation, assessing action completion, domain adaptation, skill/expertise determination from video sequences, discovering task-relevant objects, dual-domain and dual-time learning as well as multi-modal fusion using vision, audio and language. She is the project lead for EPIC-KITCHENS, the seminal dataset in egocentric vision, with accompanying open challenges and follow-up works: EPIC-Sounds, VISOR and EPIC Fields. She is part of the large-scale consortium effort Ego4D and Ego-Exo4D. Dima is Associate Editor-in-Chief of IEEE TPAMI and associate editor of IJCV, and was a program chair for ICCV 2021. She is frequently an Area Chair in major conferences and was selected as Outstanding Reviewer in CVPR2021, CVPR2020, ICCV2017, CVPR2013 and CVPR2012. At Google DeepMind, Dima is part of the Vision team, led by Andrew Zisserman, focusing on video understanding research. Her latest contribution is to the Perception Test project on measuring perception in AI models
Cordelia Schmid
Cordelia Schmid holds a M.S. degree in Computer Science from the University of Karlsruhe and a Doctorate, also in Computer Science, from the Institut National Polytechnique de Grenoble (INPG). Her doctoral thesis on "Local Greyvalue Invariants for Image Matching and Retrieval" received the best thesis award from INPG in 1996. She received the Habilitation degree in 2001 for her thesis entitled "From Image Matching to Learning Visual Models". Dr. Schmid was a post-doctoral research assistant in the Robotics Research Group of Oxford University in 1996--1997. Since 1997 she has held a permanent research position at INRIA Rhone-Alpes, where she is a research director and directs an INRIA team. Dr. Schmid is the author of over a hundred technical publications. She has been an Associate Editor for IEEE PAMI (2001--2005) and for IJCV (2004--2012), editor-in-chief for IJCV (2013---), a program chair of IEEE CVPR 2005 and ECCV 2012 as well as a general chair of IEEE CVPR 2015. In 2006, 2014 and 2016, she was awarded the Longuet-Higgins prize for fundamental contributions in computer vision that have withstood the test of time. She is a fellow of IEEE. She was awarded an ERC advanced grant in 2013, the Humbolt research award in 2015 and the Inria & French Academy of Science Grand Prix in 2016. She was elected to the German National Academy of Sciences, Leopoldina, in 2017. In 2018 she received th Koenderink prize for fundamental contributions in computer vision that have withstood the test of time. Starting 2018 she holds a joint appointment with Google research.
Ranjay Krishna
Ranjay Krishna is an Assistant Professor at the Paul G. Allen School of Computer Science & Engineering. His research lies at the intersection of computer vision and human computer interaction. This research has received best paper, outstanding paper, and orals at CVPR, ACL, CSCW, NeurIPS, UIST, and ECCV, and has been reported by Science, Forbes, the Wall Street Journal, and PBS NOVA. His research has been supported by Google, Amazon, Cisco, Toyota Research Institute, NSF, ONR, and Yahoo. He holds a bachelor's degree in Electrical & Computer Engineering and in Computer Science from Cornell University, a master's degree in Computer Science from Stanford University and a Ph.D. in Computer Science from Stanford University.