How Artificial Intelligence ‘Sees’ the World – Answers from Computer Vision

As one of the popular subfields of artificial intelligence and machine learning, computer vision has attracted public attention. Can computer vision possess the resolution of the human eye, and in which fields does it have development prospects? A series of questions can be answered in the Computer Vision course.

The Computer Vision course, taught by Professor Gao Shenghua from the School of Information, is a professional elective course aimed at senior undergraduate students. The course covers the basic concepts and methods of computer vision and introduces cutting-edge research results in the discipline, including imaging, basic image processing, 3D reconstruction, object detection, recognition, and segmentation. Through case studies and practical exercises, it cultivates students’ hands-on skills and research spirit, helping them learn to apply computer vision technology to solve real-world problems.

How Artificial Intelligence 'Sees' the World - Answers from Computer Vision

Image | Professor Gao Shenghua Teaching

The Course Content is Cutting-Edge and Emphasizes Innovation

“Compared to traditional courses, a major highlight of this course is that it does not stick to a rigid knowledge structure and system. The course focuses on the 3Rs of computer vision (recognition, reorganization, reconstruction) and explains numerous modern applications of computer vision, giving me a comprehensive and in-depth understanding of this discipline,” said Wang Zhiyi, a 2020 undergraduate student from the School of Information.

Each semester, the instructor assigns a significant practical project, allowing students to choose topics based on course content, literature research, and their interests, and to form teams to complete the project, presenting academic reports in class. This method comprehensively assesses students’ literature review, project structuring, engineering optimization, academic writing, and presentation skills.

Chen Jiaben, a 2018 undergraduate student, said, “What impressed me most was the course practice project. The theme I chose was video frame interpolation, which was my first experience with the entire research process. During the presentation, the encouragement and suggestions from the teacher greatly benefited me. Since then, I have been ignited with a passion for research in the field of computer vision, and I often discuss class topics with the teacher, gaining new inspiration and motivation with every exchange.”

How Artificial Intelligence 'Sees' the World - Answers from Computer Vision

Image | Classroom Discussion

Diverse and Interesting Teaching Methods, Emphasizing Practice

“My greatest gain is not only learning many algorithms for traditional computer vision tasks and implementing some applications of artificial intelligence in this field, but also igniting my interest in exploring this field in depth,” said Wu Xiaohan, a 2019 undergraduate student. “In the past, when I encountered terms like ‘face detection’, ‘3D reconstruction’, and ‘neural networks’, I always thought such technologies were very advanced and difficult to achieve. However, through the teacher’s gradual teaching, I realized that undergraduates like me can also accomplish many interesting tasks that once seemed out of reach.”

In addition to regular lectures and instructor Q&A, the course also uses teaching assistants and the Piazza platform to provide students with various handouts and references, encouraging self-directed learning and hands-on practice.

“The course assignments are highly open-ended. Through continuous trial and error, I not only understood ‘why this works’ but also ‘why that doesn’t work’, which is invaluable to me,” said Wang Zhiyi.

How Artificial Intelligence 'Sees' the World - Answers from Computer Vision

Image | Teaching Assistant Q&A

Challenges and Fun Go Hand in Hand

Interesting experiences are not easily obtained; achieving them requires solid effort.

2018 graduate student Sha Yuting recalled, “Computer vision has always been a research direction I am very interested in. Taking this course has given me a comprehensive understanding of computer vision, from classic classification, detection, and segmentation tasks to deeper generative and reconstruction tasks, all filled with challenges and fun. The first attempt at the practical project was very difficult, but as I tried little by little, problems were resolved one by one. Completing it and achieving the expected results brought me immense satisfaction.”

Although the course sounds very interesting, the content and its application directions are very broad, requiring mastery of many knowledge points. Wu Xiaohan said, “The difficulty of this course, in addition to needing a certain level of Python programming skills (and learning related libraries), also requires students to have a certain understanding of linear algebra, probability theory, signal processing, and other knowledge. Therefore, it is crucial to review and master the relevant knowledge in advance; otherwise, it may be ‘very headache-inducing’ during class.”

Professor Profile

Gao Shenghua, tenured associate professor, researcher, and doctoral supervisor at ShanghaiTech University, selected for the National Overseas High-Level Talent Program Youth Project, Shanghai Pujiang Talent Program, Shuguang Scholar, and Zhangjiang National Science Center Outstanding Innovation and Entrepreneurship Talent (Innovation Category). His research directions cover understanding images and videos and 3D reconstruction. He graduated with a bachelor’s degree from the University of Science and Technology of China in 2008 and obtained his PhD from Nanyang Technological University in Singapore in 2012. He then worked as a research scientist at the Illinois Institute of Technology in Singapore. In 2014, he joined the School of Information at ShanghaiTech University. To date, he has published over 120 papers in top conferences and journals in the field of computer vision, with over 9,400 citations. He has served as a program chair for top international conferences such as ICCV/CVPR/AAAI and as an associate editor for journals in the field of computer vision, including IEEE TCSVT and Neurocomputing. He has led/participated in multiple projects funded by the National Natural Science Foundation of China, major projects by the Ministry of Science and Technology, and the Shanghai Natural Science Foundation.

Want to learn more about exciting courses?

Want to explore the classrooms of the School of Information at ShanghaiTech University?

Welcome to apply to ShanghaiTech University’s School of Information!

Click on the Read Original at the end of the article to see the professors from the School of Information!

Source | Interviewed Students

Layout | Li Hongxia

Editor | Gao Zhengchun

ShanghaiTech University School of Information

ID: ShanghaiTech-SIST

Scan the QR code to follow us~

How Artificial Intelligence 'Sees' the World - Answers from Computer Vision

Leave a Comment