Keynote Speeches – DAS 2020

Dr. Tong Sun

Document Intelligence Lab in Adobe

Title: The Future of Document: A New Frontier in the New Decade

Biography: Dr. Tong Sun is leading Document Intelligence Lab in Adobe to reinvent the document of the future in the era of AI and machine learning. Tong is a seasoned technology innovator and thought-leader with a 15+ years leadership in incubating new concepts through state-of-art scalable machine learning methods and tools, developing impactful rapid prototypes, and delivering competitive technologies to market opportunities in cross-disciplinary and cross-functional team environments. Her research interests on natural language processing and understanding, distributed machine learning, big data computing and human computer interaction. She held 22 issued US patents, 40+ peer-reviewed publications in prestigious conferences and journals. Prior to joining Adobe, Tong was the Director of Scalable Data Analytics Research Lab at Xerox PARC and the Group Leader of Decision Support and Machine Intelligence at United Technologies Research Center.

Abstract: Documents have been an integral part of our day-to-day business for centuries. Today’s enterprises leverage documents to drive all essential and business critical processes and functions. Over past three decades, several paradigm shifts have reshaped the ways in which human interacts with documents: (1) the transition from paper to digital documents in 1990s (2) the always-on always-connected social media and mobile device in 2000s that enable people to access, share and collaborate instantly with documents; (3) the rise and widening success of AI and machine learning technologies in 2010s that unlocks intelligence from documents for deeper automation . At the dawn of 2020s, the new decade presents us with unprecedented challenges and opportunities: What the emerging disruptive trends are? Where the future of document is heading to? What key research questions we need to tackle in order to invent the document of the future? We truly believe the future of document lies in the conjunction of three aspects: deep structural and semantic understanding, the novel content format for next-generation infrastructure, and the magical user experiences with documents enabled by emerging technologies. In this talk, I will share our state of art research spans multi-modal document understanding, content forensic and trustworthy document and futuristic human-document interactions (e.g. editing, annotating, reading, collaboration) in mixed-reality and cross-modal interfaces.

Prof. Lianwen Jin

South China University of Technology

Title: Optical Character Recognition in the Deep Learning Era

Biography: Lianwen Jin received the B.S. degree from the University of Science and Technology of China, Anhui, China, and the Ph.D. degree from the South China University of Technology, Guangzhou, China, in 1991 and 1996, respectively. He is currently a Professor with the School of Electronic and Information Engineering, South China University of Technology. He is the author of more than 200 scientific papers. Dr. Jin was a recipient of the award of New Century Excellent Talent Program of MOE in 2006 and the Guangdong Pearl River Distinguished Professor Award in 2011. His research interests include optical character recognition, handwriting analysis and recognition, machine learning, deep learning, and computer vision.

Abstract: As one of the most fundamental and influential inventions of humanity, text has played an important role in human life. Rich and precise semantic information carried by the text is important in a wide range of vision-based application scenarios, such as image searching, image understanding, industrial automation, information security, instant translation, intelligent finance, robot navigation and so on. In recent years, with the fast development of deep learning theory and technology, optical character recognition (OCR) has been achieved great progress in many aspects such as unconstraint handwritten text recognition, camera based printed document analysis and recognition, scene text detection and recognition in the wild and so on. The OCR technology has also attracted extensive attention by both academia and industry and it plays a crucial role in many real-world AI systems. In this talk, I will briefly introduce the state-of-the-art of various deep learning methods in the field of OCR, and specifically introduce the main research progress in handwritten text recognition, signature verification and writer identification, and scene text detection and recognition. I will also discuss about some unsolved problems, new research topics and challenges, and future research trends.

Prof. C.V. Jawahar

IIIT Hyderabad, India

Title: Document Understanding Beyond Text Recognition

Biography: Prof. C.V. Jawahar holds the Amazon Chair at the IIIT Hyderabad, India, where he leads a group focusing on computer vision, machine learning and multimedia systems. In the recent years, he has been looking into a set of problems that overlap with vision, language and text, as well as large scale multimedia retrieval systems. He has many publications in top-tier computer vision, robotics and document image processing conferences, with contributions in the areas of scene text understanding, Indian language OCRs and handwritten text recognition among others. Presently, he is an area editor of CVIU and an associate editor of IEEE PAMI. He is a Fellow of IAPR and INAE.

Abstract: Abstract: The fundamental problem in document understanding has been the recognition of textual content present in the document images. Recent years have seen great advance in text recognition performance with innovative deep learning architectures. This may be the time to catch up with the other information rich components that are present in the document image. This include objects like tables, figures, equations and roles of text as captions and headings. Semantic understanding of the documents will be highly limited, if we do not understand these objects and cues. Segmentation should no longer be the necessary evil before the text recognition. The innovations in document image designs also leading to the reduction in the gap between the classical natural image understanding and the modern document image understanding. This talks aims to peep into this emerging space and discuss the research directions and recent advances.