{"id":337,"date":"2020-09-03T17:20:44","date_gmt":"2020-09-03T20:20:44","guid":{"rendered":"https:\/\/wp.ufpel.edu.br\/dpvsa\/?page_id=337"},"modified":"2021-01-05T11:36:49","modified_gmt":"2021-01-05T14:36:49","slug":"lei-zhang","status":"publish","type":"page","link":"https:\/\/wp.ufpel.edu.br\/dpvsa\/lei-zhang\/","title":{"rendered":"Lei Zhang"},"content":{"rendered":"<p style=\"text-align: center\"><strong><span style=\"font-size: 18pt\">Lei Zhang<\/span><br \/>\nMicrosoft<br \/>\n<\/strong><\/p>\n<p><strong style=\"text-align: center\"><span style=\"font-size: 18pt\"><strong style=\"font-size: 16px\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-411 aligncenter\" src=\"https:\/\/wp.ufpel.edu.br\/dpvsa\/files\/2020\/09\/DPVSA-speaker-photos-LeiZhang.png\" alt=\"\" width=\"300\" height=\"300\" srcset=\"https:\/\/wp.ufpel.edu.br\/dpvsa\/files\/2020\/09\/DPVSA-speaker-photos-LeiZhang.png 300w, https:\/\/wp.ufpel.edu.br\/dpvsa\/files\/2020\/09\/DPVSA-speaker-photos-LeiZhang-200x200.png 200w, https:\/\/wp.ufpel.edu.br\/dpvsa\/files\/2020\/09\/DPVSA-speaker-photos-LeiZhang-32x32.png 32w, https:\/\/wp.ufpel.edu.br\/dpvsa\/files\/2020\/09\/DPVSA-speaker-photos-LeiZhang-80x80.png 80w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/strong><\/span><\/strong><\/p>\n<h3 style=\"text-align: center\"><span style=\"font-size: 18pt\"><em>Unified representation learning for vision-language understanding<\/em><\/span><\/h3>\n<p style=\"text-align: center\">Tuesday, October 20th, 04:30PM &#8211; 6:00PM (GMT-3)<\/p>\n<div style=\"text-align: center\">\n<!-- iframe plugin v.6.0 wordpress.org\/plugins\/iframe\/ -->\n<iframe loading=\"lazy\" width=\"900\" height=\"500\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture\" fs=\"1\" src=\"https:\/\/www.youtube.com\/embed\/gOCZnmiKdEE\" scrolling=\"yes\" class=\"iframe-class\"><\/iframe>\n<\/div>\n<p><span style=\"font-size: 14pt\"><strong>Abstract<\/strong><\/span><\/p>\n<p style=\"padding-left: 40px;text-align: justify\">Presentation Abstract: Large-scale pre-training methods of learning cross-modal representations on image-text pairs are becoming popular for vision-language tasks. In this talk, I will firstly introduce our research on VLP (unified Vision-Language Pretraining), which pretrains a model based on two objectives: bidirectional and seq2seq prediction to learn a unified representation for both understanding and generation tasks. To encourage vision and language-aligned representation, we further developed OSCAR (Object-Semantics Aligned Pre-training), which uses object tags detected in images as anchor points to ease the learning of object-semantics alignment and creates new state-of-the-arts on six well-established vision-language understanding and generation tasks. I will present extensive image captioning examples and analysis to provide insights on the effectiveness of the learned VL-aligned representation.<\/p>\n<p><span style=\"font-size: 14pt\"><strong>Biography<\/strong><\/span><\/p>\n<p style=\"padding-left: 40px;text-align: justify\">Lei Zhang is a Principal Researcher and Research Manager of the computer vision research group in Microsoft Cloud &amp; AI, leading a team working on visual recognition and computer vision. The team has made a significant impact to Microsoft Cognitive Services, including image tagging, object detection, entity recognition, and image captioning. Prior to this, he has worked with Microsoft Research Asia for 12 years as a Senior Researcher and later with Bing Multimedia Search for 2 years as a Principal Engineering Manager. He is an IEEE Fellow and has published 150+ papers and holds 50+ U.S. patents for his innovation in related fields.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Lei Zhang Microsoft Unified representation learning for vision-language understanding Tuesday, October 20th, 04:30PM &#8211; 6:00PM (GMT-3) Abstract Presentation Abstract: Large-scale pre-training methods of learning cross-modal representations on image-text pairs are becoming popular for vision-language&#46;&#46;&#46;<\/p>\n","protected":false},"author":748,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-337","page","type-page","status-publish","hentry"],"publishpress_future_action":{"enabled":false,"date":"2026-04-16 21:13:22","action":"change-status","newStatus":"draft","terms":[],"taxonomy":"","extraData":[]},"publishpress_future_workflow_manual_trigger":{"enabledWorkflows":[]},"_links":{"self":[{"href":"https:\/\/wp.ufpel.edu.br\/dpvsa\/wp-json\/wp\/v2\/pages\/337","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/wp.ufpel.edu.br\/dpvsa\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/wp.ufpel.edu.br\/dpvsa\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/wp.ufpel.edu.br\/dpvsa\/wp-json\/wp\/v2\/users\/748"}],"replies":[{"embeddable":true,"href":"https:\/\/wp.ufpel.edu.br\/dpvsa\/wp-json\/wp\/v2\/comments?post=337"}],"version-history":[{"count":10,"href":"https:\/\/wp.ufpel.edu.br\/dpvsa\/wp-json\/wp\/v2\/pages\/337\/revisions"}],"predecessor-version":[{"id":898,"href":"https:\/\/wp.ufpel.edu.br\/dpvsa\/wp-json\/wp\/v2\/pages\/337\/revisions\/898"}],"wp:attachment":[{"href":"https:\/\/wp.ufpel.edu.br\/dpvsa\/wp-json\/wp\/v2\/media?parent=337"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}