2022/08/07第11回全日本コンピュータビジョン勉強会「CVPR2022読み会」（前編）資料まとめ

関東、名古屋、関西のコンピュータビジョン勉強会合同で開催している全日本コンピュータビジョン勉強会の11回目です。

今回は恒例となりました、コンピュータビジョンのトップカンファレンス「CVPR2022」の論文読み会をオンラインで行いました。 CVPR2022読み会は毎回多くの発表希望者がいるため、今回も前後編の２回に分けて行われます。

今回は前編ということで、リンク等をまとめます。

登録サイト

kantocv.connpass.com

Togetter

togetter.com

Youtube

youtu.be

発表資料

全ての発表資料は、勉強会で使用した質疑応答用のSlack上にはアップされているのですが、ここにリンクした資料は発表者本人がtwitterで公開したもののみ記載しています。

発表者	論文タイトル	発表資料
takmin	Learning to Solve Hard Minimal Problems	https://speakerdeck.com/takmin/learning-to-solve-hard-minimal-problems
shade-tree	It Is Okay To Not Be Okay: Overcoming Emotional Bias in Affective Image Captioning by Contrastive Data Collection	https://speakerdeck.com/forest1988/it-is-okay-to-not-be-okay-overcoming-emotional-bias-in-affective-image-captioning-by-contrastive-data-collection
carnavi	TableFormer: Table Structure Understanding with Transformers	https://www.slideshare.net/RyoKawanami/tableformercarnavipdf
inoichan	Time3D: End-to-End Joint Monocular 3D Object Detection and Tracking for Autonomous Driving	https://speakerdeck.com/inoichan/cvpr2022du-mihui-time3d-end-to-end-joint-monocular-3d-object-detection-and-tracking-for-autonomous-driving
kzykmyzw	EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation	https://www.slideshare.net/KazuyukiMiyazawa/20220807japancvkzykmyzwepropnppdf
kano_sawa	DoubleField: Bridging the Neural Surface and Radiance Fields for High-Fidelity Human Reconstruction and Rendering	https://speakerdeck.com/kanosawa/doublefield-cvmian-qiang-hui-cvpr2022
alfredplpl	Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding - 論文デモ：「エターナルフォースブリザード」を作るためにImagenをご家庭で育てる方法 -	https://www.docswell.com/s/alfredplpl/59YQLK-2022-08-07-145902
n-watanabe	Learning Neural Light Fields with Ray-Space Embedding	https://temporal-nw-slide-japancv-cvpr2022-reading-20220807.netlify.app/
toni	Balanced Multimodal Learning via On-the-fly Gradient Modulation	https://www.slideshare.net/AntonioTejerodePablo/cvpr2022-paper-reading-balanced-multimodal-learning-all-japan-computer-vision-study-group-20220807
losnuevetoros	パーツ探し～ PubTables-1M: Towards comprehensive table extraction from unstructured documents と XYLayoutLM: Towards Layout-Aware Multimodal Networks For Visually-Rich Document Understanding と V-Doc: Visual questions answers with Documents と Cross-modal Clinical Graph Transformer for Ophthalmic Report Generation は読んだし、TableFormer: Table Structure Understanding with Transformers と Neural Collaborative Graph Machines for Table Structure Recognition と Revisiting Document Image Dewarping by Grid Regularization と Fourier Document Restoration for Robust Document Dewarping and Recognition は気になったが読まなかった。	https://speakerdeck.com/yushiku/patutan-si

私の発表資料はこちらです。

speakerdeck.com

後編は8/21に開催予定です。

第１１回全日本コンピュータビジョン勉強会(後編) - connpass

takminの書きっぱなし備忘録 @はてなブログ

主にコンピュータビジョンなど技術について、たまに自分自身のことや思いついたことなど

2022/08/07第11回全日本コンピュータビジョン勉強会「CVPR2022読み会」（前編）資料まとめ

登録サイト

Togetter

Youtube

発表資料