CI1176 - Tópicos em Visão Computacional¶

BCC, IBM & PPGInf - Turma 2023-1¶

Professor: David Menotti @inf.ufpr.br @gmail.com

Carga Horária: 60 horas em aulas expositivas e colaborativas.

Horário: terças-feiras e quintas-feiras das 15h40 às 17h20

Sala: PA 02 (ter) & PC 18 (qui) (Presencial) / Google Meet (Digital)

Avaliação¶

80% - quatro apresentações de pesos iguais

20% - frequência

100% dos 20%: no máximo 3 ausências
50% dos 20%: entre 5 e 4 ausências
0% dos 20%: entre 7 e 6 ausências

Ementa¶

Leitura, estudo e apresentação de artigos científicos da área publicados em veículos de impacto e relevância relacionados a diversos tópicos da área de visão computacional.

Datas Importantes¶

21/03/23 Primeiro Encontro
06/04/23 Quinta-feira santa (não haverá aula)
01/06/23 Feira de Cursos e Profissões (não haverá aula)
08/06/23 Corpus Christi
06/07/23 Final

Cronograma Tentativo¶

##	DATA	ATIVIDADE / TÓPICO	Apresentador
1	21/3	Apresentação	David Menotti
2	23/3	Inaugural - Palestra	David Menotti
3	28/3	MLP, LeNet & AlexNet	Eduardo G., David & Leticia
4	30/3	ResNet, LSTM & ViT	Leonardo, Gabriel N. & Pedro
5	04/4	Object Classification	André, Eduardo G. & Gabriel P.
	06/4		Nao haverá aula
	07/4	Sexta-feira Santa	Sexta-feira
6	11/4	Object Detection	Leticia, Rayson & Gabriel P.
7	13/4	Image Segmentation	Leticia, Felipe & Vinícius
8	18/4	Generative Models	Felipe, Leonardo & Eduardo G.
9	20/4	Vehicle Identification	Rayson & Eduardo S. & Sabri
	21/4	Tiradentes	Sexta-feira
10	25/4	Aut. License Plate Recognition SR	Eduardo S., Gabriel P. & Valfride
11	27/4	Face Detection	Eduardo S. & Bernardo
	01/5	Dia do Trabalho	Segunda-feira
12	02/5	Face Recogntion	André, Gabriel P. & Bernardo
13	04/5	Face Recognition 3D	Sabry, Pedro, Bernardo
14	09/5	Face Masked Recognition	Felipe, Sabry & Michel
15	11/5	Face Recognition SR	Eduardo G., Pedro & João Picolo
16	16/5	General Spoofing	Rodrigo & Bruno Kamarowski
17	18/5	Face Anti-Spoofing (Passive)	Sabry, Felipe & Raul Almeida
18	23/5	Face Anti-Spoofing (Active)	Vinícius, Leonardo & Bruno Kamarowski
19	25/5	Facial Expression	Vinícius & Eduardo S.
20	30/5	Natural Language Processing	Lucas Woicjk
	01/6	Feira de Cursos e Profissões	Quinta-feira
21	06/6	Document Recognition	Lucas Woicjk
	08/6	Corpus Christi	Quinta-feira
22	13/6	Action Recognition	Valter Estebam
23	15/6	Reinforcement Learning	João Picolo, Leonardo & Gabriel N.
	17/6	Reposição 08/6	Sábado
24	20/6	Zero Shot Learning	Valter Estebam
25	22/6	Pose Estimation	Michel, André & Raul Almeida
26	27/6	Gaze Estimation	Leticia & Gabriel N.
27	29/6	Object ReID (Person & Vehicle)	Pedro, Michel & Rayson Laroca
28	01/7	Biometrics	Rodrigo & André
	04/7		Terça-feira
	06/7	Final	Quinta-feira

Tópicos¶

##	TÓPICO	APRESENTADOR	ASSUNTO
3	Background 1 (MLP, LeNet & AlexNet)	Eduardo Gobbo	ForwardForward :cite:p:`forwardforward:2022`
		David Menotti	LeNet :cite:p:`lenet:1998`
		Leticia Fontanelli	AlexNet :cite:p:`alexnet:2012`
4	Background 2 (ResNet, LSTM & ViT)	Leonardo Dionisio	ResNet :cite:p:`resnet:2016`
		Gabriel Nascarella	LSTM :cite:p:`lstm:1997`
		Pedro Pasqualini	Vision-Transformers :cite:p:`vit:2021`
5	Object Classification	André Thomal	VGGNet :cite:p:`vggnet:2015`
		Eduardo Gobbo	MobileNet :cite:p:`mobilenet:2017`
		Gabriel Pontarolo	EfficientNet :cite:p:`efficientnet:2020`
6	Object Detection	Leticia Fontanelli	Faster R-CNN :cite:p:`od-faster-rcnn:2017`
		Rayson Laroca	YoloV2 :cite:p:`od-yolo:2017` and newer versions
		Gabriel Pontarolo	EfficientDet :cite:p:`od-efficientdet:2020`
7	Image Segmentation	Leticia Fontanelli	Mask R-CNN :cite:p:`seg-maskrcnn:2018`
		Felipe Mazur	U-Net :cite:p:`seg-unet:2015`
		Vinícius Gonçalves	IFT :cite:p:`seg-ift-diff:2004`
8	Generative Models	Felipe Mazur	pix2pix :cite:p:`gm:pix2pix:2017`
		Leonardo Dionisio	CycleGAN :cite:p:`gm-cyclegan:2017`
		Eduardo Gobbo	Diffusion :cite:p:`gm-diffusion:2022`
9	Vehicle Identification	Sabry Rafrafi	Efficient :cite:p:`alpr-efficient:2021`
		Eduardo Santos	Multi :cite:p:`alpr-multi:2020`
		Rayson Laroca	Bias :cite:p:`alpr-bias:2022`
10	ALPR-SR	Eduardo Santos	Multitask :cite:p:`alprsr-multitask:2019`
		Gabriel Pontarolo	MPRNet :cite:p:`alprsr-mprnet:2021`
		Valfride Nascimento	Attention & Pixel Shuffle :cite:p:`alprsr-combining:2022`
11	Face Detection	Eduardo Santos	Viola Jones :cite:p:`fd-viola:2001`
		Bernardo Biesseck	RetinaFace :cite:p:`fd-retina:2020`
		Bernardo Biesseck	MTCNN :cite:p:`fn-mtcnn:2016`
12	Face Recognition	André Thomal	CenterLoss :cite:p:`fr-centerloss:2016`
		Gabriel Pontarolo	ArcFace :cite:p:`fr-arcface:2019`
		Bernardo Biesseck	Partial FC :cite:p:`fr-partialfc:2022`
13	Face Recognition 3D	Sabry Rafrafi	3D-PointNet++ :cite:p:`fr3d-pointface:2021`
		Pedro Pasqualini	3D-PointCloudNet :cite:p:`fr3d-pointcloudnet:2021`
		Bernardo Biesseck	3D-BERL :cite:p:`fr3d-3dberl:2022`
14	Face Masked Recognition	Felipe Mazur	MaskInvArcface :cite:p:`mfr-synthetic:2022`
		Sabry Rafrafi	MaskInv-Hg :cite:p:`mfr-kd:2021`
		Michel Brasil	SotA :cite:p:`mfr-sota:2022`
15	Face Recognition SR	Eduardo Gobbo	Iterative :cite:p:`frsr-iterative:2020`
		Pedro Pasqualini	SCTANet :cite:p:`frsr-sctanet:2023`
		João Picolo	DIDNet :cite:p:`frsr-didnet:2021`
16	General Spoofing	Rodrigo Saviam	Fingerprint :cite:p:`gspf-fingerprint:2021`
		Rodrigo Saviam	Iris :cite:p:`gspf-iris:2018`
		Bruno Kamarowski	Voice :cite:p:`gspf-voice:2023`
17	Face Anti-Spoofing (Passive)	Sabry Rafrafi	FAS-Casia :cite:p:`fas-casia:2012`
		Felipe Mazur	FAS-ViT :cite:p:`fas-vit:2021`
		Raul Almeida	FAS-dc-cdn :cite:p:`fas-dc-cdn:2021`
18	Face Anti-Spoofing (Active)	Vinícius Gonçalves	Distortion :cite:p:`fasa-distortion:2019`
		Leonardo Dionizio	Reflections :cite:p:`ffasa-reflections:2015`
		Bruno Kamarowski	Sensor :cite:p:`ffasa-sensor2014`
19	Facial Expression	Vinícius Gonçalves	Hybrid :cite:p:`fe-hybrid:2021`
		Eduardo Santos	Synthesis :cite:p:`fe-synthesis:2020`
		Vinícius Gonçalves	Fusion :cite:p:`fe-fusion:2022`
20	Natural Language Processing	Lucas Wojcik	Tesseract :cite:p:`docs-tesseract:2007`
		Lucas Wojcik	Word2Vect :cite:p:`docs-word2vec2` Wordpiece :cite:p:`docs-wordpiece:2016` GloVe :cite:p:`docs-glove:2014`
		Lucas Wojcik	BERT :cite:p:`docs-bert:2018` família GPT (GPT-1 a 4 e ChatGPT) :cite:p:`docs-gpt:2020`
21	Document Recognition	Lucas Wojcik	Intellix :cite:p:`er-intellix:2013`
		Lucas Wojcik	Graph Convolution :cite:p:`er-graphdoc:2022`
		Lucas Wojcik	ERNIE-Layout :cite:p:`er-ernielayout:2022`
22	Action Recognition	Valter Estevam	Trajectories :cite:p:`ar-trajectories:2013`
		Valter Estevam	Kinetics :cite:p:`ar-kinetics:2017`
		Valter Estevam	InternVideo :cite:p:`ar-internvideo:2022`
23	Reinforcement Learning	João Picolo	DDPG :cite:p:`rl-ddpg:2020`
		Leonardo Dionizio	Pedestrian Trajectory :cite:p:`rl-pedestrian:2020`
		Gabriel Nascarella	Autonomous Driving :cite:p:`rl-driving:2020`
24	Zero Shot Learning	Valter Estevam	Tell me :cite:p:`zsl-tellme:2021`
		Valter Estevam	Global Semantic :cite:p:`zsl-global:2022`
		Valter Estevam	Cezsar :cite:p:`zsl-cezsar:2023`.
25	Pose Estimation	Michel Brasil	Hourglass :cite:p:`pose-hourglass:2016`
		André Thomal	OpenPose :cite:p:`pose-openpose:2017`
		Raul Almeida	Transformers :cite:p:`pose-transformer:2022`
26	Gaze Estimation	Leticia Fontanelli	MPIIGaze :cite:p:`gaze-mpii:2019`
		Gabriel Nascarella	Gaze-360 :cite:p:`gaze-360:2019`
		Gabriel Nascarella	L2CS-Net :cite:p:`gaze-l2csnet:2022`
27	Object ReID (Person & Vehicle)	Pedro Pasqualini	Person in the Wild :cite:p:`reid-person:2017`
		Michel Brasil	Person Ensemble :cite:p:`reid-personensemble:2020`
		Rayson Laroca	Vehicle Large :cite:p:`reid-vehiclelarge:2016`
		Rayson Laroca	Vehicle Read :cite:p:`reid-vehicleread:2021`
28	Biometrics: Iris, Periocular, Fingerprint	Rodrigo Saviam	Iris :cite:p:`bio-irina:2017`
		Rodrigo Saviam	Fingerprint :cite:p:`bio-fingernet:2019`
		André Thomal	Periocular :cite:p:`bio-ufpr-periocular:2022`

	Explainable AI
	Satelite Imagery
	ImaGen & VideoGen from text
	Attack Adversarial

Referências¶

AHKAH22: Ahmed A. Abdelrahman, Thorsten Hempel, Aly Khalifa, and Ayoub Al-Hamadi. L2cs-net: fine-grained gaze estimation in unconstrained environments. 2022. arXiv:2203.03339.
ADG+22: Xiang An, Jiankang Deng, Jia Guo, Ziyong Feng, XuHan Zhu, Jing Yang, and Tongliang Liu. Killing two birds with one stone: efficient and robust training of face recognition cnns by partial fc. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 4042–4051. June 2022.
BLG+23: Qiqi Bao, Yunmeng Liu, Bowen Gang, Wenming Yang, and Qingmin Liao. Sctanet: a spatial attention-guided cnn-transformer aggregation network for deep face image super-resolution. IEEE Transactions on Multimedia, ():1–12, 2023. doi:10.1109/TMM.2023.3238522.
BMR+20: Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. Language models are few-shot learners. CoRR, 2020. URL: https://arxiv.org/abs/2005.14165, arXiv:2005.14165.
CSWS17: Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). July 2017.
CZ17: Joao Carreira and Andrew Zisserman. Quo vadis, action recognition? a new model and the kinetics dataset. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). July 2017.
CPM14: Shaxun Chen, Amit Pande, and Prasant Mohapatra. Sensor-assisted facial recognition: an enhanced biometric authentication system for smartphones. In Proceedings of the 12th annual international conference on Mobile systems, applications, and services, 109–122. 2014.
CLWZ21: Fangfang Cheng, Tao Lu, Yu Wang, and Yanduo Zhang. Face super-resolution through dual-identity constraint. In 2021 IEEE International Conference on Multimedia and Expo (ICME), volume, 1–6. 2021. doi:10.1109/ICME51207.2021.9428360.
CJ21: Tarang Chugh and Anil K. Jain. Fingerprint spoof detector generalization. IEEE Transactions on Information Forensics and Security, 16():42–55, 2021. doi:10.1109/TIFS.2020.2990789.
DGV+20: Jiankang Deng, Jia Guo, Evangelos Ververas, Irene Kotsia, and Stefanos Zafeiriou. Retinaface: single-shot multi-level face localisation in the wild. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). June 2020.
DGXZ19: Jiankang Deng, Jia Guo, Niannan Xue, and Stefanos Zafeiriou. Arcface: additive angular margin loss for deep face recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 4690–4699. 2019.
DCLT18: Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: pre-training of deep bidirectional transformers for language understanding. CoRR, 2018. URL: http://arxiv.org/abs/1810.04805, arXiv:1810.04805.
DBK+21: Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: transformers for image recognition at scale. In International Conference on Learning Representations. 2021. URL: https://openreview.net/forum?id=YicbFdNTTy.
ELMP21: Valter Estevam, Rayson Laroca, David Menotti, and Helio Pedrini. Tell me what you see: a zero-shot action recognition method based on natural language descriptions. 2021. arXiv:2112.09976.
ELPM22: Valter Estevam, Rayson Laroca, Helio Pedrini, and David Menotti. Global semantic descriptors for zero-shot action recognition. IEEE Signal Processing Letters, 29:1843–1847, 2022. URL: https://doi.org/10.1109%2Flsp.2022.3200605, doi:10.1109/lsp.2022.3200605.
ELPM23: Valter Estevam, Rayson Laroca, Helio Pedrini, and David Menotti. Cezsar: a contrastive embedding method for zero-shot action recognition. SSRN, 2023. submitted to Pattern Recognition Letters. URL: http://dx.doi.org/10.2139/ssrn.4333781, doi:10.2139/ssrn.4333781.
FB04: A.X. Falcao and F.P.G. Bergo. Interactive volume segmentation with differential image foresting transforms. IEEE Transactions on Medical Imaging, 23(9):1100–1108, 2004. doi:10.1109/TMI.2004.829335.
GM21: Anjith George and Sébastien Marcel. On the effectiveness of vision transformers for zero-shot face anti-spoofing. In 2021 IEEE International Joint Conference on Biometrics (IJCB), volume, 1–8. 2021. doi:10.1109/IJCB52358.2021.9484333.
GPAM+14: Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In International Conference on Neural Information Processing Systems (NeurIPS), 2672–2680. 2014.
GLF+09: Alex Graves, Marcus Liwicki, Santiago Fernández, Roman Bertolami, Horst Bunke, and Jürgen Schmidhuber. A novel connectionist system for unconstrained handwriting recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(5):855–868, 2009. doi:10.1109/TPAMI.2008.137.
HGDG18: Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. Mask r-cnn. 2018. arXiv:1703.06870.
HZRS16: Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), volume, 770–778. 2016. doi:10.1109/CVPR.2016.90.
HZSC22: Mingjie He, Jie Zhang, Shiguang Shan, and Xilin Chen. Enhancing face recognition with self-supervised 3d reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 4062–4071. June 2022.
Hin22: Geoffrey Hinton. The forward-forward algorithm: some preliminary investigations. 2022. arXiv:2212.13345.
HS97: Sepp Hochreiter and Jürgen Schmidhuber. Long Short-Term Memory. Neural Computation, 9(8):1735–1780, 11 1997. doi:10.1162/neco.1997.9.8.1735.
HZC+17: Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. Mobilenets: efficient convolutional neural networks for mobile vision applications. 2017. arXiv:1704.04861.
HWT+22: Gee-Sern Jison Hsu, Hung-Yi Wu, Chun-Hung Tsai, Svetlana Yanushkevich, and Marina L Gavrilova. Masked face recognition from synthesis to reality. IEEE Access, 2022.
HBKD21: Marco Huber, Fadi Boutros, Florian Kirchbuchner, and Naser Damer. Mask-invariant face recognition through template-level knowledge distillation. In 2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021), 1–8. IEEE, 2021.
JLC+21: Changyuan Jiang, Shisong Lin, Wei Chen, Feng Liu, and Linlin Shen. Pointface: point set based feature learning for 3d face recognition. In 2021 IEEE International Joint Conference on Biometrics (IJCB), volume, 1–8. 2021. doi:10.1109/IJCB52358.2021.9484368.
JWL+23: Peipei Jiang, Qian Wang, Xiu Lin, Man Zhou, Wenbing Ding, Cong Wang, Chao Shen, and Qi Li. Securing liveness detection for voice authentication via pop noises. IEEE Transactions on Dependable and Secure Computing, 20(2):1702–1718, 2023. doi:10.1109/TDSC.2022.3163024.
KRS+19: Petr Kellnhofer, Adria Recasens, Simon Stent, Wojciech Matusik, and Antonio Torralba. Gaze360: physically unconstrained gaze estimation in the wild. In IEEE International Conference on Computer Vision (ICCV). October 2019.
KSH12: Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. Imagenet classification with deep convolutional neural networks. In F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 25, pages 1097–1105. Curran Associates, Inc., 2012. URL: http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf.
LBD+89: Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. Backpropagation Applied to Handwritten Zip Code Recognition. Neural Computation, 1(4):541–551, 12 1989. URL: https://doi.org/10.1162/neco.1989.1.4.541, doi:10.1162/neco.1989.1.4.541.
LBBH98: Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998. doi:10.1109/5.726791.
LSN+20: Kunming Li, Mao Shan, Karan Narula, Stewart Worrall, and Eduardo Nebot. Socially aware crowd navigation with multimodal pedestrian trajectory prediction for autonomous vehicles. In 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), volume, 1–8. 2020. doi:10.1109/ITSC45102.2020.9294304.
LWL+19: Yan Li, Zilong Wang, Yingjiu Li, Robert H. Deng, Binbin Chen, Weizhi Meng, and Hui Li. A closer look tells more: a facial distortion based liveness detection for face authentication. Proceedings of the 2019 ACM Asia Conference on Computer and Communications Security, 2019.
LHM+21: Chang Liu, Kaoru Hirota, Junjie Ma, Zhiyang Jia, and Yaping Dai. Facial Expression Recognition Using Hybrid Features of Pixel and Geometry. IEEE Access, 9:18876–18889, 2021. doi:10.1109/ACCESS.2021.3054332.
LCL22: Kuan-Hsien Liu, Ching-Hsiang Chiu, and Tsung-Jung Liu. Fusion of Triple Attention to Residual in Residual Dense Block to Attention Based CNN for Facial Expression Recognition. In 2022 IEEE International Conference on Systems, Man, and Cybernetics (SMC), 1045–1050. October 2022. doi:10.1109/SMC53654.2022.9945323.
LLMF16: Xinchen Liu, Wu Liu, Huadong Ma, and Huiyuan Fu. Large-scale vehicle re-identification in urban surveillance videos. In 2016 IEEE International Conference on Multimedia and Expo (ICME), volume, 1–6. 2016. doi:10.1109/ICME.2016.7553002.
MJR+20: Cheng Ma, Zhenyu Jiang, Yongming Rao, Jiwen Lu, and Jie Zhou. Deep face super-resolution with iterative collaboration between attentive recovery and landmark estimation. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), volume, 5568–5577. 2020. doi:10.1109/CVPR42600.2020.00561.
MAS21: Armin Mehri, Parichehr B Ardakani, and Angel D Sappa. Mprnet: multi-path residual network for lightweight image super resolution. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2704–2713. 2021.
MCCD13: Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. CoRR, 2013. URL: http://dblp.uni-trier.de/db/journals/corr/corr1301.html#abs-1301-3781.
MSC+13: Tomás Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. Distributed representations of words and phrases and their compositionality. CoRR, 2013. URL: http://arxiv.org/abs/1310.4546, arXiv:1310.4546.
MAA19: Shervin Minaee, Elham Azimi, and Amirali Abdolrashidi. Fingernet: pushing the limits of fingerprint recognition using convolutional neural network. 2019. arXiv:1907.12956.
NYD16: Alejandro Newell, Kaiyu Yang, and Jia Deng. Stacked hourglass networks for human pose estimation. In Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling, editors, Computer Vision – ECCV 2016, 483–499. Cham, 2016. Springer International Publishing.
PPW+22: Qiming Peng, Yinxu Pan, Wenjin Wang, Bin Luo, Zhenyu Zhang, Zhengjie Huang, Teng Hu, Weichong Yin, Yongfeng Chen, Yin Zhang, Shikun Feng, Yu Sun, Hao Tian, Hua Wu, and Haifeng Wang. Ernie-layout: layout knowledge enhanced pre-training for visually-rich document understanding. 2022. arXiv:2210.06155.
PSM14: Jeffrey Pennington, Richard Socher, and Christopher Manning. GloVe: global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543. Doha, Qatar, October 2014. Association for Computational Linguistics. URL: https://aclanthology.org/D14-1162, doi:10.3115/v1/D14-1162.
PN17: Hugo Proença and João C. Neves. Irina: iris recognition (even) in inaccurately segmented data. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), volume, 6747–6756. 2017. doi:10.1109/CVPR.2017.714.
RDGF16: J. Redmon, S. Divvala, R. Girshick, and A. Farhadi. You only look once: unified, real-time object detection. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), volume, 779–788. June 2016. doi:10.1109/CVPR.2016.91.
RF17: J. Redmon and A. Farhadi. YOLO9000: better, faster, stronger. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), volume, 6517–6525. July 2017. doi:10.1109/CVPR.2017.690.
RBL+22: Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 10684–10695. June 2022.
RFB15: Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: convolutional networks for biomedical image segmentation. 2015. arXiv:1505.04597.
SME+13: Daniel Schuster, Klemens Muthmann, Daniel Esser, Alexander Schill, Michael Berger, Christoph Weidling, Kamil Aliyev, and Andreas Hofmeier. Intellix - end-user trained information extraction for document archiving. In Proceedings of the International Conference on Document Analysis and Recognition, ICDAR. 08 2013. doi:10.1109/ICDAR.2013.28.
SWL+22: Dahu Shi, Xing Wei, Liangqi Li, Ye Ren, and Wenming Tan. End-to-end multi-person pose estimation with transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 11069–11078. June 2022.
SZ15: Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. In 3rd International Conference on Learning Representations (ICLR 2015), volume, 1–14. 2015. doi:.
SWL15: Daniel F. Smith, Arnold Wiliem, and Brian C. Lovell. Face recognition on consumer devices: reflections on replay attacks. IEEE Transactions on Information Forensics and Security, 10(4):736–745, 2015. doi:10.1109/TIFS.2015.2398819.
Smi07: R. Smith. An overview of the tesseract ocr engine. In Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), volume 2, 629–633. 2007. doi:10.1109/ICDAR.2007.4376991.
SZS+20: Ming Sun, Weiqiang Zhao, Guanghao Song, Zhigen Nie, Xiaojian Han, and Yang Liu. Ddpg-based decision-making strategy of adaptive cruising for heavy vehicles considering stability. IEEE Access, 8():59225–59246, 2020. doi:10.1109/ACCESS.2020.2982702.
TL20: Mingxing Tan and Quoc V. Le. Efficientnet: rethinking model scaling for convolutional neural networks. 2020. arXiv:1905.11946.
VGFuhr+22: Pedro Vidal, Roger Letizke Granada, Gustavo Führ, Vanessa Testoni, and David Menotti. A benchmark on masked face recognition. In Conference on Graphics, Patterns and Images, 1–6. Natal (RN), Brazil, October 2022.
VJ01: P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, volume 1, I–I. 2001. doi:10.1109/CVPR.2001.990517.
WS13: Heng Wang and Cordelia Schmid. Action recognition with improved trajectories. In Proceedings of the IEEE International Conference on Computer Vision (ICCV). December 2013.
WLL+22: Yi Wang, Kunchang Li, Yizhuo Li, Yinan He, Bingkun Huang, Zhiyu Zhao, Hongjie Zhang, Jilan Xu, Yi Liu, Zun Wang, Sen Xing, Guo Chen, Junting Pan, Jiashuo Yu, Yali Wang, Limin Wang, and Yu Qiao. Internvideo: general video foundation models via generative and discriminative learning. 2022. arXiv:2212.03191.
WZLQ16: Yandong Wen, Kaipeng Zhang, Zhifeng Li, and Yu Qiao. A discriminative feature learning approach for deep face recognition. In Bastian Leibe, Jiri Matas, Nicu Sebe, and Max Welling, editors, Computer Vision – ECCV 2016, 499–515. Cham, 2016. Springer International Publishing.
WSC+16: Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, and others. Google's neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144, 2016.
YHC+20: Yan Yan, Ying Huang, Si Chen, Chunhua Shen, and Hanzi Wang. Joint Deep Learning of Facial Expression Synthesis and Recognition. IEEE Transactions on Multimedia, 22(11):2792–2807, November 2020. doi:10.1109/TMM.2019.2962317.
YLLS20: Mang Ye, Xiangyuan Lan, Qingming Leng, and Jianbing Shen. Cross-modality person re-identification via modality-aware collaborative ensemble learning. IEEE Transactions on Image Processing, 29():9387–9399, 2020. doi:10.1109/TIP.2020.2998275.
YQZ+21: Zitong Yu, Yunxiao Qin, Hengshuang Zhao, Xiaobai Li, and Guoying Zhao. Dual-cross central difference network for face anti-spoofing. In Zhi-Hua Zhou, editor, Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, 1281–1287. International Joint Conferences on Artificial Intelligence Organization, 8 2021. Main Track. URL: https://doi.org/10.24963/ijcai.2021/177, doi:10.24963/ijcai.2021/177.
ZZLQ16: Kaipeng Zhang, Zhanpeng Zhang, Zhifeng Li, and Yu Qiao. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 23(10):1499–1503, 2016. doi:10.1109/LSP.2016.2603342.
ZSFB19: Xucong Zhang, Yusuke Sugano, Mario Fritz, and Andreas Bulling. Mpiigaze: real-world dataset and deep appearance-based gaze estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(1):162–175, 2019. doi:10.1109/TPAMI.2017.2778103.
ZMD+22: Zhenrong Zhang, Jiefeng Ma, Jun Du, Licheng Wang, and Jianshu Zhang. Multimodal pre-training based on graph attention network for document understanding. 2022. arXiv:2203.13530.
ZYL+12: Zhiwei Zhang, Junjie Yan, Sifei Liu, Zhen Lei, Dong Yi, and Stan Z. Li. A face antispoofing database with diverse attacks. In 2012 5th IAPR International Conference on Biometrics (ICB), volume, 26–31. 2012. doi:10.1109/ICB.2012.6199754.
ZDY22: Ziyu Zhang, Feipeng Da, and Yi Yu. Learning directly from synthetic point clouds for “in-the-wild” 3d face recognition. Pattern Recognition, 123:108394, 2022. doi:https://doi.org/10.1016/j.patcog.2021.108394.
ZZS+17: Liang Zheng, Hengheng Zhang, Shaoyan Sun, Manmohan Chandraker, Yi Yang, and Qi Tian. Person re-identification in the wild. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), volume, 3346–3355. 2017. doi:10.1109/CVPR.2017.357.
ZWP+20: Meixin Zhu, Yinhai Wang, Ziyuan Pu, Jingyun Hu, Xuesong Wang, and Ruimin Ke. Safe, efficient, and comfortable velocity control based on reinforcement learning for autonomous driving. Transportation Research Part C: Emerging Technologies, 117:102662, 2020. URL: https://www.sciencedirect.com/science/article/pii/S0968090X20305775, doi:https://doi.org/10.1016/j.trc.2020.102662.
ZZL+18: Hang Zou, Hui Zhang, Xingguang Li, Jing Liu, and Zhaofeng He. Generation textured contact lenses iris images based on 4dcycle-gan. In 2018 24th International Conference on Pattern Recognition (ICPR), volume, 3561–3566. 2018. doi:10.1109/ICPR.2018.8546154.
GonccalvesDinizLaroca+19: G. R. Gonçalves, M. A. Diniz, R. Laroca, D. Menotti, and W. R. Schwartz. Multi-task learning for low-resolution license plate recognition. In Iberoamerican Congress on Pattern Recognition (CIARP), volume, 251–261. Oct 2019. doi:10.1007/978-3-030-33904-3_23.
IsolaZZE17: Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. Image-to-image translation with conditional adversarial networks. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), volume, 5967–5976. 2017. doi:10.1109/CVPR.2017.632.
LarocaSantosEstevam+22: R. Laroca, M. Santos, V. Estevam, E. Luz, and D. Menotti. A first look at dataset bias in license plate recognition. In Conference on Graphics, Patterns and Images (SIBGRAPI), volume, 234–239. Oct 2022. doi:10.1109/SIBGRAPI55357.2022.9991768.
LarocaZanlorensiGonccalves+21: R. Laroca, L. A. Zanlorensi, G. R. Gonçalves, E. Todt, W. R. Schwartz, and D. Menotti. An efficient and layout-independent automatic license plate recognition system based on the YOLO detector. IET Intelligent Transport Systems, 15(4):483–503, 2021. doi:10.1049/itr2.12030.
NascimentoLarocaLambert+22: V. Nascimento, R. Laroca, J. A. Lambert, W. R. Schwartz, and D. Menotti. Combining attention module and pixel shuffle for license plate super-resolution. In Conference on Graphics, Patterns and Images (SIBGRAPI), volume, 228–233. Oct 2022. doi:10.1109/SIBGRAPI55357.2022.9991753.
OliveiraLarocaMenotti+21: I. O. Oliveira, R. Laroca, D. Menotti, K. V. O. Fonseca, and R. Minetto. Vehicle-Rear: a new dataset to explore feature fusion for vehicle identification using convolutional neural networks. IEEE Access, 9():101065–101077, 2021. doi:10.1109/ACCESS.2021.3097964.
RenHeGirshickSun17: S. Ren, K. He, R. Girshick, and J. Sun. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6):1137–1149, 2017. doi:10.1109/TPAMI.2016.2577031.
TanPL20: Mingxing Tan, Ruoming Pang, and Quoc V. Le. EfficientDet: scalable and efficient object detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), volume, 10778–10787. 2020. doi:10.1109/CVPR42600.2020.01079.
WangPZF20: Huibing Wang, Jinjia Peng, Yanzhu Zhao, and Xianping Fu. Multi-path deep CNNs for fine-grained car recognition. IEEE Transactions on Vehicular Technology, 69(10):10484–10493, 2020. doi:10.1109/TVT.2020.3009162.
ZanlorensiLarocaLucio+22: L. A. Zanlorensi, R. Laroca, D. R. Lucio, L. R. Santos, A. S. Britto Jr., and D. Menotti. A new periocular dataset collected by mobile devices in unconstrained scenarios. Scientific Reports, 12():17989, 2022. doi:10.1038/s41598-022-22811-y.
ZhuPIE17: Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. In IEEE International Conference on Computer Vision (ICCV), volume, 2242–2251. 2017. doi:10.1109/ICCV.2017.244.

Referências por tópico¶

Background 1: ForwardForward [Hin22], Backprop [LBD+89], LeNet [LBBH98], AlexNet [KSH12].
Background 2: ResNet [HZRS16], LSTM [HS97], LSTM [GLF+09], Vision Transformers (ViT) [DBK+21].
Object Classification: VGGNet [SZ15], MobileNet [HZC+17], EfficientNet [TL20].
Object Detection: Yolo [RDGF16] & YoloV2 [RF17], Faster R-CNN [RenHeGirshickSun17] , EfficientDet [TanPL20]
Image Segmentation: Mask R-CNN [HGDG18], U-Net [RFB15], IFT: [FB04].
Generative Models: GAN [GPAM+14] & pix2pix [IsolaZZE17], CycleGAN: [ZhuPIE17], Stable Diffusion [RBL+22]
Vehicle Identification: Efficient [LarocaZanlorensiGonccalves+21], Multi [WangPZF20], Bias [LarocaSantosEstevam+22]
ALPR SuperResolution: Realtime [GonccalvesDinizLaroca+19], MPRNet [MAS21], Attention Module & Pixel Shuffle [NascimentoLarocaLambert+22].
Face Detection: Viola-Jones: [VJ01], RetinaFace [DGV+20], MT-CNN [ZZLQ16].
Face Recognition: Center loss [WZLQ16], Arcface [DGXZ19], Partial FC [ADG+22].
Face Rec. 3D: 3D-PointFace [JLC+21], 3D-PointCloudNet [ZDY22], 3D-BERL [HZSC22]
Face Masked Rec.: MaskInvArcface [HWT+22], MaskInv-Hg [HBKD21], Sota [VGFuhr+22].
Face Recognition SR: Iterative [MJR+20], SCTANet [BLG+23], DIDNet [CLWZ21].
General Anti-Spoofing: Fingerprint [CJ21], Iris [ZZL+18], Voice [JWL+23].
Face Anti-Spoofing (passive): FAS-CASIA [ZYL+12], FAS-ViT [GM21], FAS-DC-CDN [YQZ+21].
Face Anti-Spoofing (active): Distortion [LWL+19], Reflections [SWL15], Sensor [CPM14].
Facial Expression: Hybrid [LHM+21], Synthesis [YHC+20], Fusion [LCL22].
Natural Language Processing: Tesseract [Smi07], Word2Vect [MCCD13] & [MSC+13], Wordpiece [WSC+16], GloVe [PSM14], BERT [DCLT18], GPT [BMR+20].
Document Recognition: Intellix [SME+13], GraphDoc [ZMD+22], ERNIE-Layout [PPW+22].
Action Recognition: Trajectories [WS13], Kinetics [CZ17], InternVideo [WLL+22].
Reinforcement Learning: DDPG [SZS+20], Pedestrian Trajectory [LSN+20], Autonomous Driving [ZWP+20].
Zero Shot Learning: Tell me [ELMP21], Global Semantic [ELPM22], Cezsar [ELPM23].
Gaze Estimation: MPIIGaze [ZSFB19] , Gaze360 [KRS+19] , L2CS-Net [AHKAH22].
Pose Estimation: Hourglass [NYD16], OpenPose [CSWS17], Multi-Person with Transformers: [SWL+22]
Object ReID (Person & Vehicle): Person in the Wild [ZZS+17], Person Ensemble [YLLS20], Vehicle Large [LLMF16], Vehicle Read [OliveiraLarocaMenotti+21].
Biometrics: Iris [PN17], Fingerprint [MAA19], Periocular [ZanlorensiLarocaLucio+22]

Frequência¶

nome	pres1	pres2	pres3	pres4	nfreq	média	sistema	situação	frequência	faltas
BERNARDO JANKO GONÇALVES BIESSECK	20	20	20	20	20	100	100	aprovado	90%	20/4 25/5 13/6
EDUARDO DOS SANTOS	20	20	20	20	20	100	100	aprovado	93%	04/4 13/6
ANDRÉ GAUER THOMAL	20	20	20	20	20	100	100	aprovado	93%	20/4 27/6
EDUARDO GOBBO WILLI VASCONCELLOS GONÇALVES	20	20	20	20	20	100	100	aprovado	93%	23/5 25/5
FELIPE MAZUR ROMANIUK DE FREITAS	20	20	20	20	0	80	80	aprovado	73%	30/5 06/613/6 15/6 20/6 22/6 27/6 29/6
GABRIEL DE OLIVEIRA PONTAROLO	20	20	20	20	20	100	100	aprovado	93%	25/5 29/6
GABRIEL NASCARELLA HISHIDA DO NASCIMENTO	20	20	20	20	20	100	100	aprovado	97%	27/6
LEONARDO LIMA DIONIZIO	20	20	20	20	20	100	100	aprovado	93%	25/5 22/6
LUCAS MATHEUS LEITE WOJCIK	20	20	20	20	20	100	100	aprovado	90%	04/5 25/5 20/6
MARCUS AUGUSTO FERREIRA DUDEQUE	0	0	0	0	20	20	20	reprovado	13%	23/3 28/3 30/3 04/4 11/4 13/4 18/4 20/4 25/4 27/4 02/5 04/5 09/5 11/5 16/5 18/5 23/5 25/5 30/5 06/613/6 15/6 20/6 22/6 27/6 29/6
MICHEL BRASIL CORDEIRO	20	20	20	0	0	60	60	exame	70%	28/3 30/3 18/4 04/5 11/5 30/5 13/6 20/6 27/6
PEDRO PASQUALINI DE ANDRADE	20	20	20	20	20	100	100	aprovado	90%	13/4 27/4 09/5
RODRIGO SAVIAM SOFFNER	20	20	20	20	20	100	100	aprovado	90%	09/5 06/627/6
SABRY INACIO RAFRAFI	20	20	20	20	20	100	100	aprovado	90%	06/615/6 27/6
LETICIA FONTANELLI STRAUBE DE SOUZA	20	20	20	20	20	100	100	aprovado	97%	13/6
MICHEL DOUGLAS MARTINS DOS SANTOS	0	0	0	0	0	0	0	reprovado	20%	30/3 04/4 11/4 13/4 18/4 20/4 25/4 27/4 02/5 04/5 09/5 11/5 16/5 18/5 23/5 25/5 30/5 06/613/6 15/6 20/6 22/6 27/6 29/6
VINÍCIUS DE LIMA GONÇALVES	20	20	20	20	20	100	100	aprovado	90%	28/3 02/5 27/6