cv
Curriculum vitae (PDF download available above).
Basics
| Name | Simone Rossetti |
| Label | Applied Researcher • PhD in Computer Science Engineering • Startup Founder |
| simone[dot]rossetti[at]live[dot]com | |
| Phone | (+39)[space]349[space]105[space]9384 |
| Url | https://rossettisimone.github.io/ |
| Summary | Specialist in multimodal feature learning, vision language alignment, and structured visual perception. Over four years of experience leading AI research teams and developing scalable, research-grade and production-ready models. Published at top-tier venues and involved in EU-funded multidisciplinary research. Strong focus on bridging theory and application through weakly- and self-supervised learning, large-scale training, and multimodal system design, with growing interest in Vision Language Action models and agent-oriented AI. |
Work
-
2021.09 - Present Co-Founder & Applied Researcher
DeepPlants S.r.l.
AI research startup in agri-tech and intelligent automation. Leading the development of multimodal, agent-oriented AI systems for micro-farming management and decision support.
- Led development of multimodal, agent-oriented AI systems for micro-farming management and decision support
- Built scalable training pipelines (multi-GPU) and optimized data workflows
-
2021.01 - 2021.10 AI Research Fellow
Sapienza Università di Roma (DIAG)
Research grant at DIAG, focused on computer vision and AI.
- Research in AI and computer vision, contributing to projects and publications
-
2019.06 - 2020.04 ICT Application Developer
VIK School S.r.l.
Development of accessible digital learning platforms.
- Built accessibility-compliant adaptive learning tools and interactive platforms
Education
-
2021.11 - 2025.01 Rome, Italy
PhD
Sapienza Università di Roma
Computer Science Engineering
- Advisors: Pirri F.; Amerini I.
- Thesis: Reducing supervision in semantic segmentation through advancements in Bayesian prior modelling (UNITesi 2025)
-
2019.10 - 2021.10 Rome, Italy
MSc
Sapienza Università di Roma
Artificial Intelligence and Robotics
- Master's thesis on fast instance segmentation and tracking for YouTube-VIS 2021
-
2015.10 - 2019.03 Rome, Italy
Certificates
| DeepLearn '22 | ||
| Advanced Training | 2022-01-01 |
| ICVSS '22 | ||
| Advanced Training | 2022-01-01 |
Publications
-
2025.01.01 Reducing supervision in semantic segmentation through advancements in bayesian prior modelling
UNITesi
Rossetti, S. (2025, UNITesi).
-
2025.01.01 CABBAGE: Comprehensive Agricultural Benchmark Backed by AI-Guided Evaluation
2025-2026, Ongoing
Rossetti, S., Gatti, P., Palleschi, D.
-
2024.01.01 Unsupervised Hierarchy-Agnostic Segmentation: Parsing Semantic Image Structure
NeurIPS
Rossetti, S., Pirri, F. (2024, NeurIPS).
-
2023.01.01 A new large dataset and a transfer learning methodology for plant phenotyping in Vertical Farms
ICCV
Samà, N., David, E., Rossetti, S. et al. (2023, ICCV).
-
2023.01.01 Removing supervision in semantic segmentation with local-global matching and area balancing
arXiv
Rossetti, S., Samà, N., Pirri, F. (2023, arXiv).
-
2022.01.01 Max pooling with vision transformers reconciles class and shape in weakly supervised semantic segmentation
ECCV
Rossetti, S. et al. (2022, ECCV).
-
2021.01.01 Video Instance segmentation Challenge 2021 with YoloV4+1Tr
YouTubeVOS
Rossetti, S., Zharkynbek, T., Pirri, F. (2021, YouTubeVOS).
Skills
| Expertise Areas | |
| Multimodal feature learning, vision language alignment and grounding | |
| Weakly- and self-supervised learning, structured visual perception | |
| Semantic and instance segmentation, foundation model benchmarking |
| Vision & Multimodal Models | |
| Vision Transformers, vision language models and contrastive pretraining | |
| Segmentation foundation models, multimodal encoders and decoders | |
| Masked autoencoding, contrastive and clustering-based learning | |
| Efficient fine-tuning and distillation |
| Language & Agentic Models | |
| Large language models and encoder-decoder architectures | |
| Multimodal prompting and instruction tuning, vision language reasoning | |
| Tool-augmented and agent-oriented model design, retrieval-augmented pipelines |
| Training, Scaling & Optimization | |
| Large-scale multimodal training, distributed training, multi-GPU optimization | |
| Scalable inference, experiment tracking, evaluation protocols | |
| Reproducibility-oriented research workflows |
| Engineering & Research Tooling | |
| Python, PyTorch and Lightning, Hugging Face ecosystem | |
| Docker, Linux, Git, SQL, multi-GPU environments | |
| Dataset curation and pipeline engineering |
Languages
| Italian | |
| Native (C2) |
| English | |
| Fluent (C1) |