Simone Rossetti

Applied Researcher & Co-Founder @ DeepPlants · AI @ DIAG & ALCOR Lab

prof_pic.jpg

⬆️ Rainbow Mountain, Peru

📍 Rome, Italy

👋 Hi, I’m Simone, an AI Research Engineer working at the intersection of Computer Vision and Natural Language Processing. My work focuses on representation learning, vision-language alignment, and weakly, semi-, and self-supervised learning for semantic and instance-level visual understanding.

📚 My research interests lie in Multimodal Learning, Vision-Language Models (VLMs), and Vision-Language-Action Models (VLAMs). I am particularly interested in grounding language into dense visual predictions (segmentation, tracking, affordances) and leveraging foundation models for zero- and few-shot transfer in structured vision tasks. A recurring theme is uncertainty modeling and probabilistic priors to improve robustness, calibration, and data efficiency under limited or noisy supervision.

🚀 I am a co-founder of DeepPlants, where I led research and engineering teams building production-grade, agentic AI systems for micro-farming management, plant phenotyping, and agri-tech automation. My experience spans the full research-to-production pipeline, from dataset design and large-scale multi-GPU training to model optimization and real-world deployment.

🔙 Previously, I was an AI Research Fellow at ALCOR Lab (Sapienza University of Rome), contributing to peer-reviewed research in computer vision, with a focus on instance segmentation and tracking and activity recognition.

🎓 I earned a PhD in Computer Science Engineering from DIAG, Sapienza University of Rome. My doctoral research focused on reducing supervision in semantic segmentation through Bayesian prior modeling and structured regularization. I hold an MSc in AI & Robotics and a BSc in Computer Engineering, with a background in automation and perception-action systems.

📄 My work has been presented at NeurIPS, ECCV, and ICCV. Selected publications and highlights are available on the Publications page.

📮 For collaborations reach out at simone[dot]rossetti[at]live[dot]com.

news

Mar 01, 2025 CABBO applying to COSMIC and SmarTerra open calls
Jan 29, 2025 Lessons learned while designing a multimodal benchmark for agricultural decision support
Jan 20, 2025 CABBO – multimodal AI agent for EU micro-farming
Sep 15, 2024 Since September 2024 I have been leading the multimodal learning team at DeepPlants, focusing on combining vision, language and agronomic signals to build robust, data-efficient models for agricultural applications.

latest posts

selected publications

  1. Max Pooling with Vision Transformers Reconciles Class and Shape in Weakly Supervised Semantic Segmentation
    Simone Rossetti†*, Damiano Zappia*, Marta Sanzari*, and 2 more authors
    In Computer Vision – ECCV. More Information can be found here. , 2022
  2. A new Large Dataset and a Transfer Learning Methodology for Plant Phenotyping in Vertical Farms
    Nico Samà*, Etienne David°, Simone Rossetti†*, and 3 more authors
    In IEEE/CVF International Conference on Computer Vision Workshops. More Information can be found here. , Oct 2023
  3. Hierarchy-Agnostic Unsupervised Segmentation: Parsing Semantic Image Structure
    Simone Rossetti†* and Fiora Pirri†*
    In Advances in Neural Information Processing Systems. More Information can be found here. , 2024