Dogtooth AI Collision Intelligence System
An industry linked MSc dissertation with Dogtooth Technologies, focused on building the AI system layer for sensor based collision detection and severity classification in autonomous soft fruit harvesting robots.
- Context
- MSc dissertation · industry project
- Partner
- Dogtooth Technologies (Cambridge)
- Domain
- AI systems · sensor machine learning
- Status
- In progress · 2026
Overview
Dogtooth Technologies builds autonomous soft fruit harvesting robots that operate in real farm environments. My dissertation is not about designing the robot hardware. The focus is the intelligence layer: an AI pipeline that can interpret onboard sensor signals and identify when a physical interaction with the environment is harmless, risky or severe enough to require action.
The project addresses a practical AI problem inside an autonomous system. The robot already operates in a complex physical setting with crop rows, rails, wires, posts and support structures. My work is to build a sensor based model pipeline that can detect collision like events, classify their severity and provide a decision signal that supports safer and more productive operation.
Problem context
Strawberry farms are difficult environments for autonomous systems. The layout can vary between rows, physical infrastructure can move or sag, and some objects are difficult to interpret from vision alone. A contact event also does not always mean the same thing. Some interactions are minor and operationally acceptable. Others can indicate a risk of damage, downtime or unsafe behaviour.
The AI challenge is to move beyond a simple contact or no contact view. The system needs to recognise patterns in sensor data, distinguish normal variation from meaningful events and support a response that is proportionate to the severity of the interaction.
Why robot availability matters
In commercial deployment, availability matters because a system that stops too often becomes less useful, while a system that ignores risky events can accumulate damage and downtime. The objective is not to make the robot more aggressive. The objective is to make the AI layer more informed, so the response is based on evidence from the sensor signal rather than a fixed rule.
This makes the project a reliability and decision support problem as much as a detection problem. The model has to help separate events that should be logged and monitored from events that should trigger a stronger operational response.
System concept
The proposed system sits on top of the existing autonomy platform as an AI interpretation layer. It takes onboard sensor data as input, applies preprocessing, extracts time and frequency based features, detects candidate events and classifies severity. The output is a structured signal that can be used by a decision layer, logging workflow or monitoring dashboard.
Pipeline
Step 01
Sensor input
Onboard signal capture
Step 02
Preprocessing
Filtering · framing
Step 03
Feature extraction
Time / frequency
Step 04
Event detection
Anomaly model
Step 05
Severity classifier
Harmful vs harmless
Step 06
Decision signal
Log · monitor · respond
Step 07
Evaluation
Recall · F1 · false alarms
Each stage is designed to be testable in isolation. The detection stage can be evaluated on event recall and false alarm rate. The classifier can be evaluated on harmful versus harmless classification performance. The complete pipeline can then be assessed by how well it supports operational decisions.
Decision logic
The decision layer should treat severity classification as a risk weighted AI problem. Different errors have different consequences:
- False positive: a harmless interaction is treated as harmful, which can reduce availability.
- False negative: a harmful interaction is missed, which can lead to damage or downtime.
This means the threshold is not only a technical setting. It should be informed by model performance, operational logs and the cost of different outcomes.
Technical challenges
- Class imbalance, because severe events are expected to be less common than normal operation.
- Label quality, because severity labels need to be consistent and defensible.
- Distribution shift, because farms, crop stages, surfaces and sensor conditions can change the signal.
- Latency, because the model output must be available quickly enough to support action.
- Debuggability, because field engineers need to understand why the system raised an event.
Research plan
Build the dataset around sensor windows captured near candidate events. Combine operational logs with controlled tests where labels can be checked more carefully. Treat data quality, event definitions and labelling rules as core parts of the research.
Compare lightweight time series classifiers using engineered features with more expressive sequence modelling approaches. Start with models that are easier to debug and justify before introducing extra complexity.
Evaluate event detection recall, false alarm rate and harmful versus harmless classification performance. Use holdout splits that reflect real deployment, such as held out dates or operating conditions, rather than random rows only.
Keep the pipeline compact, inspectable and reproducible. Document failure modes and design the output so engineers can understand which signal patterns contributed to a decision.
Expected impact
The intended outcome is an AI system that helps improve availability by reducing unnecessary interruptions while still identifying events that deserve attention. The value is in the full engineering pipeline: reliable data preparation, model development, severity classification, evaluation and integration into an operational decision workflow.
Future work
- Adapt the model as new farms, sensors and operating conditions are added.
- Combine sensor features with vision signals for richer event understanding.
- Connect severity predictions to planning or monitoring systems.
- Build availability dashboards from event telemetry and model outputs.