Force Feedback Lets a Surgical Robot Track the Surgeon's View Autonomously
The robot handed the camera back to the surgeon. Not because it was told to, but because it knew where the instruments were.
That is the promise of a new autonomous laparoscope control system described in a preprint posted to arXiv on May 6, 2026 by researchers led by Jin Fang at Zhejiang University. The team did not respond to requests for comment. Current surgical robots let a human operator — often a resident or a second nurse — manually adjust the camera while the primary surgeon works. The result is a recurring bottleneck: the surgeon asks for a better view, waits, the camera gets bumped, and the rhythm breaks. The new system, validated on a surgical training phantom and during in vivo procedures on a live pig, tries to end that interruption cycle by giving the robot a unified way to see, feel, and position itself.
The technical core is what the researchers call an equivalent-wrench framework. A wrench, in robotics terms, is a description of a force applied at a point — both the pushing or pulling in a straight line and any twisting around an axis. The system maps three separate inputs — camera images, force and torque measurements from the robot arm, and the arm's joint positions — into the same wrench language. Once everything is expressed in the same units, a single controller can reason about all three simultaneously rather than switching between separate pipelines.
One wrench keeps the instrument in line with the incision site. During minimally invasive surgery, the instrument pivots through a small opening in the skin — a constraint called remote center of motion, or RCM. Stray even slightly and the incision stretches uncomfortably for the patient. The system constantly applies a correction force to maintain that pivot geometry and reduce sustained loading at the trocar site, the metal port the instruments pass through.
A second wrench handles the actual camera repositioning. The robot reads the video feed and detects where the surgical instruments are in the frame. When the instruments drift toward the edge of view, the robot autonomously nudges the camera to re-center them — what the researchers call compliant dragging. The third wrench does the same tracking job, but faster, using visual features to predict where the instruments will be next.
The system runs both the repositioning wrench and the tracking wrench simultaneously through a task-priority scheme. In plain terms, it means the camera-tracking task gets compute priority, but the RCM constraint task runs in the background and can override if the geometric guardrail is about to be violated. The result is a camera that re-centers itself when instruments drift but immediately snaps back to constraint-safe positioning if it would otherwise pull the incision.
Existing laparoscopic camera systems typically handle each of these jobs with separate, disconnected logic. A foot pedal moves the camera. A separate sensor suite measures forces. A third algorithm handles visual tracking. The new approach integrates all three from the ground up, which the researchers argue makes the behavior more predictable and the system easier to tune. The task-priority architecture is borrowed from established robotics literature, but applying it to the specific geometry of laparoscopic surgery with real force feedback is novel — and the significance of that unification was noted by the paper's own framing of prior work, which describes existing systems as typically treating force and vision as separate input streams rather than translating both into a common mechanical language.
The in vivo porcine validation is notable, and it has precedent in the field — a March 2026 open-source surgical robotics platform also reported in vivo validation, and the 2022 STAR autonomous surgery system, published in Science Robotics, demonstrated that autonomous surgical procedures in a living subject were feasible. What appears genuinely new here is the combination: force-feedback-guided autonomous tracking — where the robot adjusts its camera based on measured push and pull forces, not just video frames — demonstrated in a live animal with the geometric constraints of minimally invasive surgery. The equivalent-wrench framework lets those force signals and visual signals speak the same language inside a single controller, which the paper argues is what makes the difference when the subject is breathing and tissue compliance is unpredictable. In a phantom or ex vivo setting, the robot can rely on a fixed, unchanging environment. In a live animal, tissue shifts, the breathing cycle distorts the surgical field, and instrument resistance changes moment to moment. Force feedback is what tells the controller something has changed before the visual frame reflects it — and translating that force signal into the same mathematical language as the visual tracking is what lets the camera respond in real time rather than waiting for the visual pipeline to catch up. The paper reports successful multi-task operation in vivo, though specific performance metrics — tracking latency, force reduction percentages, or autonomous re-centering rates — are not disclosed in the preprint and would require direct access to the full paper or the authors.
Zhejiang University has been building toward this incrementally. An August 2025 paper in The International Journal of Medical Robotics and Computer Assisted Surgery, from the same institutional group, described a multi-task compliant control framework for autonomous laparoscope manipulation — establishing that this was not a one-off demo but a research direction with a published prior step. Jin Fang's group appears to be the same team.
The gap between a working animal trial and a commercial surgical system is still wide. No human trials are described. The paper does not address how a surgeon would override or approve the camera's autonomous decisions in real time — there is no disclosed mechanism for surgeon intervention, and that question will matter enormously to any hospital risk committee. The system was tested in a research setting with a specific robot kinematics configuration, and translating those results to the installed base of commercial surgical platforms — including Intuitive Surgical's dominant da Vinci system — would require substantial re-engineering.
That installed base is relevant context. Intuitive Surgical received FDA clearance for its fifth-generation da Vinci system in March 2024, with the company describing more than 10,000 times the computing power of its prior generation. Whether that compute headroom is being directed toward autonomous camera control, and on what timeline, has not been publicly specified. The company did not respond to questions as of publication.
Whether the wrench framework stays specific to laparoscope holders or scales to full surgical arms is the second-order question the paper leaves open. The same equivalent-wrench logic that tracks camera position and instrument forces could theoretically describe any robotic task with a physical constraint and a force-sensing surface — instrument insertion depth, tissue retraction tension, suture tension. If the architecture proves robust enough in translation, it could become the standard sensor-fusion substrate for surgical robotics autonomy rather than a single-purpose camera holder. That outcome depends on two things the paper does not settle. The first is whether the architecture can generalize. The second is whether anyone with an installed base picks it up. A commercial surgical robotics company would need access to the underlying control code, the specific kinematic calibration for their robot, and a regulatory pathway that does not yet exist for autonomous camera-guidance tasks. No such partnership is disclosed in the paper. Until one arrives, the framework is a proof of concept inside a proof of concept — live pig validation for an architecture that has not yet been adopted by the platform it would most plausibly run on.
For now, the paper's most direct contribution is conceptual: a unified mechanics-based framework for thinking about what an autonomous surgical camera needs to sense, resist, and track, all expressed in the same mathematical language. Whether that framework becomes the basis for the next generation of laparoscopic robots — or stays in the arXiv preprint record — depends on whether the engineering translation from pig to patient ever arrives.