The automatic discovery of motion patterns and object classes from video will potentially enable the creation of scenario models that are much more detailed than those that could be built by hand. Such models should capture and characterise the things that go on sufficiently well to be useful in many application domains. Attempting to do this in an unsupervised fashion from passive observation (e.g. from TV shows or CCTV) presents a major challenge for the field. Indeed, there is a good deal of scepticism about the feasibility of doing this at all without having access to linked sources of non-visual data, or without being able to act within the world in order to explore how things work.
The talk will review the state of the art in this rapidly developing area of computer vision and demonstrate that useful things can indeed be learnt from passive observation in structured domains (e.g. food preparation, aircraft servicing). It will also examine the synergy that exists between the discovery of object classes and the simultaneous discovery of motion patterns.