Human behavior recognition methods based on visible light imaging are the earliest and most widely used, but the performance of such methods is easily affected by various factors such as lighting conditions, background composition, and the state of the actor carrying foreign objects. Using thermal infrared imaging to realize human behavior recognition can better solve or weaken the above problems, and has the advantage of all-weather work. With the technological progress and price reduction of thermal imaging sensors and related devices, the research on thermal infrared human action recognition methods has developed rapidly, but the research results in this area are still relatively few, and there are still many key problems and technical difficulties that need to be solved urgently. There are mainly the following aspects.
(1) As an important preprocessing link for infrared human behavior recognition, fast and accurate target detection and tracking has not been well realized. The main reasons are: ① The signal-to-noise ratio of thermal infrared images is usually low, and the resolution is often much lower than that of visible light images, which makes it difficult to distinguish the edges and details of the image target; ② The thermal imaging mechanism leads to low thermal radiation in some working scenes Or objects with low infrared reflectivity (such as packages, balls, wooden sticks, etc. carried by the human body) are difficult to image clearly, and the recognition of these foreign objects in practice can often provide important clues for behavior recognition. In addition, there are a large number of objects (such as animals, vehicles, roads, buildings, etc.) with high thermal radiation or high infrared reflectivity in the scene, which may make it difficult to locate and detect actors.
(2) There is insufficient research on behavioral representation with high recognition rate, strong robustness and concise form. Human behavior usually exhibits strong uncertainty. First, human behavior is ever-changing, and the same behavior has different semantics in different scenarios. For the same behavior, different people have different performance styles. Even if the same person performs the same behavior at different times, they will show some differences due to changes in their physiological states or emotions. Secondly, the process of image acquisition will be affected by various factors, resulting in the introduction of noise in the extracted behavioral features. In addition, there may also be a certain perspective or scale difference between the frames of the behavior sequence images. An effective behavior representation should have high robustness under the premise of ensuring a certain recognition rate, that is, while having a strong ability to describe the training samples, it can essentially mine the "commonality" and different behaviors between the same behavior samples. The "difference" between categories, and at the same time achieve concise expression.
(3) The performance of the classification method for behavior recognition still needs to be improved. Recognition rate, robustness, and feature sample sparsity are all important indicators to measure the classification performance. Among them, the robustness requires that the classification method can accurately extract the statistical characteristics of the training samples, and the better feature sample sparse performance weakens the adverse effects of information redundancy on recognition, and significantly reduces the computational complexity of the recognition process.
(4) The breadth of human action recognition research is still narrow. According to the complexity of behavior, some researchers divide human behavior into four levels: posture, individual behavior, interactive behavior, and group behavior. At present, most of the researches still mainly focus on the first two levels, and there are still relatively few researches on the latter two levels.