We propose a visual surveillance based person-to-person hostile intent and behavior detection method in elevators. The view of an elevator by a surveillance camera is typically of a small confined space with abrupt changes in illumination due to opening and closing of the elevator door. We extract three levels of features in a sequential process for the violent event detection. First, as low-level features, foreground blobs are segmented from the background and their motion velocity vectors are extracted by an optical flow method. Second, as a mid-level feature, the number of people inside the elevator is estimated by considering the number and sizes of the segmented blobs. As the other mid-level features, the velocity magnitudes and directions are computed by image based motion analyses. A person-to-person violence can only occur when there is more than one person in the elevator. As the key classifying feature, we consider the average velocity magnitude and direction of each blob. A sequence of image frames are determined to contain a violent event if an average velocity magnitude of any segmented blob exceeds a threshold along with its associated direction not being dominant in one direction. The experimental results demonstrate that the proposed method functions effectively with a computational efficiency sufficient for real-time processing.