We propose an automatic video indexing system that considers
correspondences between indices derived from texts in the video
and image contents. In order to realize this, correspondences
are considered separately according to the 4W attributes. This
paper will focus on personal noun - character region, and locational
/ organizational noun - background scene correspondences and indexing.
A brief overview and result of text and image analyses are introduced,
and the actual indexing result is shown. The indexing showed good
performance in some cases, and countermeasures for improvement is
discussed for the insufficient cases.