We present a framework for audio-visual analysis of dance performances towards the goal of music-driven dance synthesis. Dance gures, which are performed synchronously with the musical rhythm, can be analyzed through the audio spectra using spectral and chromatic musical features. In the proposed multimodal dance performance analysis system, dance gures are manually labeled over the video stream and modeled by employing HMMs. The music segments, which correspond to beat and meter boundaries, are used to train hidden Markov model (HMM) structures to learn meter related temporal audio patterns which are correlated with the dance gures. Bi-gram based co-occurences of temporal audio patterns and dance gures are calculated. and bi-gram based co-occurrence performances for two di erent audio feature streams are evaluated. In our evaluations, mel-scale cepstral coe cients (MFCC) with their rst and second derivatives and chroma features are used as our candidate audio feature set. The proposed framework in this thesis, can be used towards analysis and synthesis of audio-driven human body animation.