This thesis presents specification-based test case generation and evaluation techniques. The methods combine mutation analysis with model checking techniques to generate tests to systematically check for safety properties. We generated two categories of tests to check system safety properties from complementary perspective, which are safety passing and failing tests. A set of safety coverage criteria are defined to evaluate the tests. To show the feasibility of our method, we developed a tool kit and applied our method to a sample specification. We automatically generated tests and evaluated the tests with our safety coverage criteria and also on a Java implementation.