PDbSeed Documentation
- Introduction
- Prerequisites
- Command Line Usage
- Python Usage
- Database Operations
- Database Metadata Files
PDbSeed is a database seeding and verification tool written in Python. PDbSeed performs an operation, such as DELETE or INSERT, on a database using an XML dataset file for input/output.
PDbSeed is useful in unit tests that interact with a database. Typically, one would use PDbSeed in the unit test's setup and teardown to ensure that the database's state is consistent across execution of tests.
PDbSeed may be invoked either as a command-line program, or by using the PDbSeed API. When running PDbSeed in either way, one must ensure that the PDbSeed source is in the PYTHONPATH. The $PDBSEED_ROOT/sources directory should be added to the user's PYTHONPATH.
PDbSeed may be invoked from the command line. Usage is as follows:
python dbseed.py [options] dbhost dbname dbuser dbpassword datasetFilename metadataFilename operation [sqlQuery] operation is one of REFRESH, DELETE_ALL, INSERT, DELETE, UPDATE, NONE, COMPARE, DUMP sqlQuery is used for COMPARE and DUMP operations options: --db=db_type selects the type of DB being used defaults to mysql supported types are mysql, kinfx --tableName=table_name table name to be used for COMPARE and DUMP operations if not present, table name is parsed from sqlQuery --columnsToRemove="columns, to, remove" comma delimited list of columns to filter out during COMPARE operation --debug print SQL statements and arguments before they are executed
The arguments are:
- dbhost - the host of the database
- dbname - the name of the database to connect to
- dbuser - the username used to connect to the database
- dbpassword - the password used to connect to the database
- datasetFilename - the name of the dataset file used as input/output
- metadataFilename - the name of the file that contains metadata about the tables in your database (see Database Metadata Files)
- operation - the name of the operation to perform on the database
- sqlQuery (required for COMPARE, DUMP operations) - a SQL query, used by the COMPARE and DUMP operations
- db - The database type to use. This should be one of mysql, kinfx (informix supported is untested at this time). If not present, mysql is the default.
- tableName - The table name to be used for COMPARE and DUMP operations. If not present, the table name is parsed from sqlQuery.
- columnsToRemove - A comma delimited list of columns to filter out during COMPARE operations.
- debug - When this is enabled, print SQL statements and arguments before they are executed.
Here's an example of using the python pdbseed API:
import pdbseed.core.dbseed as dbseed
if __name__ == '__main__':
# print sql statements
dbseed.debugSql = True
# our db connection info
dbhost = 'localhost'
dbname = 'pdbseed'
dbuser = 'root'
dbpassword = 'root'
datasetFilename = 'pdbseed_example.dataset.xml'
metaDataFilename = 'pdbseed_example.metadata.xml'
# create our context
context = dbseed.ContextFactory.createContext('mysql', metaDataFilename, dbhost, dbname, dbuser, dbpassword)
# create our producer from our xml dataset file
xmlDatasetProducer = dbseed.FlatXMLFileRecordProducer(context, datasetFilename)
# run some operations
dbseed.NoneOperation().execute(xmlDatasetProducer)
dbseed.RefreshOperation().execute(xmlDatasetProducer)
dbseed.DeleteOperation().execute(xmlDatasetProducer)
dbseed.InsertOperation().execute(xmlDatasetProducer)
dbseed.UpdateOperation().execute(xmlDatasetProducer)
# get a dataset from the db
sqlQuery = 'select * from simple'
databaseQueryProducer = dbseed.DatabaseQueryProducer(context, sqlQuery)
# dump a dataset from the db into a string
dumpToString = dbseed.DumpToStringOperation()
dumpToString.execute(databaseQueryProducer)
print '\n\n'
print 'got dataset from running query',sqlQuery,':\n\n'
print dumpToString.getString()
# clean things out
dbseed.DeleteAllOperation().execute(xmlDatasetProducer)
# compare what's in the db with what's in our dataset
# these should differ since we deleted everything from the db
print '\n\n'
print 'comparing data from db with dataset:\n\n'
xmlDatasetProducer = dbseed.FlatXMLFileRecordProducer(context, datasetFilename)
print dbseed.DataSetComparer(xmlDatasetProducer, databaseQueryProducer).compareProducers()
PDbSeed supports the following database operations:
- INSERT - attempts to insert each row into the database. Assumes each row is already in the database
- UPDATE - attempts to update each row in the database
- DELETE - attempts to delete each row in the database
- DELETE_ALL - deletes all rows from each table present in the dataset
- REFRESH - attempts to refresh each row in the database by performing either an insert if the row isn't present, or an update if the row is already present
- NONE - processed each row in the dataset, but performs no actually database commands
- COMPARE - used with the --sqlQuery option, compares the contents of the result of running sqlQuery with the dataset specified by datasetFilename, printing out any differences found
- DUMP - used with the --sqlQuery option, dumps the contents of the result of running sqlQuery to the file specified by datasetFilename
Since the Python database API offers no standardized way of gathering metadata about a database, PDbSeed uses an XML file to specify database metadata. The format specifies a set of tables, columns that belong to each table, and each table's primary key columns. Here's an example:
<DatabaseMetaData> <Table name="simple"> <Columns> <Column name="id" type="int"/> <Column name="name" type="text"/> <Column name="d" type="date"/> <Column name="dt" type="datetime"/> <Column name="b" type="blob"/> <Column name="f" type="float"/> <Column name="dbl" type="double"/> </Columns> <Keys> <Primary> <KeyColumn name="id"/> </Primary> </Keys> </Table> </DatabaseMetaData>
Fortunately, there is a metadata generator tool that will generate this file. It uses the Java JDBC metadata API, and requires Java 1.4 and Jython 2.1.
Here's an example of running the metadata generator tool:
# include your JDBC driver in your classpath export CLASSPATH=mysql-connector-java-3.1.12-bin.jar # run jython metadataGenerator.py jython ../sources/dbseed/tools/metadataGenerator.py org.gjt.mm.mysql.Driver # JDBC driver classname jdbc:mysql://localhost:3306/pdbseed # JDBC URL root # DB username root # DB password pdbseed # DB catalog _ # DB schema, _ indicates all