PDbSeed Documentation

Contents
  1. Introduction
  2. Prerequisites
  3. Command Line Usage
  4. Python Usage
  5. Database Operations
  6. Database Metadata Files
1. Introduction

PDbSeed is a database seeding and verification tool written in Python.  PDbSeed performs an operation, such as DELETE or INSERT, on a database using an XML dataset file for input/output.

PDbSeed is useful in unit tests that interact with a database.  Typically, one would use PDbSeed in the unit test's setup and teardown to ensure that the database's state is consistent across execution of tests.

PDbSeed may be invoked either as a command-line program, or by using the PDbSeed API. When running PDbSeed in either way, one must ensure that the PDbSeed source is in the PYTHONPATH. The $PDBSEED_ROOT/sources directory should be added to the user's PYTHONPATH.

2. Prerequisites
  • Python 2.3 or 2.4
  • MySQLdb driver for Python
  • Java 1.4, Jython 2.1 for the Metadata Generator Tool
3. Command Line Usage

PDbSeed may be invoked from the command line.  Usage is as follows:

python dbseed.py [options] dbhost dbname dbuser dbpassword datasetFilename metadataFilename operation [sqlQuery]

operation is one of REFRESH, DELETE_ALL, INSERT, DELETE, UPDATE, NONE, COMPARE, DUMP

sqlQuery is used for COMPARE and DUMP operations

options:
    --db=db_type
   	selects the type of DB being used
   	defaults to mysql
   	supported types are mysql, kinfx
    --tableName=table_name
   	table name to be used for COMPARE and DUMP operations
   	if not present, table name is parsed from sqlQuery
    --columnsToRemove="columns, to, remove"
   	comma delimited list of columns to filter out during COMPARE operation
    --debug
   	print SQL statements and arguments before they are executed

The arguments are:

  1. dbhost - the host of the database
  2. dbname - the name of the database to connect to
  3. dbuser - the username used to connect to the database
  4. dbpassword - the password used to connect to the database
  5. datasetFilename - the name of the dataset file used as input/output
  6. metadataFilename - the name of the file that contains metadata about the tables in your database (see Database Metadata Files)
  7. operation - the name of the operation to perform on the database
  8. sqlQuery (required for COMPARE, DUMP operations) - a SQL query, used by the COMPARE and DUMP operations
The options are:
  1. db - The database type to use.  This should be one of mysql, kinfx (informix supported is untested at this time).  If not present, mysql is the default. 
  2. tableName - The table name to be used for COMPARE and DUMP operations.  If not present, the table name is parsed from sqlQuery.
  3. columnsToRemove - A comma delimited list of columns to filter out during COMPARE operations.
  4. debug - When this is enabled, print SQL statements and arguments before they are executed.

4. Python Usage

Here's an example of using the python pdbseed API:

import pdbseed.core.dbseed as dbseed

if __name__ == '__main__':
    
    # print sql statements
    dbseed.debugSql = True
    
    # our db connection info
    dbhost = 'localhost'
    dbname = 'pdbseed'
    dbuser = 'root'
    dbpassword = 'root'
    datasetFilename = 'pdbseed_example.dataset.xml'
    metaDataFilename = 'pdbseed_example.metadata.xml'
    
    # create our context
    context = dbseed.ContextFactory.createContext('mysql', metaDataFilename, dbhost, dbname, dbuser, dbpassword)

    # create our producer from our xml dataset file
    xmlDatasetProducer = dbseed.FlatXMLFileRecordProducer(context, datasetFilename)
    
    # run some operations
    dbseed.NoneOperation().execute(xmlDatasetProducer)
    dbseed.RefreshOperation().execute(xmlDatasetProducer)
    dbseed.DeleteOperation().execute(xmlDatasetProducer)
    dbseed.InsertOperation().execute(xmlDatasetProducer)
    dbseed.UpdateOperation().execute(xmlDatasetProducer)
    
    # get a dataset from the db
    sqlQuery = 'select * from simple'
    databaseQueryProducer = dbseed.DatabaseQueryProducer(context, sqlQuery)
    
    # dump a dataset from the db into a string
    dumpToString = dbseed.DumpToStringOperation()
    dumpToString.execute(databaseQueryProducer)
    
    print '\n\n'
    print 'got dataset from running query',sqlQuery,':\n\n'
    print dumpToString.getString()

    # clean things out
    dbseed.DeleteAllOperation().execute(xmlDatasetProducer)

    
    # compare what's in the db with what's in our dataset
    # these should differ since we deleted everything from the db
    print '\n\n'
    print 'comparing data from db with dataset:\n\n'
    
    xmlDatasetProducer = dbseed.FlatXMLFileRecordProducer(context, datasetFilename)
    print dbseed.DataSetComparer(xmlDatasetProducer, databaseQueryProducer).compareProducers()
5. Database Operations

PDbSeed supports the following database operations:

6. Database Metadata Files

Since the Python database API offers no standardized way of gathering metadata about a database, PDbSeed uses an XML file to specify database metadata. The format specifies a set of tables, columns that belong to each table, and each table's primary key columns. Here's an example:

<DatabaseMetaData> <Table name="simple"> <Columns> <Column name="id" type="int"/> <Column name="name" type="text"/> <Column name="d" type="date"/> <Column name="dt" type="datetime"/> <Column name="b" type="blob"/> <Column name="f" type="float"/> <Column name="dbl" type="double"/> </Columns> <Keys> <Primary> <KeyColumn name="id"/> </Primary> </Keys> </Table> </DatabaseMetaData>

Fortunately, there is a metadata generator tool that will generate this file. It uses the Java JDBC metadata API, and requires Java 1.4 and Jython 2.1.

Here's an example of running the metadata generator tool:

# include your JDBC driver in your classpath export CLASSPATH=mysql-connector-java-3.1.12-bin.jar # run jython metadataGenerator.py jython ../sources/dbseed/tools/metadataGenerator.py org.gjt.mm.mysql.Driver # JDBC driver classname jdbc:mysql://localhost:3306/pdbseed # JDBC URL root # DB username root # DB password pdbseed # DB catalog _ # DB schema, _ indicates all