cpipilot logo cpipilot logo

Table Of Contents

Previous topic

User Guide

Next topic

Developer Guide

This Page

Administration Guide

After reading the administration guide, it should be possible to download the project from Source-forge and get it set up on your own servers. This guide will give you no indication as to how to modify the inner workings of the project. See the Developer Guide for this. New administrators to this project should first read the User Guide to familiarise themselves with the project before proceeding as this section builds on the first.

General Installation Guide

Below is a description of the steps required to successfully install the CPI Pilot project on your local computer. These instructions are not operating system specific so for each instruction a general explanation of what to do is provided. Working out how to do it for your specific platform will require a little searching. The exception here is for Windows XP where these steps are explained in detail and may give clues as to how to achieve the same results on your system. This procedure only describes the steps necessary to get the Django development server running so that you can access your project website on your localhost. Once the project is working on the development server steps should be taken to deploy it onto a production server <deploying> so that it may be accessed publically.

Downloading and installing dependencies

  1. Download and unpack the latest release of the CPI pilot project. This will unpack to a cpipilot folder which you can place anywhere on your system.
  2. Download and install Python. Django will not work with Python 3.x yet because every Python 3.x is not backward compatible with any Python 2.x (18/08/2009).
  3. Download and install MySQL. Set the port that the database uses to 8889.
  4. Download and install Django.
  5. Download and install MySQLDB.
  6. Download and install NumPy.

Creating a database

Enter the MySQL prompt (mysql -u[admin] -p[admin_password]) and:

  • Create a new database:

    mysql> create database dev;
    
  • Make a new user named user:

    mysql> create user user@localhost identified by "password";
    
  • Grant him local privileges:

    mysql> grant all on *.* to user@localhost identified by "password";
    mysql> flush privileges;
    

Testing the development server

  1. Navigate into the cpipilot folder from a command-line/terminal program. Execute the command python manage.py runserver. This runs the development server and you should get the message:

    Django version 1.0.2 final, using settings 'cpipilot.settings'
    Development server is running at http://127.0.0.1:8000/
    Quit the server with CTRL-BREAK.
    
  2. If you encounter any errors here please review the Django installation instructions. To check your progress so far, visit your localhost (http://127.0.0.1:8000/) in your browser and you should see your instance of CPI Pilot running. The home page will show but you will get errors if you try to navigate to any page that uses the database. This is because the models have not been synced with the database yet and no tables exist in the database. Quit the development server using Ctrl C because the next steps involves populating the database.

Syncing the models and database tables

  1. To synchronise the database with your cpipilot/repository/models.py file navigate into the cpipilot folder and type the command python manage.py syncdb. This will create the required tables in the database. Say yes to creating an auth system and fill in the relevant details. This allows you to use Django’s admin system which can be found at (http://127.0.0.1:8000/admin/). From here you will see the tables that the admin system has access to view and modify. You will also see that the database doesn’t contain any data yet. Django’s admin system allows very easy access for web based database entry modification. The admin interface can be modified by altering cpipilot/repository/admin.py. Additional information on the admin system can be found here.

Populating the database

  1. Load the core data by navigating into cpipilot/core and running the command mysql -u[admin] -p[admin_password] dev < core.sql. This replaces the db and vendor table which store information about external databases and reagent vendors respectively. This core data is not added by any particular experiment and so is kept seperate from any experimental data.
  2. Load data pertaining to a particular experiment. First, go to the experiment page on the server that you wish to take the experiment press the “Download All Experiment Data” button. This download contains three folders:
  • /raw - contains instructions on how to download all the experiment’s raw data.
  • /CPI_data - contains the data in a special intermediate form which can be imported into the database as well as exported from it.
  • /original - contains any files uploaded by the experimenter as well as the scripts required to convert this data into the CPI pilot form.

Extract this download into the cpipilot/experiments folder. Then, navigate to the cpipilot/experiments folder and type the command python import_script.py [experiment_folder_name]/. Don’t forget the trailing slash This imports the experiment data to the database. This takes a while, but once it is done you will see your chosen experiment has been loaded and through the website (http://127.0.0.1:8000/) you can navigate through it and access its data.

  1. The raw data is still accessed from the EBI servers. If you want to download the raw data from the EBI servers following the instructions on downloading an experiment. At the moment this is only possible on an experiment by experiment basis. The root path for accessing the data (which currently points to the EBI servers) can be changed by modifying the DATA_ROOT in the settings.py module to point to somewhere local or to your own servers. This means the the experiments’ raw data can be stored in a different location than that of the cpipilot project folder. Downloading data for all experiments hasn’t been implemented yet.

Before continuing with this admin guide please see this Django-Docs page which explains the usefulness of the manage.py module which is a wrapper for the django-admin.py module that is utilized for command-line administrative tasks.

Windows XP Installation Guide

Below is a description of the steps to follow to successfully install the CPI Pilot project on your local computer running with a Windows operating system. This procedure has been tested and works. These steps can be adapted to install the CPI Pilot project on another OS such as Linux or Mac OS X. This procedure only describes the steps necessary to get the Django development server running so that you can access your project website on your localhost. The procedure follows below:

Download the latest release of the cpipilot project (zip file) from Source-forge.

Unzip the program and delete any folders associated with Mac OS X. Place the downloaded cpipilot folder anywhere on your system.

Download Python for your version of windows. Django will not work with Python 3.x yet because every Python 2.x is not backward compatible with any Python 3.x. ‘Python 2.6.2 Windows binary installer’ has been tested and is compatible with Windows XP.

Follow these instructions to add C:\python26 to your PATH system variable. Type python anywhere at the command prompt as a test and exit() to exit.

Download and install MySQL for windows. Run the MySQL server instance config wizard (these options can be set during installation also) from the start menu and go through the procedure making sure to set the database usage to ‘non-transactional database only’ and the port that the database uses to 8889.

Add the ...\mysql\bin folder to your PATH environment variable so that the mysql command can be accessed from the command line. Follow the same instructions to do this.

From the command prompt run mysql -u[admin] -p[admin-password]. This will allow you to access your databases and give you a mysql> prompt.

From there, create a new database called dev:

mysql> create database dev;

See it that worked:

mysql> show databases;

Make a new user named ‘user’:

mysql> create user user@localhost identified by "password";

and grant him local privileges:

mysql> grant all on *.* to user@localhost identified by "password";
mysql> flush privileges;

Search this page for ‘Installing an official release’ which will let you know how to install Django on a windows system. When doing this, use 7-zip as it will help you extract these and future archived files.

Download and install MySQLDB (MySQL support for Python 2.6) for windows.

Download and unzip django-pagination. Copy the pagination folder into your C:\Python26\Lib\site-packages folder.

Run cmd and navigate into the cpipilot folder. Execute the command python manage.py runserver. This runs the development server and you should get the message:

Django version 1.0.2 final, using settings 'cpipilot.settings'
Development server is running at http://127.0.0.1:8000/
Quit the server with CTRL-BREAK.

If you encounter any errors here please review the Django installation instructions.

Now, visit your localhost (http://127.0.0.1:8000/) in your browser and you should see your instance of CPI Pilot running. The home page will show but you will get errors if you try to navigate to any page that uses the database. This is because the models have not been synced with the database yet and no tables exist in the database.

To synchronise the database with your cpipilot/repository/models.py file run cmd and navigate into the cpipilot folder type the command python manage.py syncdb. This will create the required tables in the database. Say yes to creating an auth system and fill in the relevant details. This allows you to use Django’s admin system which can be found at (http://127.0.0.1:8000/admin/). From here you will see the tables that the admin system has access to view and modify. You will also see that the database doen’t contain any data yet. Django’s admin system allows very easy access to non-batch database modification. The admin interface can be modified by altering cpipilot/repository/admin.py. Additional information on the admin system can be found at here

Download and install NumPy from Source-forge.

Set the PYTHONPATH variable... to do this:

  • Go to Control Panel -> System-> Advanced.
  • Click ‘Environment Variables’ button.
  • In the System Variables panel that appears, click New and enter PYTHONPATH in the Variable Name field. In the Variable Value field enter path to the cpipilot folder and to the python folder directly below it. These path’s must be separated by a semi-colon.

Navigate into .../cpipilot/database and run the command mysql -u[user] -p[password] dev < core.sql. This loads needed core data into the database. To load the data from an experiment (5 were included in the core data when this was written), navigate into the .../cpipilot/experiments and run the command python import_script.py experimentFolderName/. Don’t forget the trailing slash above. This takes a while but once it is done you will see your chosen experiment has been loaded and you can navigate through it and access its data. The raw data is still accessed from the EBI servers.

If you want to download the raw data from the EBI servers following the instructions on downloading an experiment. At the moment this is only possible on an experiment by experiment basis. The datapath in the project’s setting.py file also need to be directed to the location of the downloads.

Before continuing with this admin guide please see this Django-Docs page which explains the usefulness of the manage.py module which is a wrapper for the django-admin.py module that is responsible for command-line administrative tasks.

Database Interaction

Django’s admin service does not do everything. Sometimes you have to resort to custom SQL queries. Below are a few of my most commonly used SQL commands for this project. SQL is well documented and many questions are answered on various forums so normally a single google search is enough to find exactly what you are looking for if it is not already listed below.

From the command line

Saving the contents of a database to the hard-disk:

mysqldump --opt -u[user] -p[password] -h[host-name] -P[port-number] [database-name] > [database-store-file-name].sql

Loading a saved database from the hard-disk:

mysql -u[user] -p[password] -h[host-name] -P[port-number] [database-name] < [database-store-file-name].sql

Sometimes it is useful to drop all the database tables for a database. This allows python manage.py syncdb to create all the tables again without you having to drop the database, recreate it and reassign user permissions to it:

mysqldump -u[user] -p[password] --no-data --add-drop-table [database-name] | grep ^DROP | mysql -u[user] -p[password] [database-name]

Check which port and socket mysql is listening on:

netstat -ln | grep mysql

Log into the MySQL server (see commands in the next section):

mysql -u[user-name] -p[password]

From the MySQL prompt

Create a new user and grant privileges to that user:

mysql> create user [user]@[ip] identified by "[password]";
mysql> grant all privileges on *.* TO [user]@[ip] identified by "[password]";
mysql> flush privileges;

A simple select with optional ordering and limiting:

select * from [table-name] order by [column-name] limit 10;

Changing a model in the models.py file and running python manage.py syncdb doesn’t update the database tables so sometimes it is necessary to add a new column to a database table:

mysql> alter table [table-name] add column [column-name]  [type... example... is varchar(255)];

Remove a column:

alter table [table] drop [column];

Change the name of a preexisting column:

alter table [table-name] change [column-name] [new-column-name] [column properties... example is... int not null auto_increment primary key];

Finding duplicate entries in a single table column, this also gives the number of duplicates:

select count(*), [column-name] from [table-name] group by [column-name] having count(*)>1 order by count(*);

Change a column value in specific rows:

update [table-name] set [column-name] = [new-vaule] where [column-name] = [value];

Doing a join across multiple tables:

select [selection-columns]
from
  [table-name-1] as [tn1 (alias)]
  join [table-name-2] as [tn2 (alias)]
       on tn1.[join-on-column-name-1] = [tn2].[join-on-column-name-2]
  join [table-name-3] as [tn3 (alias]
       on tn2.[join-on-column-name-3] = tn3.[join-on-column-name-4]
where
  tn3.[where-column -name]= [value];

Inserting into one table from another:

insert into [to-table] ([column-2], [column-3], [column-4], [column-5]) select [column-1], [column-2], [column-3], [column-4] FROM [from-table];

Inserting into one table from two others:

insert into [to-table] ([column-2], [column-3], [column-4], [column-5]) select [table-1].[column-name-1], [table-2].[column-name-1] from [table-1], [table-2] where [table-1].[column-name-2]=[table-2.column-name-2];

Creating a copy of a table:

create table [copy-table-name] select * from [original-table-name];

Copy a table from one database to another:

create table [to-DB].[to-table] LIKE [from-DB].[from-table];
insert into [to-DB].[to-table] select * from [from-DB].[from-table];

Show the indexes that a table contains. Indexes are used to speed up look-ups into a table:

show index from [table];

Add an auto-incrementing primary key:

alter table [table] add [column-name... usually 'id'] integer auto_increment not null primary key;

Remove a primary key:

alter table [table] drop primary key;

Delete all records in a table:

truncate table [table];

Find and replace:

update [table] set [column] = [new-value] where [column] = [old-value];

Explain, shows how a query is being executed:

explain [query];

Deploying to a Server

Deployment of this instance of Django to our servers required the assistance of our external services department. They agreed to install an Apache HTTP server called mod_wsgi which has proven to work relatively well up until now with Django. If the code below confuses you as it does myself... here are some further notes on Deploying Django.

For our installation, scanning through the large httpd.conf file shows the following EBI specific deviations from the default:

# set the directory
<Directory "/nfs/research/huber/www/html">
           Options Indexes FollowSymLinks
           AllowOverride None
           Order allow,deny
           Allow from all
</Directory>

# set the document root
DocumentRoot "/nfs/research/huber/www/html"

# set the server root
ServerRoot "/ebi/www/ptest/servers/python_wsgi"

# set an alias for the server specific error messages
Alias /error/ "/var/www/error/"

# indicate where the error logs are kept
ErrorLog "|/usr/sbin/rotatelogs /ebi/www/ptest/servers/python_wsgi/logs/error_log.%Y-%m-%d-%H_%M_%S 86400"

# imports the file described below
Include conf.d_local/*.conf

The entire contents of the wsgi.conf are shown below:

LoadModule wsgi_module modules_local/mod_wsgi.so

WSGIDaemonProcess site-1 user=w3nobody group=w3group threads=25
WSGIProcessGroup site-1
WSGISocketPrefix run/wsgi

# EBI specific
Alias /huber-srv/cpipilot/media/ /nfs/research/huber/www/html/cpipilot/media/

# EBI specific
<Directory /nfs/research/huber/www/html/cpipilot/media>
    Order deny,allow
    Allow from all
</Directory>

# EBI specific (points to the file described below)
WSGIScriptAlias /huber-srv/cpipilot /ebi/www/ptest/python/apache/django.wsgi

# EBI specific
<Directory /ebi/www/ptest/python/apache>
    Order deny,allow
    Allow from all
</Directory>

LogLevel info

The entire contents of django.wsgi are shown below:

import os, sys

# for Django, MySQLdb and pagination
sys.path.insert(1,'/ebi/www/ptest/python/site-packages')
# the parent folder of the cpipilot project on our servers
sys.path.append('/nfs/research/huber/www/html')
# the folder that stores various other utils
sys.path.append('/net/nas10b/vol1/homes/dmurrell/dmurrell/development/python/utils')

os.environ['DJANGO_SETTINGS_MODULE'] = 'cpipilot.settings'

import django.core.handlers.wsgi

application = django.core.handlers.wsgi.WSGIHandler()

Watching the error-log is very useful. Use the following terminal command to keep the last few entries to the error log showing in the terminal (linux/OSX):

tail -f [error.log]

Intermediate Data Storage Format

In order to easily upload data to or save data from the online database for an individual experiment, it is useful to define an intermediate data format which captures the structure of the data as it’s stored online so that scripts can operate on it locally during reanalisis. Below is a list of the comma seperated variable files that make up the intermediate format along with the header involved and some example values.

Data.csv (links to the image or movie files):

Heading Value
Location LT0032_01/174
ReagentExternalID 41652
Vendor Ambion
RawPath LT0032_01–ex2005_10_21–sp2005_05_05-tt17–c4/_movies/174–15–06–(9,12)–41652–AK2-gfp.avi
RawType movie
ViewPath LT0032_01–ex2005_10_21–sp2005_05_05-tt17–c4/_movies/174–15–06–(9,12)–41652–AK2-gfp.avi
ViewType movie
Reporter HIST1H2BJ-GFP
Comments none

rpMapping.csv (reagent/phenotype binary mapping):

Heading Value
ReagentExternalID 135516
Vendor Ambion
MCY_0000001 0
MCY_0000002 1
MCY_0000003 0
MCY_0000004 0
... ...

rtMapping.csv (reagent/target mapping):

Heading Value
ReagentExternalID 41652
Vendor Ambion
TargetExternalIDs ENST00000354858;ENST00000398192;ENST00000373449
GeneHGNC AK2

reagents.csv (list of reagents):

Heading Value
ExternalID 41652
Vendor Ambion
SubmissionID MCO_0000001
Type dsRNA
ForwardSequence GAUUGGCUGAAAACUUCUGTT
ReverseSequence CAGAAGUUUUCAGCCAAUCTG

expinf.csv (experiment information):

Heading Value
Investigation Title Mitocheck primary screen
Experiment Description Genome-wide screen for chromosome segregation defects using siRNAs in Hela cells.
Persons Last Name Heriche
Persons Mid Initials .
Persons First Name Jean-Karim
Persons Email jeankarim.heriche@embl-heidelberg.de
Persons Phone 49 6221 387 8860
URL www.mitocheck.org
Data Root Raw MitocheckScreen/source/movies/
Data Root View MitocheckScreen/source/movies/
Target Database .
Phenotypes File phenotypes.csv
Reagents File reagents.csv
rtMapping File rtMapping.csv
rpMapping File rpMapping.csv
Data File data.csv

phenotypes.csv (phenotype descriptions):

Heading Value
SubmissionID MCY_0000001
Description nuclei stay close together
AnnotationType manual