After reading the administration guide, it should be possible to download the project from Source-forge and get it set up on your own servers. This guide will give you no indication as to how to modify the inner workings of the project. See the Developer Guide for this. New administrators to this project should first read the User Guide to familiarise themselves with the project before proceeding as this section builds on the first.
Below is a description of the steps required to successfully install the CPI Pilot project on your local computer. These instructions are not operating system specific so for each instruction a general explanation of what to do is provided. Working out how to do it for your specific platform will require a little searching. The exception here is for Windows XP where these steps are explained in detail and may give clues as to how to achieve the same results on your system. This procedure only describes the steps necessary to get the Django development server running so that you can access your project website on your localhost. Once the project is working on the development server steps should be taken to deploy it onto a production server <deploying> so that it may be accessed publically.
Enter the MySQL prompt (mysql -u[admin] -p[admin_password]) and:
Create a new database:
mysql> create database dev;Make a new user named user:
mysql> create user user@localhost identified by "password";Grant him local privileges:
mysql> grant all on *.* to user@localhost identified by "password"; mysql> flush privileges;
Navigate into the cpipilot folder from a command-line/terminal program. Execute the command python manage.py runserver. This runs the development server and you should get the message:
Django version 1.0.2 final, using settings 'cpipilot.settings'
Development server is running at http://127.0.0.1:8000/
Quit the server with CTRL-BREAK.
If you encounter any errors here please review the Django installation instructions. To check your progress so far, visit your localhost (http://127.0.0.1:8000/) in your browser and you should see your instance of CPI Pilot running. The home page will show but you will get errors if you try to navigate to any page that uses the database. This is because the models have not been synced with the database yet and no tables exist in the database. Quit the development server using Ctrl C because the next steps involves populating the database.
- /raw - contains instructions on how to download all the experiment’s raw data.
- /CPI_data - contains the data in a special intermediate form which can be imported into the database as well as exported from it.
- /original - contains any files uploaded by the experimenter as well as the scripts required to convert this data into the CPI pilot form.
Extract this download into the cpipilot/experiments folder. Then, navigate to the cpipilot/experiments folder and type the command python import_script.py [experiment_folder_name]/. Don’t forget the trailing slash This imports the experiment data to the database. This takes a while, but once it is done you will see your chosen experiment has been loaded and through the website (http://127.0.0.1:8000/) you can navigate through it and access its data.
Before continuing with this admin guide please see this Django-Docs page which explains the usefulness of the manage.py module which is a wrapper for the django-admin.py module that is utilized for command-line administrative tasks.
Below is a description of the steps to follow to successfully install the CPI Pilot project on your local computer running with a Windows operating system. This procedure has been tested and works. These steps can be adapted to install the CPI Pilot project on another OS such as Linux or Mac OS X. This procedure only describes the steps necessary to get the Django development server running so that you can access your project website on your localhost. The procedure follows below:
Download the latest release of the cpipilot project (zip file) from Source-forge.
Unzip the program and delete any folders associated with Mac OS X. Place the downloaded cpipilot folder anywhere on your system.
Download Python for your version of windows. Django will not work with Python 3.x yet because every Python 2.x is not backward compatible with any Python 3.x. ‘Python 2.6.2 Windows binary installer’ has been tested and is compatible with Windows XP.
Follow these instructions to add C:\python26 to your PATH system variable. Type python anywhere at the command prompt as a test and exit() to exit.
Download and install MySQL for windows. Run the MySQL server instance config wizard (these options can be set during installation also) from the start menu and go through the procedure making sure to set the database usage to ‘non-transactional database only’ and the port that the database uses to 8889.
Add the ...\mysql\bin folder to your PATH environment variable so that the mysql command can be accessed from the command line. Follow the same instructions to do this.
From the command prompt run mysql -u[admin] -p[admin-password]. This will allow you to access your databases and give you a mysql> prompt.
From there, create a new database called dev:
mysql> create database dev;
See it that worked:
mysql> show databases;
Make a new user named ‘user’:
mysql> create user user@localhost identified by "password";
and grant him local privileges:
mysql> grant all on *.* to user@localhost identified by "password";
mysql> flush privileges;
Search this page for ‘Installing an official release’ which will let you know how to install Django on a windows system. When doing this, use 7-zip as it will help you extract these and future archived files.
Download and install MySQLDB (MySQL support for Python 2.6) for windows.
Download and unzip django-pagination. Copy the pagination folder into your C:\Python26\Lib\site-packages folder.
Run cmd and navigate into the cpipilot folder. Execute the command python manage.py runserver. This runs the development server and you should get the message:
Django version 1.0.2 final, using settings 'cpipilot.settings'
Development server is running at http://127.0.0.1:8000/
Quit the server with CTRL-BREAK.
If you encounter any errors here please review the Django installation instructions.
Now, visit your localhost (http://127.0.0.1:8000/) in your browser and you should see your instance of CPI Pilot running. The home page will show but you will get errors if you try to navigate to any page that uses the database. This is because the models have not been synced with the database yet and no tables exist in the database.
To synchronise the database with your cpipilot/repository/models.py file run cmd and navigate into the cpipilot folder type the command python manage.py syncdb. This will create the required tables in the database. Say yes to creating an auth system and fill in the relevant details. This allows you to use Django’s admin system which can be found at (http://127.0.0.1:8000/admin/). From here you will see the tables that the admin system has access to view and modify. You will also see that the database doen’t contain any data yet. Django’s admin system allows very easy access to non-batch database modification. The admin interface can be modified by altering cpipilot/repository/admin.py. Additional information on the admin system can be found at here
Download and install NumPy from Source-forge.
Set the PYTHONPATH variable... to do this:
Navigate into .../cpipilot/database and run the command mysql -u[user] -p[password] dev < core.sql. This loads needed core data into the database. To load the data from an experiment (5 were included in the core data when this was written), navigate into the .../cpipilot/experiments and run the command python import_script.py experimentFolderName/. Don’t forget the trailing slash above. This takes a while but once it is done you will see your chosen experiment has been loaded and you can navigate through it and access its data. The raw data is still accessed from the EBI servers.
If you want to download the raw data from the EBI servers following the instructions on downloading an experiment. At the moment this is only possible on an experiment by experiment basis. The datapath in the project’s setting.py file also need to be directed to the location of the downloads.
Before continuing with this admin guide please see this Django-Docs page which explains the usefulness of the manage.py module which is a wrapper for the django-admin.py module that is responsible for command-line administrative tasks.
Django’s admin service does not do everything. Sometimes you have to resort to custom SQL queries. Below are a few of my most commonly used SQL commands for this project. SQL is well documented and many questions are answered on various forums so normally a single google search is enough to find exactly what you are looking for if it is not already listed below.
Saving the contents of a database to the hard-disk:
mysqldump --opt -u[user] -p[password] -h[host-name] -P[port-number] [database-name] > [database-store-file-name].sql
Loading a saved database from the hard-disk:
mysql -u[user] -p[password] -h[host-name] -P[port-number] [database-name] < [database-store-file-name].sql
Sometimes it is useful to drop all the database tables for a database. This allows python manage.py syncdb to create all the tables again without you having to drop the database, recreate it and reassign user permissions to it:
mysqldump -u[user] -p[password] --no-data --add-drop-table [database-name] | grep ^DROP | mysql -u[user] -p[password] [database-name]
Check which port and socket mysql is listening on:
netstat -ln | grep mysql
Log into the MySQL server (see commands in the next section):
mysql -u[user-name] -p[password]
Create a new user and grant privileges to that user:
mysql> create user [user]@[ip] identified by "[password]";
mysql> grant all privileges on *.* TO [user]@[ip] identified by "[password]";
mysql> flush privileges;
A simple select with optional ordering and limiting:
select * from [table-name] order by [column-name] limit 10;
Changing a model in the models.py file and running python manage.py syncdb doesn’t update the database tables so sometimes it is necessary to add a new column to a database table:
mysql> alter table [table-name] add column [column-name] [type... example... is varchar(255)];
Remove a column:
alter table [table] drop [column];
Change the name of a preexisting column:
alter table [table-name] change [column-name] [new-column-name] [column properties... example is... int not null auto_increment primary key];
Finding duplicate entries in a single table column, this also gives the number of duplicates:
select count(*), [column-name] from [table-name] group by [column-name] having count(*)>1 order by count(*);
Change a column value in specific rows:
update [table-name] set [column-name] = [new-vaule] where [column-name] = [value];
Doing a join across multiple tables:
select [selection-columns]
from
[table-name-1] as [tn1 (alias)]
join [table-name-2] as [tn2 (alias)]
on tn1.[join-on-column-name-1] = [tn2].[join-on-column-name-2]
join [table-name-3] as [tn3 (alias]
on tn2.[join-on-column-name-3] = tn3.[join-on-column-name-4]
where
tn3.[where-column -name]= [value];
Inserting into one table from another:
insert into [to-table] ([column-2], [column-3], [column-4], [column-5]) select [column-1], [column-2], [column-3], [column-4] FROM [from-table];
Inserting into one table from two others:
insert into [to-table] ([column-2], [column-3], [column-4], [column-5]) select [table-1].[column-name-1], [table-2].[column-name-1] from [table-1], [table-2] where [table-1].[column-name-2]=[table-2.column-name-2];
Creating a copy of a table:
create table [copy-table-name] select * from [original-table-name];
Copy a table from one database to another:
create table [to-DB].[to-table] LIKE [from-DB].[from-table];
insert into [to-DB].[to-table] select * from [from-DB].[from-table];
Show the indexes that a table contains. Indexes are used to speed up look-ups into a table:
show index from [table];
Add an auto-incrementing primary key:
alter table [table] add [column-name... usually 'id'] integer auto_increment not null primary key;
Remove a primary key:
alter table [table] drop primary key;
Delete all records in a table:
truncate table [table];
Find and replace:
update [table] set [column] = [new-value] where [column] = [old-value];
Explain, shows how a query is being executed:
explain [query];
Deployment of this instance of Django to our servers required the assistance of our external services department. They agreed to install an Apache HTTP server called mod_wsgi which has proven to work relatively well up until now with Django. If the code below confuses you as it does myself... here are some further notes on Deploying Django.
For our installation, scanning through the large httpd.conf file shows the following EBI specific deviations from the default:
# set the directory
<Directory "/nfs/research/huber/www/html">
Options Indexes FollowSymLinks
AllowOverride None
Order allow,deny
Allow from all
</Directory>
# set the document root
DocumentRoot "/nfs/research/huber/www/html"
# set the server root
ServerRoot "/ebi/www/ptest/servers/python_wsgi"
# set an alias for the server specific error messages
Alias /error/ "/var/www/error/"
# indicate where the error logs are kept
ErrorLog "|/usr/sbin/rotatelogs /ebi/www/ptest/servers/python_wsgi/logs/error_log.%Y-%m-%d-%H_%M_%S 86400"
# imports the file described below
Include conf.d_local/*.conf
The entire contents of the wsgi.conf are shown below:
LoadModule wsgi_module modules_local/mod_wsgi.so
WSGIDaemonProcess site-1 user=w3nobody group=w3group threads=25
WSGIProcessGroup site-1
WSGISocketPrefix run/wsgi
# EBI specific
Alias /huber-srv/cpipilot/media/ /nfs/research/huber/www/html/cpipilot/media/
# EBI specific
<Directory /nfs/research/huber/www/html/cpipilot/media>
Order deny,allow
Allow from all
</Directory>
# EBI specific (points to the file described below)
WSGIScriptAlias /huber-srv/cpipilot /ebi/www/ptest/python/apache/django.wsgi
# EBI specific
<Directory /ebi/www/ptest/python/apache>
Order deny,allow
Allow from all
</Directory>
LogLevel info
The entire contents of django.wsgi are shown below:
import os, sys
# for Django, MySQLdb and pagination
sys.path.insert(1,'/ebi/www/ptest/python/site-packages')
# the parent folder of the cpipilot project on our servers
sys.path.append('/nfs/research/huber/www/html')
# the folder that stores various other utils
sys.path.append('/net/nas10b/vol1/homes/dmurrell/dmurrell/development/python/utils')
os.environ['DJANGO_SETTINGS_MODULE'] = 'cpipilot.settings'
import django.core.handlers.wsgi
application = django.core.handlers.wsgi.WSGIHandler()
Watching the error-log is very useful. Use the following terminal command to keep the last few entries to the error log showing in the terminal (linux/OSX):
tail -f [error.log]
In order to easily upload data to or save data from the online database for an individual experiment, it is useful to define an intermediate data format which captures the structure of the data as it’s stored online so that scripts can operate on it locally during reanalisis. Below is a list of the comma seperated variable files that make up the intermediate format along with the header involved and some example values.
Data.csv (links to the image or movie files):
Heading | Value |
---|---|
Location | LT0032_01/174 |
ReagentExternalID | 41652 |
Vendor | Ambion |
RawPath | LT0032_01–ex2005_10_21–sp2005_05_05-tt17–c4/_movies/174–15–06–(9,12)–41652–AK2-gfp.avi |
RawType | movie |
ViewPath | LT0032_01–ex2005_10_21–sp2005_05_05-tt17–c4/_movies/174–15–06–(9,12)–41652–AK2-gfp.avi |
ViewType | movie |
Reporter | HIST1H2BJ-GFP |
Comments | none |
rpMapping.csv (reagent/phenotype binary mapping):
Heading | Value |
---|---|
ReagentExternalID | 135516 |
Vendor | Ambion |
MCY_0000001 | 0 |
MCY_0000002 | 1 |
MCY_0000003 | 0 |
MCY_0000004 | 0 |
... | ... |
rtMapping.csv (reagent/target mapping):
Heading | Value |
---|---|
ReagentExternalID | 41652 |
Vendor | Ambion |
TargetExternalIDs | ENST00000354858;ENST00000398192;ENST00000373449 |
GeneHGNC | AK2 |
reagents.csv (list of reagents):
Heading | Value |
---|---|
ExternalID | 41652 |
Vendor | Ambion |
SubmissionID | MCO_0000001 |
Type | dsRNA |
ForwardSequence | GAUUGGCUGAAAACUUCUGTT |
ReverseSequence | CAGAAGUUUUCAGCCAAUCTG |
expinf.csv (experiment information):
Heading | Value |
---|---|
Investigation Title | Mitocheck primary screen |
Experiment Description | Genome-wide screen for chromosome segregation defects using siRNAs in Hela cells. |
Persons Last Name | Heriche |
Persons Mid Initials | . |
Persons First Name | Jean-Karim |
Persons Email | jeankarim.heriche@embl-heidelberg.de |
Persons Phone | 49 6221 387 8860 |
URL | www.mitocheck.org |
Data Root Raw | MitocheckScreen/source/movies/ |
Data Root View | MitocheckScreen/source/movies/ |
Target Database | . |
Phenotypes File | phenotypes.csv |
Reagents File | reagents.csv |
rtMapping File | rtMapping.csv |
rpMapping File | rpMapping.csv |
Data File | data.csv |
phenotypes.csv (phenotype descriptions):
Heading | Value |
---|---|
SubmissionID | MCY_0000001 |
Description | nuclei stay close together |
AnnotationType | manual |