.. highlight:: rest .. _admin-guide: ===================== Administration Guide ===================== After reading the administration guide, it should be possible to download the project from `Source-forge `_ and get it set up on your own servers. This guide will give you no indication as to how to modify the inner workings of the project. See the :ref:`developer-guide` for this. New administrators to this project should first read the :ref:`user-guide` to familiarise themselves with the project before proceeding as this section builds on the first. General Installation Guide -------------------------- Below is a description of the steps required to successfully install the CPI Pilot project on your local computer. These instructions are not operating system specific so for each instruction a general explanation of what to do is provided. Working out how to do it for your specific platform will require a little searching. The exception here is for Windows XP where these steps are explained in detail and may give clues as to how to achieve the same results on your system. This procedure only describes the steps necessary to get the Django development server running so that you can access your project website on your localhost. Once the project is working on the development server steps should be taken to `deploy it onto a production server ` so that it may be accessed publically. Downloading and installing dependencies ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ #. Download and unpack the latest release of the `CPI pilot project `_. This will unpack to a ``cpipilot`` folder which you can place anywhere on your system. #. Download and install `Python `_. Django will not work with Python 3.x yet because every Python 3.x is not backward compatible with any Python 2.x (18/08/2009). #. Download and install `MySQL `_. Set the port that the database uses to 8889. #. Download and install `Django `_. #. Download and install `MySQLDB `_. #. Download and install `NumPy `_. Creating a database ^^^^^^^^^^^^^^^^^^^ Enter the MySQL prompt (``mysql -u[admin] -p[admin_password]``) and: * Create a new database:: mysql> create database dev; * Make a new user named ``user``:: mysql> create user user@localhost identified by "password"; * Grant him local privileges:: mysql> grant all on *.* to user@localhost identified by "password"; mysql> flush privileges; Testing the development server ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ #. Navigate into the ``cpipilot`` folder from a command-line/terminal program. Execute the command ``python manage.py runserver``. This runs the development server and you should get the message:: Django version 1.0.2 final, using settings 'cpipilot.settings' Development server is running at http://127.0.0.1:8000/ Quit the server with CTRL-BREAK. #. If you encounter any errors here please review the Django `installation instructions `_. To check your progress so far, visit your localhost (http://127.0.0.1:8000/) in your browser and you should see your instance of CPI Pilot running. The home page will show but you will get errors if you try to navigate to any page that uses the database. This is because the models have not been synced with the database yet and no tables exist in the database. Quit the ``development server`` using Ctrl C because the next steps involves populating the database. Syncing the models and database tables ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ #. To synchronise the database with your ``cpipilot/repository/models.py`` file navigate into the ``cpipilot`` folder and type the command ``python manage.py syncdb``. This will create the required tables in the database. Say yes to creating an auth system and fill in the relevant details. This allows you to use Django's admin system which can be found at (http://127.0.0.1:8000/admin/). From here you will see the tables that the admin system has access to view and modify. You will also see that the database doesn't contain any data yet. Django's admin system allows very easy access for web based database entry modification. The admin interface can be modified by altering ``cpipilot/repository/admin.py``. Additional information on the admin system can be found `here `_. Populating the database ^^^^^^^^^^^^^^^^^^^^^^^ #. Load the core data by navigating into ``cpipilot/core`` and running the command ``mysql -u[admin] -p[admin_password] dev < core.sql``. This replaces the ``db`` and ``vendor`` table which store information about external databases and reagent vendors respectively. This core data is not added by any particular experiment and so is kept seperate from any experimental data. #. Load data pertaining to a particular experiment. First, go to the :ref:`experiment page ` on the server that you wish to take the experiment press the "Download All Experiment Data" button. This download contains three folders: * ``/raw`` - contains instructions on how to download all the experiment's raw data. * ``/CPI_data`` - contains the data in a special :ref:`intermediate ` form which can be imported into the database as well as exported from it. * ``/original`` - contains any files uploaded by the experimenter as well as the scripts required to convert this data into the CPI pilot form. Extract this download into the ``cpipilot/experiments`` folder. Then, navigate to the ``cpipilot/experiments`` folder and type the command ``python import_script.py [experiment_folder_name]/``. Don't forget the trailing slash This imports the experiment data to the database. This takes a while, but once it is done you will see your chosen experiment has been loaded and through the website (http://127.0.0.1:8000/) you can navigate through it and access its data. #. The raw data is still accessed from the EBI servers. If you want to download the raw data from the EBI servers following the instructions on :ref:`downloading an experiment `. At the moment this is only possible on an experiment by experiment basis. The root path for accessing the data (which currently points to the EBI servers) can be changed by modifying the ``DATA_ROOT`` in the ``settings.py`` module to point to somewhere local or to your own servers. This means the the experiments' raw data can be stored in a different location than that of the ``cpipilot`` project folder. Downloading data for all experiments hasn't been implemented yet. Before continuing with this admin guide please see this `Django-Docs `_ page which explains the usefulness of the ``manage.py`` module which is a wrapper for the django-admin.py module that is utilized for command-line administrative tasks. Windows XP Installation Guide ----------------------------- Below is a description of the steps to follow to successfully install the CPI Pilot project on your local computer running with a Windows operating system. This procedure has been tested and works. These steps can be adapted to install the CPI Pilot project on another OS such as Linux or Mac OS X. This procedure only describes the steps necessary to get the Django development server running so that you can access your project website on your localhost. The procedure follows below: Download the latest release of the cpipilot project (zip file) from `Source-forge `_. Unzip the program and delete any folders associated with Mac OS X. Place the downloaded ``cpipilot`` folder anywhere on your system. `Download Python `_ for your version of windows. Django will not work with Python 3.x yet because every Python 2.x is not backward compatible with any Python 3.x. 'Python 2.6.2 Windows binary installer' has been tested and is compatible with Windows XP. Follow these `instructions `_ to add ``C:\python26`` to your ``PATH`` system variable. Type ``python`` anywhere at the command prompt as a test and exit() to exit. Download and install `MySQL for windows `_. Run the MySQL server instance config wizard (these options can be set during installation also) from the start menu and go through the procedure making sure to set the database usage to 'non-transactional database only' and the port that the database uses to 8889. Add the ``...\mysql\bin`` folder to your PATH environment variable so that the mysql command can be accessed from the command line. Follow the same `instructions `_ to do this. From the command prompt run ``mysql -u[admin] -p[admin-password]``. This will allow you to access your databases and give you a ``mysql>`` prompt. From there, create a new database called dev:: mysql> create database dev; See it that worked:: mysql> show databases; Make a new user named 'user':: mysql> create user user@localhost identified by "password"; and grant him local privileges:: mysql> grant all on *.* to user@localhost identified by "password"; mysql> flush privileges; Search `this page `_ for 'Installing an official release' which will let you know how to install Django on a windows system. When doing this, use `7-zip `_ as it will help you extract these and future archived files. Download and install `MySQLDB `_ (MySQL support for Python 2.6) for windows. Download and unzip `django-pagination `_. Copy the pagination folder into your ``C:\Python26\Lib\site-packages`` folder. Run cmd and navigate into the ``cpipilot`` folder. Execute the command ``python manage.py runserver``. This runs the development server and you should get the message:: Django version 1.0.2 final, using settings 'cpipilot.settings' Development server is running at http://127.0.0.1:8000/ Quit the server with CTRL-BREAK. If you encounter any errors here please review the Django `installation instructions `_. Now, visit your localhost (http://127.0.0.1:8000/) in your browser and you should see your instance of CPI Pilot running. The home page will show but you will get errors if you try to navigate to any page that uses the database. This is because the models have not been synced with the database yet and no tables exist in the database. To synchronise the database with your ``cpipilot/repository/models.py`` file run cmd and navigate into the cpipilot folder type the command ``python manage.py syncdb``. This will create the required tables in the database. Say yes to creating an auth system and fill in the relevant details. This allows you to use Django's admin system which can be found at (http://127.0.0.1:8000/admin/). From here you will see the tables that the admin system has access to view and modify. You will also see that the database doen't contain any data yet. Django's admin system allows very easy access to non-batch database modification. The admin interface can be modified by altering ``cpipilot/repository/admin.py``. Additional information on the admin system can be found at `here `_ Download and install `NumPy `_ from Source-forge. Set the ``PYTHONPATH`` variable... to do this: * Go to Control Panel -> System-> Advanced. * Click 'Environment Variables' button. * In the ``System Variables`` panel that appears, click New and enter ``PYTHONPATH`` in the ``Variable Name`` field. In the ``Variable Value`` field enter path to the cpipilot folder and to the python folder directly below it. These path's must be separated by a semi-colon. Navigate into ``.../cpipilot/database`` and run the command ``mysql -u[user] -p[password] dev < core.sql``. This loads needed core data into the database. To load the data from an experiment (5 were included in the core data when this was written), navigate into the ``.../cpipilot/experiments`` and run the command ``python import_script.py experimentFolderName/``. Don't forget the trailing slash above. This takes a while but once it is done you will see your chosen experiment has been loaded and you can navigate through it and access its data. The raw data is still accessed from the EBI servers. If you want to download the raw data from the EBI servers following the instructions on :ref:`downloading an experiment `. At the moment this is only possible on an experiment by experiment basis. The datapath in the project's ``setting.py`` file also need to be directed to the location of the downloads. Before continuing with this admin guide please see this `Django-Docs `_ page which explains the usefulness of the ``manage.py`` module which is a wrapper for the django-admin.py module that is responsible for command-line administrative tasks. Database Interaction -------------------- Django's admin service does not do everything. Sometimes you have to resort to custom SQL queries. Below are a few of my most commonly used SQL commands for this project. SQL is well documented and many questions are answered on various forums so normally a single google search is enough to find exactly what you are looking for if it is not already listed below. From the command line ^^^^^^^^^^^^^^^^^^^^^ Saving the contents of a database to the hard-disk:: mysqldump --opt -u[user] -p[password] -h[host-name] -P[port-number] [database-name] > [database-store-file-name].sql Loading a saved database from the hard-disk:: mysql -u[user] -p[password] -h[host-name] -P[port-number] [database-name] < [database-store-file-name].sql Sometimes it is useful to drop all the database tables for a database. This allows ``python manage.py syncdb`` to create all the tables again without you having to drop the database, recreate it and reassign user permissions to it:: mysqldump -u[user] -p[password] --no-data --add-drop-table [database-name] | grep ^DROP | mysql -u[user] -p[password] [database-name] Check which port and socket mysql is listening on:: netstat -ln | grep mysql Log into the MySQL server (see commands in the next section):: mysql -u[user-name] -p[password] From the MySQL prompt ^^^^^^^^^^^^^^^^^^^^^ Create a new user and grant privileges to that user:: mysql> create user [user]@[ip] identified by "[password]"; mysql> grant all privileges on *.* TO [user]@[ip] identified by "[password]"; mysql> flush privileges; A simple select with optional ordering and limiting:: select * from [table-name] order by [column-name] limit 10; Changing a model in the ``models.py`` file and running ``python manage.py syncdb`` doesn't update the database tables so sometimes it is necessary to add a new column to a database table:: mysql> alter table [table-name] add column [column-name] [type... example... is varchar(255)]; Remove a column:: alter table [table] drop [column]; Change the name of a preexisting column:: alter table [table-name] change [column-name] [new-column-name] [column properties... example is... int not null auto_increment primary key]; Finding duplicate entries in a single table column, this also gives the number of duplicates:: select count(*), [column-name] from [table-name] group by [column-name] having count(*)>1 order by count(*); Change a column value in specific rows:: update [table-name] set [column-name] = [new-vaule] where [column-name] = [value]; Doing a join across multiple tables:: select [selection-columns] from [table-name-1] as [tn1 (alias)] join [table-name-2] as [tn2 (alias)] on tn1.[join-on-column-name-1] = [tn2].[join-on-column-name-2] join [table-name-3] as [tn3 (alias] on tn2.[join-on-column-name-3] = tn3.[join-on-column-name-4] where tn3.[where-column -name]= [value]; Inserting into one table from another:: insert into [to-table] ([column-2], [column-3], [column-4], [column-5]) select [column-1], [column-2], [column-3], [column-4] FROM [from-table]; Inserting into one table from two others:: insert into [to-table] ([column-2], [column-3], [column-4], [column-5]) select [table-1].[column-name-1], [table-2].[column-name-1] from [table-1], [table-2] where [table-1].[column-name-2]=[table-2.column-name-2]; Creating a copy of a table:: create table [copy-table-name] select * from [original-table-name]; Copy a table from one database to another:: create table [to-DB].[to-table] LIKE [from-DB].[from-table]; insert into [to-DB].[to-table] select * from [from-DB].[from-table]; Show the indexes that a table contains. Indexes are used to speed up look-ups into a table:: show index from [table]; Add an auto-incrementing primary key:: alter table [table] add [column-name... usually 'id'] integer auto_increment not null primary key; Remove a primary key:: alter table [table] drop primary key; Delete all records in a table:: truncate table [table]; Find and replace:: update [table] set [column] = [new-value] where [column] = [old-value]; Explain, shows how a query is being executed:: explain [query]; .. _deploying: Deploying to a Server --------------------- Deployment of this instance of Django to our servers required the assistance of our external services department. They agreed to install an Apache HTTP server called `mod_wsgi `_ which has proven to work relatively well up until now with Django. If the code below confuses you as it does myself... here are some further notes on `Deploying Django `_. For our installation, scanning through the large ``httpd.conf`` file shows the following EBI specific deviations from the default:: # set the directory Options Indexes FollowSymLinks AllowOverride None Order allow,deny Allow from all # set the document root DocumentRoot "/nfs/research/huber/www/html" # set the server root ServerRoot "/ebi/www/ptest/servers/python_wsgi" # set an alias for the server specific error messages Alias /error/ "/var/www/error/" # indicate where the error logs are kept ErrorLog "|/usr/sbin/rotatelogs /ebi/www/ptest/servers/python_wsgi/logs/error_log.%Y-%m-%d-%H_%M_%S 86400" # imports the file described below Include conf.d_local/*.conf The entire contents of the ``wsgi.conf`` are shown below:: LoadModule wsgi_module modules_local/mod_wsgi.so WSGIDaemonProcess site-1 user=w3nobody group=w3group threads=25 WSGIProcessGroup site-1 WSGISocketPrefix run/wsgi # EBI specific Alias /huber-srv/cpipilot/media/ /nfs/research/huber/www/html/cpipilot/media/ # EBI specific Order deny,allow Allow from all # EBI specific (points to the file described below) WSGIScriptAlias /huber-srv/cpipilot /ebi/www/ptest/python/apache/django.wsgi # EBI specific Order deny,allow Allow from all LogLevel info The entire contents of ``django.wsgi`` are shown below:: import os, sys # for Django, MySQLdb and pagination sys.path.insert(1,'/ebi/www/ptest/python/site-packages') # the parent folder of the cpipilot project on our servers sys.path.append('/nfs/research/huber/www/html') # the folder that stores various other utils sys.path.append('/net/nas10b/vol1/homes/dmurrell/dmurrell/development/python/utils') os.environ['DJANGO_SETTINGS_MODULE'] = 'cpipilot.settings' import django.core.handlers.wsgi application = django.core.handlers.wsgi.WSGIHandler() Watching the error-log is very useful. Use the following terminal command to keep the last few entries to the error log showing in the terminal (linux/OSX):: tail -f [error.log] .. _intermediate: Intermediate Data Storage Format -------------------------------- In order to easily upload data to or save data from the online database for an individual experiment, it is useful to define an intermediate data format which captures the structure of the data as it's stored online so that scripts can operate on it locally during reanalisis. Below is a list of the comma seperated variable files that make up the intermediate format along with the header involved and some example values. Data.csv (links to the image or movie files): ================== ============ Heading Value ================== ============ Location LT0032_01/174 ReagentExternalID 41652 Vendor Ambion RawPath LT0032_01--ex2005_10_21--sp2005_05_05-tt17--c4/_movies/174--15--06--(9,12)--41652--AK2-gfp.avi RawType movie ViewPath LT0032_01--ex2005_10_21--sp2005_05_05-tt17--c4/_movies/174--15--06--(9,12)--41652--AK2-gfp.avi ViewType movie Reporter HIST1H2BJ-GFP Comments none ================== ============ rpMapping.csv (reagent/phenotype binary mapping): ================== ============ Heading Value ================== ============ ReagentExternalID 135516 Vendor Ambion MCY_0000001 0 MCY_0000002 1 MCY_0000003 0 MCY_0000004 0 ... ... ================== ============ rtMapping.csv (reagent/target mapping): ================== ============ Heading Value ================== ============ ReagentExternalID 41652 Vendor Ambion TargetExternalIDs ENST00000354858;ENST00000398192;ENST00000373449 GeneHGNC AK2 ================== ============ reagents.csv (list of reagents): ================== ============ Heading Value ================== ============ ExternalID 41652 Vendor Ambion SubmissionID MCO_0000001 Type dsRNA ForwardSequence GAUUGGCUGAAAACUUCUGTT ReverseSequence CAGAAGUUUUCAGCCAAUCTG ================== ============ expinf.csv (experiment information): ====================== ============ Heading Value ====================== ============ Investigation Title Mitocheck primary screen Experiment Description Genome-wide screen for chromosome segregation defects using siRNAs in Hela cells. Persons Last Name Heriche Persons Mid Initials . Persons First Name Jean-Karim Persons Email jeankarim.heriche@embl-heidelberg.de Persons Phone 49 6221 387 8860 URL www.mitocheck.org Data Root Raw MitocheckScreen/source/movies/ Data Root View MitocheckScreen/source/movies/ Target Database . Phenotypes File phenotypes.csv Reagents File reagents.csv rtMapping File rtMapping.csv rpMapping File rpMapping.csv Data File data.csv ====================== ============ phenotypes.csv (phenotype descriptions): ================== ============ Heading Value ================== ============ SubmissionID MCY_0000001 Description nuclei stay close together AnnotationType manual ================== ============