.. highlight:: rest .. _developer-guide: ================ Developer Guide ================ This section is written to assist people in the contribution to or takeover of the CPI pilot project. Hopefully, this pilot reaches expectations and can aquire more funding allowing it to move into the development stage. New developers to this project should first read the :ref:`user-guide` and :ref:`admin-guide` to familiarise themselves with the project before proceeding as this section builds on the first two. .. _conventions: CPI Pilot conventions ===================== Online Documentation -------------------- This documentation is written using `Sphinx `_, a python documentation generator. All modules, classes and functions are commented using docstrings and these follow the conventions described in `PEP 257 Docstring Conventions `_. A Sphinx extension, `sphinx.ext.autodoc `_, pulls in the docstrings from the modules, classes and functions and processes any Sphinx markup found within. This allows documentation to be stored in a single location and avoids duplication. It also allows any code changes to be followed up by an immediete documentation update in the same location as the code change. Coding Style ------------ Coding follows the `PEP 8 Style Guide for Python Code `_. .. _django-project: The Django Project ================== What is `Django `_? -------------------------------------------------- There is no need to repeat here what is answered many times over elsewhere. In general, whenever a topic is well documented in the official documentation it will be reffered to rather than repeated here. Django is a high-level `Python `_ web-framework that allows rapid web development. Django has extensive `official documentation `_ and a large and growing amount of community support. Webmonkey provides a pretty good guide for `getting started with django `_. Why use Django? --------------- For someone with little to no web-development skills like myself, Django seemed very well documented and it's decoupled nature (model, view, controller framework) appealed to my coding sense. My group leader, Wolfgang Huber, also showed some interest in using a web-framework that was `Python `_ based as it is object-oriented and well renowned for allowance of maintainable code development. Project Layout -------------- The CPI Pilot project folder (``cpipilot`` - `downloadable here `_) is simply a collection of settings for this instance of Django to run. The settings include database configuration and indicates which applications the project uses. These configuration details are stored in a ``settings.py`` module at the root level of the ``cpipilot`` folder. Along with the project configuration, a description of where the browsers URL requests are directed is stored in the ``urls.py`` module at the root level of the ``cpipilot`` folder. The CPI Pilot project folder includes a custom built application, ``repository``, which does most of the work. The ``repository`` application consists of four main contributing modules and follows the standard Django application setup. These are described in summary below and in more detail a little further down: * **The Models** - the ``models.py`` module describes models (Python classes) for every table in the database and also information about how these models are related to one another in 1-1, many-1, 1-many or many-many relationships. Each model in the ``models.py`` module describes a single database table. * **The URLs** - a list of patterns in the ``urls.py`` module indicate which view functions should be called when the browser is directed to a specific url. * **The Views** - each view (Python function) in the ``views.py`` module is responsible for doing some 'logic' behind the scenes and supplying an :ref:`HTML template ` with a context of variables that the template uses when generating the end users HTML. * **The Admin** - the ``admin.py`` module registers models with the built in `Django admin `_ interface which was designed to allow a user (in this case a super-user only) to modify or view database entries easily through a web based interface without having to resort to using SQL queries or the Django object relational model (ORM). Another custom built application that was used for a while but has now been taken offline is the ``feedback`` application that was intended for anonymous user feedback. The feedback aoplication was no longer required since the project had been posted on the web-based source code repository, `SourceForge `_, which includes these features at a higher level of detail. The ``feedback`` application may be modified for other purposes so it has been left inside the project folder. The project uses an external application called ``pagination`` to handle the separation of long lists into sublists of objects that are viewable in a single page. ``pagination`` also takes care of the navigation between these single pages. The current version of the ``pagination`` application in use at the moment is actually a hybrid formed from two external pagination applications. The HTML templates from the one were preffered so they were incorporated into the other. A summarized high-level description on how the framework does it's business follows below: * the end user points his browser at a particular URL. * this URL is matched to one of the URL patterns in the ``urls.py`` file and, if need be, variables are extracted from the URL. * the URL pattern matched calls a view function passing in the relevant parameters. * the view function fills up some data structures (usually by querying the database) and passes these to the an HTML template. * the HTML template is filled and generates the HTML that is passed back to the users browser. For a more in-depth description on how a users browser request works see James Bennett's `blog post `_ on how Django processes a request. Module Descriptions ------------------- The Configuration Module ^^^^^^^^^^^^^^^^^^^^^^^^ Please consult the full list of the available `Django settings `_ for a deeper understanding of this section. Some of the more important parts of the configuration module for the CPI Pilot, ``settings.py``, are explained below:: import os.path PROJECT_DIR = os.path.dirname(__file__) ``PROJECT_DIR`` now contains the full path of your project folder and can be used elsewhere in the ``settings.py`` module so that your project may be moved around the system without you having to worry about changing any troublesome hard-coded paths. :: DEBUG = True turns on debug mode allowing the browser user to see project settings and temporary variables. :: ADMINS = ( ('Daniel Murrell', 'daniel.murrell@ebi.ac.uk') ) sends all errors from the production server to the admin's email address. :: DATABASE_ENGINE = 'mysql' DATABASE_NAME = 'dev' DATABASE_USER = 'user' DATABASE_PASSWORD = 'password' DATABASE_HOST = 'mysql-cpipilot' DATABASE_PORT = '4199' sets up the options required for Django to connect to your database. :: MEDIA_ROOT = os.path.join(PROJECT_DIR, 'media') tells Django where to find your media files such as images that the :ref:`HTML templates ` might use. :: ROOT_URLCONF = 'cpipilot.urls' tells Django to start finding URL matches at in the ``urls.py`` module in the ``cpipilot`` project folder. :: TEMPLATE_DIRS = ( os.path.join(PROJECT_DIR, 'templates'), ) tells Django where to find your HTML template files. :: INSTALLED_APPS = ( 'django.contrib.auth', 'django.contrib.contenttypes', 'django.contrib.sessions', 'django.contrib.sites', 'django.contrib.admin', 'cpipilot.repository', 'cpipilot.feedback', 'pagination', ) tells Django which applications (custom and external) to use in your project. The custom applications, ``cpipilot.repository`` and ``cpipilot.feedback`` are stored in the project folder. Along with these custom applications, the project uses many base Django applications (django.contrib.*s above) which assist in the in-built `admin server `_ as well as the before mentioned ``pagination`` application. .. _models-description: The Models Module ^^^^^^^^^^^^^^^^^ Please consult the official documentation on `Models `_ for a deeper understanding of this section. The models defined in :ref:`models.py ` uniquely relate to tables in the database. There are essentially two different types of models. Models that describe and relate to phisical entities in the experiments and models that form relationships between these entity models (at the moment, exclusively many-to-many relationships). Entity models together with incredibly brief descriptions are listed below: * :ref:`Experiment-model` - a typical high throughput experiment. * :ref:`Person-model` - a person. * :ref:`Vendor-model` - a reagent vendor. * :ref:`Gene-model` - a gene. * :ref:`Image-model` - a link to an image/movie on the harddisk. * :ref:`Reagent-model` - a reagent used in the experiment. * :ref:`Mapping-model` - a mapping of reagent/target associations. * :ref:`Target-model` - the target of a reagent. * :ref:`ImageSet-model` - a set of images is an imageSet. * :ref:`Phenotype-model` - an experiment defined phenotype. * :ref:`DB-model` - an external database. For more in-depth descriptions of these entities please see :ref:`definitions` or follow the definition link through the individual model descriptions linked to above. Relationship models include the following: * :ref:`ExperimentRelPerson-model` * :ref:`ReagentRelExperiment-model` * :ref:`ReagentRelPhenotype-model` * :ref:`RtLink-model` * :ref:`RtLinkRelMapping-model` The URLs modules ^^^^^^^^^^^^^^^^ The defined URL patterns for the CPI Pilot project are divided into URL patterns specific to the project and URL patterns specific to the applications. For more information on how the pattern matching syntax work or how to write your own url patterns please consult Django's `URL Dispatcher `_ documentation. Project specific URL patterns """"""""""""""""""""""""""""" The URL patterns specific to the project are applied in the ``urls.py`` file that is stored in the project directory ``cpipilot``. The code segments that add these URL patterns aren't lengthy and are shown below:: urlpatterns = patterns('', (r'^debug/$', views.debug), (r'^admin/(.*)', admin.site.root), (r'^repository/', include('cpipilot.repository.urls')), ) In order these patterns achive the following: #. If the site URL (e.g. http://wwwdev.ebi.ac.uk/huber-srv/cpipilot/) is followed by a ``debug`` URL extension then the debug function in the projects views module is called. #. If the site URL is followed by an ``admin`` extension then Django's admin system is invoked. #. If the site URL is followed by a ``repository`` extension then the URL patterns for the repository application are included. This, in effect, matches any additional URL extensions with the URL patterns defined in the ``urls.py`` file of the repository application. and:: urlpatterns += patterns('django.views.generic.simple', url(r'^$', 'direct_to_template', {'template': 'home.html'}, name="root"), url(r'^home/$', 'direct_to_template', {'template': 'home.html'}, name="home"), url(r'^downloads/$', 'direct_to_template', {'template': 'downloads.html'}, name="downloads"), url(r'^contact/$', 'direct_to_template', {'template': 'contact.html'}, name="contact"), ) These are project specific 'menu' type URL patterns that invoke the use of Django's `generic views `_ to return specific HTML templates without going through a view function. View functions are not needed here because no variables need to be passed through to the HTML templates in these cases. Repository specific URL patterns """""""""""""""""""""""""""""""" The URL patterns specific to the repository application are applied in the ``/repository/urls.py`` file in the repository application folder. The code segment that adds these URL patterns isn't lengthy either and is shown below:: urlpatterns = patterns('', # search pattern url(r'^search/$', views.search, name="search"), # gene specific patterns url(r'^geneSingle/(\d+)/$', views.geneSingle, name="geneSingle"), # target specific patterns url(r'^targetSingle/(\d+)/$', views.targetSingle, name="targetSingle"), # reagent specific patterns url(r'^reagentSingle/(\d+)/$', views.reagentSingle, name="reagentSingle"), # imageSet specific patterns url(r'^imageSetSingle/(\d+)/$', views.imageSetSingle, name="imageSetSingle"), # experiment specific patterns url(r'^experimentSingle/(\d+)/$', views.experimentSingle, name="experimentSingle"), url(r'^experimentsAll/$', views.experimentsAll, name="experimentsAll"), url(r'^experimentDownload/(\d+)$', views.experimentDownload, name="experimentDownload"), # phenotype specific patterns url(r'^phenotypeToGene/(\d+)/$', views.phenotypeToGene, name="phenotypeToGene"), url(r'^phenotypeToReagent/(\d+)/$', views.phenotypeToReagent, name="phenotypeToReagent"), url(r'^phenotypeGeneLinkage/(\d+)/(\d+)$', views.phenotypeGeneLinkage, name="phenotypeGeneLinkage"), ) The :ref:`search-view` function uses GET data from an HTML form but in all the other URL patterns an ``id`` parameter for the view function is extracted from the URL. This ``id`` is used to find the relevant database entity by matching the ``id`` parameter to the automatically incrementing ``id`` column that Django creates in all it's database tables when no primary key is defined in the corresponding model. As an example, given the full URL ``http://wwwdev.ebi.ac.uk/huber-srv/cpipilot/repository/phenotypeGeneLinkage/89/8496``, the ``repository`` URL extension to the site URL would include the URL patterns of the repository application and the additional ``phenotypeGeneLinkage/89/8496`` URL extension would be matched with the last pattern in the code shown above. The ``89`` would be extracted and supply the first parameter of the :ref:`phenotypeGeneLinkage-view` function which is ``phenotypeID`` and the ``8496`` would be extracted and supply the second parameter for the :ref:`phenotypeGeneLinkage-view` function which is geneID. The :ref:`phenotypeGeneLinkage-view` can then use these IDs to find all instances of data that was obtained using a reagent that targets one of the gene's transcripts and that exhibits the phenotype. The Views module ^^^^^^^^^^^^^^^^ Please consult the official documentation on `Writing Views `_ for a deeper understanding of this section. The functions defined in :ref:`views.py ` represent the logic behind the webpages. The view functions (called through the URL matching) decide which data structures need to be constructed and sent through to the HTML templates. To do this, each view function uses Django's object relational model (ORM) to query the database picking out what is needed for any particular page. The Admin Module ^^^^^^^^^^^^^^^^^ Please consult the official documentation on the `Django admin site `_ for a deeper understanding of this section. The classes defined in :ref:`admin.py ` tell Django what attributes are visible and modifiable from the admin site. .. _database-design: Database Design ================ The current database schema is shown below: .. image:: _static/schema.png :width: 1000 This schema the model attributes and how these models relate to each other. A line with a single arrow-head indicates a ``ForeignKey`` or Many-to-One relationship between two objects. As an example, an imageSet belongs to only one experiment but and experiment can contain multiple imageSets. A line with arrow-heads on each end indicates a ``ManyToManyField`` or Many-to-Many relationship between two objects. To give an associated example, a reagent can belong to multiple experiments and and experiment can make use of multiple reagents. In these Many-to-Many relationships a relation or ``Rel`` model stores each One-to-One relationship on individual rows. With the exception of RtLink, models of this type are named by indicating the models this model forms a relationship between and putting a ``Rel`` between these. This way, data that pertains to each One-to-One relationship can be stored. As an example, the ``ReagentRelPhenotype`` model indicates which reagent relates to which phenotype and contains extra data about this relationship such as the reproducibility of the phenotype indicated by the number of phenotype hits over the number of replicates done with this reagent. A more complicated example of extra data about a relationship can be seen in the ``RtLink`` model. This model includes information about the relationship between any specific *reagent/target relationship* and which mapping the *reagent/target relationship* belongs to. This extra relationship is itself of the Many-to-Many form.