12.07.2015 Views

View - ResearchGate

View - ResearchGate

View - ResearchGate

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Prediction Using PAINT 55first exon in the Ensembl database to be the transcription start site. Thisapproach was determined to be acceptable for in silico genome-wide locationanalysis (22).In addition to the promoter sequence for each gene, PAINT promoter databasealso contains the cross reference tables that enable retrieval of promoters usingEntrez Gene, the cDNA clone ID, and Genbank accession number. This cross referencewas constructed using information from the Unigene database. This allowsfor convenient retrieval of the promoter sequences directly from a list of genesmarked as significantly varying in expression by any microarray analysis softwareor other gene-expression analysis methods. The PAINT promoter database is periodicallyupdated when a new version of an annotated genome database is released.2.3. The Upstreamer ModuleThe input from the user is a list of identifiers for the genes of interest and thenumber of base pairs of the upstream sequence needed for analysis. The lengthcount is from the start of the gene toward the upstream (5′) end. The identifierlist and parameters are used to query UpstreamDB. The output of the module isthe actual genomic upstream sequences of specified length, for the genes thatare on the user’s list and referenced in the UpstreamDB database. The output isin FAST-ALL (source: http://www.ebi.ac.uk/fasta/) (FASTA) format for furtherprocessing by transcription binding motif inspection/discovery software in theTFRetriever module.The TFRetriever module is envisaged to contain several submodules that cancommunicate with various local and web-based motif inspection and discoverysoftware such as MATCH (TRANSFAC Public) (23), MatInspector (24), andMEME (25), and so on. A motif is a characteristic sequence of a binding site andfunctionally similar motifs are grouped together into families. PAINT 3.5 currentlycontains only the submodule for interacting with MATCH software. Theset of vertebrate TF families is utilized for promoter inspection. The output of theTFRetriever module is the output from the motif discovery program for eachinput sequence list. TFRetriever runs MATCH with settings to minimize falsepositivesor to minimize the sum of false-positives and -negatives. However, userscan filter the results further by specifying a threshold on the “core similarity” andchoosing whether or not the Transcriptional Regulatory Element (TRE) occurrenceson complementary sequence are to be considered in further analysis.The FeasnetBuilder module processes and filters the output from MATCH toconstruct an interaction matrix (hereinafter termed “Feasnet”) representing acandidate set of connections in the regulatory network based on the promotersequence and TF/TRE information. The columns of the interaction matrix correspondto the TREs and each row corresponds to a gene from the input list. Ifthe parameter for binary counting is set in PAINT, the regulation of a gene is

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!