Gene Identification
The goal of the web-tool "Gene Identification" is the identification of all transcription factor binding sites (TFBS) of preselected transcription factors (TFs) in all Arabidopsis thaliana genes. To use this tool the user has to select a specific TF (Factor-Name) from a list of all annotated TFs. To facilitate selection one can first select the TF family. This restricts the number of selectable factors to these family members. The default upstream and downstream region of all genes to be searched is -500 and +50 bp relative to either the transcription start site or the translation start site, depending on the annotation. It is possible to change these parameters. A maximum window of 6000 bp, 2000 bp upstream and 4000 bp downstream, can be selected around either start site. For TFs with binding sites determined with positional weight matrices, the minimal threshold can be increased to detect only genes with highly conserved TFBS. The results can be displayed in two different sort modes. "Gene", which is the default mode, will list the results according to the genome identifier (AGI); "Distance" will sort the results according to the distance of the TFBS to the start site of the gene. The results are displayed in tables appearing below the user interface. These tables identify the gene, the positions of TFBS and, if applicable, the individual score of each TFBS. Also, the orientation of the TFBS relative to the start site of the gene is shown. Furthermore, links to the gene and the genomic positions of the sites are implemented in the result tables. For some TFs the number of sites to be searched had to be restricted. This applies to thirteen TFs with putative binding sites of more than 200,000. In these cases the score used for screenings is displayed in a "table of restriction scores" which can be accessed through a link on the user interface. As the TFBS for TBP (TATA box) and CBF (CAAT box) are also positionally defined, the score restrictions are not applicable to TFBS from TBP and CBF. Furthermore, a score restriction cannot be applied to sites determined by pattern search or to combinatorial elements. For further data processing of results, binding sites detected around annotated genes can be downloaded as a file containing all sites detected for the selected TF between 2000 bp upstream and 4000 bp downstream of each gene. On the result page, genes potentially regulated by small RNA and miRNA are identified in italics and bold, respectively. By selecting "exclude genes putatively regulated by smallRNA" or "exclude genes putatively regulated by miRNA" these genes are excluded from the analysis.

