Link to Home

<<
 

2. Starting GenomeComp

Chapter Contents
2.1 Running GenomeComp
2.2 Configuration Window
2.3 Main Window
2.3.1 Overview
2.3.1.1 Color representation
2.3.1.2 Fixed length and color
2.3.1.3 Miscellaneous
2.3.1.3.1 Figure scale
2.3.1.3.2 Length filter
2.3.1.3.3 Island color
2.3.1.3.4 ORFs color
2.3.2 The File Menu
2.3.2.1 Reset
2.3.2.2 Start
2.3.2.3 Save setting
2.3.2.4 Load DEFAULT
2.3.2.5 Save
2.3.2.6 Exit
2.3.3 The Options Menu
2.3.3.1 Configure
2.3.3.2 Compare type
2.3.3.2.1 Self-compare
2.3.3.2.2 Two sequences
2.3.3.2.3 Three sequences
2.3.3.3 Input format
2.3.3.3.1 Fasta format sequence
2.3.3.3.2 Genbank format sequence
2.3.3.3.3 EMBL format sequence
2.3.3.3.4 BLAST output file
2.3.3.4 Anchor style
2.3.3.4.1 Left
2.3.3.4.2 Center
2.3.3.4.3 Right
2.3.3.4.4 Custom
2.3.4 The Search Menu
2.3.4.1 Find ORF
2.3.4.2 Find Next
2.3.4.3 Go to Position
2.3.5 The List Menu
2.3.5.1 Length setting
2.3.5.1.1 All length
2.3.5.1.2 Above 50 bp
2.3.5.1.3 Above 1 Kbp
2.3.5.1.4 Above 5Kbp
2.3.5.1.5 Custom
2.3.5.2 Specific regions
2.3.6 The Hide Menu
2.3.6.1 Nothing
2.3.6.2 Control panel (left)
2.3.6.3 Display panel (right)
2.3.7 The View Menu
2.3.7.1 Title
2.3.7.2 ORFs
2.3.7.3 ORFs text
2.3.7.4 Ruler
2.3.7.5 Ruler text
2.3.7.6 Island bar
2.3.8 The Help Menu
2.3.8.1 Topic
2.3.8.2 About
2.4 Some Pop-up Windows
2.4.1 Confirm sequence name and length, Select program and parameters Window
2.4.2 The Detail Information Window
2.4.3 Specific regions list Window
2.4.4 The Save Window

 

Running GenomeComp

On Unix/Linux systems the way to run GenomeComp is to execute the main program called "GenomeComp1.3_XXX" with it's path on command line like this:

> /your_install_path/GenomeComp1.3/GenomeComp1.3_XXX

An alternative way to run the source code of this program if the main program doesn't work is like this:

> perl /your_install_path/GenomeComp1.3/GenomeComp1.3

If it still can not work, you might need to see the previous part of this manual carefully and make sure you have every system requirements done! For binaries users please make sure you have downloaded the properly package suit to your operation system.

On Windows systems with Perl and Tk module installed correctly GenomeComp can be started by double clicking on the "GenomeComp1.3.pl" icon or its shortcut anywhere.

If all goes well you will be presented with the GenomeComp Configuration Window and Main Window.

Configuration Window

The above configuration window will be presented first when running GenomeComp, which will let you locate some external programs in your system and define the working directory and project name for the following comparison.

GenomeComp needs to run the external program BLAST (Altschul et al., 1997) to perform the comparison if the Input format is FASTA, GENBANK or EMBL. The recent version of stand-alone BLAST program (binaries for different systems) are available on the anonymous Ftp server of NCBI (all of these three program 'blastall', 'megablast' and 'formatdb' are in this package). If the entry is blank which means GenomeComp could not automatically find the program in your system path, so you will have to locate them manually by clicking the "Browse..." button.

The "Working environment" options in the configuration window are very useful because it defines the name and location of some temporary files of GenomeComp, and also the default name and path for saving the result files.

Click "OK" button in the window when you have finished your configuration, then those information will be saved and this configuration window will be hidden automatically.

Main Window

Overview

On top of the main window is the menu bar for GenomeComp (described later in this page). The main window can be divided into two main parts, control panel and display panel. The former is the left part of the main window which containing some custom configuration options, and the latter is the right part. The display panel also contains two components, the top canvas will show the dynamic comparison result and the bottom message window will display the brief information of selected items in the canvas.

The top of the control panel are three input entries for users to locate the input files. The number of the valid entries will alter automatically according to the options of the Compare type and Input format from the options menu.

The middle of the control panel are three groups of Figure options in a sunken frame, these settings are very important for the dynamic comparison result shows in the top canvas of display panel.

  • Color representation : It allows users to restrict different colors to different length ranges of comparison matches. Both of the length ranges and the representation colors can be user defined. Directly change the value in the entries to custom the length ranges or click on the "Set..." buttons to select other color.

  • Fixed length and color : It allows users to define certain colors for some fixed lengths of matches (not a range). It's might be useful when users are especially interested in some certain lengths of matches.

  • Miscellaneous : It contains some unclassified but important options.

  • Figure scale : It is a scale bar which can be adjusted by mouse dragging to set the value from 5 to 500. Set the scale small for detailed graphical comparison result and otherwise for global view. If the scale is larger than 20, the names of ORFs will not be displayed. The comparison figure in the canvas is easily zoomable by this scale now!

  • Length filter : It is a grooved frame for setting the range of length for matches to be displayed in the canvas. The first entry is the lower limit and the second for upper limit (null means infinate). After the comparison figure has been displayed in canvas, users can change the limitation value and then take effect the new setting only by click the 'Apply' button.

  • Island color : Click the color region to set the color for representing those unmatched regions in the sequences. It will take effect immediately in the canvas.

  • ORFs color : Click the color region to set the color for representing the ORFs in the sequences (only when the Input format is Genbank or EMBL). It will take effect immediately in the canvas.

The bottom of the control panel are three function buttons which are all equivalents of the commands in the file menu and there are here just for convenient purpose.

The File Menu

  • Reset : Clean all the user modification in the GenomeComp main window and initialize all the options to the default values. This command can help users to erase error settings and bring them back to the default values. But it should be carefully used since this operation is irreversible. The equivalent of it is the button in the main window with the same name and its short cut is "Ctrl+R" in the keyboard.

  • Start : Start a comparison project and display the result graphically and dynamically in the canvas at the top right of the main window. The equivalent of it is the button in the main window with the same name and its short cut is "Ctrl+T" in the keyboard.

  • Save setting : Save the current settings into the '.ibprc' file in user's home directory, so there will be automatically loaded whenever running GenomeComp.

  • Load DEFAULT : Recover all of the current settings by the program presetting options (the options that user first run the program). This might be helpful when some settings were invalid unexpected.

  • Save : Save the graphical comparison result in the canvas into a local file in PostScript format. To get more details about this command see the part The Save Window. Its short cut is "Ctrl+S" in the keyboard.

  • Exit : Exit from the GenomeComp program. The equivalent of it is the button in the main window called "Quit" and its short cut is "Ctrl+X" in the keyboard.

The Options Menu

  • Configure : Recall the configuration window to view or edit the former settings. Its short cut is "Ctrl+G" in the keyboard.

  • Compare type : It's a cascade menu. Users should select one of the three choices according to their project.

  • Self-compare : This means users just give one sequence as input, so GenomeComp will perform a self-comparison for it. This command will be helpful to discover some structural features like repeat sequences in the given sequence.

  • Two sequences : This is the default choice which means inputting two sequences to compare with each other.

  • Three sequences : This is an extend function for multi-genome-comparison. It allows users to input three sequences and GenomeComp will automatically perform the comparison between the reference sequence and the other two sequences. Then display all the results in the canvas synchronously. see the part Three Sequences Comparison for more details.

  • Input format : It's a cascade menu. Users should select one of the four choices according to the format of their sequences or BLAST (blastn or megablast) results.

  • Fasta format sequence : Since this format of inputs do not contain any annotation information of the sequences, the comparison result based on such inputs will be less informative.

  • Genbank format sequence : This is the default format for input files because they are used comprehensively now.

  • EMBL format sequence : Almost as good as the default one.

  • BLAST output file : This is another type of choice. Only recommend for those who could not run local BLAST and have to use the exist BLAST output file directly, or those users that do not care the annotation information.

  • Anchor style : It's a cascade menu. Users should select one of the four choices according to the relative length of pairs input sequences or for some specific purpose.

  • Left : It should be the common choice for similar lengths of input sequence pairs, which means the two/three input sequences will be presented on alignment from the start (e.g. the left side of the canvas).

  • Center : It is the default value which means alignment from the center of every input sequences. So the center of every sequences will be presented in a vertical line in the center of the canvas.

  • Right : This value should be used only for some specific purpose. It will make the end of every input sequences be aligned in a vertical line in the right end of the canvas.

  • Custom : Users can use this option to specify the adjustments of present position for each input sequence manually. So any regions in those pairs could be anchored in the same position for better visualization. This selection will pop up a dialog window for inputting adjustment values.

The Search Menu

  • Find ORF : This command allows users to search for some certain genes by name in the comparison result display in canvas. Its short cut is "Ctrl+F" in the keyboard.

  • Find Next : This is a convenient command for locating the next ORF with the same name. The equivalent of it is the button in the pop-up window of "Find ORF" called "Next" and its short cut is the function key "F3" in the keyboard.

  • Go to Position : This command help users to move the scrolled bar of the canvas to certain sequence position quickly. It's very useful when the sequences are very long. Its short cut is "Ctrl+P" in the keyboard.

The List Menu

  • Length setting : It's a cascade menu and users should select one of the five choices. This setting defines threshold for reporting a specific region by sequences comparison. A specific region (so-called island for a sequence) is defined as those part of sequences that do not have any maximal segment pairs (MSPs) by the external program BLAST.

  • All length : Force GenomeComp to report all length of specific regions in the specific regions list window.

  • Above 50 bp : Let GenomeComp report only those larger than 50 base pairs specific regions. (Default value)

  • Above 1 Kbp : Let GenomeComp report only those larger than 1 kilo base pairs specific regions.

  • Above 5 Kbp : Let GenomeComp report only those larger than 5 kilo base pairs specific regions.

  • Custom : If selected it will pop up a dialog window and allow users to custom threshold for specific regions list.

  • Specific regions : Get the specific regions list according to the former settings. It will pop up the specific regions list window but might take a few minutes for calculating. Its short cut is "Ctrl+E" in the keyboard. See the part Specific regions list Window for more details.

The Hide Menu

This is a cascade menu for users to manipulate the display mode of the main window.

  • Nothing : It's the default mode when GenomeComp start up which means showing both of the control panel and display panel. Its short cut is "Ctrl+N" in the keyboard.

  • Control panel (left) : Hide the control panel in the left of common main window, thus only display panel is displayed. So users might can get maximal visualization about the canvas in the right display panel. Its short cut is "Ctrl+L" in the keyboard.

  • Display panel (right) : Hide the display panel in the right of common main window, thus only control panel is displayed. Its short cut is "Ctrl+D" in the keyboard.

The View Menu

This is a cascade menu for users to choose the items that should be displayed in the canvas (all of them will be displayed by default). Unmark any of them will remove the corresponding items from the canvas.

  • Title : The sequences name in the left start of the canvas. Sometimes when the figure be zoomed out, the title text would be covered by others. So it might be useful to remove them if you don't want to redraw the figure.

  • ORFs : The arrows that represent the ORFs in the sequences (available only when provide GenBank or EMBL format inputs). Unmark this item will automatically unmark the 'ORFs text' item below.

  • ORFs text : The name text displayed above or below the ORFs arrows. Mark this item will automatically mark the 'ORF' item above.

  • Ruler : The black vertical line to mark the sequences length. Unmark it will automatically unmark the 'Ruler text' item below.

  • Ruler text : The number text above or below the ruler. Mark this item will automatically mark the 'Ruler text' item above.

  • Island bar : The colored bar for representing the matches in each sequence.

The Help Menu

  • Topic : Show the brief online help about GenomeComp. Its short cut is "Ctrl+O" in the keyboard.

  • About : Show the information of the authors. Its short cut is "Ctrl+A" in the keyboard.

Some Pop-up Windows

In the process of using GenomeComp users will get some pop-up dialog windows for input values or confirm actions. And also some warn or error message windows if GenomeComp meets something wrong, so please check your operations according to their messages.

Confirm sequence name and length, Select program and parameters Window

This window will pop up when users click the "Start" button on the main window or execute the "Start" command in the File menu with input sequences in Fasta, Genbank or EMBL format.

In the left frame of the window GenomeComp will get the name and length of each input sequences automatically. Here the sequence name is just a mark used by GenomeComp and also display in the canvas to distinguish each other, so it is not very serious and you can change it as you wish. But the sequence length is very important for the graphical result display, please make sure your input if you disagree the result counted by GenomeComp.

In the top right frame users should select one of the external program to carry out the sequence comparison. The default one is "megablast" which is suitable for genome wide sequence (such as several mega base pairs long). Of course please make sure you have specified the location of the program you chose in the configuration window.

In the bottom right frame users can manually set some parameters for running the external program you chose above. Some common values have been preassigned and please do not change them unless you are familiar with these parameters indeed.

After finishing the setting, click "OK" button to continue the comparison or click "Cancel" to back to the main window without any comparisons. The "Reset" button can help you clean all the user modification in this window and initialize all the options to the default values.

The Detail Information Window

When users are viewing the comparison result in the canvas with the mouse be drawn on each significant part, the comparison or sequence annotation information will be briefly reported in the message window below. And the pair-wised sequence comparison results or detailed information about ORFs can be clicked out in a pop up window like the above.

Specific regions list Window

By invoking the Specific regions command from the list menu, this list window will pop up and list the specific regions from both sequences separately in the list box. Note that self-compare can not call this window. The program will give two comparison results at the same window if performing three sequences comparison.

The amount of reported specific regions in each input sequence and the current threshold setting are displayed above the list box. Users can quick locating those listed regions simply by double-clicking them in the list box.

The Save Window

Sometimes after finishing the sequences comparison, users would like to save the graphical comparison result in a local file for further analysis et. al. Hence the Save command in the file menu will be useful. The above window will pop up and let users to specify the file name and location to save all figure in the canvas or just certain part of it.

The saved file is in PostScript format, which can be viewed using the free Ghostscript program from Aladdin Inc. Note that the figure in this file is a static graph without any dynamic displays as those in the canvas of GenomeComp main window. But it might be very important and useful when you want to present your comparison result in publications.

 
<<

Copyright @2002 Chinese National Human Genome Center,Beijing
All Rights Reserved