This document describes the interface and underlying software that will sit between the data held in the science archives of the VISTA Data Flow System (VDFS) and the end user. In this context a user might be an astronomer from an ESO member country, a non-ESO astronomer or to a lesser extent a member of the public.
The VDFS science archives are the WFCAM Science Archive (WSA) and the VISTA Science Archive (VSA). The Science Requirements Analysis Document (AD01) lays out the conditions for user access to these archives and drives the development of the user-interface. From a user-interface point of view, many of the requirements and methods of implementation are similar or indeed the same for the WSA and VSA. This document will take a joint approach to the archives highlighting any differences as they occur. VDFS design policy was to develop the WSA first and some examples of existing functionality are included.
If users wish to access data still under a proprietary period they will first need to be authenticated in order to verify their access rights to the data. User authentication is outlined in Section 4.
Users of the archives will be mainly interested in access to the stored pixel data (FITS images) and/or the generated object catalogues. The archives will store pixel data in flat files with the meta-data loaded into a Relational DataBase Management System (RDBMS). Object catalogue data are read from FITS catalogue files during the curation procedures and stored directly in the RDBMS (though the FITS catalogue files will also be archived). During curation any enhanced object catalogue products are also produced and loaded into the RDBMS e.g. merged source and neighbour tables.
The way the data are held in the archive and the different software required to process user requests lead us to consider these two types of access in separate sections.
Section 5 discusses the user interface to the pixel data splitting it up into the different ways in which users will be able to extract image data and the underlying software required.
The user interface providing access to the object catalogue data is similarly broken down and the subject of Section 6.
Ensuring the archive interface conforms to emerging Virtual Observatory standards and housekeeping issues are briefly discussed in Sections 7 and 8.
Section 9 lists the documentation that will be provided to assist the user.
Initial access to information on the WSA and VSA and to the data is via the websites maintained by WFAU. The WSA website is at
Apache web servers running under Linux and offering up CGI host the VDFS websites. Tomcat is installed on the servers providing Java servlet functionality. All the sites' pages and forms are HTML 4.01 compliant. This ensures they work with a wide range of browsers on common operating systems. Specifically they have been tested using Microsoft Internet Explorer under Windows and Netscape variants running on Unix/Linux. Cascading Style Sheets (CSS) will also be used to build the website.
Experience with the WSA has shown that it has been necessary to make use of Javascript in some of the web forms to enhance the functionality.
The VDFS websites will consist of
Users wishing to access proprietary data through the user-interface will need to be registered with the archives. For UKIDSS data held in the WSA the list of registered users is maintained by a group of community contacts. These contacts administer their community (adding users, changing passwords etc) though a password protected interface (Java servlet) provided by WFAU. The list of registered non-survey users who wish to access their open-time data is maintained by WFAU staff. The lists of registered users are held in a database in the archive.
When accessing data under its proprietary period users must first login using a web form. The attempt to login is authenticated against the lists of users in the database. Successful logins are stored in a browser session. The session login status is then used by the access points described below to determine which database(s) the user is allowed to access. The underlying database connections are also based on login status. These connections will fail if a user attempts un-authorised access to a proprietary database. A user's login status is displayed at the top of all the web forms accessing the archive.
Different ways for accessing the pixel data are described below with a schematic representation given in Fig 1.
The primary access route for users will be via web forms on the WSA website. In general terms the forms will action a Java servlet that queries the relevant database pulling out any matching results. The resulting parameters (filenames, extension numbers etc) are passed to CGI scripts to perform any necessary image manipulation. Speed and efficiency is more important in the low level manipulation of the binary data and this is carried out using C code.
Some specific examples of pixel access methods are give below:
Boxes and menus on the form will allow the user to specify the celestial coordinates of the extraction (decimal degrees or sexagesimal), the coordinate system (J2000, B1950, Galactic), the size of the area to extract (x,y in arcmin), the waveband, and type of image to use (normal, stack, interleave etc)
When submitted this form will action a Java servlet on the server. A summary of the tasks the script will perform is given below:
Small image extractions are completed in real time (under 30 seconds).
Uploading a file of coordinates to this multi-part form provides the user with a batch mode front end to the functionality offered in Section 5.1. Users are asked to supply the path of the upload file, size of extraction, waveband, and a valid email address.
A limit needs to be placed on the total amount of pixel data that can be extracted. For the WSA this is currently set to a total area of 500 sq. arcmin.
The Java servlet actioned by this form carries out similar steps to those described above in Section 5.1 as it loops through the input file. Output is written to a temporary results file and if a given image can not be extracted the script will skip to the next one. After initial checking of the input parameters and execution of the SQL query, the actual extractions are run in the background with a message being returned to the browser informing the user that an email will be sent to them when the script has finished.
The email sent on completion provides the user with a URL where the results can be viewed an downloaded. These results include a tar saveset of the extracted FITS files and a PDF file of the jpeg images.
For the WSA in order to narrow down the number of matching images for a given coordinate the user is also be asked which survey or programme to use. The underlying SQL query is then joined with the relevant mergeLog table which ensures the optimum science frames are returned.
This form will enable users to construct an image from multiple detector frames covering an area of sky up to 1.0 degree across. However the underlying mosaicing software will be able to generate arbitrarily large areas.
The inputted parameters will be similar to PF1 (Section 5.1) but with the addition of a text box to enter the scrunch factor (pixel binning) of the returned image and a box to enter the user's email address.
This form will again action a servlet. The main difference in the processing steps outlined in Section 5.1 is that the initial SQL query constructed and sent to the image database table will return the path/filenames of all images held that overlap with the area of sky requested. This list of files together with the size and binning requested for the output image will in turn be passed to a local copy of the CASU mosaicing code that will combine the files together into a single image. As a large amount of pixel re-sampling can be involved the main work will be done in the background with the user being notified by email on completion.
After submitting user inputted values for position and waveband this form will return a list of matching images as part of another web form. Users will then select which of the images should be stacked and supply an email address.
On submission this second form will action a CGI script that inputs the selected images to a local copy of the CASU stacking tool. Again the intensive processing is done as a background task with an email notification being sent on completion.
Once again the form actions a Java servlet that performs the required SQL query on the relevant database. The servlet returns the lists as HTML tables. These tables contain links allowing the user to view the library jpeg images of the multiframe extensions and download the FITS image file and any associated FITS catalogue.
The methods outlined in this section will only access data classed as world readable.
The single and batch image cut-out access methods (PF1 and PF2) have been implemented in the WSA
Access to SuperCOSMOS Sky Surveys (SSS) data has been made available from within GAIA
(under Data-Servers >
The queryable URL functionality described in Section 5.6 has been supplied to the
6dF observers who routinely use it to generate SSS finders for checking target objects (3000-10000
extractions per week).
Screenshots of some of the access methods described above are provided in the Appendix
(Section 10).
Note that the intensive part of any query is carried out by SQL Server. Much of the differences
in functionality of the access methods described below lie in the construction of the
SQL query, the code for executing the query is largely generic and re-usable.
The amount of data returned by a query and written to file needs to be restricted.
For the WSA this is currently set at 15 million cells (e.g. a million rows
with 15 attributes).
Again the primary access method is via web forms. A user's login status is used to ensure
they are authorised to query the requested database.
The actioned servlet first converts the inputted coordinates or resolves
the inputted name (using SIMBAD) to an RA and DEC in J2000 decimal degrees.
An SQL query is then formed that efficiently searches through the required table using the
indexes on RA and Dec.
The servlet submits the query as a separate thread, with the main
thread keeping the browser connection active and checking that it hasn't
been stopped by the user.
On completion the query thread parses and formats the returned rows, writes
the data to file if requested
and prints any HTML table output and links to files to the browser window.
There are two versions of this form. The first uses drop down
menus and text boxes to guide the user in building an SQL query
for execution by the servlet. Users are able to choose/input
values to construct the SELECT, FROM and WHERE clauses of SQL.
An option enables the querying of a second table joined with
the primary table. The secondary table can be one of the non-WFCAM
based tables held in the archive e.g. 2MASS or SDSS. Joins are made
via the neighbours table for a given combination.
The second version is based around a text box into which users with
knowledge of SQL and the contents of the WSA or VSA (which will be documented)
can directly input their SQL query.
Output options for both versions will be as OF1.
Results of long-running queries are sent by email to users.
This multipart form offers the user the ability to upload a list
of coordinates
and match them against tables held in the database.
Users specify the pairing radius and whether they want just the nearest
object or
all matching objects extracted.
Unlike the previous object catalogue forms several SQL statements are executed by the
servlet. The first of these creates a temporary database table, that table
is then populated with the contents of the upload file. Finally the
requested database table and temporary table are paired. The temporary table
is automatically dropped when the JDBC connection is closed.
As detailed in Section 5.5 the archive listing form provides
a browsable way to reach links to the object catalogues which are stored as
FITS binary tables (generated from and associated with a given FITS image file
by the CASU standard source extraction tool).
The functionality offered in Section 6.1
will be made compatible with the GAIA
and Aladin tools. Some of the options available on the web form will
be hard-wired to sensible defaults for this
implementation (e.g. the parameters returned) as the
interface, especially with GAIA, is not fully configurable.
As previously mention in Section 5.6 the underlying queryable
URL can be made accessible from the command line using wget.
Again the methods outlined in this section will only access data
classed as world readable.
http://horus.roe.ac.uk:8080/wsa/region_form.jsp
http://horus.roe.ac.uk:8080/wsa/menu_form.jsp
http://horus.roe.ac.uk:8080/wsa/SQL_form.jsp
http://horus.roe.ac.uk:8080/wsa/crossID_form.jsp
respectively.
Cross-matching an uploaded file of 1000 records with the UKIDSS Large Area Survey takes approximately 30 seconds.
Screenshots of some of the examples described above are provided in the Appendix
(Section 10).
This topic will be covered in detail AD03. However it is worth noting here that VOTable
format is one of the output options for the object catalogue access.
Log files and database tables
of user access are archived and used to monitor and generate statistics on archive
usage (hits, queries, data volume served). In the two months following the WSA DR1 release
some 2800 SQL queries were made via OF2 and nearly 0.5 billion rows of data were returned. In the
same period just over 900 archived FITS files were downloaded by users via the archive listing form.
6dF : Six-degree field
Issue: 1.0 09/06
Issue 1.0 09/06
Issue 1.0 09/06
This document was generated using the
LaTeX2HTML translator Version 2002-2-1 (1.71)
Copyright © 1993, 1994, 1995, 1996,
Nikos Drakos,
Computer Based Learning Unit, University of Leeds.
The command line arguments were:
The translation was initiated by Nigel Hambly on 2006-09-30
Object catalogue data are stored in SQL Server database tables on a Windows machine
networked to the web server. The basic recipe for access is described below:
OBJECT CATALOGUE DATA
This form enables a user to extract objects within a specified distance of
a supplied position (RA/DEC, Galactic) or object name. Other options passed from the form to the servlet
are: which survey to search, which parameters to extract (all, subset or user
specified), any additional constraints (used to form the SQL WHERE clause), the format
of the output data (HTML, delimited ascii file, binary FITS table, VOTable).
Radial Search - Object Form 1 (OF1)
Similar to the radial search but this time an SQL function is used
that extracts objects bounded by limits in RA and DEC (or the
coordinate system requested).
Rectangular Search - Object Form 2 (OF2)
SQL query - Object Form 3 (OF3)
Cross-matching a catalogue - Object Form 4 (OF4)
Browsable access to catalogue data
Other access to catalogue data
The core functionality described in Sections 6.1, 6.3 and 6.4 is implemented
in the WSA under:
Demonstrations of catalogue access
VIRTUAL OBSERVATORY CONSIDERATIONS
Temporary output files generated by user requests are written to a publicly
(HTTP) accessible area of the file system. A cron job run daily and deletes
any of these files more than 48 hours old.
HOUSEKEEPING
The WSA website has and VSA websites will have extensive online documentation to help the user including:
USER DOCUMENTATION
The database contents of the archives are documented in detail in the Schema Browser section of the
website. Users can navigate through the database design viewing documentation on schemas, tables, views,
columns and functions. For a given attribute or column the following information is available:
name, type, length, unit, description, default value and Unified Content Descriptor (UCD).
Where necessary more detailed
information on a given attribute is provided via a link to the glossary section of the Schema Browser.
Schema Browser
This section shows screenshots of some of the ways users can access
the archives held at WFAU.
APPENDIX
ACRONYMS & ABBREVIATIONS
6dFGS : Six-degree field Galaxy Survey
ADnn : Applicable Document No nn
CGI : Common Gateway Interface
CASU : Cambridge Astronomical Survey Unit
DBD : Database Driver
DBI : Database Interface
DXS : Deep Extragalactic Survey
JDBC : Java Database Connectivity
LAS : Large Area Survey
LWP : LibWWW-Perl
HTML : HyperText Markup Language
HTTP : Hypertext Transfer Protocol
SOAP : Simple Object Access Protocol
SQL : Structured Query Language
SSS : SuperCOSMOS Sky Surveys
SSA : SuperCOSMOS Science Archive
UDS : Ultra Deep Survey
UKIDSS : UKIRT Infrared Deep Sky Survey
VIRCAM : VISTA InfrarRed Camera
VISTA : Visible and Infrared Survey Telescope for Astronomy
VPO : VISTA Project Office
W3C : World-Wide Web Consortium
WFAU : Wide Field Astronomy Unit (Edinburgh)
WFCAM : Wide-Field Camera
XML : eXtensible Markup Language
APPLICABLE DOCUMENTS
AD01 Science Requirements Analysis Document VDF-WFA-VSA-002
AD02 Database Design Document VDF-WFA-VSA-007
AD03 Virtual Observatory integration VDF-WFA-VSA-010
CHANGE RECORD
Issue
Date Section(s) Affected
Description of Change/Change Request Reference/Remarks 1.0 06/09/06 All New document based on VDF-WFS-WSA-008
The following people should be notified by email whenever a new
version of this document has been issued:
NOTIFICATION LIST
WFAU: P Williams, N Hambly
CASU: M Irwin, J Lewis
QMUL: J Emerson
ATC: M. Stewart
JAC: A. Adamson
UKIDSS: S. Warren, A. Lawrence
About this document ...
Copyright © 1997, 1998, 1999,
Ross Moore,
Mathematics Department, Macquarie University, Sydney.
latex2html -html_version 3.2,math,table -toc_depth 5 -notransparent -white -split 0 VDF-WFA-VSA-008-I1
Nigel Hambly
2006-09-30