Questionnaire Guidance Notes

General

The form uses standard HTML and should be useable on most browsers. If you do encounter problems, then please consider using the MS Word or PDF versions. Completed forms can be e-mailed to spacegrid@rl.ac.uk or posted to SpaceGrid SSR Team, R25 1.88, Space Science and Technology Dept., Rutherford Appleton Lab, Oxfordshire, OX11 0QX, UK.

Four types of input element are used within the form:-

  1. A single line text entry box for short answers.
  2. A multi-line text box that you can use to enter free format text. If you enter more than five lines of text then a scroll bar will appear.
  3. A checkbox that allows you to select multiple choices from a list of options
  4. A radio button which is similar to the checkbox except that only a single selection can be made

If you are using Internet Explorer to complete the form then hitting the Enter key anywhere except in a multi-line text box will cause the partially filled form to be submitted. If this happens, don't worry, just use the Back button on your browser and continue. When you are ready to submit the completed form, use the button at the bottom of the page.

Layout

The questionnaire consists of seven sections:-

    1. Organisation
    2. Grid Technology
    3. The Domain
    4. Looking for standardisation
    5. Collaborative environment
    6. SpaceGRID priorities
    7. Your future involvement

Each of these sections are considered in more detail below.

1. Organisation

This section of the questionnaire tells us something about you, your organisation and which area or areas of the Solar System Research field you are currently working on.

Name:
To avoid any confusion please include your title, first name (or initials) and last name

E-mail:
If you are happy for us to follow up any issues with you resulting from the questionnaire then please ensure that you provide a valid e-mail address so that we can contact you.

Number of people in group:
If answering on behalf of a group, please indicate the number of people in the group (including yourself). If answering just for yourself then enter a value of 1 or leave the entry blank.

Institution:
Give the name and location of your institute (e.g. Rutherford Appleton Lab, UK)

Department:
Give the name of the department and/or group for which you work (e.g. Planetary group)

Main area of study:
Three options are given covering the three main areas within Solar System Research. Please select one or more of these options. If you wish to submit a different response for each of the areas, then only select the option that corresponds to the answers that you are giving in this instance of the questionnaire.

Specify main activity(s):
Indicate the type of activities that are undertaken by your group, department, institution or organisation. Multiple entries may be selected but please concentrate on the options activities that are most relevant to your particular work.


2. Grid Technology

This section is intended to help us assess how widespread knowledge about Grid technology is within the SSR community. A list of four possible options are provided which can be used to indicate your level of involvement or knowledge of Grid activities. We explain these in a bit more detail below.

"Yes, involved in learning, applying or providing facilities using Grid technology"
Select this option if you are working on clearly defined Grid projects such as AstroGrid, DataGrid or EGSO. You might also select this option if you are working on a project that is using Grid-like technology to provide distributed access to a set of different computational, storage or data resources. Use the text box to supply additional information such as the name(s) of the project or the sorts of Grid technology that you are using and how.

"Know something about it but not currently using it"
This option indicates that you are familiar with the concepts, potential benefits and problems of Grid computing, but not the details of specific Grid implementations. Indicate in the text box how you found out about the Grid (e.g. from reading an article or talking to colleagues).

"Have heard about it but not sure what it all means"
Select this option if you have come across the term 'Grid Technology' and know that it's something about distributed computing but don't really know what it means or what it could do for you.

"No, not heard about Grid technology before"
If this is the first time you have come across the concept of 'Grid Computing' then please select this option.

Further information on Grids can be found via the links page.

 

3. The Domain

The domain section of the questionnaire is used to collect information on the particular area that you or your group are working on within Solar System Research. The section is split into several sub-sections concentrating on different aspects of data use, location and retrieval.

DATA USE

This section deals with what data you use and what you do with the data.

The first question asks you to indicate what your main uses of SSR data are. In the case of research activity you are asked to further qualify your response with the area of study (e.g. lunar evolution, solar flares, magnetopause reconnection).

The next set of questions deal with the sort of data that you use. You are asked to allocate a percentage against each of a set of choices from the three areas within SSR. The total across all three areas should be 100%. If you work in several areas and are submitting separate questionnaires for each, then your response should only cover the applicable data sets and the total should still be 100%. There may be an overlap between the data you are using and the area (for example in-situ solar wind measurements may be used for both Solar and STP work) in which case the percentages should be allocated according to the use of the data. Other sources may include published lists

You are then asked to enter up to three commonly used datasets. These should be specified at a fairly general level of mission, instrument and data type (e.g. Cluster, FGM magnetometer, vector time series). Use the next text box to indicate any particular problems accessing the data that you need (e.g. not available online, or poor quality).

The question on combining data provides four options corresponding to different levels of coordination and complexity. You may select any number of the available options. Use the textbox that follows the options to provide more details on the sort of data joining that you do and particular difficulties (e.g. problems with locating data or data formats, getting data onto the same time line etc).

DATA SOURCE

This section allows you to provide information on how you find the data that you need and where you ultimately get you data. This should include published lists (e.g. SDG). One of the activities of the SpaceGRID study is to identify existing facilities that could be connected into the SpaceGRID.

The first question asks you to list some of your common sources of online data. These may be project specific data archives (such as the SOHO archive) or more general facilities (e.g. SEDAT, or the French CDPP). In some cases, archives are mirrored between more than one site. In such cases please indicate the site that you most commonly use or from which you get the largest volume of data.

The next question concerns the location of the data. One of the important features of a Grid-like infrastructure is improved facilities for locating resources. Knowing how data is located now will help us to assess what additional facilities SpaceGRID can provide.

The next you are asked about the accessibility of data (e.g. is there access control on the data). This information will feed into the requirements for transparent access to multiple archives (e.g. via single sign-on and delegated authentication).

Finally in this section you are asked to list any data sets that you would like to make available to other users.

DATA REQUEST

In this section you can provide information on the means of requesting and retrieving data.

"What is your most common way of constraining a data request?"
The two most common ways to constrain a request are either to limit the time range of a product or the spatial coverage. Other alternatives may be to limit the wavelength range of a spectral observation or the energy range of a particle distribution or via phenomenon or event catalogues. You should also use the other option if you commonly use a combination of constraints.

Are there any selection criteria you would wish were available?
This will help us to define a wish list of features that are not currently available and therefore good candidates for implementation in a Grid system.

"What is your preferred means of requesting data?"
Indicate the method that you prefer (or most commonly use) for requesting online data. Other methods may include applications that contain built in clients that can talk directly to a remote data server (e.g. ISDAT).

The next three questions ask about the number and typical volumes of the requests that you make. The question about the volume of a request that is actually required is trying to assess how efficient a typical request is at returning the data that you need. A request may return a complete file containing 10 hours of data where all you wanted was a one hour sub-set.

"What is your current preferred means of data/product delivery?"
Indicate the method that you currently prefer (or most commonly use) for product delivery. Other methods may include applications that contain built in clients that can talk directly to a remote data server (e.g. ISDAT).

"Would you benefit from being able to process data remotely before downloading the result?"
If you want to process very large datasets it may be more efficient if some or all of the processing can be applied at the archive prior to the data being transferred. Alternatively you may prefer to download the data so that you have step-by-step control over the processing chain.

The final questions ask about the data delivery time. For example if you are providing a space weather prediction service then access to near real-time data may be an essential requirement.

DATA PROCESSING

"What hardware and operating system(s) do you use?"
This will help us to identify the systems that need to be supported by the SpaceGRID system. Your answer should include the platform and the operating system (e.g. Sun/Solaris and PC/Linux). Please only list the systems that you most commonly use.

"What data formats do you commonly use"
Indicate the data formats that you use within your local processing environment (including formats like IDL Savesets) and also the formats that you use to share data with others.

"What processing do you need to apply to retrieved data"
In many cases retrieved data will need to undergo some form of manipulation before it can be used with existing data analysis applications. This post processing may involve reformatting of the data to get it into a form that the application can understand or actual processing of the data itself to combine parameters or apply calibration factors.

The final four questions ask about the processing software and tools that you use. These are split between commercial packages (such as IDL, Matlab and Excel), community supported applications and more general public available packages such as those provided by GNU.

4. Looking for standardisation

Standardisation is an important aspect of interoperability between different datasets. It is not the aim of SpaceGRID to try and enforce a common set of standards on SSR community. Instead we are looking at ways to facilitate the exchange of data between the different standards that are already in use. In this section of the questionnaire we are looking for your views on what aspects of your work suffer from the lack of standardisation and in particular how this affects your use of multi-instrument data sets.

5. Collaborative environment

Collaboration has always been an important aspect of Solar System Research. This reflects the diverse and complex nature of the data handled within the domain. In this section we investigate the level of collaboration within the different SSR areas and the means used to facilitate the collaboration.

6. SpaceGRID priorities

The penultimate section deals with the prioritisation of potential features that SpaceGRID may provide to the SSR community. You should enter a value between 1 to 5 indicating how important you think the feature may be to improving your productivity. If you do not want to rate a particular feature then leave a blank entry.

a) "Improved facilities for locating online sources of data based on a general query"
This feature will simplify the process of locating distributed resources via knowledge based search and query tools.

b) "Standardisation in the format and content of delivered data descriptions (metadata)"
The standardisation of data descriptions will assist in the production of uniform responses to queries and handling of data within analysis and visualization applications.

c) "Standardisation in the delivery format of data from different sources"
The standardisation of the data delivery format will reduce the post processing required to allow the end user to actually use the data. As with b) it will simplify the development of analysis software. However, it is not considered realistic for SpaceGRID to define a single data format for use by all domains. A more likely solution is to provide translation layers that will support conversion between the most common formats.

d) "Compatibility with other national and international programmes"
How important is it that the facilities provided by SpaceGRID are integrated, or at least compatible, with similar facilities provided at national and international levels. For example this could be a high priority if you are regularly using data obtained from an archive in the US.

e) "Improved facilities for querying the catalogues of a single data archive"
Most archives provide some mechanism for accessing data based on information contained in a set of catalogues. This functionality is often limited to picking a particular data set and time range. How useful would you find extended catalogue search capabilities? For example the ability to requests periods for which data is held for two or more data sets within the archive.

f) "The ability to apply queries on the data within a single data archive"
This feature differs from e) in that a query can be applied to the value of a parameter within the data archive and not just to the information contained in the catalogues. A request might ask for the times when values of one or more parameters lie within specified ranges.

g) "The ability to query transparently the catalogues of multiple distributed data archives"
This is the same as feature e) but with the added functionality of being able to automatically apply a query across a number of distributed data archives.

h) "The ability to query the data across multiple distributed data archives"
In the same way that g) was the extension of e) to the case of multiple distributed archives, this is the extension of the feature described in f) to allow queries to be applied to the actual data held in multiple archives. A distinction is made between f) and h) due to complex implementation of distributed data operations.

i) "The ability to manipulate and process data remotely prior to download"
Some of the new data sets within the Solar System Research field are expected to be very large. The ability to apply some level of processing at the data source and then just return the results may be much more efficient than downloading the data for local processing. This will be particularly true of large scale statistical studies that may involve processing of a significant fraction of the total archive.

j) "A web portal to access distributed resources from a single web site"
This feature offers simple web based entry points into the SpaceGRID system. The advantage being that access can be provided to a range of facilities without the need for the user to download and install specialized software.

k) "A Grid server application that allows users to link their own data into SpaceGRID"
Many SSR missions are PI based with the individual instrument teams being responsible for the reduction and dissemination of their data. A Grid server application (perhaps based on existing web based technologies) would allow individual teams to link their local data into the SpaceGRID infrastructure.

l) "A software library that can be used to allow programs access to SpaceGRID facilities"
This feature would allow existing tools and analysis software to be Grid-enabled. Such applications would then have access to the data and resources provided by the SpaceGRID infrastructure. In the simplest case this might just involve the replacement of an open file procedure to a Grid version that would allow access to a remote file.

m) "An online collaborative working environment – virtual SpaceGRID meeting facility"
A collaborative environment draws together some of the other features described above into a system that can be used for collaborative analysis. This includes features such as the near real time distribution of data and visualization.

7. Your future involvement

This section allows you to provide any further relevant information. It also provides the opportunity for you to suggest possible projects or applications that could be included in the SpaceGRID prototyping activities.


Goto SpaceGRID - SSR home page