In the Pre-SAMD scenario, if datasets need intensive analysis that is beyond the capacity of a desktop PC, researchers had to log in to an Athens-enabled dataset, download the datasets to their PC several times, and then physically take these files to their nearest High Performance Computer.

The SAMD model provides an integrated environment to manage this workflow in such a way that time and effort is saved using the single sign-on feature to access both the data and the compute resources in the one environment rather than having to negotiate a potentially confusing technical environment.

The following table compares the pre-SAMD and Grid models

Pre-SAMD model

Grid model

Fig. 1 Shows the user having to access resources individually

Fig. 2 Shows the Single Sign On Grid model

  • The diagram above shows that social scientists need to access resources individually.

  • Without single sign on giving standard authentication and authorisation, access to the various datasets and High Performance Computers (HPCs) frequently requires separate login accounts. There may be several usernames and passwords to remember in the whole operation.

  • The distributed resources appear to the user as entirely separate resources. For example, the researcher may have to download some of the data in their office and some data from another PC in a separate location. This data may need to be combined and then sent off to a large Unix computer to do the high speed analysis. These operations require separate passwords and a basic knowledge of UNIX commands.

  • There is no integrated front-end to all of these operations which means that different interfaces need to be learned by the social scientist, including the UNIX command line.

  • Along with the data analysis, this all took the best part of a day.
  • The diagram above shows that social scientists can access multiple resources with a single sign-on.

  • In an e-science Grid a single sign on gives you access to all the databanks to which you have an entitlement. For example, this may be a databank at MIMAS to which you have an entitlement through your university, or an individual registration for the French census at the Institut National de la Statistique et des Etudes.

  • Distributed resources appear as a single resource. This means users can search across all the databanks wherever they are located in a single search. The network also includes High Performance Computing facilities for data analysis and the data can be moved directly from the databank to the HPC.

  • The integrated front-end (the GUI) designed by the SAMD technical team makes the steps of searching for data and computer resources much easier.

  • Data collection and analysis time was reduced to less than an hour.