IDA FAQ

This version of IDA FAQ is no longer updated. For up-to-date version please see http://openscience.fi/ida-faq

  1.    What is the status of the service?
  2.    Can a university or individual faculty also be granted user access to research data in the IDA storage service if
         necessary? How can I access data if a researcher leaves a project?

  3.    Can a researcher or university purchase the services IDA, Etsin, AVAA and PAS from CSC?
  4.    Can a partner from a country outside Finland receive user access to IDA?
  5.    What kind of data is IDA intended for?
  6.    Will data stored in IDA be openly accessible to all? How can data be shared with the services IDA, Etsin and
         AVAA?

  7.    Why is IDA not a suitable storage for research data containing storing sensitive material?
  8.    Why does windows 7 say my file is too big to transfer to/from Windows 7 network directory?
  9.    Why does the network directory sometimes behave differently from my local directories/disks?
10.    What are the small files with strange names starting with a dot?
11.    Why do some transfers to the network directory end up to be 0 sized?
12.    The size of the Windows network directory is smaller than it should be / A file won't fit into Windows network
         directory even it should

13.    File copying ends before all is copied in Linux/Gnome
14.    The progress of the file transfer seems to vary a lot
15.    Adding metadata fails with Windows imeta command even when the metadata is correct UTF-8
16.    The file size is 0 for a file copied to Linux/davfs2 network directory
17.    A file won't fit into Linux/davfs2 network directory although it should
18.    Linux/Gnome copying a directory from IDA creates just one file

 

 

1. What is the status of the service?
IDA is in production since 2012 and may currently be used to store research data. After a long-term preservation solution of research data is made available, the role of IDA may be restructured.

 

2. Can a university or individual faculty also be granted user access to research data in the IDA storage service if necessary? How can I access data if a researcher leaves a project?
The owner of the data determines the user rights. Although ownership does not change, it is highly recommended to agree on the authorship/ownership rights in the beginning of a project, so that all involved parties will know what to do if someone leaves the project.

 

3. Can a researcher or university purchase the services IDA, Etsin, AVAA and PAS from CSC?
Under the Act on Public Contracts, higher education institutions may purchase CSC's own services without bidding up to EUR 30,000. However, IDA, Etsin, AVAA and PAS are support services offered by the Ministry of Education and Culture to higher education institutions and are not therefore available for purchase. CSC is the technical provider of TTA services but it cannot grant storage space. The Ministry of Education and Culture decides on the quotas for higher education institutions which decide on the amount of storage space it will grant to their own researchers.

 

4. Can a partner from a country outside Finland receive user access to IDA?
A foreign partner in a research project working within the Finnish research system may be granted user rights, provided that the project is part of a Finnish higher education institution or other project entitled to use IDA. Special usage arrangements (e.g. user federations) will however not be made.

 

5. What kind of data is IDA intended for?
IDA is basically not intended for a heavy-duty computing disk service, because IDA cannot guarantee the speed required for this kind of parallel access. It is either not ideal to, connect IDA directly to an instrument, which is constantly pushing data onto the disk. In IDA it is possible to store raw data and new datasets, as well as published data. What is relevant is how IDA is used, not what kind of data is being stored. IDA is not suitable for sensitive data.

 

6. Will data stored in IDA be openly accessible to all? How can data be shared with the services IDA, Etsin and AVAA?
The metadata can be viewed by all users within IDA, but access to the data itself is, by default, restricted to members of the project group. A project can also, if desired, add its own metadata field, but this will also be accessible to all users. Metadata may be added through the SUI option Manage metadata and with the i-command imeta add.

Data stored in IDA may be shared more widely if so wished. The file must be stored into an IDA folder with field metamodel in its metadata set to value Etsin. The published-folder is preset to this metadata setting. Automated harvesting from IDA to Etsin is being currently developed.

Sharing a file through AVAA
The open data publishing platform AVAA enables openly sharing data via IDA and Etsin. Make sure the file is stored into an IDA folder with the correct setting specified above. It's easiest to save the file in the folder name "published". The file to be shared also needs to have the metadata attribute availability set to direct_download. An open http download link for the file is then automatically formed into the AVAA service, with the following syntax:

http://avaa.tdata.fi/openida/dl.jsp?pid=<Identifier.series of the IDA file>


e.g. http://avaa.tdata.fi/openida/dl.jsp?pid=urn:nbn:fi:csc-ida2015011501152s

Sharing dataset information and file through Etsin
The metadata of a research datasets is shown to anyone in Etsin - research data finder. In the future, metadata of data stored in IDA will be shown automatically in Etsin if so specified by the user.

Currently, users wishing to share data via Etsin can do so manually by describing their data in Etsin. If the actual data file can be shared that can also be done via Etsin. The file and folder metadata must be modified as described above. As part of the Etsin descriptions indicate in the section "Use information" that the data is directly downloadable and input the abovementioned AVAA-link in the field "Web address for downloading the dataset".

 

7. Why is IDA not suitable for storing research data containing sensitive material?
IDA is not a suitable storage for research data containing sensitive material (sensitive personal data as in law http://www.finlex.fi/fi/laki/ajantasa/1999/19990523 or other sensitive material) because command line usage and Cyberduck/iRODS do not encrypt network traffic. Other user interfaces encrypt network traffic.

For this reason IDA is not at the raised security level according to government regulation https://www.vahtiohje.fi/web/guest/keskeiset-vaatimukset-ict-ympariston-tietoturvallisuuden-toteuttamiseksi but at the basic level.

 

 

8. Why does windows 7 say my file is too big to transfer to/from Windows 7 network directory?
The default maximum file size for file transfer in Windows 7 WebDAV network drives is 50 MB. This limit can be changed in Windows registry.

You can change the maximum file size for a Windows 7 WebDAV network directory by changing a parameter in Windows registry. Changing the registry is at user's own risk. You need admin rights to change the limit in the Windows registry.

You can change registry values with for instance regedit program. regedit is part of the normal Windows 7 distribution. You can start it either by
   1.    Start
   2.    type regedit into the text field
   3.    press <enter>

   or by

   1.    Open directory <startup-disk>\Windows
   2.    double click regedit program

The parameter that limits the transfer file size is:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\WebClient\Parameters\FileSizeLimitInBytes


Default value is 50000000 (decimal). By changing it to 4294967295 the limit will be 4 GB

 

9. Why does the network directory sometimes behave differently from my local directories/disks?
It's because the network directory is not a disk. The network directory just gives you a similar user interface to store files to and retrieve files from IDA.

The normal disk file interface is in use only between the local user interface software (for instance Windows Explorer, OS X Finder, Gnome Nautilus) and the local WebDAV implementation (for instance Windows WebDAV MiniRedirector, linux davfs2). These WebDAV implementations also cache files more or less successfully.

The protocol between user's machine and IDA is WebDAV which is an extension to http protocol.

An illustration of the basic blocks when using IDA as a network directory:
 
 
A similar illustration of a remote disk service (for instance NFS, iSCSI) :
 

 


10. What are the small files with strange names starting with a dot?
They are probably utility files created by Mac OS X:n Finder.

Finder stores user interface selections (such as wether the files are listed as icons or as a list) into a file named ".DS_Store". You shoud prevent the caretion of ".DS_Store" files. It can be done in Terminal application with command :

defaults write com.apple.desktopservices DSDontWriteNetworkStores true

When Mac OS X stores files into some other file system than the native "HFS Plus" it creates utility files with names starting by "._". For instance a file named "MyFile" creates also a file named "._MyFile". These utility files contain that additional data specific to Mac's file system that can't be stored into a file in some other type file system. This additional data can be for instance resource fork, Finder flags, file type, creator type.
 


11. Why do some transfers to the network directory end up to be 0 sized?
The actual transfer has failed for some reason and in the beginning of the operation either user's machine or IDA has created a 0 sized place holder name for the file and failed to remove it.

You can minimize the chance to have 0 sized files after failures in Windows network directory by disabling the file locking in the Windows WebDAV. Changing the registry is at user's own risk. You need admin rights to change the Windows registry.

You can change registry values with for instance regedit program. regedit is part of the normal Windows 7 distribution. You can start it either by
   1.    Start
   2.    type regedit into the text field
   3.    press <enter>

   or by

   1.    Open directory <startup-disk>\Windows
   2.    double click regedit program

The parameter that defines locking is:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\WebClient\Parameters\SupportLocking

The default value is 1, locking enabled. Change it to 0 to disable locking.

 

12. The size of the Windows network directory is smaller than it should be / A file won't fit into Windows network directory even if it should
Windows incorrectly shows the size of the network directory to be the same as user's machines startup disk. It also needs free space on startup disk the size of the file to be copied to network directory so copy to IDA will work once you create enough free space on your startup disk.

 

13. File copying ends before all is copied in Linux/Gnome
Copying one big or multiple files in Gnome/Nautilus/gvfs fails with for instance error:

Error in stream protocol: End of stream

This WebDAV implementation uses main memory as a cache and won't free it during one operation so all files need to fit into memory.

You can tackle this by selecting less files in one copy operation or by using for instance iput command line copy instead of Nautilus/gvfs.

 

14. The progress of the file transfer seems to vary a lot
The user interface does not react correctly when the actual transfer is performed in two steps, into the local WebDAV cache and over the network. There are pictures that illustrate this in question "2. Why does the network directory sometimes behave differently from my local directories/disks?"

For instance Windows 7 and copying two big files, this is what typically happens: The transfer seems to progress fast up to the halfway and then almost stops. After that again faster progress and at the end the last seconds that the interface tells to to remaining may take actually a minute. During the fast progress in the beginning Windows is just copying the first file into the local WebDAV cache and the real transfer over the network has not started at all. Once the real transfer starts the progress seems to almost stop compared to the fast start. This is then repeated with the second file.

 

15. Adding metadata fails with Windows imeta command even when the metadata is correct UTF-8
The error is:

ERROR: rcModAVUMetadata failed with error -806000 CAT_SQL_ERR

This error text may come for other reasons too.

WIndows imeta cannot handle some UTF-8 characters correctly, such as Scandinavian chars (åäöÅÄÖ). It changes them to something else before the data is sent out from Windows host.

You can go around this problem by using browser interface for metadata or some other OS for imeta command, or a java based Windows Jmeta, available on integration page.

 

16. The file size is 0 for a file copied to Linux/davfs2 network directory
Diasble locking. Davfs2 configuration file, for instance /etc/davfs2/davfs2.conf, needs a line:

use_locks  0

 

17. A file won't fit into Linux/davfs2 network directory even if it should
davfs2 needs space in your local mount point for the amount of the file to be copied to network directory. The copy to IDA will work once you create enough free space there.

18. Linux/Gnome copying a directory from IDA creates just one file
Some versions of Nautilus (the software that takes care of network directories by default) may not download directories correctly and the result is just one file instead of a directory. If you run into this problem you can transfer files one by one or try a different version of Nautilus or for instance gnome-commander or command line interface.

19. I want to transfer large datasets or mulitple files to IDA. How do I speed up the transfer?
It is higly recommended to package your data before transferring these to IDA. Please use these scripts and wrappers in doing so: https://github.com/ida-csc and https://github.com/ida-csc/iput_wrapper