Minutes by Akos (thanks!) Formatted slightly by Paul. Please send corrections to Paul Ad-hoc meeting for file event notifications ------------------------------------------- 2008-04-23 10:00-11:00 Participants: Paul Millar (dCache), David Smith(DPM), Luca Magnoni (StoRM), Shaun De Witt (Castor), Giuseppe Lo Presti (Castor), Ricardo Britoda, Akos Frohner, Vincent Garonne (ATLAS), Miguel Branco (ATLAS), Birger Koeblitz (ATLAS), Patrick Fuhrmann (dCache), Tigran Mkrtchyan (dCache), Flavia Donno (GSSD). The basic idea is that SEs would gather file state changes and provide these as events for the interested clients, like central file catalogs or experiment data management systems. The system would actually include details like if the file is online. Questions: - latency, reliability requirements. - Implementation details. - security/privacy: who should get these events? Benefits: - there would be less polling by the clients Disadvantage: - SE implementations have to provide even notifications Shaun: why is it less polling than doing an (SRM) 'ls'? Miguel/Vincent: yes ?: could you really make use of out-of-band events? Miguel: yes Shaun: there would be a *lot* of events coming out of Castor! Birger: DQ2 is able to cope with that. What happens if a file is lost? How can an SE notify the client about this loss? David: unless there is a scanning tool comparing file catalogs with the SE content Vincent: that is what happening today. However if a file is lost today it is only discovered by errors, when the file is accessed. Shaun: I could send an e-mail... Akos: is e-mail a bad format, given that there is a fixed format? Paul (summary): two basic modes of events: - error notifications - notifications of the events in normal operational mode. Let's focus on the events and leave the transport for later. Patrick: you can do 'ls' today to sync with a catalog today, however that will take a week in two years. Birger: DQ2 uses a daily dump of the Castor disk pool file information to build up a full picture of their files, do quotas, space management, etc. Patrick: what about security? Birger: at the moment it is public at moment Akos: it should not be a reason to make it public Miguel: dumps of filesystem namespace would be a useful starting point. Patrick: hoping for a standard format and accessible via HTTP Birger: full dump of only the disk pool is hundred MBs. (discussion if XML would be a bloated format) Miguel: ideally this could be the same person writing these Birger: what kind of information would be interesting (only disk yet, tape too!) needed: filename (SURL or logical filename) status (online, nearline, deleted) wish list: last access (general comment: it won't happen) owner (also group?) --- for accounting/quota which space (space token) is it in size checksum (Vincent asked for it) Patrick: what should be the checksum algorithm? Patrick: this is getting more than the original proposal was Periodicity: per day, per week? --> per day Giuseppe: dumping the whole DB of ATLAS would be dumping 5M entires. That would not be only half an hour! It should be restricted to a VO or even to a subdirectory. ... the idea is to start with DB dumps and maybe have more frequent updates and arrive to event notifications at the end. Birger: talking about space management Akos: so they would like to have a quota system? Birger: they already have something like that already Miguel: quota would be nice in the future, but it is not in the SEs yet Birger: knowing about that from 150TB there is already 100TB on disk that would help planning the reprocessing Patrick/Miguel: isn't that what space tokens are for, that the experiments actually tell where the file should be? Birger: by the time reprocessing starts files might be already archived to tape, so one wouldn't know if it is really on disk Miguel: ... not sure if this information is really useful Patrick: providing service class in dCache doesn't tell it if the file is actually on disk or migrated to tape at the moment. TODO: Paul will figure out the format for DB dumps. Patrick will think about the online/nearline status. The discussion on event notifications is postponed. --> e-mail discussion with a few examples