Points for Discussion about the BOS-C++ Interface

The BOS-C++ interface presented here is only a first trial. There are quite a few questions that should be discussed before a "production strength" implementation of something based on these ideas should be undertaken.

These are the points that come to my mind. Mail me (Benno List, blist@mail.desy.de) your opinions or further questionable issues!

Style Guide

When many people work on software, a style guide is always a good thing to have, and some steps have been taken in the direction to have a FORTRAN style guide for H1 (e.g. the H1 software note 54 and Stephan Egli's example code in his DST guide).
For a complicated and new language like C++, a style guide is much more important. It should help people to avoid pitfalls, avoid writing non-portable code, and produce reliable and easy-to-read code.
BaBar has already taken steps in that direction: Look at their BABAR Programming Guidelines!
There exists also a much larger, though in some points debatable (because of its strictness) style guide from the Ellemtel corporation. However, it does not cover some issues connected with more modern features in C++ such as exceptions and namespaces.
Here are just a few topics that came to my mind which should be adressed.

Wrapper files: C or C++?

Thanx to mainly Martine Charlet, already a large collection of C wrapper files for H1 software FORTRAN routines exist. Surely this is a very valuable thing.
Nevertheless it should be considered to provide special C++ header files, which can then use function overloading and default arguments to provide sensible shortcuts.
Eventually, C and C++ wrappers could be combined by using the predefined __CPLUSPLUS__ macro.

Using the STL?

The ANSI standard defines a standard template library. I think it's just too powerful to ignore.

Exceptions

Locking mechanism/associated structures/parallel banks

A widespread problem in HEP analyses is that we are offered a large collection of reconstructed objects which are just hypotheses (e.g. track hypotheses, electron candidates).
A kind of flagging and locking mechanism is therefore often needed. In H1 software this is often done by using banks which are parallel to one another.

Syntax of StrBank

In my proposal, a structured bank's elements can be accessed via the "dot" operator, as if the StrBank object was the bank itself:
  struct qtrarow {
  // Conditions for good central track:
    float th_min_c;  // min. theta [degrees] 
    float th_max_c;  // max. theta [degrees] 
    float rs_max_c;  // max. start radius [cm] 
    float re_min_c;  // min. end radius [cm] 
    float rl_min_c;  // min. radial length [cm] 
    float pt_min_c;  // min. p_T [GeV] 
  // and much more... 
  }
  
  void f () {
    StrBank qtra ("QTRA", 0);
  
    float thet_min_cen = qtra.th_min_c;  // qtra acts like structure
  
    qtrarow cuts = *qtra;                // qtra acts like pointer
  
  } 
So, on one hand, the object qtra is treated as if it actually contained the members th_min_c, th_max_c, and so on, on the other hand it acts like a pointer. This is somehow illogical.
The syntax might be more consistent if qtra was always viewed as a sort of pointer:
  void f () {
    StrBank qtra ("QTRA", 0);
  
    float thet_min_cen = qtra->th_min_c;  // qtra acts like pointer
    //                    ^^ not "." anymore!
  
    qtrarow cuts = *qtra; // stays
  } 
This would also be more consistent with the treatment of tables:
A Table object is syntactically similar to an array, so it's very close to a pointer, and indexing an array is somewhat equivalent to dereferencing a pointer.

Should bank name and number be kept in the "Bank" object?

Currently, only the index of a bank is stored in the Bank object, not the name or number of the bank the user wanted to open.
Therefore, when a bank was not found on the BOS common, this name and number are not available anymore, e.g. for use in error messages. Also code like this would not be possible:
  void f () {
    StrBank qtra ("QTRA", 0); // try to open bank
  
    if (!qtra.is_open()) {
       // do something to get bank, e.g. fetch it from database
       qtra.reopen();                  // perform again nlink ("QTRA", 0)
    }
  } 
So the answer is probably "yes". But then questions arise: Should a method name() return the stored name, or the take the name from the BOS common?

Should the "row" class contain a default name and number of a bank?

Currently, the user always must provide the name of the bank he wants to open, although using a predefined structure generally implies a certain bankname.
Code like this might be nice:
  void f () {
    Table dtra (); // open "DTRA" bank with number 0
  } 
For this one would need a default bank name (and possibly number) in the class "dtrarow". This could be done the following way:
  // file dstbanks.h
  class BankDescriptor {
  public:
    static char name [5];
    static int number;
  };
    
  class dtrarow: public BankDescriptor {
  public:
    float ptinv_tr;
    // and all the rest...
  };
  
This would be used in the Table template:
  // file banks.h
  template  
  class Table: public Bank {
  public:
    Table (): Bank (row::name, row::number) {
      // check miniheader;
    }
  // and all the rest...
  };
  
Then one would have, in a special file, initiators for name and number:
  // file dstbanks.C
  #include"banks.h"
  #include"dstbanks.h"
  
  dtrarow::name   = "DTRA";
  dtrarow::number = 0;
  
This is somewhat inconvenient, but at the moment it has to be done anyway as soon as one wants to use TablePointers, and I see no way how to avoid it there.
Also, the above initializations could be automatized exacly the same way as the transition from DDL to structures/classes.

Banks with wrong format

One common case where one has banks with a "wrong" format is that a bank has been added some new columns, and now one runs over old data.
In this particular case it might be useful to have a default treatment of banks with too few columns:
If the constructor of a Table observes that columns are missing, it might create a new Table with the right number of columns, use the default creator of the "row" type to initialize all elements, and then copy the elements present in the table which is too small.
This would not work for ill cases like DMIS, where the sequence of columns has been changed, but for many other cases.
The alternative would be to leave it to the user to catch the exception thrown in such a case, perform the same thing, and continue. This might also be OK, and makes the user aware of the problem, and the fact that the default solution may not work.

"Free format" list-like banks

Especially in H1SIM often banks are used that have a list-like structure, i.e. consist of variable-length rows which start with the number of elements of the row, and then a sequence of elements which typically depends on the second element of the row.
The structure of an individual row is thus similar to a union (or a variant record in Pascal) with a selection field.
It might make sense to try to define templates which can take such unions as row descriptors, and do not use miniheaders, but provide singly-linked list type iterators for access of the bank.

Use of int32 and float32

My implementation of Banks generally assumes that int and float are 32bit numbers. Probably it would be better to define
  typedef int32 int;
  typedef float32 float;
  
and use these types wherever one relies on this assumption. Otherwise the transition to a 64 bit compiler could mean lots of code rewriting!

Garbage collection

After garbage collection, which in the case that FORTRAN modules are called from the C++ program, cannot be controlled or detected by the C++ program automatically, all bank indices potentially point to the wrong location.
A similar situation arises from dropping a bank.
Several "solutions" to this problem may be considered:

Work banks

Currently, work banks are not used (and probably noty usable) in the proposed scheme. An important hurdle is that BOS takes the adress of a workbank pointer and changes it's value during garbage collection. Therefore workbank indices must be in stored common blocks.
The C++ solution would be to have a workbank class that keeps and manages a static array of workbank indices. Probably it is not much overhead to have something like this, but it requires quite some thought how to do it. But the STL standard container classes probably are very useful for that.
Probably the best would be to have an inheritable Type "WorkObject" which could be inherited by "WorkBank", "WorkStrBank", and "WorkTable".

Database banks

Wouldn't it be nice to have a "database bank" class which issues a "UGTBNK" for you every time you want to open the bank?
It would be nice to have some sort of base class "DatabaseBased" which can be ingerited by "DatabaseBank", "DatabaseStrBank", and "DatabaseTable".