Questions and answers about SOM

1.1: Are there plans for debugging SOM objects?

response date: 5/30/92

Although SOM objects are in general multi-lingual (i.e., the instance variables and method procedures encapsulated by an object may be implemented by different languages), the newly introduced instance variables and methods for each derived class of SOM objects are always implemented by a single language, and the API for class creation can include specification of this language. Thus, the information necessary to invoke the correct debugger for tracing through individual methods can be made available. This suggests the possibility of providing a framework within which different debuggers can be dynamically called to trace method calls as execution proceeds.

Currently, we use C language debuggers for debugging SOM classes implemented using C. As they become available, we may use C++ debuggers for SOM classes implemented using C++.

1.2: Does SOM support garbage collection?

response date: 5/30/92

This is an area of some importance, and we may work on this in the future.

Garbage collection in a multi-language environment is a hard problem, though. Even the single language case is not straightforward; take the problem of automatic garbage collection for C language programs, for example.

But it is possible that enhancements to the current SOM API might be able to provide information upon which a GC capability could be based.

1.3: If SOM doesn't provide GC, then is there SOM support for freeing objects when multiple threads have access to them?

response date: 5/30/92

A service or thread that creates an object (which is given to others but which may later be "deleted") has the responsibility of defining an appropriate protocol for client registration to support the object's use. SOM doesn't provide any specific support for this. Critical regions, for example, are provided by the usual operating system interfaces; not through any primitive SOM objects or classes.

It should be possible to provide such facilities through SOM classes, allowing inheritance to provide uniform support of protocols. Although such classes are not provided by the basic SOM runtime, they can be provided as application frameworks.

1.4: What are the Costs/Benefits of runtime class objects?

response date: 5/30/92

The cost is minimal, since there will be only one class object, no matter how many of its instances exist.

The benefits are great, since the class object can respond dynamically to requests concerning object and class information that is only known at runtime. In addition, specializations of SOMClass can create metaclasses that provide additional facilities, or override SOMClass methods to provide tailored behavior for classes of objects.

Since class objects are created at runtime, it is even possible to support adaptable systems in which the desired classes of objects may be discovered or may change as the system evolves. In advanced, interactive systems, the ability to create "new" kinds of objects in response to changing circumstances can be useful.

1.5: What are the Costs/Benefits of always requiring derivation from SOMObject class?

response date: 5/30/92

There is no cost, since SOMObject introduces no instance variables.

But there are many benefits, since this allows us to provide important functionality for all SOM objects.

As an example of useful functionality, every SOM object knows its class, and can return its class object. Also, SOMObject methods allow introducing defaults for routines (like dumpSelf) that can be specialized by individual classes. There are other, deeper examples as well, related to support for dynamic method dispatching.

1.6: What SOM Class libraries are there?

response date: 5/30/92

Currently the Workplace Shell and the SOM run-time classes are available for use on OS/2. Various additional class libraries are planned for the future that cover topics such as replication, persistance, event management, GUI, CORBA Interface Repository, foundation classes, etc.

2.1: What are the SOM method resolution mechanisms?

response date: 5/30/92

Offset resolution is based on knowing a SOM object type that includes the desired method, and using this knowledge to access a corresponding "method token" from a global data structure. The object and the method token are passed to somResolve (the SOM procedure for offset method resolution), and the appropriate procedure pointer is returned. This pointer is then used to invoke the procedure and perform the method. Language bindings based on static information declared using OIDL are used to hide the details of this procedure.

Name resolution is based on knowing the arguments for a desired method. The method name is passed to the somFindMethod method of the object's class object, and the result is the appropriate procedure pointer. Name resolution is available for any SOM object, irrespective of its class. This is because offset resolution can be used on any object to find it's class object (the somGetClass method is in the SOMObject type), and can then be used on the resulting class object to do name resolution (the somFindMethod is in the SOMClass type).

The two methods above assume that it is possible to statically construct within a program a procedure call that applies the "resolved" procedure pointer to the appropriate arguments - either explicitly, within code written by the programmer, or implicitly, within "language bindings." If the programming language does not support procedure pointers, or the appropriate arguments are not known until runtime, then dispatch resolution can be used.

Dispatch resolution accepts as parameters a method name and a dynamically constructed argument list (class objects can be used to discover what the appropriate arguments for a method call are), and is available as an offset method call on all objects (since the dispatch methods are included in the SOMObject type).

The different kinds of method resolution techniques thus correspond to successively greater uncertainty about the class of an object and the methods it supports.

In the case of offset resolution, we have an OIDL declaration for a class that includes the desired method.

In the case of name resolution, we don't have a static type for the object more specific than, often, SOMObject. But knowing only this type, we are able to use offset resolution to access the object's class, check whether the object supports a desired method (whose arguments we know), and then call the appropriate procedure.

In the case of dispatch resolution, we can't use offset resolution because we don't have an appropriate type for the object, and we can't use name resolution because we don't know what the appropriate arguments are. But, as described above, the information required for dispatch resolution is always available dynamically.

response date: 5/30/92

Name Resolution method:

SOMClass has a method called "somFindMethod". By using this interface it is possible to get a method pointer if the object supports the specified method. This method pointer can be used to make a dynamic method call. It is a useful mechanism for optimization. Another application is that if the only information about the method call is its signature, we could use this to make the method call.

Dispatch Resolution:

This mechanism in SOM is created to support dynamic languages. SOMObject provides "somDispatch" interfaces which can be overridden by providing dynamic-language supported "somDispatch" routines. "somDispatch" interfaces accepts a variable argument list for the method parameters. Dynamic languages can construct a variable argument list and use this interface to send messages to SOM objects. (we use this mechanism to support subclassing of SOM classes in OREXX, Object-Oriented REXX, and vice versa)

Example program:

student.csc file: student.c file: Client program: Output from this program: 2.2: What might language bindings for C++ look like? Because SOM seems to omit a large subset of C++ constructs, it seems that C++ bindings may be little more than the current C bindings, save maybe the hassle of dealing with the 'this' pointer. Is this true? Will I have to give up exception handling, templates, default method arguments, operator overloading, etc..? Will I just be basically coding in C, but compiling with C++? If these kinds of features will be available under the C++ support, will I have to use convoluted macros, or will I be able to code using the C++ semantics? If a construct is supported under one SOM language binding, will all other language bindings also have to support it? Why not make the SOM OIDL a superset of the functions provided by the popular OO languages, with some neat SOM bonus features?

response date: 5/30/92

SOM is not a programming language, and does not attempted to support all of the features of all object-oriented languages. This is probably impossible and self-contradictory. SOM does provide an extremely flexible and powerful framework for advanced object-oriented systems development. The central focus and contribution of SOM is support for binary distribution of language-neutral class libraries. In contrast, language-centric class libraries are limited in utility (they are useful only from a single language), and require publication of source code (different compilers for the same object-oriented language cannot be guaranteed to use the same object layouts in memory, so a binary class library produced by one compiler will generally be useless to applications developed with a different compiler). SOM solves these problems.

In general, we don't expect that SOM will be used to implement all of the objects and classes important to an application. Many objects created and used by an application will be strictly internal to that application's implementation - with no need for being accessed by multiple languages or applications. Perhaps these internal objects will be SOM objects (in those cases where a SOM class library provides useful functionality), or perhaps they will be objects provided by some other class library (e.g., a Smalltalk or C++ class library), depending on the language used to program the application.

Only objects whose implementation is to be provided or used by non-object-oriented programming languages, or objects intended to be useful across application and language boundaries with upwards binary compatability need to be SOM objects. We believe such objects can be valuable; supporting such objects is SOM's objective.

The purpose of C++ bindings for SOM would be to make it easy for a C++ programmer to use SOM objects, and to make it easy for a C++ programmer to define new SOM object classes.

There are many ways of accomplishing these two goals. We have identified at least three different approaches to C++ bindings, which may be characterized as follows: (1) Instantiable (2) Subclassable (3) Compiler-Supported

Instantiable is based on in-line expansion of C++ method calls into SOM offset resolution method calls. The approach requires no C++ instance data, since the pointer to a C++ object that "wraps" a SOM object IS the pointer to the wrapped SOM object.

Subclassable is based on C++ instance data that holds a pointer to a wrapped SOM object, and uses a C++ virtual function table.

Either approach could be supported by a special C++ compiler front-end, which would provide the corresponding functionality, while optimizing performance.

What are the functionalities associated with (1) and (2)?

With (1), Instantiable:

All SOM objects (including class objects) are available to a C++ programmer as C++ objects of a corresponding C++ class. Such SOM objects can be created using the C++ "new" operation on the C++ class name. The SOM class hierarchy is faithfully mirrored in the C++ class hierarchy. As a result, the C++ compiler provides static typechecking for SOM object use. A C++ programmer using the SOM bindings to access SOM objects thus has access to all of C++ in his code. (The important qualification here is the use of the word "object" instead of the word "class" - see the next paragraph).

If it is desired to implement an OIDL-specified class using C++, an OIDL specification for the desired SOM class is provided as input to the SOM compiler, and C++ implementation bindings for the class are generated and used by the C++ programmer. The C++ programmer uses unrestricted) C++ code to implement procedures supporting the newly-introduced methods of the class. Thus, individual C++ procedures (not C++ methods) are defined to support the methods introduced and overridden by the OIDL-specified class.

Of course, using "this" within such procedures is meaningless. The C++ methods in which "this" is used contain the (inline) SOM offset resolution method calls that are used to invoke the supporting procedures. The primary difference from the C implementation bindings is that the new class can introduce C++ instance variables (including C++ objects), and the method procedures used to implement new and overridden methods can make use of local C++ variables (again, including C++ objects).

Thus, subclassing is done in OIDL - not C++ - and all parent classes must themselves be SOM classes. The C++ classes that mirror SOM classes, and whose instances are actually SOM objects, cannot be subclassed in C++. Only the mirrored SOM classes themselves can be subclassed (using OIDL and some choice of implementation bindings).

With (2), Subclassable:

As with the Instantiable approach, SOM objects (including classes) are available to a C++ programmer as an instance of a C++ class, and the SOM class hierarchy is faithfully mirrored in the C++ class hierarchy. As a result, the C++ compiler provides static typechecking for SOM object use.

But subclassing can now be done in C++. A C++ subclass can be derived (in C++) from the C++ representative of a SOM class, and can introduce new instance variables and methods, and override inherited methods. This allows a new C++ class to be derived from both C++ and SOM classes.

Unfortunately, using language bindings to allow C++ to override inherited SOM methods and pass C++ objects as arguments to SOM methods introduces performance penalties. Some of these can be avoided in a compiler-supported approach.

Using a dual approach, C++ classes can be made available as SOM class, through an interface that provides the SOM benefits. Again, there are performance tradeoffs.

2.3: How about inter-process method calls?

response date: 5/30/92

This capability is planned. Currently, SOM can be used with shared memory for this purpose (this is not a supported capability, but some users are doing it). But SOM is being enhanced to support the CORBA standards for object method invocation in a distributed open system environment. As an initial step in this directions, we will be supporting calls across address spaces on a single machine. Also, classes being developed as class libraries may provide automated support for replicated objects across address spaces (i.e., one "logical" object; many "physical" copies).

2.4: Will SOM and the WPS be supported on multiple platforms (other than AIX)?

response date: 6/1/92

We're working on getting SOM available on AIX. As for the WPS, we don't know if there are plans or a desire to put it on AIX. As for other platforms for SOM, there don't appear to be any technical barriers.

2.5: What does it take for another language to be supported under SOM? Once a language is supported, does this mean that any standard compiler for that language will do?

response date: 5/30/92

To allow a given language to use and define class of SOM objects, we decide how to map the types and objects of SOM into this other language. There may be many ways to do this; we decide on one, and then write emitters that automate the process. Then all that is needed is a standard compiler for the language, plus use of the emitters to produce SOM class bindings. This has currently been done for C and C++. We are working on other languages as well.

The above approach assumes that it is possible to make use of the SOM API from within the language for which bindings are desired. The SOM API was designed with the desire to require as little as possible for its use. The SOM API basically requires the ability to make external procedure calls using system-defined linkage, and the ablity to construct simple data structures in memory.

Where these capabilities are available at the language level, SOM bindings can be source level; where they are not, SOM bindings will require compiler support, in addition to any source-level bindings that might be used.

2.6: How does SOM implement language independence?

response date: 5/30/92

Discussion of language independence in SOM ultimately relates to the ability of different languages to use SOM objects and define new SOM classes (which, really, is simply a use of SOM objects).

It is not SOM's most direct purpose to allow C++ programmers to invoke virtual functions on Smalltalk objects, or allow C++ to subclass Smalltalk classes. SOM does allow both C++ and Smalltalk (as well as other languages) to use SOM objects and to implement new classes of SOM objects by inheriting and enhancing the implementation of other classes of SOM objects. As a result of this capability, SOM does reduce an order n-squared interface problem to an order n interface problem. In addition, SOM does provide facilities that are useful in this reguard.

Multi-language capability is a primary objective for SOM. Achieving this objective has influenced SOM's design in many ways, so there is no single answer to this question. As we produce new language bindings for SOM, we are likely to iterate the SOM design (while maintaining binary capability with previous versions) to provide better support for this.

The SOM API for invoking methods on SOM objects is ultimately based on calls to external procedures defined by the SOM runtime, a small set of simple types (including pointers to data and pointers to procedures) and few simple data structures. Any languages with these capabilities can use SOM directly. Where these facilities are not available directly at the language level, it is possible they might be provided by "helper functions," provided by a compiler-specific procedure library, or they might be directly supported by compiler modifications.

When a given language is used to implement a new SOM class, the instance variables that are introduced, and the procedures used to support introduced and overridden methods need not be useful (or comprehensible) to the languages used to implement subsequently derived classes. This is one reason why SOM does not allow inherited access to instance variables. As a result, multi-language objects are possible, in which instance variables and method procedures are contributed by different programming languages. When a method is invoked on a SOM object, the appropriate procedure is called, and this procedure then can directly access only those instance variables that were introduced by the class the procedure supports.

For obvious reasons, we have initially focused on the cases in which the SOM API is useful without compiler-specific support. But as interest in SOM grows, it is likely that we will also be investigating other possibilities. Due to the simplicity of the SOM API, compiler modifications necessary for its support should be straightforward. For example, a compiler vendor for a non- object-oriented language without the ability to directly use the SOM API might consider the ability to use SOM objects a strategic advantage worth the small effort involved in providing this capability through compiler modifications.

3.1: Is there a problem when two unrelated classes introduce methods of the same name, and a client code wants to deal with objects of both classes? What if the methods take a different number of arguments?

response date: 6/10/92

This would not a problem at all with C++ bindings (since C++ classes provide method scoping). The C bindings for SOM, however, are based on macros, so method "collision" in this case, as a result of different macros being #included into C source that deals with different classes of objects, is something that can happen. When this does happen, however, it is not a serious problem (it is an annoyance, and the following describes how to deal with it).

In the C bindings, corresponding to any given method on a SOM object are two different invocation macros: a "short" form in which the macro name contains only the method name (e.g., _foo for the method foo); and a "long" form, in which the macro name includes both the class that introduced the method and, in addition, the method name. Certainly, if class A and class B both define a foo method, and some client code includes both A.h and B.h, then the short forms of the macros will "collide." This is will result in either warning messages or error messages - depending on the particular C compiler being used. For example, assume that the foo method available on A instances was introduced by the class, Z, an ancestor of A, and that the foo method available on B instances was introduced by the class, B. Then the following code should be used. The only important thing is this: if you see a warning that a method invocation macro has been redefined, then it is absolutely essential that the long form of the method invocation macros be used. Note that in the above example, the methods take different numbers of arguments, as in your question. The #undef's prevent the warning (or error) messages, and also prevent accidental use of the short form.

Again Note: The C++ bindings would not have this problem, because they would not be based on macros, but, instead, inline C++ method expansions.

3.2: Why do the usage bindings for a class automatically include usage bindings for its parent (which, in turn include the usage bindings for its parent, and so on until SOMObject is reached)? The result is that if a client includes the usage bindings for a given class, then the usage bindings for all of that class's ancestors are included as well.

response date: 6/10/92

First off, this should not be a surprise to anyone familiar with C++, since any client of a C++ class must include the class's structure definition, and defining a class's structure can only be done after defining the structure of the parent classes. So a very similar sort of chaining effect occurs in within any code that uses a C++ class.

Why is it necessary in SOM, though?

As in C++, it is required for implementing inheritance of interface. The method invocation macros that support the methods introduced by any given class initially appear in the usage bindings for the introducing class, and these must be "inherited" by the usage bindings for classes derived from this class. Also, to actually accomplish a static method call ultimately requires to evaluate an expression that accesses the ClassData structure of the class that introduced the method. And this data structure is defined in the usage bindings for the introducing class. If, for example, the usage bindings for SOMObject were not included by some derived class's usage bindings, then none of the methods introduced by SOMObject would be available statically on instances of the derived class.

3.3: Why isn't access to data inherited?

response date: 6/10/92

First it is necessary to understand that the SOM API does allow direct (uninterpreted, offset-based) access to the instance variables of SOM objects. As a matter of policy, however, the usage bindings for SOM classes do not automatically provide an interface to this capability. Some reasons for this explained below. However, the SOM API is always available directly to users, so direct access to instance variables is possible if an application is willing to suffer the consequences: changes in implementation of the class whose introduced instance variables are accessed may require recompilation of the application's source code. Within tightly-coupled "friend" classes, this may be reasonable.

SOM simply doesn't provide this facility as a default. Instead, SOM provides a machanism whereby a SOM class developer decides which instance variables introduced by that class should be visible to clients, and for these instance variables (and only these) is an interface provided by the usage bindings for that class.

The following are two reasons for our approach.

(1) SOM is intended to support arbitrary languages, allowing a subclass to be implemented independently of the language(s) used to implement its ancestor classes. Typically, a class C implemented using language X will introduce instance variables to support its methods. Now, lets say that language Y is now used to implement a subclass of C named S. S will also introduce instance variables in support of its methods. Certainly, the code that implements these methods should have direct access to the instance variables that were introduced for its support. But there is no reason whatsoever to expect that the instance variables that make sense to language X (used in support of C's method procedures) will also make sense to language Y (used in support of S's method procedures). Given SOM's desire to support multi-language objects, there seems very little reason for believing that inherited access to instance variables introduced by ancestor classes makes sense in general. Thus, the procedures that implement methods introduced by a class have direct access only to those instance variables introduced by the class.

(2) There is another reason related to binary compatibility. SOM's objective is that changing the implementation for a class of objects should not require re-compiling any code that depends on this class. Code that depends on a class includes both client code, and code that implements subclasses. Now, if the code that implements a subclass were allowed direct access to inherited instance variables, this code would certainly require recompilation (and more likely, redesign) when implementation of an ancestor class changes. SOM does provide access mechanisms for instance variables, but the decision to "publish" an instance variable for access by others is a decision of the class implementor - not the class user. So the class implementor knows what portions of the implementation others could depend on, and what portions can be changed without concern for users. As mentioned above, it is possible to defeat approach, but it must be done with knowledge of the SOM API - the language bindings (which hide the API) don't provide direct support for it.

3.4: Is the interface to published methods a macro or a function.

response date: 6/10/92

Just as with access to instance data, access to the procedure that supports a given method is based on a function call. The two situations are very similar. To access data (using the SOM API directly), one calls somDataResolve passing an object and a data token. To access the procedure that supports a method, one calls somResolve passing an object and a method token. Both data and method tokens can be thought of as "logical" offsets. They provide a very fast way of accessing the desired information, and are statically available via usage bindings. As with data access, usage bindings try to hide the details. In the C bindings, this is done with a macro; in the C++ bindings, this may be done by defining C++ methods as expanded inline. The result is the same, since the same function call is compiled inline in both cases.

In drastic contrast, name lookup first requires getting the class of the object, and then invoking methods on the class to get a method token. This is rather more expensive than simply retrieving a method token from a known, globally available location (which is what the usage bindings do for static method invocations). Once the method token is available, somResolve is used to return a pointer to the procedure that supports the desired method on the object.

3.5: There is a problem with the _New macro. Unlike the New macro (note no underscore prefix). Using the first may create errors, since it assumes that the class object exists.

response date: 6/10/92

The underscore form of the macro is certainly not intended for use by class clients. It is defined within the usage bindings solely to support the non-underscore form, which is the (only) form documented in the SOM user's guide. This could cause a problem, though, since the short form for method invocation starts with an underscore, and a SOM user thus might forget and use the underscore when requesting instance creation. This has been been fixed. (I.e., the underscore form of the macro is no longer defined defined.) In general, the macro-based C usage bindings are open to a number of similar criticisms. The C++ bindings would not suffer from this problem, since they would not be macro-based.

4.1: Is there a version of SOM that is OMG (CORBA) compliant?

response date: 6/12/92

We are currently working upon a CORBA compliant version of SOM. This will include an IDL compiler (with suitable enhancements to accept SOM Class Implementation details) and all SOM runtime classes in IDL.

5.1: For example, if I had a "C++ binding" for the workplace shell classes, what might a C++ programmer write in order to subclass "WPAbstract"?

response date: 6/30/92


 * Instantiable

First of all, with the "Instantiable" approach, it is not necessary to have a C++ binding for WPAbstract in order to implement a subclass of WPAbstract using C++. All that is needed is an OIDL class description (e.g., the .sc file) for WPAbstract.

Assuming that the .sc file for WPAbstract is available, a C++ programmer would (1) use OIDL to declare a new SOM class whose parent is WPAbstract, (2) generate Instantiable C++ implementation bindings for the resulting subclass, and (3) fill out the resulting default method procedure templates for overridden and newly-introduced methods with C++ code. As can be seen, this is essentially the same approach used for the C implementation bindings.

How is this different from using C implementation bindings to implement a new subclass of WPAbstract? The two primary differences are that the instance variables introduced by the new subclass can include C++ objects, and the method procedures used by the C++ programmer to implement the overridden and newly introduced methods for the new class can make use of local variables that are C++ objects.

Of course, C++ usage bindings for a SOM class should be a great improvement over the C usage bindings, since the C++ usage bindings can be based on inline method expansion instead of macros, resulting in the elimination of any macro "collision" problems of the C bindings. Also a C++ compiler could then type check SOM object usage.


 * Subclassable

With the Subclassable approach, a C++ programmer would use a C++ class representing the SOM class WPAbstract (provided by the Subclassable C++ bindings for WPAbstract) as a base class for declaring and defining (in  C++ - not OIDL) a new derived C++ class whose parent is the C++ class  provided by the C++ usage bindings for WPAbstract. This new derived C++ class is not a SOM class. However, if it is desired to publish this new subclass as a SOM class (thereby making it available and useful to languages other than C++), then it is possible to do this - using essentially the same technique as was used to create the Subclassable C++ bindings for WPAbstract.

The Subclassable bindings would be more complex and present additional execution-time tradeoffs. We have created prototype emitters to demonstrate that the process of generating these bindings can be automated, and we are currently evaluating tradeoffs.