Updates from March, 2012 Toggle Comment Threads | Keyboard Shortcuts

  • Andre Pareis 10:24 on 31.03.2012 Permalink | Reply
    Tags: , ,   

    Java Like Programming in C++ 

    C++ is a nice and very flexible language, but this comes at the cost that it forces you to think about many programming details before you can even think about solving your actual problem. Examples would be:

    • object oriented or generic programming
    • when to use references, value types and pointers
    • memory management rules
    • casting rules, const correctness, virtual methods
    • STL? boost?

    This list can become quite endless. Being in the same boat as every other C++ programmer but having grasped some of the look&feel of other languages like Java and Objective-C (in its Cocoa incarnation) or even Qt (as a good C++ OO style example), I am continuously and unconsciously thinking about the pros and cons of each of them and recently I thought: wouldn’t it be nice to be able to program in C++ but make it look like Java?

    Sure it would! Even it it were only just for fun…

    So, I came up with a working code sample which looks quite like Java but is actually C++. I show you the code sample first before I elaborate on any details. First come some classes which you can consider the “framework”:

    #include <iostream>
    #include <sstream>
    #include <assert.h>
    #include <vector>
    
    
    size_t globalRefCount = 0;
    
    class Object;
    class String;
    std::ostream &operator<<(std::ostream&, Object&);
    std::ostream &operator<<(std::ostream &, const String&);
    
    
    
    class Object {    
    protected:
        
        struct Impl {
            size_t _refCount;
            Impl() : _refCount(0) {}
            virtual ~Impl() {}
        } *data;
        
        inline void retain() {
            if(data) {
                data->_refCount++;
                globalRefCount++;
            }
        }
        inline void release() {
            if(data) {
                data->_refCount--;
                globalRefCount--;
                if(data->_refCount == 0) {
                    delete data;
                    data = 0;
                }
            }
        }
        
        Object(): data(new Impl) {
            retain();
        }
        Object(Impl *imp) : data(imp) {
            retain();
        }
        Object(const Object& other): data(0) {
            operator=(other);
        }
        
    public:
        virtual ~Object() {
            release();
        }
        virtual const char *type() const { return "Object"; }
        
        void operator=(const Object& other) {
            if(data!=other.data) {
                release();
                data = other.data;
                retain();
            }
        }
        
        // make a heap clone of this object for usage in containers
        virtual Object *clone() const {
            return new Object(*this);
        }
        
        virtual String toString() const;
            
    };
    
    
    
    
    class String : public Object {
    protected:
        struct Impl: public Object::Impl {
            std::string str;
        };
    public:
        String(): Object(new Impl) {}
        String(const char *s): Object(new Impl) {
            static_cast<Impl*>(data)->str = s;
        }
        String(const String &other): Object(other) {}
        String operator+(const String& s) const {
            String result;
            static_cast<Impl*>(result.data)->str = static_cast<Impl*>(data)->str;
            static_cast<Impl*>(result.data)->str += static_cast<Impl*>(s.data)->str;
            return result;
        }
        String operator+(const Object& s) const {
            String result;
            static_cast<Impl*>(result.data)->str = static_cast<Impl*>(data)->str;
            static_cast<Impl*>(result.data)->str += static_cast<Impl*>(s.toString().data)->str;
            return result;
        }
        String operator+(long l) const {
            std::ostringstream oss;
            oss << static_cast<Impl*>(data)->str << l;
            String result;
            static_cast<Impl*>(result.data)->str = oss.str();
            return result;
        }
        const char *c_str() const {
            return static_cast<Impl*>(data)->str.c_str();
        }
        bool operator==(const String &other) const {
            return static_cast<Impl*>(data)->str == static_cast<Impl*>(other.data)->str;
        }
        
        // must have's
        const char *type() const { return "String"; }
        Object *clone() const { return new String(*this); }
        
        String toString() const {
            return *this;
        }
    };
    
    std::ostream &operator<<(std::ostream &os, const String& s) {
        os << s.c_str();
        return os;
    }
    
    String Object::toString() const {
        std::ostringstream os;
        os << this->type() << "@" << (void *)this << "[" << (data ? data->_refCount : 0) << "]";
        return String(os.str().c_str());
    }
    
    std::ostream &operator<<(std::ostream &os, Object& o) {
        os << o.toString().c_str();
        return os;
    }
    
    
    
    
    class ClassCastException: public Object {
        struct Impl: public Object::Impl {
            String message;
        };
    public:
        ClassCastException() : Object(new Impl) {}
        ClassCastException(const String& msg) : Object(new Impl) {
            static_cast<Impl*>(data)->message = msg;
        }
        const char *type() const { return "ClassCastException"; }
        Object *clone() const { return new ClassCastException(*this); }
        String message() const {
            return static_cast<Impl*>(data)->message;
        }
        String toString() const {
            return message();
        }
    
    };
    
    
    
    class ArrayList: public Object {
        struct Impl: public Object::Impl {
            std::vector<Object*> _data;
        };
        
    public:
        ArrayList(): Object(new Impl) {}
        ~ArrayList() {
            Impl *self = static_cast<Impl*>(data);
            for (std::vector<Object*>::iterator it = self->_data.begin(); it!=self->_data.end(); it++) {
                delete *it;
            }
        }
        void add(const Object& element) {
            static_cast<Impl*>(data)->_data.push_back(element.clone());
        }
        size_t size() const {
            return static_cast<Impl*>(data)->_data.size();
        }
        
        Object &at(size_t index) const {
            return *static_cast<Impl*>(data)->_data.at(index);
        }
        template<class T> const T &at(size_t index) const {
            Object *o = static_cast<Impl*>(data)->_data.at(index);
            T *t = dynamic_cast<T*>(o);
            if(t) return *t;
            throw ClassCastException(o->type());
        }
        
        const char *type() const { return "ArrayList"; }
        Object *clone() const { return new ArrayList(*this); }
    
        String toString() const {
            std::ostringstream oss;
            oss << Object::toString() << "(";
            for (size_t i = 0; i<size(); i++) {
                oss << at(i).toString();
                if(i+1<size()) {
                    oss << ",";
                }
            }
            oss << ")";
            return oss.str().c_str();
        }
    
    };
    
    
    class OutputStream: public Object {
        struct Impl: public Object::Impl {
            std::ostream &stream;
            Impl(std::ostream &os): stream(os) {}
        };
        OutputStream() {}
    public:
        
        OutputStream(std::ostream& os): Object(new Impl(os)) {}
        OutputStream(const OutputStream &other): Object(other) {}
    
        void println(const Object &object) {
            static_cast<Impl*>(data)->stream << object.toString() << std::endl;
        }
        
        const char *type() const { return "OutputStream"; }
        Object *clone() const { return new OutputStream(*this); }
    
    };
    
    

    Now let’s see how to use it in client code. I supply only a main() here but I have some anonymous blocks to show the effects of scoping:

    
    struct system {
        OutputStream out;
        system(): out(std::cout) {}
    };
    
    struct system System;
    
    
    int main (int argc, const char * argv[]) {
        
        assert(globalRefCount==1); // 1 is for System.out
        
        {
            String s;
            assert(globalRefCount==2);
        }
        assert(globalRefCount==1);
        
        
        {
            String s = "Connecting...";
            System.out.println(String("s = ") + s);
            
            String dots = "the dots";
            String t = s + " " + dots + ".";
            System.out.println(String("t = ") + t);
            
            ArrayList l;
            l.add(s);
            assert(l.size() == 1);
            assert(globalRefCount==6);
            
            l.add(t);
            l.add(ArrayList());
            System.out.println(String("l = ") + l);
            assert(globalRefCount==8);
            
            try {
                System.out.println(String("l[0] as String = ") + l.at<String>(0)); // ok
                System.out.println(String("l[1] as String = ") + l.at<String>(1)); // ok
                System.out.println(String("l[2] as String = ") + l.at<String>(2)); // this throws!
                assert(false);
            } catch(ClassCastException e) {
                System.out.println(String("ClassCastException: ") + e);
                l.add(e);
            }
            
            
            // adding to the list in other scope will keep the object valid
            {
                String other = "Created in other scope";
                l.add(other);
            }
            System.out.println(String("l[3] as String = ") + l.at(3));
            assert(l.size()==5);
            assert(l.at<String>(4) == "Created in other scope");
            System.out.println(String("l = ") + l);
        }
        assert(globalRefCount == 1);
        
        return 0;
    }
    

    Pretty much like Java, isn’t it?

    I have written this just as a proof of concept. As such, it is fully working and I like it so far. It could serve as a good starting point for a complete implementation. Here’s the output when executed:

    s = Connecting...
    t = Connecting... the dots.
    l = ArrayList@0x7fff5fbff8c0[1](Connecting...,Connecting... the dots.,ArrayList@0x100100b00[1]())
    l[0] as String = Connecting...
    l[1] as String = Connecting... the dots.
    l[2] as String = ClassCastException: ClassCastException@0x100100cb8[1]
    l[3] as String = ClassCastException@0x100100b20[1]
    l = ArrayList@0x7fff5fbff8c0[1](Connecting...,Connecting... the dots.,ArrayList@0x100100b00[1](),ClassCastException@0x100100b20[1],Created in other scope)
    Program ended with exit code: 0
    

    Fundamental Design

    Java is (with the exception of primitive types like int, double etc. and their array forms) an object-oriented language. Everything in Java or Objective-C derives from a common and well known base class. So there is a base class Object in this example as well and there is an equivalent of the platform type String as well as a sample collection called ArrayList which is holding just Objects.

    One thing is very important: There is no need for pointers as memory management is part of the solution! For instance, the ClassCastException that is thrown can be added to the ArrayList without having to worry about leaking memory afterwards. It’s not complete yet though as the notion of “weak” pointers is still missing (think: ARC), but the strong part it is fully working here.

    The whole idea of the implementation is based on the well-known PIMPL idiom but it also throws the idea of the smart_ptr into the mix but it even goes further. Firstly, all behavior and data are strictly separated, not just the private members. Each conceptual class like String, ArrayList or ClassCastException has a functional class that implements behavior only and no data at all, it acts like a fully functional proxy to the data. This makes it possible to clone (copy assign) these proxy objects very cheaply, because they consist only of 2 pointers (data and _vtable). The actual data is implemented in the nested class “Impl”. There is one specialized Impl class for each conceptual class (1v1 mapping). Both the conceptual classes and the Impl classes span two parallel type hierarchies. As one picture says more like 1000 words, here it is:

    In the base Impl (Object::Impl) a smart_ptr like reference counting is implemented (_refCount). I have explicitly added two methods Object::retain() and Object::release() in the code to express the similarity to Objective-C’s NSObject, but this is all handled internally during copy construction or assignment.

    Conclusion

    I have still to decide wether an approach like this is generally feasible. What I like is to be able to clone good concepts and class library designs from other languages like Java or Objective-C into C++ and continue coding without having to worry about the aforementioned detailed C++ design decisions that trouble me every day.

    Of course, I would have to implement ARC style memory management completely before it can be used, otherwise cyclical references would leak. Also, I’d like to mention that I’m fully aware that a coding style like this leads to immediate code bloat. But so does the PIMPL idiom. In order to mitigate that, I have a flexible code generator on my side which let’s me do most part of the actual coding in UML as opposed to hand-crafting it where I would definitely think twice or even more before traveling down this road…

    View the full source: https://gist.github.com/2279561

     
  • Andre Pareis 16:08 on 13.03.2012 Permalink | Reply
    Tags: ,   

    Multiple Inheritance in Objective-C / Core Data vs C++ 

    Multiple inheritance is hard. In fact it is so hard, that only very few programming languages support it. Objective-C is one for instance, where support for multiple inheritance is limited to the conformance to @protocols. Behavior can only be inherited from one single base class.

    If you need multiple inheritance in Objective-C you have several options to choose from. Most of the time when you are looking for answers to the question of how to do multiple inheritance in Objective-C the right way, you will be pointed into one of two directions: #1 don’t use it because it implies a flawed design. #2 do it using composition (via delegates).

    Option #1 I do not like at all. There are cases where MI is very well suited and only because Objective-C doesn’t support it doesn’t mean it is bad. It just means that the language designers considered it way too complicated to implement for the benefit of a few cases where it makes sense.

    For instance, my current use case with strong need for multiple inheritance support is the UML specification. UML makes heavy use of multiple inheritance and if you study the UML model you will find that the abstractions found in there make very well sense because they eliminate redundancy and the need to explain what’s going on. All those abstractions are basically orthogonal classifications which can be combined in a subclass to express things very precise and in a type-safe manner.

    So, if you are forced to deal with multiple inheritance in your program you can do so with option #2 in Objective-C. However, in my opinion, this has limitations. I will give you an example: Imagine a model like this:

    Let’s say we map this to the following physical implementation in Objective-C. Here, the greenish elements are Objective-C @protocol and the yellowish elements are Objective-C @class:

    It follows the often heard recommendation to map inheritance using composition. Here, the Class part of an AssociationClass is mapped to a delegate called “theClassImpl”, whereas the Association base class is mapped to plain Objective-C inheritance.

    Suppose now we want to map this structure to CoreData. We need to model NSManagedEntity with NSManagedProperty. CoreData does not work on top of @protocols but on actual @classes. Therefore, we have one physical implementation of the association between Class and Property (owningClass-properties).

    But here comes the big BUT: This can only work if we have full control over the OR mapping! CoreData on the other hand, does not rely on interfaces but on the actual implementations. That means, we must publish the otherwise internal composition mapping of theClassImpl to CoreData. If we then have a client of Class (for instance: Property::owningClass) then it will not be possible to downcast such a Class obtained from the persistence layer into an AssociationClass. Instead, it would be necessary to navigate backwards from the Class to the actual AssociationClass. But this kind of “alternative” cast can not be implemented transparently using Objective-C language constructs. An [aClass isKindOfClass:[AssociationClassImpl class]] would yield a technical “NO” and it’s not possible to extend the language to make it yield “YES”.

    Such an MI -> SI mapping scenario can only work if every consumer solely relies on the Interfaces only and makes no assumption about internal structures. This would imply that the ORM uses factories instead of instantiating from its meta information. In CoreData, this is not the case.

    This is why you can pretty much ignore the advice how to map multiple inheritance at the language level if you don’t also consider the APIs you’re dealing with because those APIs will render the easy sounding solution in the context of reality useless pretty quickly. In my case, I was forced to implement the model part of the system in C++ because C++ has awesomely good support for multiple inheritance. All problems related to the mapping of multiple inheritance to the microprocessor architecture had been solved by Bjarne Stroustrup in C++ since day 1. Read here why and how: http://drdobbs.com/184402074

    Here’s how the dreaded diamond from the example above would be implemented in C++:

    class Element {
    };
    
    class Association: public virtual Element {
    };
    
    class Class: public virtual Element {
    public:
        std::vector<class Property*> properties;
    };
    
    class Property: public Element {
    public:
        Class *owningClass;
    };
    
    class AssociationClass: public Association, public Class {
    };
    

    A straight 1:1 mapping of the concept to the language. Here, class “AssociationClass” would fully inherit the behavior of Class::properties without the need to implement something special. It just works. But, in comparison to Objective-C the C++ implementation lacks support for Core Data. But: so does multiple inheritance in Objective-C with CoreData! So no real difference here.

    Conclusion

    Multiple inheritance with CoreData is close to impossible except for very simple cases. With C++, besides all its ugliness and controversy, at least you get multiple inheritance in the language and usually in an implementation quality without the need to waste your time thinking around the whole concept.

     
  • Andre Pareis 13:40 on 22.07.2011 Permalink | Reply
    Tags: ,   

    Transparent Bi-directional Associations in Play 

    Do you find writing code like this cumbersome and error prone?

    forum1.posts.remove(post);
    post.forum = forum2;
    forum2.posts.add(post);
    

    Wouldn’t it be much easier if you could just write

    post.forum = forum2;
    

    and the system handles all the wiring and rewiring for you? With the benefit of eliminating errors like “detached entity passed to persist” and such?

    I have created a play module that does just that. Whenever you invoke one of the operations that changes a part of the association from whichever side, then the module completes this operation on all other affected parts. This includes not only the new target and its opposite reference, but it also manages to unlink a target object from its current associated object. All hassle free and safe. It works on @OneToOne, @OneToMany and @ManyToMany associations.

    There are no dependencies introduced in your code. The module enhancer works on all properties of your @Entity classes having a “mappedBy” attribut on the @OneToOne, @OneToMany or @ManyToMany annotation. You do not need to declare anything else. The presence of the module is sufficient.

    You can check it out at https://github.com/pareis/associations and as soon as available in the play modules repository.

     
  • Andre Pareis 13:44 on 23.02.2010 Permalink | Reply
    Tags:   

    Advanced Object Lifecycle in OO Systems applied to implementations in C++ 

    C++ certainly provides sophisticated mechanisms to create and destroy objects via it’s constructors and destructors. There are, however, certain aspects to these which make them hardly usable in advanced OO systems. First and foremost, a C++ constructor implementation lacks one very important feature: polymorphic calls! In a constructor, a call to a virtual function of the object constructed, is not a polymorphic call but a call to the method as overridden at the level of the class currently constructing the object! This makes it impossible to give the more abstract constructor information from the more concrete class. Imagine you have a UI element that need size information during construction, but that size information can only be provided by it’s specific subclasses, then you will need to implement a 2-phased approach with construction first and using (displaying in this case) second.

    C++ constructors also lack in contrast for instance to Java the possibility to call another constructor from the same class. You can only call constructors from super classes. This makes it harder to initialize a large number of members because the initializers have to be implemented redundantly in every constructor. Without a code generator this is very error prone.

    There is another problem which is the rather weird logic of which members get initialized and which do not. If you don’t have a constructor which initializes pointers, then these pointers are not initialized to 0. However, if a class uses some struct as a member and this struct has a pointer as member, then this pointer is initialized to 0.

    So, a general approach to get over all these problems would be to not use constructors at all and delegate all responsibilities to some generic virtual methods. Let’s first look at the requirements to an OO lifecycle in general:

    Creation

    • Create all mandatory child objects upon creation of the parent
    • If an instance requiring a parent element is created, the parent is already present

    Transfer Ownership

    • Take a partial tree out of an object tree and connect it to a different object tree (reparenting)

    Deletion

    • Isolate a subtree and destroy it

    These requirements can be translated into a general lifecycle model in OO systems:

    Objects are, once allocated via operator new() in an intermediate state “allocated”, where they can be attached to a parent object, i.e., the parent becomes known to the child and can be used in the subsequent init() call which initializes the object structure in it’s depth. Instead of deleting an object via operator delete() it is destroyed by the destroy() method. In the destroy method, all child objects get destroyed and all destroyed objects are returned to the object pool as isolated instances. Only the pool deletes these instances if needed. Instances are never created via operator new() directly, they are always obtained from the pool via the factory method instance() of the class. This factory method creates a new instance if none is found to be recycled from the pool. Both the default constructor and the destructor are declared as private to ensure the sole usage of the factory method.

    This would further lead to the 2 virtual methods init() and destroy() to be implemented on each class level. These methods would have the following behavioral characteristics:

    init()

    • is called only after business parent has been set
    • is the constructor equivalent
    • init() calls super::init() first, just like a constructor
    • init() initializes members of the class
    • init() can be called repeatedly
    • memory allocation is a separated concern
    • init() is virtual
    • init() can call virtual methods

    destroy()

    • is called before business parent is removed
    • is the destructor equivalent
    • destroy() first resets own members
    • destroy() calls super::destroy() second like a destructor
    • destroy() can be called repeatedly
    • memory deallocation is a separated concern
    • destroy() is virtual
    • destroy() can call virtual methods

    Conclusion

    • Constructors/destructors are not needed any more, except for memory allocation
    • Instances are created in the pool only (i.e., in the class)
    • Objects have either a business parent or are referenced by the pool during destroy()
    • objects are returned to the pool
    • if objects do not have a business parent, then a parent is also not needed for initialization

    These feature improve the handling of complex object structures in a meta-driven OO environment applied to C++. In fact, they are so generic that they can be easily adapted in a completely different programming and runtime environment. There are, of course, some drawbacks which is basically the more complex protocol for initialization and the implementation of the object pool.

     
c
Compose new post
j
Next post/Next comment
k
Previous post/Previous comment
r
Reply
e
Edit
o
Show/Hide comments
t
Go to top
l
Go to login
h
Show/Hide help
shift + esc
Cancel
Follow

Get every new post delivered to your Inbox.

%d bloggers like this: