About   -   Contact   -   Purchase   -   Search   -   What's New

Distributed Data Transfer
Over the last few months We've been putting the finishing touches to our new Distributed Class Library at work. I'm glad to say that we are starting to see some good results.

Anyway, over the next few weeks I'm going to be describing some of the problems I ran into making the distributed stuff work, some of my experiences working around the problem and hopefully some solutions.

This week I'm going to start with one of the most complicated topics, Distributed Data Transfer. Judging by the number of times I get asked this question it should provide some food for thought. In a previous tip I showed some statistics for data transfer using DPB, if you have not read them take the time to do so now as that is the basis for some of my decisions. (I'll take the time).

So after running those time trials it became obvious that strings where the way to go, strings do bring a few problems of their own, such as a 60k limit under 16bit, but it should be rare in transactional systems for the data to exceed that limit. If you need more volume then you can fall back to structures and the boat anchor that comes with it. For high speed transaction systems you have to go with strings. Fortunately for me we run Windows NT on all of our 12,000+ work stations so it is not a problem.

When we were building the new class library our Prime Directive was that it should make the developers life easier than using straight PowerBuilder or at least not more difficult. If it did not then we redesigned it. So how do you make distributed business entities transfer data without making it more complicated than a simple retrieve statement?

On the Application Server our developers create and connect their datastores to the database in the constructor event of the business entity. Then we gave them a virtual method in which to put their retrieve statements. The retrieval should be encapsulated in its own function for many reasons which I'll get into in another article.

In the overridden Retrieve method the developer retrieves all the data into their datastores using straight simple PowerBuilder retrieves. We used a simple generic argument structure to pass the arguments to the method.

The transfer trick comes next, we have a service on the client that called the retrieve on the server, because of the PB5 limitation the server cannot send the data to the client so the client has to pull the data from the server.

The datawindow objects that are used on the server are also used on the client. The client version is painted first, then all the visual controls are removed and saved into the server version. This helps to make the datastore faster and ensures that the result matches exactly from the client to the server.

To automate the data transfer between the client and the server the client datawindows are registered with our service in the open event of the window. On the server the datastores are registered with the business entity. This is just through a simple unbound array, we did not use a double linked list as the array does not normally change size dynamically. We could have then just use the datawindow DataObject attribute to match the client and the server but instead we added an attribute to the Datawindow and Datastore ancestor called is_Identity. This identity is filled out by the developer when the datawindows and datastores are created. You could go with the DataObject name but adding your own attribute gives you more flexibility and resistance to changes in PowerBuilder it also makes the code easier to read for future maintenance.

To transfer the data the client asks the server how many datastores were retrieved. Then for one to the number of data stores it asks for the identity of the data store, matches the identity with the client datawindow then it asks the server to return the data for a given datastore. We use a string return value to transfer the data as this is faster than passing a string by reference. This is because you have to empty the string in the loop before calling the server again, and even passing an empty string still takes longer than passing nothing.

Then we use RETURN datastore.Describe( 'datawindow.data' ) to return the data, and datawindow.ImportString( lbp_Customer.Data( 'Header' ) to push the data into the client.

Lastly we must run through the datawindow buffer on the client and call ResetUpdate() to reset the update flags on the rows. Otherwise PowerBuilder will think all the rows are new.

The Return Journey

The return of the data from the client to the server is a similar process, on the client for one to the number of datawindows registered with the service we ask the datawindow if it has any changes. If it does we run through the Delete buffer and pull all the deleted data, then the primary looking for new rows and finally the primary looking for modified rows.

Unfortunately PowerBuilder does not offer an easy way to extract a single row to importstring format, so we wrote a custom routine. I posted this question on the news group and got some responses but none where as fast as my custom routine. If anyone knows a fast or easy way let me know.

So we extract the data from the client and send it to the server with a flag ( 'I', 'U', 'D' ) to tell it how to apply the data and the identity the datawindow that has the changes. We only send changes back to the server so the volume of data is quite low, in fact most times there will only be a single call per datawindow, so the over head of separate calls for each type of modification is not really a problem.

On the server Inserts are handled using ImportString() to add the data to the matching datastore. Now comes the tricky part, to delete and update existing rows we perform the same import string into the datastores making sure the data is at the end of the datastore buffer, we extract the key definition out of the datawindow. Then for one to the number of rows imported we construct a find statement based on the imported record and the key definition to uniquely identify the row that existed in the datastore before the import of data from the client. For Deletes we Delete the row that we found and use RowsDiscard() to remove the imported row from the buffer. For modifications we discard the old row and mark the imported row as DataModified for the row and all columns. You could check each column for modifications but we decided this would be too slow for the benefit of sending a little bit less data back to the database server.


The actual routines we wrote are even more complicated as we do not hold any state on the server and thus after the retrieve of the initial data we clear out the server datastores. When the client wants to perform an update we reretrieve the data from the database. This may seem excessive but it is much better to limit the amount of RAM used by the objects then a quick retrieve back to the database. After all the database and server should be connected with a bit network connection. Lastly some people asked why we need all the data back on the server, could we not just pass and process the client updates. Well that works fine if you have no business logic but if you do you need to reconstruct the full result set before you can apply your business logic.


Top of Page

Legal Notice

Ken Howe 2011