lwio watch: Week 7 ending February 28, 2009

In the course of building a new software system, there are these inflection points when everything comes together. Last week was one such inflection point. Here are the highlights
– Windows explorer, dir, Acrobat reader, thumbnails views on explorer all work
– copying files, deleting files, drag and drop, xcopies are all smooth and seamless
– Word now works – we can click on a Word document on a share and we can open the document and edit it
– Multiple connections and multiple large file downloads work smoothly
– We’ve set up an internal file server that hosts over half a terrabyte of data (with full iso OS images and VMs) and are using this as our internal dog food server.
– The MMC share management snapin and wizard works smoothly; we can point MMC at our Linux server create, delete and manage shares on the server.
– The installation is very simple. There is next to nothing to configure and setup. Install the bits and you have a file server available.

The month of March promises to be very interesting for us. I’ll write another post on what we plan on accomplishing by end of March.

Thanks for reading

lwio watch: Week 6 continued (ends February 20th, 2009 today)

It has been a super productive week.

First the highlights. The lwio SMB File Server has hugely advanced. Most operations from the Windows XP command line work. Copying files, xcopying, making directories, net use, net use /del, single sign-on are all fully operational. In addition, the Windows explorer is almost completely functional. We’ve yet to support server-side file change notifications (i.e changes to the remote directory are not immediately visible in the Windows Explorer, the simple workaround is to hit the refresh button). You can view thumbnails of pictures and photographs, you can bring up notepad on a file directly. In all, we have a pretty useful file server.

Today we plan on starting internal dog-fooding. We will be hosting our own server and placing large size OS iso images which we use, so that everyone in the engineering team can upload/download files to the server. We’re putting all our VMs as well on the server and will subject the server to a significant amount of stress.

The day is not yet done. It is 8:38 hrs PST out here in Bellevue WA. We plan to have the file share management work done by end of day today. This will allow us to create file shares on the Likewise SMB File Server via MMC’s Add Share Wizard. Also, we plan on add a File Management plugin on the Likewise Administrators Console that will allow Linux Administrators to graphically manage their own Linux Likewise File Servers from Linux desktops.

Its been a good week!

lwmsg – an elegant IPC mechanism for building client-server applications

As you may deduce, I’m a huge fan of unmanaged remote procedure call  systems.  They serve as the building blocks for most distributed systems problems. They provide an easy programming model – synchronous function calls.  They have a small memory footprint (unlike managed systems like Java and C#). DCE/RPC and Sun RPC are the two widespread models (one is the foundational technology for Windows operating systems) and the other atleast is the basis for NFS infrastructure.

DCE/RPC on UNIX platforms has been in wilderness for almost a decade. Hopefully Likewise’s rehabilitated DCE infrastructure can bring about some form of a renaissance of an unmanaged, easy to use, remote procedure call infrastructure.  But I digress. I wanted to talk about lwmsg.

In the Windows world of systems programming, user mode clients and servers are built around a particular paradigm.  Let say we are building the foo client -server system.  Foo will have a client dll – the fooclient.dll that exposes DLL entry points for a calling client application to link to. The calling client program has no 

 

When we started on lwis, we wanted to build that Windows style programming pattern – we could have used DCE/RPC using ncalrpc, but at the time, I was worried about the chicken-and-egg effect.  We wanted API  style communication to the lsassd daemon, but one of the primary interfaces would be our gss-ntlm client and our gss-ntlm server. The DCE/RPC system would use NTLM as one of its authentication mechanisms, so wouldn’t that mean we would be using DCE/RPC  to bootstrap itself? So we decided that we would need an effective authenticated  remote procedure call mechanism that would handle communication between the lsass client and the lsass server for each of the lsass server interfaces – the UNIX id mapper interface and the GSS NTLM interface.

Initially, we hand-marshalled these IPC constructs, but as the system grew more sophisticated, the number of hand-marshalled functions grew to the point, that adding a new function got to be tedious. So Brian Koropoff, with his extensive background in compiler and language design, decided to build a data-driven (think idl-like) runtime engine for client-server  systems. This is lwmsg.

lwmsg allows you to specify  function definitions as request-response messages.  A remote procedure call function comprises of a  request message and a pair of response messages (on SUCCESS and FAILURE). What is so compelling about lwmsg is that you can specify a extensive class of C datatypes to marshall across the wire. Pretty much the entire DCE/RPC idl can be represented using lwmsg specifications. A calling client-server program specifies its “idl” as a set of global arrays and the light-weight run-time now handles the entire marshalling and unmarshalling for a calling program. 

LWIS is using lwmsg in three subsystems – the lsass subsystem, the netlogon subsystem and our newest lsmb subsystem. lwmsg builds  very easily, has no dependencies and we’ve just released it on trunk. Its licensed under the LGPL so feel free to get it a whirl.

A History of MS-RPC and open source equivalents

When people talk of Windows-Linux interoperability, just about everyone forgets the most important piece of technology that is the foundation for Windows-to-Windows interoperability- RPC. Remote Procedure Calls. In fact, if you ever get a hold of the first “Inside Windows NT” book by Helen Custer, Chuck Lenzmeier, one of the most influential engineers on the NT operating system makes the exact same assertion: “RPC is one of the most important pieces of technology in Windows NT.”

Microsoft RPC known as MSRPC is actually an implementation of the OSF DCE/RPC framework. If I recall correctly,  Microsoft licensed the DCE/RPC code base but then rewrote it substantially. In 1993, I was hired on  in Microsoft to work on the NT print spooler. The print spooler was one of the heavier users of RPC. The key developers in the RPC team were Bharat Shah (who wrote the runtime rpcrt4.dll) and Vibhas Chandorkar (who wrote the midl compiler). 

From Wikipedia

“DCE/RPC was commissioned by the Open Software Foundation in a “Request for Technology”. One of the key companies that contributed was Apollo Computer, who brought in NCA – “Network Computing Architecture” which became Network Computing System (NCS) and then a major part of DCE/RPC itself. The naming convention for transports that can be designed (as architectural plugins) and then made available to DCE/RPC echoes these origins, e.g. ncacn_np (SMB Named Pipes transport); ncacn_tcp (DCE/RPC over TCP/IP) and ncacn_http to name a small number.

DCE/RPC’s history is such that it’s sometimes cited as an example of design by committee. It is also frequently noted for its complexity, however this complexity is often a result of features that target large distributed systems and which are often unmatched by more modern RPC implementations such as SOAP.”

And since Microsoft hired Paul Leach, one of the founders of Apollo, I suspect Paul brought DCE/RPC in to Microsoft.

Recall that Microsoft really did not embrace IP protocols till NT 4.0/Win95 which means around circa 1996. The first thing Microsoft did with its RPC implementation was to retrofit it to run on named pipes. The protocol conventions in DCE/RPC were ncacn_ip_tcp and ncadg_ip_udp, but Microsoft added ncacn_np protocol – connection oriented semantics over SMB named pipes. Security for RPC over named pipes was done with named pipe transport security.

Named pipe transport security meant NTLM authentication which was soon roundly trashed in the industry as a weak security mechanism. And that was it – pretty any NTLM secured protocol aka SMB was also trashed and by extension MSRPC was also hammered. In general any traffic on port 139 or port 445 was immediately denounced as badness.

Later versions of MSRPC have full gss secured implementations – which means you can do kerberos and ntlm security over ncacn_ip_tcp.

In the meantime, HP and IBM both ported DCE/RPC on to their respective operating systems. HP-UX runs DCE/RPC and so does AIX atleast up to AIX 5.3. However nobody did an SMB stack so there is no ncacn_np support on any of these platforms. Sun did never support DCE/RPC – NFS uses Sun RPC so there was no DCE/RPC support on Solaris.

OSF in the meantime totally missed the boat and did not get DCE/RPC ported to Linux or BSD. It is only recently that they’ve open sourced the DCE environment and that thing has an absolutely horrific build environment. Its pretty tragic – there is some really good stuff out there, but except for the RPC framework, the rest of it – kerberos, a DFS, the directory service, the NTP server, there are better and more current alternatives available that use a sane build system. The LGPLed DCE environment’s build system is insane.

For a long time, there were no decent open source implementations of DCE/RPC. Early in 2007, I began looking at the state of affairs for DCE/RPC and I found three

While the proprietary OS vendors were incorporating non-interoperable versions of DCE/RPC (thanks to Microsoft’s ncacn_np), the Samba project was steadily working on a building an interoperable suite of the SMB protocols. In all of the versions of Samba upto 3.2, Samba was systematically hand marshalling DCE/RPC PDUs. It was only after a significant while did they realize that they were synthesizing DCE/RPC. In early 2003, Andrew Tridgell began work on Samba 4 and implemented a DCE/RPC idl compiler in Perl – pidl. Over time, Samba 4 synthesized all of the Windows RPC services idl files. Samba4 does have an RPC idl compiler and an runtime which is Windows compatible. However Samba 4 does not seem to have implemented a idl syntax which is fully compatible with DCE/RPC. In addition, core to the DCE/RPC framework is the RPC API and I’ve not yet been able to find a DCE/RPC compatible API as part of the Samba 4 suite. Since APIs are really syntactic sugar, one hopes that in time Samba 4 releases a DCE/RPC compatible API and idl compiler.

A surprising development was Novell releasing its DCE/RPC libraries under a BSD license early in 2007. Novell had acquired the rights to PADL Ltd XAD product line and had integrated it with eDirectory to build Domain Services for Windows. Domain Services for Windows was this AD proxy that was built on eDirectory. It allowed Windows clients to connect to eDirectory thinking that they were talking to an Active Directory Domain Controller. I’m speculating that PADL had possibly licensed the DCE/RPC libraries from OSF under a BSD license and used it for its XAD product line. At that time of its release, the Novell libraries were rather difficult to use. One of more challenging problems was that its threading libraries were non-portable and made use of glibc pthread internal data structures.

Thus by mid 2007, there were 3 possible choices for building DCE/RPC applications natively on Linux/UNIX platforms.

Around the same time, at Likewise we’d been discussing what it would take to build real Windows interoperability from the Linux side. And inarguably the conclusion was that without an API compatible DCE/RPC framework running natively on the Linux side, any interoperability effort would be futile. We selected the PADL libraries that Novell had open sourced for two reasons.

First and most important the lineage was from OSF which meant there were RPC APIs that were compatible with the MSRPC APIs and secondly the grammar that the IDL compiler parsed was fully compatible with midl (the Microsoft IDL compiler). This would mean that ISVs that had written RPC applications on Windows could cleanly port their infrastructure from Windows to Linux.

Second and less importantly, the licensing was a BSD style licensing. I am of the view that open source is all goodness, but ISVs still like to protect their IP. Inorder to evangelize and resurrect a technology, it would be important to get a lot of ISVs to write applications to this platform. So the ideal choice for us would be to have a LGPL or BSD style license for the libraries. We could have picked up the recently released OSF source code but the build system, as I noted, is insane. Any OSF developers/people out there, if you ever read, do yourselves a huge favor and fix your build system.

Once we picked this technology, we found out that the code simply did not work. After some really gnarly work and several changes, we got this libraries in a working state. The two most important pieces of work were redoing the threading library and adding client side named pipe support.

Remember that our goal was to build true Linux-Windows interoperability. The one sentence definition of Linux-Windows interoperability is to have the Windows Net and DSys APIs supported natively on Linux. And as I’ve noted earlier in this post, how pervasively RPC is the foundation for the Windows Net and DSys APIs, we began build equivalents of these APIs on Linux. In December of 2007, after a whirlwind 5 months of work, we had these APIs running natively on Linux. Since then, this work has expanded significantly in scope.

I want to conclude this piece with the following assertion.

“If seamless Windows-Linux interoperability is to be a reality and if we really want to see Linux systems as first class citizens in a Windows environment, then it is supremely important that an MS compatible DCE/RPC framework be a first class citizen on every flavor of Linux/UNIX/Mac”

The shunning of DCE/RPC on the Linux platform is probably one of the greater ironies in the evolution of Linux as a mainstream platform. Microsoft took an otherwise open programming platform and base all of its distributed systems technologies on it. The open source community and the industry around denounced RPC as a “Microsoft only” technology and considered insecure because of its association to a security mechanism that was not part of RPC. And this shunning resulted in an “open technology” becoming “proprietary” because the only mainstream implementation ended up being Microsoft.

Hopefully, we can fix this by making a mainstream open source Microsoft compatible DCE/RPC framework available to community at large.