Old Vintage Computing Research: beos

Showing posts with label beos. Show all posts

Friday, December 16, 2022

The strange case of BeOS, SRS and the silent Power Mac 6500

Tonight's story time: the Power Macintosh that wouldn't make any sound in BeOS R5, how I figured out the problem, and how I hacked the sound driver to fix it. (Download link at the end.)

My favourite beige Power Mac is the Power Macintosh 7300 and its relatives. They're compact, capable, upgradable and easy to work on. For as much as people raved about the pull-down side door of the Yosemite G3 and the Power Mac G4, they owe their design to their fold-out Outrigger Power Mac ancestors which did it all and did it horizontally — and in some ways did it better.

However, when it came time to setting up a second PowerPC BeOS system to go with my 133MHz BeBox (the two run the same applications), although the Outriggers are well supported in BeOS 5 I decided to get a tower Mac for space reasons. Since NuBus, G3 and New World Macs were out (not compatible with BeOS) there's only a few choices, namely the 8500, 8600, 9500, 9600 and the 6400 and 6500. I despise the 8500 case (I only tolerate my clock-chipped Quadra 800 in the same style because it runs A/UX so well), the 8600 is bulky, and while all beige Power Macs have succumbed to the general price inflation that has afflicted every corner of vintage computing, the 9500 and 9600 6-slot Power Macs have really taken it on the chin (and the 9600 is bulky too).

That left the Insta-Towers, or what I like to call the Stormtrooper Macs:

Going where BeOS NetPositive hasn't gone before: NetPositive+

BeOS browser:
TLS apocalypse
Won't keep it off line.

(How do you pronounce BeOS?)

This is a real 133MHz BeBox running otherwise stock BeOS R5, surfing Hacker News and Lobste.rs using a modified, bug-fixed NetPositive wired to offload encryption to an onboard copy of Crypto Ancienne (see my notes on the BeOS port). NetPositive is the only known browser on the PowerPC ports of BeOS — it's probably possible to compile Lynx 2.8.x with BeOS CodeWarrior, but I've only seen it built for Intel, and Mozilla and Opera were definitely Intel/BONE-only. With hacks for self-hosted TLS bolted on, NetPositive's not fast but it works, and supports up to TLS 1.2 currently due to BeOS stack limitations.

Crypto Ancienne 2.0 now brings TLS 1.3 to the Internet of Old Things (except BeOS)

Who says you can't teach an old box new tricks? We did it before and we're doing it again. Crypto Ancienne ("Cryanc") is a TLS implementation for pre-C99 beasts and monstrosities featuring carl, a simple curl-like utility that serves as a demonstration command line tool and even as an HTTPS-over-HTTP proxy for suitably configurable browsers. Many operating systems are supported and a number of compilers too (not only gcc going back to version 2.5 and the egcs days, but also clang, MIPSpro, Compaq C and even Metrowerks CodeWarrior). Now, after a lot of late night hacking, screaming and unspeakable acts of programming, tons of bugs are fixed (including a long-standing big-endian issue with ChaCha20Poly1305) and the core has been significantly upgraded such that almost all of the supported platforms now support TLS 1.3.

And what are those supported platforms? Why, here's some of them as they were being cruelly whipped to perform like beaten dogs for your entertainment:

When you have too much memory for SheepShaver

When I first got my 133MHz BeBox (not new, sadly), it had "only" 32MB of memory and it had four more SIMM slots to fill. While Be only officially supported 256MB of RAM, I was blissfully ignorant of that, bought an additional 256MB of memory in four equally sized 72-pin SIMMs and installed it for 288MB of RAM. (It can actually take up to 1GB, I later learned.) Nice, I said! And then SheepShaver never worked again.

SheepShaver is a desperate pun and an unusual emulator: much like Classic on PowerPC Mac OS X, on big-endian PowerPC most of the MacOS and its applications run natively on the processor, in a form analogous to KVM-PR. In fact, SheepShaver on Leopard is pretty much the best way to run Classic applications on Power Macs that must run Leopard, though it also runs on Tiger and presents certain advantages there as well. It existed first on BeOS as a paid product before becoming open source, though multiple later forks fix various problems on modern platforms.

My original theory was that I had somehow broken something in the update or some other installation, and so I never did much with it (especially since I have plenty of real Power Macs around here). But while I was doing other work on the machine, after a game of BeOS Doom I accidentally double clicked on its icon on the desktop and ... it started up! What could have restored it, I feverishly wondered? Did something monkey around with the memory map? (Foreshadowing music plays here.) It only ran the one time, however, and I spent hours trying to retrace my steps to see if I could make it work again and I never could.

But this at least told me that the install was fine and the problem lay elsewhere. I had never closely looked at it in a debugger. Perhaps it was time.

The BeOS debugger isn't gdb, but you get the idea. The offending instruction was an stbu (store byte with update), but the effective address was ... really weird. It looks like it's wrapped around the entire addressing space back to 0! How did this program even work?

In the source code, for all supported platforms, SheepShaver (and Basilisk II, a 68K emulator it shares substantial code with) has a SIGSEGV handler for trapping segmentation faults; here is BeOS's. My initial thought was that somehow the handler wasn't being installed, but a couple debug printfs in the handler showed that not only was the handler being triggered, it was actually passing the segfault along to the system handler apparently on purpose.

A partial explanation appears in the Darwin (Mac OS X) port:

Under Mach there is very little assumed about the memory map of object files. It is the job of the loader to create the initial memory map of an executable. In a Mach-O executable there will be numerous loader commands that the loader must process. Some of these will create the initial memory map used by the executable. Under Darwin the static object file linker, ld, automatically adds the __PAGEZERO segment to all executables. The default size of this segment is the page size of the target system and the initial and maximum permissions are set to allow no access. This is so that all programs fault on a NULL pointer dereference. Arguably this is incorrect and the maximum permissions shoould be rwx so that programs can change this default behavior. Then programs could be written that assume a null string at the null address, which was the convention on some systems. In our case we need to have 8K mapped at zero for the low memory globals and this program modifies the segment load command in the basiliskII [sic] executable so that it can be used for data.

So, the handler expects to have actual memory mapped indeed at an effective address of zero for the MacOS's low memory globals, a holdover from the 68K days (and if I'd read the Basilisk technical notes, I would have realized that sooner). Since such a fault should never have gotten to the handler in the first place, it just passes it along and crashes. That kind of significant address space remapping clearly could not come from a user-level executable on BeOS; there had to be some sort of system component doing that remapping.

Turns out SheepShaver did in fact install a couple system extensions:

$ find /boot/home/config/add-ons -name 'sheep*' -print
/boot/home/config/add-ons/kernel/drivers/bin/sheep
/boot/home/config/add-ons/kernel/drivers/dev/sheep
$ find /boot/beos/system/add-ons -name 'sheep*' -print
/boot/beos/system/add-ons/net_server/sheep_net

The last one is used for tunneling emulated networking through the host machine; the sheep driver is the one we want (the two sheep drivers are actually the same file; the dev/ one is a symlink to the actual file in bin/). After a little digging in the source tree, I found the C source for it. It became rapidly obvious after a cursory readthrough that it manipulates the PowerPC page tables.

On PowerPC (prior to POWER9 which introduces a higher-performance radix MMU), the mapping between virtual addresses and physical addresses is maintained by a set of hashed page tables, divided into page table entry groups, or PTEGs. (There is an alternate pathway using block address translation "BAT" registers but I'm going to ignore that for the purposes of this discussion.) The low memory globals region is 8K in size, so (with 32-bit PowerPC) we need two 4K memory pages to map to 0x0000 and 0x1000, which needn't be contiguous in real memory since we'll set up mappings for each page individually. The driver allocates three pages with malloc() and takes a page-aligned slice of two pages within it, then tries to find where in physical memory those pages got mapped to using get_memory_map(). Now we want to make those pages' effective address mapping in SheepShaver point to 0x0000 and 0x1000 instead.

To find a real address in 32-bit PowerPC, the top four bits of the effective address select one of 16 segment registers mapping each 256MB effective address block. The segment register's low 24 bits (the Virtual Segment ID) is combined with the 16-bit effective address' page number and 12-bit byte number within that page to generate a 52-bit virtual address. The VSID and the page number then get hashed and combined with the storage description register SDR1 to yield the address of the PTEG, the correct PTE is found within it, and the real page number within it then becomes the upper 20 bits of the resulting real page address. We're going to work this in a similar fashion to find the PTEG that would contain the mapping for these lowest page addresses.

Traditionally the number of PTEGs is optimally half the number of real pages to be accessed, and since the next highest power of two in a 288MB BeBox is 512MB, that means 2²⁹ addressable bytes in (divided by 4K, or 2¹²) 2¹⁷ pages. Halving that yields 2¹⁶, or 65536, 64-byte PTEGs to equal a total size of 4MB. BeOS has a specific memory area for this, appropriately named pte_table, that we can look up with find_area() (thus giving us the effective address of the page table pointed to by SDR1). We find the relevant PTEG for each page by doing the same hashing steps the processor would do to resolve the address. In that PTEG, each PTE's highest bit is whether it's valid, followed by the 24-bit VSID, one bit for the hash type flag, five bits of the effective address called the Abbreviated Page Index, the 20-bit Real Page Number, and protection and access control fields.

We won't know the VSID without looking at the segment registers, but we can just walk the entire page table instead since we only have to set this mapping up once. When we find a valid PTE that matches the API, then we know this is a candidate PTEG and derive the VSID from that. We can then either directly modify an existing PTE within it or take advantage of the fact that each PTEG essentially offers up to eight hash collision resolution slots to add a PTE of our own. If we do this to the first place the CPU will look, we will take over that memory mapping for the life of the process.

The memory mapper conveniently has debug logging support for a simple tool called PortLogger that I patched up for BeOS R5. I compiled it with debugging on, restarted, ran PortLogger, started SheepShaver (it crashed, of course) and looked at the output:

$ ./PortLogger 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    PortLogger version 0.4.1
   Cameron Kaiser    - 14/02/21
   Simon Thornington - 14/02/97
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
init_hardware()
init_driver(3)
control(10000) data 0xfd001bb8, len 00000000
3 pages malloc()ed at 0x0202b228
Address aligned to 0x0202c000
Memory locked
get_memory_map returned 0
PTE table seems to be at 0x30000000
PTE table size: 4096KB
Found page 0  PtePos 58b84 V1 VSID c70 H0 API 08 RPN b11a R0 C0 WIMG2 PP0 
Found page 1d PtePos 58ba4 V1 VSID c70 H0 API 08 RPN b11b R0 C0 WIMG2 PP0 
Found page 1d PtePos 178580 V1 VSID d37 H0 API 2c RPN b11b R1 C1 WIMG2 PP0 
Found page 0  PtePos 1785a0 V1 VSID d37 H0 API 2c RPN b11a R1 C1 WIMG2 PP0 
Trying to map EA 0x00000000 -> RA 0x0b11a000
PTEG1 at 0x30034dc0, PTEG2 at 0x303cb200
 found 80069b80 00000010
 existing PTE found (PTEG1)
 written 80069b80 0b11a012 to PTE
Trying to map EA 0x00001000 -> RA 0x0b11b000
PTEG1 at 0x30034d80, PTEG2 at 0x303cb240
 found 80069b80 00001010
 existing PTE found (PTEG1)
 written 80069b80 0b11b012 to PTE

The driver seemed to properly reserve memory and find the real address (and thus real page number) for its mapping, and was able to resolve and walk the page table. But one problem jumped out immediately: we only have two pages (here 0 and 1d). Why is it that it found four? Notice that the "fraternal twin" pages have matching RPNs, but the VSIDs are different and we don't know which VSID is right. Did our algorithm effectively cause its own hash collision?

Continuing on, when we look at the existing PTE we found, the RPN is the first through fifth hex digits in the second word and both effective addresses match their real ones (80069b80 00000010 and 80069b80 00001010). That seems hinky.

My first thought was maybe we had a stale TLB and our PTE change didn't stick, because on the PowerPC 603 and 603e the code doesn't do a tlbsync to synchronize the translation lookaside buffer (which caches all this work) and this BeBox has two 603e CPUs. However, despite the code and Metrowerks saying it's 604-only, tlbsync is listed as a valid instruction in my copy of the 603e User's Manual Appendix A. I forced it to do a tlbsync by commenting out the check, compiled it again, restarted, ran PortLogger and started SheepShaver. Unfortunately, while it didn't do anything worse, it didn't work either.

My next guess was to see if maybe we were working on the wrong "twin." Assuming we really did have two sets of colliding hashes, what if we used the other one? A line of code to stop the search at the first page pair rather than the second was added and I tried again:

$ ./PortLogger 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    PortLogger version 0.4.1
   Cameron Kaiser    - 14/02/21
   Simon Thornington - 14/02/97
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
init_driver(3)
control(10000) data 0xfd001bb8, len 00000000
3 pages malloc()ed at 0x01587bb8
Address aligned to 0x01588000
Memory locked
get_memory_map returned 0
PTE table seems to be at 0x30000000
PTE table size: 4096KB
Found page 0  PtePos 33f04 V1 VSID c70 H0 API 05 RPN 45a8 R1 C1 WIMG2 PP0 
Found page 1f PtePos 33f24 V1 VSID c70 H0 API 05 RPN 45cd R0 C0 WIMG2 PP0 
Trying to map EA 0x00000000 -> RA 0x045a8000
PTEG1 at 0x30031c00, PTEG2 at 0x303ce3c0
 found 80069b80 00147010
 found 80076280 082b5190
 found 00000000 00000000
 free PTE found (PTEG1)
 written 80063800 045a8012 to PTE
Trying to map EA 0x00001000 -> RA 0x045cd000
PTEG1 at 0x30031c40, PTEG2 at 0x303ce380
 found 80069b80 00146010
 found 80076280 082b4190
 found 00000000 00000000
 free PTE found (PTEG1)
 written 80063800 045cd012 to PTE

Success! Now we actually have a free PTE, instead of modifying a questionable one, and we alter that. The mapping now takes precedence over anything else for that effective address and SheepShaver starts and runs normally. It also fixed Basilisk II, which would not run for the same reason, though SheepShaver seems to run 68K applications rather better than Basilisk II does.

Why was this never noticed? Well, like I say, Be never advertised support for more than 256MB in the BeBox, and in 1997 that would have been a significant amount of memory (my Power Mac 7300 in 2000 had 192MB and I thought that was a lot). Most PowerPC systems running BeOS probably had substantially less. Like many other bugs due to clock speed and RAM, no one ever dreamed future users would have such a surfeit of them.

The patched binary and source code are on Be-Power.

Sunday, March 7, 2021

Over the weekend: Commodore 128DCR keyboard extension, updates to Be-Power

Just some potpourri over the weekend. While working on other stuff and doing laundry, I finally put that keyboard extension on my Commodore 128DCR so I can put the keyboard in my lap if I want to. Any straight-thru DB-25 will work, so I used a 3-foot parallel printer extension cable.

Also, I was pointed to some additional BeOS PowerPC files, so I deduped and sorted them, and they are on Be-Power. Among other things there are now many more apps and also some of the OS updates. I also heard from someone who claims to have the old ftp.be.com archive completely mirrored, though for obvious reasons I'm mostly just interested in the PowerPC stuff. Haven't heard back from him yet when I asked for more details, but that's very encouraging for having a nearly complete mirror.

Sunday, February 14, 2021

Make the BeBox great again: TLS 1.2, inetd and more for PowerPC BeOS R5

Many nerds are at least historically aware of the BeOS, which died an untimely death two decades ago this year (when parent Be, Inc. sold out to Palm in 2001 and self-liquidated). Established in 1993 by former Apple exec Jean-Louis Gassée, Be's new OS was meant as a media-savvy alternative to MacOS and Windows, but with POSIX compatibility (largely), a command-line shell option and pervasive cheap multithreading, which is probably its most notable technical feature. It survives in recreated spirit on modern PCs, if not a direct descendent, as the perennially-beta Haiku.

A few nerds, however, will recall that BeOS didn't originally run on x86. In fact, its original architecture was one almost nobody remembers, the AT&T Hobbit, a strange stack-oriented CPU specialized for running C programs. The Hobbit had few takers due to its cost and various technical issues (Apple eventually rejected it for the Newton, leading to the rise of ARM), and when AT&T decided to kill the project in 1993 it nearly killed Be as well, who were using it for their dual-processor prototype wonderbox. After all, the best way to show off your all-singing, all-dancing, all-threading new operating system is with extra CPUs to power it.

Be regrouped around the PowerPC 603, which led to some unique technical issues of its own because the 603 has only three cache coherence states (MEI), making it notionally insufficient for multiprocessing. (This was carried over to the G3 as well, which is really just an evolved 603; the 604 offered a fourth state, and the G4 the full five MERSI states.) With little choice to get a product out the door, Be had to get around this problem with extra hardware to forcibly keep the processor caches synchronized. Be ended up making around 2,000 of the striking blue-and-beige PowerPC BeBoxes, deliberately targetted at technical users, over half of them in the slower dual 66MHz version and later a 133MHz version in the minority. Touches like the zooming LED load meters on the front, built-in MIDI and the customizable Geek Port made them beloved machines by their few owners: author Neil Stephenson, famous for Snow Crash, wrote the essay In The Beginning Was The Command Line with his own BeBox in mind. Pointedly, he declares in the essay that "[w]hat holds Be back in this country is that the smart people are afraid to look like suckers."

Naturally, there's a BeBox here too at Floodgap, a dual 133MHz model with 288MB of RAM running BeOS R5, the last release for PowerPC. And with a little hacking to get around its non-POSIXisms, it now has its own port of Crypto Ancienne with TLS 1.2. The screenshot is what's on the monitor (just press Print Screen anytime and a Targa file is dumped).

The Power Macintosh 7300 under the monitor isn't running BeOS, though it could (not sure if I'd need to remove its 800MHz G4 upgrade card, but it's basically compatible). Aside from PowerBooks (maybe the 3400 could be tricked into booting), PowerPC BeOS would run on pretty much any PCI beige Mac with a 603 or 604 CPU, including the clones. It even boots on systems with aftermarket G3 upgrades. It wouldn't run on an actual beige G3, however, and it wouldn't work on any New World Mac that came after.

And that's the reason why PowerPC BeOS withered after R5: Apple wouldn't provide technical documentation on future models, and Be didn't want to make the company dependent on reverse engineering them. By 1997 the BeBox, only ever a niche product for a niche OS, was discontinued. While Power Computing and other vendors still offered BeOS with their Power Mac clones, the Mac clones were themselves dying out and Be proceeded full speed ahead on an x86-compatible BeOS, releasing the dual-architecture R3 in 1998. PowerPC users became quickly neglected: BeOS never released a "try before you buy" personal edition of BeOS for the Power Macs, and unlike the situation with NeXTSTEP where fat binaries for all architectures were the rule for most software, the majority of developers simply wrote for x86 alone. There was never another browser for PowerPC BeOS other than Be's own NetPositive (while x86 had Opera and Mozilla), which is why I didn't show any BeOS browsers magically empowered by Cryanc in the screenshot, and when BeOS R5.1d0 "Dano" was leaked after Be's demise featuring the improved BeOS Networking Environment (BONE), there was no PowerPC release. At the time LowEndMac observed, "If you feel like Macs are treated like second class citizens, wait until you switch to BeOS — you might soon get the feeling of a fourth class citizen."

Nowadays I'd beg to be a fourth-class citizen. All of the old ftp.be.com archives appear to be gone, along with most of their games and freeware ports. A few packages developed by third parties survive in their original locations, and a few more in the Wayback Machine. There was a egcs port to PowerPC BeOS, but it seems to have evapourated completely, leaving BeIDE and Metrowerks C/C++ as your only development choice. I don't have many software packages but what little I do have for PowerPC BeOS I put on the Floodgap gopher server.

And no Intel crap. Twenty years later x86 has Haiku, which on 32-bit can run all your old x86 R5 apps and new ones besides, so x86 BeOS doesn't need our help. Instead, let's make the BeBox (and PowerPC BeOS generally) great again. And, hey, any of the Hobbit BeBoxes still out there too, being personally aware of a couple. (Especially if anyone wants to send me theirs.)

In future posts I want to talk about some of the other things I've been doing on this BeBox, including patching the SheepShaver Power Mac emulator (fun with page table entries) and writing a gopher client in BeIDE. But today, let's talk about porting Crypto Ancienne to BeOS, writing the only currently existing inetd-like environment for PowerPC BeOS, and why I say R5 is only mostly POSIX compliant.

Crypto Ancienne's core crypto library, ultimately derived from TLSe and libtomcrypt, is written in pre-C99. In fact, version 1.5, the current release, not only adds support for BeOS but also Tru64, IRIX 6.5 and SunOS 4, plus contributed builds for 68K NeXTSTEP, Professional MachTen, Haiku and Solaris 9 along with its previous support for Mac OS X (PowerPC and Intel), AIX, A/UX, Power MachTen, PA-RISC NeXTSTEP and of course Linux and the contemporary BSDs. While gcc 2.x is the most common compiler on these platforms, we also added support for MIPSPro on IRIX, Compaq C on Tru64 and Metrowerks C on BeOS. The core is generally the easiest portion to compile once you find the way the OS likes types and prototypes specified, and Metrowerks C had a good reputation for standards compliance, so other than adding a hack to get function-local variable allocations under 32K (!) that much was uneventful.

The tricky part turned out to be carl, the Crypto Ancienne Resource Loader, the Cryanc demo application and a desperate pun. BeOS has some unusual aspects to its POSIX support, all of which were rectified in Haiku, which built with the default code pretty much unmodified. The needed hacks boil down to the fact that, like the Windows API, standard input, standard output and standard error aren't "normal" filehandles. Let's say you want to check if there's input on an arbitrary file descriptor. There are no less than three non-interchangeable ways in BeOS:

If it's a socket, you can use select() like normal right-thinking people. There is no poll(), but overall this works like you think it should. This is also true for Winsock.
If it's a file or pipe, however, you can't. Instead, while this isn't well documented, you can make it non-blocking (something like (void)fcntl(fileno(stdin), F_SETFL, O_NONBLOCK);), and then busywait on the descriptor (return (read(fileno(stdin), &throwaway_char, 0) >= 0); will tell you if input is present). This is somewhat like PeekNamedPipe() in Win32, except that BeOS seems to lack any bespoke function for this purpose, and both require a similar combination of timeouts and alternating calls if you're waiting on a network socket and standard input.
But, if it's a TTY, it all goes out the window because there's an even more poorly documented ioctl you have to use instead (ioctl(fd, 'ichr', &numcharswaiting)). Haiku even preserves this ioctl for compatibility, though it is obviously discouraged. The non-blocking read() trick might also work but I ended up having to do a combination of both approaches, and even that doesn't work quite right.

For carl's loop where it transmits data from standard input and receives data from the socket, that had to be modified to check a utility function (stdin_pending()) and time-limit select() so that it could go back and forth between the two descriptors. This is ugly but it works, and the successful result is what you see in the screenshot (I grepped some lines from the HTML from lobste.rs as proof-of-doesn't-suck).

On the Crypto Ancienne web browser demo we showed that those computers could self-host their own carl in proxy mode so that they were their own "crypto proxies," assuming a suitable level of web browser support (or coercibility). NetPositive, your only choice on PPC BeOS, resolutely insists on using its own state-of-the-art 40-bit encryption over SSL; I'll see about hacking that later. Still, carl doesn't listen on sockets itself and relies on inetd or inetd-like environments such as xinetd (hi, Rob!) and Jef Poskanzer's micro_inetd, my personal favourite mini-inetd. We demonstrated running it as a proxy with micro_inetd on pretty much every other one of the OSes Cryanc supports, so it would be nice for BeOS to do the same.

Well, it won't come as a surprise to you that BeOS R5 works with none of these. Back in the day, it was even argued it might not be possible to implement inetd at all because sockets aren't shared across fork() (typically, for most inetd-like environments, they fork(), connect the socket to the standard filehandles and launch the dependent program, but this approach is unpossible in BeOS for that reason). Furthermore, you might think that net_server, the team (i.e., process) responsible for sockets in BeOS, would implement something of the sort and you would be wrong; the telnetd and ftpd in R5 are implemented differently. BONE does have a classic inetd but only because it fixes this problem as part of the other significant underlying changes in Dano, none of which were made available for PowerPC.

So this post also introduces inetb (kneeslaps and guffaws), less a port than a heavily multithreaded reimplementation of micro_inetd. Near as I can determine, this is the only inetd-like system that can run on a pre-BONE system. How can we do this if we can't pass the socket to the process we fork() to? Easy: don't pass the socket with fork()! Download it from Be-Power, or follow along with this gist:

We start up inetb with the port number and the dependent program. Let's use ./inetb 8765 awk '/quit/{exit}{print $1+1}' as a nice interactive example: this takes input, quits if it's quit, and otherwise tries to coerce it to a number and add one to it.
We listen on the port, initalize our array of iothreadstates (a struct we use to track sockets in flight), set up signal handlers for SIGCHLD and SIGPIPE, and go into an accept() loop. So far, so standard.
When we get a connection, we assign a new iothreadstate and then use an implementation of popen2() to fork() the dependent process but using pipes, not the socket.
Now for the BeOS magic. With the dependent process now running and its standard filehandles connected to pipes, we then start two threads, one to read from the process and write to the socket, and another to read from the socket and write to the process. (I have intentionally not implemented standard error: it's convenient to see it for debugging in the terminal you're running inetb from. Exercise left for the reader, but it would be a third set of pipes and a third thread.) The main thread goes back to its accept() loop to take more requests.

In the normal case, let's say that we quit this miniature awk session properly with the quit command. How do the threads react?

awk terminates, sending a SIGCHLD to the main thread and triggering the signal handler.
The signal handler reaps the process and based on the PID finds its iothreadstate. It then launches a cleanup thread for that iothreadstate, and goes back to the accept() loop to take more requests.
The cleanup thread now has to make the threads quit cleanly, since killing them leaves a mess in net_server (killing teams cleans up resources, but not individual threads within a team). It does this by sending a message to both the read-from-process and write-to-process threads. Any message will make them quit.
For the read-from-process thread, this is sufficient to interrupt its blocking read(). It sees there is a message, and exits gracefully.
For the write-to-process thread, this is a little more complicated. Even though the socket read should be blocking, in practice signals regularly interrupt it, so we use a select() on the socket to ensure we really do have data to read. There appears to be a bug in BeOS, however, where sending a message sometimes doesn't interrupt select(). We get around this problem by having the select() timeout every 10ms so it can look in its queue, which is less elegant, but better than a tight loop. Anyway, it too sees there is a message, and exits.
After waiting for both threads to exit, the cleanup thread flushes the socket and closes everything, returns the now spent iothreadstate to the pool and exits itself. Meanwhile, the main thread has already gone on to service other requests. Ain't multithreading great?

What happens if the user just disconnects?

As in standard POSIX, the write-to-process thread sees that the socket is ready but there is no data. Assuming a signal hadn't arrived, this is treated as a disconnect. It kills the dependent process (this is an entire team, so it's safe) and quits.
awk has just been killed, so a SIGCHLD goes to the main thread, triggering the signal handler.
The signal handler reaps the process, finds the iothreadstate, and starts the cleanup thread as it returns to the accept() loop.
The cleanup thread takes down the read thread as well by sending it a message, flushes the socket, closes everything and terminates. Meanwhile, the main thread has already gone on to service other requests. Another stupendous day in Cheap Thread Land!

It's BeOS' ability to effortlessly spawn huge numbers of thread even on constrained systems (even in 1996 a dual 133MHz wasn't stonking fast, and certainly not the 66MHz version) that makes this arrangement work effectively. Want to handle something asynchronously instead of busywaiting? Make a thread! The thread can block (usually)! Perfect for UI or network! Even the cleanup is asynchronous in inetb so that as little happens on the main thread as possible. The kernel handles all this messing around for you as long as you play by the rules.

BeOS isn't perfect, though, as that last sentence will attest. During my testing of inetb I unsettled net_server a lot. You can restart networking from its preference window, but it seemed bad that I had to do this as often as I did. In fact, as an unrelated note, I was able to pretty much wreck the machine every time if I accidentally started CDBurner. I don't have a burner and you'd think it would handle that circumstance, but it doesn't. The machine goes haywire if I'm lucky; it locks up if I'm not. I eventually had to remove it from the Applications menu. More generally, the notion of uids and gids is a veneer and you're pretty much doing everything as the superuser. That means wrong moves hurt.

But don't forget that early Mac OS X had its own weird problems during its earliest versions. BeOS, at least superficially, gives you that similar experience of a POSIX-alike underpinning with better multitasking and memory management, and it was definitely lighter on system resources than early OS X was, too. What NeXT had was Steve Jobs and a longer history with Apple than Jean-Louis Gassée, and while it is variously said that Be's demanded purchase price is what turned Apple away from buying them, I've always thought it was just a cover story for the real deal to get an original Apple founder back. And that worked out handsomely for Apple. But I think BeOS could have served as the next Mac OS at least as well.

Our next BeOS entry will talk about SheepShaver, which you can think of as "Classic" for BeOS. It even runs PowerPC code natively for surprisingly useable performance. But it started crashing incessantly after I upgraded the RAM in my BeBox. Can we fix that? Of course! Find out how next time!

Old Vintage Computing Research