website page counter

20-Nov-2007: Fumbling Forward


I started my VAX career as a government employee. One of the benefits of that were that I had no problem getting sent out of town on training. Also, money was available for me to take part in national conferences. As a consequence, I met and got to know many people working with VMS systems from across the country. The first few years these people were as enthusiastic as I was about the potential of the machines we worked with. We swapped war stories, especially about encounters with the Bad Guys who, in our books, were the stodgy uptight protocol-encumbered IBMers. Instead of three-piece suits, we wore blue jeans and cotton T-shirts. In a way, VAXes were straight out of the 1960's rebellions. We had taken on the establishment and were winning the war.

A downside to being an employee of the Government of Canada was the in-place hierarchy. In my position as systems manager I was classified as a CS2. Not a bad classification. It meant that I had to report to someone in a more senior job classification—fair enough, but I could not report directly to someone one step or more above who I reported to. That meant that my "real" boss, chief of operations, had to appoint someone as my "manager" so that they could report to and participate in meetings of those with higher job classifications than me. The result was an absurd situation where I had a manager sitting at the desk next to mine, who had very limited computer skills and very little knowledge of VMS and whose job it was to carry information to and from the higher-ups in the organization. Seriously.

And, worse, the groups she was reporting to had vested interests in maintaining old, inefficient, and expensive technologies that required their expertise. The information flow between me and the other groups in systems support was sketchy, at best. For example, the batch job flow process on the mainframes was controlled by operators sitting at huge consoles. To prevent log jams during the day, there was a discount system in place so that users could prioritize their jobs to run in the evenings or overnight. The operators would then decide, based on how much demand there was for computer cycles, when these "P2" jobs would run. So, my manager comes back from a meeting excited because they had decided that the VAXes I was managing should have a "P2" job queue.

Okay, and why? I asked. My manager patiently explained the discount system to me, even though I was well versed in it. After she finished, I again asked why. Flustered, she explained that we had to offer a batch job turnaround time of overnight. She then told me, enthusiastically, that it would give the midnight operators some exposure to the VAX world. Now she really had me puzzled. I simply couldn't comprehend what she was talking about until she told me that they would have to check the system and decide when this batch queue could start running. I told her that all the VAX users were off the system by five o'clock—this was the government, after all. She then snapped, Fine, but someone will have to start the queue running. You guessed it: Why? By now she was convinced she had a total idiot on her hands, so, to let her off the hook, I told her that I could schedule a queue to start executing at a given time automatically. The VAX didn't need operators to maintain its functionality. She now wasn't sure whether to be angry or to be relieved that I had finally seen the light. Fine, she snapped. I knew she really didn't understand what I was talking about, but she had given up.

Now, I continued. Why called it something meaningless like P2. We can call it something like The Overnight Queue? She lost her cool and ranted at me for about 15 minutes, calling me the most arrogant person she had ever met. Just who did I think I was?, etc. etc. So, I created a batch queue called "P2" that would start execution at midnight. No one ever used it because in the VAX world it wasn't necessary. Batch jobs typically whizzed through the system in a couple of seconds and were gone before anyone noticed. Or, if it was a larger job, it would execute whenever there were free cycles, especially when the department went to coffee or lunch breaks, and from five o'clock until eight the next morning it could continue its merry way without being interrupted by interactive users. There was never any need for manual interference with the operations of the VMS job scheduler—and the times when I saw operators doing that at sites when I was consulting, they were doing more harm than good.

Fortunately, my manager decided to move on and was replaced with someone who saw the potential in what I was doing. And she was right: I was being deliberately obtuse. One, I didn't like her much and two, the arrogance and ignorance of those higher up the food chain really irritated me.

Here's a story that demonstrates, I think, how new all this stuff was. It was decided that we needed a program that would log out users if their terminals were inactive for thirty minutes. The reasons were: to make a data line available to someone who needed to do some work, and, secondarily and never really expressed, to provide a small measure of security so that someone couldn't use an abandoned logged in terminal. So, I reasoned in designing the job, I want this program to be as fast and efficient as possible. Otherwise, it would be counterproductive if it were to take away computer cycles from those who need them. (Some day soon I will tell you about the failure of a $300,000,000 project because someone didn't think the way I did.) The most efficient programming language was Assembler. With Assembler you were working at the level of the CPU using only the instructions that were hard-wired into it. You can't get faster than that. It was a difficult slog for me, because I had had no training or experience with Assembler, but I worked my way through it one step at a time. When I was done, I had a slick little twenty-line program that scanned all the processes in a system, identified the ones that were interactive, checked them against previous entries. If it was not already in my miniscule database, create a new entry. If the process already existed, I would compare the elapsed CPU time with what I had previously recorded for that job. If the difference was less than 0.02 seconds, I would increment a counter; if more than .02 seconds I would reset the counter to 0. The job ran every five minutes, so, when my counter reached six, my program would order the process to commit suicide. By the way, it would send a warning message at the second-to-last cycle. It worked like a charm. It could handle a system with 200 logged-in users in less than 0.01 seconds every five minutes. (Aside: you might want to remember those figures when I get to the story of the $300,000,000 boondoggle.)

There is a point to this story. There was also available, at that time, many versions of interactive time out programs available. My objection to all of them was that they were written in DCL, a command-line interpreted language. That meant every line of code in the program had to be deciphered by the system and turned into machine code that the CPU could process. It had to do this for every line every time it ran. My program could run hundreds of times before one of these clumsy elephants could lumber through a system. There was a 9-track tape that VMS system managers across the country passed around containing such programs on it. I submitted mine. There was another program on the tape called "Watch Dog." It was a monster of a DCL "program." It checked everything you could possibly think of to determine if a session was really non-attended. It checked the amount of I/O, the amount of memory used; it kept track of interrupts and swapping activity. All in DCL. And all of the things it was checking required CPU time: ta dah! Mine had never logged out someone who was actually working, even though it was bare-bones. But, guess which program systems managers grabbed for their own use? That's right: the one they could read and understand because it was written in their everyday working language: DCL.

Today there exists a very large corporation that writes and sells system management tools for all types of systems. It's called "Computer Associates." Google it. They are really big, though if you are not in the system management business you probably haven't heard of them. And you know the name of one of the earliest system management tools they marketed? That's right: Watchdog, folded into their PolyCenter product, now called CA UniCenter Solutions. That same clumsy over-written inefficient interpretative language piece of code that had been circulating for free along with my program. Mercifully, it has been rewritten (I hope in Assembler) and greatly expanded in capability, but the name is still there: CA System Watchdog for OpenVMS. Here's a link.

If quality counted for anything back in the early 1980's, well...maybe I'd be in my villa in Southern France counting my millions, though they could have bought the program from me for a lot less.