Category Archives: Software Development

Working with Client-Server Programs

Back when I was starting out as a software developer, webapps weren’t really a thing. Not as much as they are now anyway. My company provided training to new hires, but I didn’t get any web development training at the time, even though they already had a few web development projects in play at the time.

Instead my initial training involved mostly development of so-called client-server software. This was software that was installed and run on the client machine but they would connect to a remote database server. Up until the early 2000s most enterprisey-type systems used these kinds of program.

I was trained in using mostly two tools:  Borland Delphi and Oracle Forms and Reports. These were the sort of tools that would have been billed as reducing the need for specialized programmers or developers. They featured drag and drop user interfaces to design forms or report layouts. The Oracle tools featured robust database integration that let you drag and drop to associate form fields to database tables and fields. Supposedly even people with minimal training would be able to do the work of application programmers.

In practice of course, the only people patient enough to work with these tools in-depth were the programmers themselves. And client requirements always eclipsed the capabilities of these tools, such that expert programmers were still needed to push the tools to their limits and beyond.

Tools like Oracle Forms and Reports had a lot of problems with modern software development practices. For one thing, the “source” files weren’t actually text files. They were mostly binary format files with some PL/SQL interspersed, and could only be opened in the proprietary IDE that only Oracle provided.

Binary formats meant that while they could reap some of the benefits of source control (namely version history and such), actually figuring out the differences between revisions involved opening up both versions in the IDE and comparing each item/script/setting in the two files. Doing things like global find and replace were also a pain in the ass, especially in systems that had hundreds of forms. It was so bad I remember spending some time trying to reverse engineer the binary format so that I could attempt to make some of this work easier. No dice.

Another issue was that Oracle’s IDEs were notoriously buggy. Their Reports IDE in particular often had me incensed. If you dragged the wrong thing to the other wrong thing, you would crash the IDE entirely. In fact, we had a list of things to avoid doing that would crash the entire IDE entirely. The built-in text editors were so bad, I often had a entirely separate text editor open in a separate window so I could copy-paste PL/SQL code there for any complicated edits. (Back then we were for some reason fond of Crimson Editor although it wasn’t as full-featured as UltraEdit or as reliable with updates as Notepad++) This was actually also sometimes a problem as some versions of the Reports IDE also had a crash on paste.

Most of my first two years of work involved maintenance of Oracle Forms-based systems (with some Delphi work thrown in occasionally). I didn’t get an introduction to web development until late into the second year of my career. I was so impressed with web development I made a mock-up of one of the really complicated screens from one of our Oracle Forms projects and it seemed so pretty. I secretly hoped we could convince our clients to port them over to a web application. (1) No dice again; and (2) I probably would have regretted such a thing.

I know I would have regretted it because some number of years later, we got a project to migrate an existing client-server system to a web application. The original application was written in Powerbuilder. I figured it was a fairly straightforward reverse-engineer and implement as a webapp, but nooooo. One of the higher-ups decided that to save on costs, we should look into attempting to automate the porting process somehow. We were to write translation software that would take the Powerbuilder source and convert it into the appropriate web application screens.

This was ridiculous for a good number of reasons, but as a tech lead, I had to look into it and had to prove it was in fact not viable (Spoiler alert: it was not). The first concern was that Powerbuilder also had a binary format. This was solved when I found that someone had already written a tool to export Powerbuilder binaries into some kind of text format. It was a bit crude, but it was a way forward.

The second concern is that client-server form behaviors do not map well to web applications. My favorite example of this is editable grids. These are some kind of excel-like grids that were commonplace in client-side systems. And when users upgrade their legacy systems to web apps, they inevitably expect that their editable grids should work the exact same way on the web as they did on the client-side forms.

Back then Javascript-backed editable grids were still in their infancy, and each one we tried had their own sets of problems and limitations which would lead to some number of client complaints. Some clients even wanted editable grids that could accept copy-pastes from excel! Yeah, we had to develop Google Sheets! I personally had to roll my own editable grid framework from scratch at least once too. I ranted so often against client’s editable grid requirements that a few developers often quoted something I said in one of our internal chat rooms:

    <jaeger> you say “editable grid” i hear “problems”

Anyway yeah, porting client-side behavior to a web framework would not have been straightforward. At least not to the way our internal web development framework liked to do things. In order to have any kind of reasonable mapping, I would have needed to build some kind of Javascript rich-client framework (or see which ones of the emerging ones we could use). That also meant rewriting a lot of the backend code to support a bunch of new operations. It was a lot more work than the expected output of “a series of steps to easily or automatically migrate PowerBuilder source code to web applications.”

When I raised that the idea was not viable, I was asked to give a more detailed explanation. This involved taking one of the more complex Powerbuilder source files, putting it into an Excel, and line-by-line explaining how and why that line could or could not be translated into a web application. As an alternative option to an automated porting program they wanted a guide or process so simple that even junior developers with minimal experience could follow the guide and port the programs quickly. I think they eventually went with this approach. (Which was closer to our normal reverse-engineer-and-reimplement methodology.)

These days web applications are the norm and client-server programs are a thing of the past. They were relics of a different age. The browser is our universal client now, on the desktop at least. On mobile, “apps” are effectively client-server programs, though they have a different set of advantages and limitations. Maybe in the future mobile apps will converge as well, into a unified browser client – although probably mobile browsers need to be more feature-rich before this convergence can take place. My historical disdain for client-server programs carries over to mobile apps though – I don’t like to use too many of them, although that is a story for another blog post.

Handling unexpected errors in web applications

So after so many months of development you deployed your webapp to production and it’s up and running and everything is fine and you celebrate and your work is done right?

Not really.

Two days later you get an urgent support call in the middle of the night. (Your clients are halfway across the world.) They’re asking why the website is inaccessible. You check via your browser and sure enough there’s an error 500. You have to ssh into the server. Half-asleep, you have to quickly comb through the log files to figure out if there’s something you can do about it right now or you have to sadly tell the client that the team has to handle it in the morning.

Nobody expects software errors. And nobody can predict how they will happen. (If we could do that, there wouldn’t be any errors!) But in general you should stay ahead of future headaches during development time by having robust error handling and reporting principles in place for your development team to follow.

There are two classes of errors:

  • Validation errors – checks on user inputs. These errors are expected, and the most important thing here is to present clear error messages that tell the user what was wrong and how they can correct it.
  • Everything else is an unexpected error, which we can breakdown further according to how the development team would react:
    • “That shouldn’t happen!” – the broadest category, usually caused by human errors in coding/configuration or hardware problems. Examples include: database failure, typos in configuration files, different data formats used by different developers, unexpected system crashes and so on.
    • “Hmm, oh yeah, we didn’t think of that.” – i.e., design flaws or unhandled cases. These are errors caused by program flow or usage that was not anticipated in the program’s design but are considered valid use.
    • “That can’t possibly happen!”, also popularly known as “We have no clue how that happened.” Most unexpected errors start out under this category then after some detective work (some developers use the term “debugging”), it transfers to one of the above categories.

Unexpected errors may also be either recoverable (the user can either attempt to repeat the operation to continue or use other functions in the system) or catastrophic (the entire system or subsystem is completely unusable.)

Ideally, what happens when an unexpected error occurs? There’s a number of things that probably need to be considered:

  1. The user needs to be aware that it’s not an error on his part and that the problem needs to be reported to support or the system administrator.
  2. The user needs to be presented with enough information to give to the support team to allow them to determine the cause and fix the error.
  3. The program needs to capture what information it can to help the support team find and fix the error.

For client-side errors (typically JavaScript problems, but can also be problems in HTML/CSS rendering), all of the above are difficult, since the mechanisms by which you present #1 and #2 and by which you can do #3 may be compromised by the error. Typically for any client-side errors, you are almost entirely reliant on the user’s description of the error and the steps that led to it – not the most reliable source of information, but you make do with what you have. If you are unable to reproduce the error during debugging, you are in for a whole lot of speculation and trial and error.

For server-side errors, it’s a bit easier than client-side errors. The first two items are typically handled by having a catch-all error page in the application that gives an error code and message for unexpected errors.

For the last item, it generally means logging, either to a file or a database-backed audit trail (the file is more reliable). Generally we log the error/exception encountered, including the stack trace, and some log messages on each request/page access. A majority of the time, having the stack trace is sufficient. The access logs allow us to back-trace previous steps which might have contributed to the problem. Any special conditions encountered in the code should also be logged – this helps to identify errors that can only happen under some combination of special conditions.

You don’t want to log too much information (what to log might need to be a separate post all on its own), but just enough to help the support team determine the cause of the problem.

Sometimes, the error is specific to the data being used by the users, or happens only in the environment they are using. And the support team investigating the problem will often not have access to that data or that environment, so your logging and the error reports have to be good enough to allow them to deduce the conditions that caused the error.

Error handling for deployed systems is already hard – so make sure not to make it harder on the future support team by having standard error handling and logging mechanisms.

Problems in Large Software Dev Teams

Hopefully by now most developers and project managers are well aware of the mythical man-month and Brooks’ Law:

Adding manpower to a late software project makes it later

The idea is that communications overhead scales up quickly as you add more people to a project. Oftentimes it is counter-intuitively not worthwhile to keep adding more people to try to catch up. Some implications of larger team/project size may not be immediately obvious. Some problems scale up faster as team/project size grows:

  1. Lower productivity due to increased overhead as mentioned above.
    • Meetings will tend to involve more people and take longer
    • There will be a lot more emails
    • Project management effort scales up quickly too
    • More people need to be allocated to maintaining builds and servers
    • More time needs to be spent on task prioritization, bug triage, etc
    • More people asking WTF happened to their code (LOL)
    • Any decision making that requires consensus building takes longer
    • It becomes more difficult to find the right person to ask things
  2. Simply due to the number of people, there are more things that could go wrong
    • Developers breaking the build happens more often
    • People going on sick days will happen more often
    • Server performance becomes much more important since any delay or downtime affects more people
    • Schedule delays or others unexpected problems will be more likely
  3. Maintainability becomes more important
    • Technical debt becomes more burdensome and poor code is more likely to come back and bite you in the ass in the future
    • The need for good coding and development standards increases
    • Higher likelihood of code duplication (“I didn’t know that Developer R already wrote a function that does X!”)
    • More important for code to be well-decoupled, to reduce the likelihood of one developer breaking a lot of things
  4. Source control gets harder to use, with so many people making so many changes.
    • The team needs to develop standards for commit messages and linking commits to bug reports. to make it easier to track and monitor changes
    • Source control commit comments need to be a lot more helpful or descriptive.
    • More commits happening in the same amount of time, the more you need to be constantly updating from the repository.
    • Merging is more likely to become difficult and complicated (may be made easier by modern source control systems)
    • More important to use more, smaller files instead of fewer large files (less likely to produce conflicts)
    • Need better coding/programming standards. Otherwise you have the problem of changes/commits being difficult to track for example if one developer uses different autoformatting standard (his commits will have many small reformats)
  5. Having consistent rules for naming, UI,  and other things becomes more important
    • The more developers you have, the more likely that they will have different ways of thinking. There are far more likely conflicts among a team of 8-10 developers than between 2-3 developers.
    • It becomes more important to have a standard or plan for where different kinds of files should be placed. Otherwise you run into problems like different developers using different folders for their css or different package naming conventions, etc.
    • Consistency and standards more difficult to enforce (since there are more devs)
    • Need to keep things consistent on all levels: databases, code, UI, and so on.
  6. Documentation becomes more important
    • Tribal knowledge is often spread out among multiple developers
    • Undocumented things are less likely to be passed on to new developers
    • Developers unaware of undocumented things are more likely to have difficulties or to break things
    • Becomes a lot more difficult to absorb new developers into the team in times of urgentness
    • Documentation more likely to quickly become out of date due to rapid pace of changes

Anything you want to add?

Qualities to Look for in a Software Developer

Just a list I’ve been maintaining for a while:

(Disclaimer: This list in no way implies that developers who don’t exhibit all of these attributes are terrible human beings who don’t deserve to live. But working with developers who exhibit many of these traits will probably result in a better experience over the course of your developer career.)

  1. Laziness, Impatience and Hubris – from the well-known (notorious?) Larry Wall quote
  2. Communicates well; is able to explain and communicate his ideas clearly, especially to nontechnical people; able to write good documentation
  3. Understands the concerns with scheduling and project management and communicates clearly with the team to avoid problems. This means: willing to speak up as soon as any problem is encountered that introduces any kind of risk; not bloating estimates or pretending that tasks take longer than they really do; not cutting estimates to make managers happy; 
  4. Cares about writing elegant code; understands the risks involved with code that is complicated or difficult to maintain; Understands the importance of data structures, algorithms and design patterns
  5. An attitude towards learning and self-improvement; owns up to his own faults; ready and willing (and often excited) to pick up new domains or technologies (and advises you of the appropriate schedule risk); Can easily pick up and learn new technologies and programming languages; recognizes and understands programming principles and able to carry them across to different domains or technologies; Able to study or learn new topics with minimal guidance;
  6. Understands engineering tradeoffs; able to tell you the differences in performance, storage, etc among different options.
  7. Works well with others: Willing to help with other people’s work when possible/needed; Has an open mind and willing to consider other people’s suggestions; Doesn’t take criticism personally; Chill AF
  8. Able to think logically and sequentially; able to break down a problem into a discrete set of solvable tasks; Able to investigate and find the cause of problems with minimal info; Able to think outside the box when necessary; Able to point out problems or logical inconsistencies with program requirements;
  9. Able to read and understand and maintain other people’s code; Can update code with the minimal possible changes to avoid breaking things;

Any other suggestions?


The Simplest Code That Can Do The Job

So the other day I was reworking a Python script that I had been using for years on my home PC to manage and categorize some downloaded files for me. This time I wanted to add some smarter behavior to make it more able to figure out when to group files into folders without constantly needing manual intervention from me. To do this, I needed to persist some data between runs – so that the script remembers how it categorized previous files and is able to group similar files together.

Now since my software development career has largely been as an enterprise-y kind of developer, my first thought was to just use a database to store the data. I already had a MySql installation on my machine so that was fine, I just needed Python to interface with it. After looking up how to do it, I balked at having to install a new Python library just to connect to MySql and reconsidered.

As programmers, we have a tendency sometimes to over-engineer solutions because that’s what we’re used to doing. Did I really need a database for this? The data won’t be very big, and I won’t need to do any sort of maintenance on it, so maybe a simpler solution was in order.

I ended up just using pickle, which was already built-in to Python:

def load_db():
	all_series = {}
	with open(DATABASE_FILE, 'rb') as handle:
		all_series = pickle.load(handle)
	return all_series
def save_db(all_series):
	with open(DATABASE_FILE, 'wb') as handle:
		pickle.dump(all_series, handle, protocol=pickle.HIGHEST_PROTOCOL)

(Above code probably gives you an idea what kind of files I’m sorting…)

As an added benefit, I didn’t need to design any database schemas or tables or whatnot, pickle just lets me serialize the map as-is and reload it later from disk without any hassle.

I guess my lesson here was: don’t over-complicate things when something simple will work fine. Write the simplest code that can do the job.

Client and Server Validation in Web applications

Because of the nature of the web and the fact that you should never trust user input, all the validation in a web application should be done on the server side. You can additionally provide validation on the client side (via JavaScript), but this is only a concession towards a better user experience and should not be used as a substitute for server-side validation.

One would think that anyone with a basic understanding of how HTTP works would understand the above easily and any failure to practice it should be considered amateur hour. But in shops where most of the testing is done manually, developers can easily fall into the habit of adding the client-side validations (since failing to do so would earn them a bug report) and forgetting the server-side validations altogether.

The main problems are that (a) HTTP requests can be spoofed, they do not need to have come from a form submitted via a browser; and (b) even for forms submitted via a browser, the Javascript validations may have been tampered with on the client-side.

For explicit validations for which you wrote out some logic (for example: email address must be so-and-so format), it is obvious that you need that on the server side. But for some classes of validation you may forget to handle them especially if they do not explicitly generate errors in the webpage on the client-side.

First example: when the contents of a drop-down list are dependent on some other value on the form. On the client-side you probably already restrict the choices such that the user is unable to select an invalid combination so it doesn’t look like a check is needed. But on the server-side, you still have to check that the choice submitted for the drop-down field is a valid value given the other values submitted in the request.

Second example: when you hide or disable certain fields in the web page depending on some other value on the form. Same as above, you don’t need to add a specific check on the client-side since the user is already prevented from doing so by the UI. But on the server-side, you have to make sure not to save or process any values from those hidden/disabled fields if the other values on the form indicate they shouldn’t be processed.

Weak validations on the server side are dangerous because at the very least they will create bad data in your system and at the very worst may expose you to security vulnerabilities.

Cleaning up your Code

In one of my most recent projects, a large system that had gone through a relatively long and unstable period of many, many changes due to sales demonstrations, different clients and whatnot, one of the “fun buffer tasks” I always kept around for devs was code cleanup. Because of the unstable nature of the project, there was always a lot of duplication, unused/unnecessary/obsolete classes/functions/files and so on. Unnecessarily large CSS files where most of the selectors were no longer really needed or JS libraries that weren’t actually used. That kind of thing.

It’s one of those things that you’ll never get official approval from management to do, so you have to somehow sneak it in during your daily tasks. But it’s important for a couple of reasons:

  • Having too much cluttered code makes your system a lot harder to grok. That means new developers will have a much higher learning curve, and existing developers will find it difficult to be assigned tasks in modules and functions they’re not familiar with. Lower understanding means more bugs, lower quality and so on.
  • Having a lot of unused files, classes, functions, etc. bloats the build process (making build times longer, extending development cycles) and makes build files bigger (extending deployment times)

A lot of developers prefer not to throw away old code, for fear that “we might need it later”. They would prefer to just comment them out in large blocks (making the code a lot more unreadable) or just leaving dangling functions/classes unused. The reason is hogwash of course, since you should be using source control, and source control means never being afraid to delete old code. (Of course, you should make sure what you’re deleting is really no longer in use!)


Learning a New Programming Language

Related: Learning new skills

While many people working as programmers/software developers are happy enough specializing in a single programming language or platform, I generally consider it a better idea to have a wider toolset and the ability to easily pick up new programming languages as needed. The benefits should be obvious: when you have a wide variety of tools under your belt and are able to quickly learn to use a new tool, the number of work options you have increases greatly.

Happily, programming languages share a lot of similar constructs. Only your first programming language (when you first learn programming) should provide you with any difficulty – once you’ve cleared that hurdle, learning additional programming languages shouldn’t be too much of a concern.

You typically start with syntax, variable declarations, function declarations and program flow (loops, conditionals and so on). Some languages may have a strange syntax that don’t share much in common with other programming languages, but that’s pretty much not a concern as long as you have access to a modern-day compiler that will tell you when you stray from the desired syntax.

I find that learning a programming language is best done the same way you actually program – iteratively. Learn something new, try it out, modify it a bit, try it out, and so on. I once had to help someone prepare for a programming interview where he would be expected to know C++. He had learned it before in college, but hadn’t used it for a few years, but for some reason he stuck to reading up on it instead of taking my suggestion to install a compiler and actually try out all the suggestions. (He ended up passing the interview, but that was luckily because they didn’t ask too much about syntax.)

It also helps a lot to learn about theory and terms and whatnot. Like for object-oriented  programming, it helps to know and understand the concepts behind polymorphism, inheritance, abstract classes and so on. When studying different programming languages, you can then easily compare how they are done in one language vs another, making it easier for you to carry over design concepts to the new language you are learning. Being familiar with the terms and vocabulary also helps you communicate better with mentors, teachers, and fellow learners studying the same language.

Building small toy applications with the new programming language is a great way to learn too. Or maybe if you had a certain project in mind that you wanted to do, you could use the new language with it. Slightly related: When I was consulting with a startup on developing a new product, I asked the CTO whether he preferred to use a technology stack that the developers were more familiar with (to reduce learning costs) or whether he wanted to try this different language that neither of us had tried before, and he told me that one of the great things about being in a startup was that he could choose to do new projects with new technologies to expand their horizons and not have to listen to higher-ups shut it down for fear of increased costs.

For the longest time, I tried to learn at least one new programming language/platform a year. For 2016, it was Unity/C#, although I’ve also started studying Node.js in the past month or so. I hope it’s something I’m able to keep up, even as I’m trying to explore new skills other than programming.

Generalists and Specialists in Dev Teams

In any reasonably large software project, the system will be so large that no one developer will have a good grasp of the details of every function in the codebase.

The tendency is for developers to specialize – that is, developers tend to focus only on certain parts of the codebase and become more familiar with that part, while not having much knowledge about the other parts. This tendency is self-reinforcing – once it becomes known that the developer is an “expert” in the given module, there is a tendency that he will be assigned the most difficult and urgent tasks or fixes related to that module, further cementing his expertise. Thus, the developer becomes a sort of specialist within the system.

In contrast to a specialist, you will also sometimes have developers who prefer to be generalists. That is, they are comfortable working with any part of the system, although their familiarity and knowledge are probably not as deep as the specialist for any given module.

Both generalists and specialists are valuable in different situations. If you need a complicated change done quickly on a particular module with minimum impact, it’s best to have a specialist who is very familiar with how everything works. On the other hand, generalists are very useful from a resource management perspective, since they can jump in to help at any time in any part of the codebase. Say, if your specialist is sick or out of town and you urgently need to do a small change, the generalist can probably take it on no problem.

Ideally, you train more than one specialist per module of interest in your system, through some sort of mentoring or maybe pair programming, but not all dev teams have that luxury (mostly due to schedule or resource constraints). It’s best for software dev teams to find the right mix of generalists and specialists that their particular development process entails.

Power Distance in Software Development

I was in a meeting once with my boss (literally the CEO, a Malaysian) and some representatives of another company (Americans) where we were discussing the technical details of a possible future partnership. At one point, one of the Americans said to my boss that he was pleasantly surprised that I was openly speaking up independently of my boss and willing to correct him on some points when he didn’t quite get the technical details right. It seems they were used to working with some Indian outsourcing firms, where due to cultural differences, the tendency was for the Indian guys to accept everything the Americans asked for without question and delivered it exactly as requested, even if there were obvious problems.

The concept is called Power Distance, where cultures with a higher power distance are more likely to just accept without question the authority of “higher-ups”. While in cultures with lower power distance, people feel less of a gap with people of “higher” status, and are thus more willing to speak up openly.

I believe that I live and work in a country with a high power distance. It is typical of workers here to have an exchange like:

“Why are we doing this, isn’t it kind of dumb?”

“Because the boss says so.”

“Oh, ok”

Not just with people in “higher” positions, but especially with foreigners. I witnessed this first hand when I first observed how other people behaved when they first had to work with our project managers who were based in another country; many would be hesitant to raise their concerns directly with the foreign counterparts.

In an industry where users and clients and management often do not really understand the finer technical details of what exactly they want to happen, being able and willing to raise concerns regardless of differences in position or status is not only a distinct advantage, it may very well be an important aspect of the job. All the best developers I’ve worked with are the ones who are willing to call out problems, and it’s a trait I personally encourage in anyone I work with.