I mentioned before that as an engineer, I’m not fond of marketing. Image credit: Dilbert.com (Disclaimer: Liking the Dilbert comics is not an endorsement of Scott Adams’ politics) It’s not that I can’t be good at salesmanship either. I have a good grasp of communication skills and think I have a decent chance of writing good copy. My main issue is that I’ve been exposed many times to sales/marketing practices that just seem dishonest downright or scummy.
If I could give some advice to someone starting out in their software development career, it would be this: Don’t stay in the same place too long. The first company I worked at, I stayed with them for thirteen years, which I now feel was way too long. I have to admit, the work was hard and challenging, but I was young and had a lot of energy and was willing to work the long hours.
Another repost from my Quora answers, this time some info for anyone looking to move into programming. What are the pros and cons of making your career in programming? Pros: It is a very rewarding career financially. Software development often ranks in the top 10 highest-earning careers in most countries There is a lot of scope - you could be developing web applications, mobile applications, embedded applications, client-side, server-side, data analysis, artificial intelligence, games, etc It is very difficult to be bored.
Given my recent misgivings about Quora, I thought it might be a good idea to cross-post some of my answers from there into this blog, with some edits even. So here’s the first one! (stuff in italics were added during the cross-post) How can you read and study a large software project source code? Attacking a large, existing codebase that you are unfamiliar with can be a daunting endeavor.
I had some free time the other day so I randomly decided to post in the PH subreddit’s regular afternoon random discussion thread, asking for questions about software development. I ended up typing some longish answers, I thought I’d copy them over to the blog in case anyone was interested. TBH I meant more like StackOverflow type questions with specific technical problems, but I ended up answering mostly career-related questions, which is fine, but disclaimer: I don’t claim to be an expert, these are just my opinions on things.
Last October I participated in #Hacktoberfest, sponsored by DigitalOcean and Github. It’s a “celebration” to promote open source activity, and basically you just need to submit 5 pull requests to any github repository, and they give away swag to anyone who completes the activity. Microsoft held a [counterpart celebration] where they only require you to submit 1 pull request to any Microsoft repository. I’ve always wanted to start participating in Open Source, but it’s a bit difficult to find a good place to contribute (other than logging issues of course).
A while back I started a Twitter trivia bot as a weekend project. That bot is still up and running on Twitter, you can check it out there! But today, I thought I’d write about the answer-checking mechanism used by the bot. It was a bit interesting to me because it was the first nontrivial use I had for Django’s unit testing framework. I’m not too keen on unit testing web functionality (something I still have to learn), but this seemed an appropriate first use of a unit test framework for several reasons:
Someone responded to my post on things to learn in 2019 by asking how one finds the inspiration to learn all of the things. Well, my first answer was that those are just things I find interesting and may look into, but that’s not really an answer for the inspiration part. Software development is a very wide field, one where the amount of things you can learn increases daily, so it’s almost impossible to keep up with everything.
A while back we were tasked with helping a client’s internal dev team to migrate their repositories from Subversion to Git. The distributed VCS seemed ideal for their situation - they had a very small in-house dev team managing contributions from external subcontractors. The main rationale was that their process of merging contributions from the external developers was extremely complicated and often resulted in conflicts that were challenging to merge. Before this, I hadn’t actually used Git too deeply myself (aside from cloning stuff from Github), and especially not in a team setting, so the training one of our other engineers gave them was a good opportunity for me to become familiar with Git as well.
One of the things about self-identifying as a “Full Stack Developer” or “Solution Architect” is that there’s no shortage of things to learn, and oftentimes it’s good for your career-wise to at least have some passing knowledge of a bunch of technologies. It helps that I really like the field as well. I try to make sure I study or learn at least one new programming language or framework every year (though I am willing to stretch that definition as needed).
Although I still primarily identify as a “Full Stack Developer”, during the past few years I’ve also found myself in a role called “Solution Architect”. The thing about being a solution architect is that there isn’t really a clear definition of the role, what it involves, or the scope of responsibility. I suppose it depends largely on the organization and the project. The role mostly involves making techical decisions on a larger scale, like project-wide or organization-wide, rather than on the micro day-to-day technical decisions involved in a typical software development involves.
This year I had the dubious privilege of having to work with a C++ project again. Although my college education was in C, that was a completely different animal. I did self-study C++ for a bit back even before I was working, mostly because I was interested in game development even back then. I remember trying some OpenGL and/or DirectX stuff back with good old Borland Turbo-C++ during the DOS days and using the Dev-C++ IDE when I shifted to Windows.
For any nontrivial software project of at least moderate team size, there can be a significant cost to onboarding a new team member, especially at later stages when you are rushing to meet deadlines. The most signifiant cost is of course the communication overhead as described in the Mythical Man Month. Fun story, the CEO of a company once told me they would add more developers to a delayed project to meet the deadline and when I pointed out the increased overhead he said to me that it wasn’t a problem because they would just assign modules to those devs that have minimal dependencies so they don’t have to communicate so much.
Text editors (and by extension IDEs) are a programmer’s best friend. I thought I’d look back at a number of text editors I’ve used over the years. (I grew up with Windows, so I won’t list vim/emacs/nano here, even though I’m at least a bit proficient with vim by now. That is, I know how to exit vim.) Notepad – of course, the default editor in Windows. The one we turn to when all else fails.
According to Malcolm Gladwell’s book Outliers, you need 10,000 hours of continuous sustained practice to become an expert. There are 168 hours in a week. If you never sleep and you eat as you practice, you can become an expert in 60 weeks. (Around 14 months) If you sleep 8 hours a day, you only have 112 hours in a week. If you eat as you practice, you can become an expert in 90 weeks.
SCM (Software Configuration Management) doesn’t just refer to version control for the software you’re building. It also means controlling the versions of software you depend on. This includes operating system and programming runtimes. Sometimes even minor version differences can cause issues in running your software. I have two example stories to share: One of our clients asked us for help with an upgrade their production servers from CentOS 6.4 to 6.
Systemic change is difficult. I’m talking about software projects/systems, but there are a lot of parallels with societal systems too, like governments or states. I’ve been in large projects with hundreds of thousands of LOC where a lot of the code was painful to read and full of code smells and so on. It happens over time as projects get maintained by different developers and teams or different enhancements or changes are made.
Ten years ago this month, I started studying Django by trying to build my own blog application. I found the code lying around while I was going through some backups lately. It’s way out of date, it uses an early version of django. I thought of bringing it up to speed, but that didn’t seem practical. Instead, for archival purposes, I cleaned it up a bit and uploaded the code to a github repo.
Malcolm Gladwell, in an article from 1996 discussing the Challenger disaster, tells us: This kind of disaster is what the Yale University sociologist Charles Perrow has famously called a “normal accident.” By “normal” Perrow does not mean that it is frequent; he means that it is the kind of accident one can expect in the normal functioning of a technologically complex operation. Modern systems, Perrow argues, are made up of thousands of parts, all of which interrelate in ways that are impossible to anticipate.
Rockstar was in the gaming news recently because they mentioned that some of them had worked 100-hour weeks on their massive sequel to Red Dead Redemption coming out soon (no idea if I’ll play this). The idea of 100 hour weeks seemed insane to me, and it got me thinking: I’ve done some serious overtime before, have I ever gotten close to that amount of work in a week? Luckily, I didn’t have to speculate too much, because I had data (I love data).
Mentoring is one of those tasks that’s to be expected of anyone in a senior software development role. This usually involves reviewing other people’s code, helping them with tough technical issues, and even giving career advice. I’m not sure how good I am when it comes to mentoring other software developers. When I first became technical lead on projects, I got some evaluations from junior developers saying I can be “intimidating”.
As a programmer, I’ve always been a big fan of StackOverflow. I asked my first question there and also wrote my first answer in September 2008, which was the month the site launched, so I was pretty much there from the beginning. The site was a huge boon to programmers when it first came out, because the internet as a venue for asking questions and answers back then was a horrible fragmented landscape of small forums and mailing lists and sites like Experts Exchange, all of which were terribly designed.
While browsing through my old blog posts, I found one about my setup from 2010. I figured it was a good time to do an update. I like doing posts like these because it provides an easy reference for me to look back and see what I was working with at a certain point in time. What Hardware Do I Use? Desktop. I bought a new desktop rig back in late 2015, here are the specs:
I was thinking about my typical approach to coding. When writing a new feature, I tend to implement in the direction of where the data flows, starting from the user interface then to the backend and back to the frontend and wherever else that goes. I will build incrementally, using debugging tools or console printouts to ensure that each step is working correctly. As an example, here’s how I did a recent web-based function:
Random thoughts while walking at night: The structure of government can be a bit analogous to the structure of a software development project. The Constitution is like the requirements for a project. It’s kind of high-level and (I believe) shouldn’t be too detailed. Supposedly the requirements are written by the client. For a country like the Philippines the client is “we the sovereign Filipino people”. Slight tangent: I used to know this guy who was one of those rabid “we need to amend the constitution” types and he asked me to review a “mathematical model to track the budget as a function of tax collection and monetary policy” that he wanted to include in a proposed new constitution.
I recently attended a few training sessions for MarkLogicheld at an office in a nearby business center. Now, I'll forgive you for not knowing what MarkLogic is, as even I hadn't heard of it before six months ago. MarkLogic is (apparently) the leading Enterprise NoSQL provider. NoSQL is big and sexy right now because of the supposed advantages in handling big data, and large web companies like Google and Facebook use a lot of NoSQL in the backend.
Back in 2004, I signed up for the Google Code Jam for the first time. Unfortunately I didn’t make it past the qualifying round. I was a bit luckier in 2008 and 2010, making it to round 2 both times. In fact in 2008 as I recall I was one of only two participants from the Philippines who made it to round 2, which allowed me to jokingly brag about being the #2 programmer in the country.
Recently, a developer needed to undergo a tech interview at US immigration:1 This may surprise some people I’ve worked with, but I didn’t have formal computer science training in school. I’m not actually a computer science major. Yet I’ve worked as a software developer for more than a decade now. Literally zero times have I needed to write a sorting function or balance a BST. I have a rudimentary understanding of some sorting algorithms (mostly just bubble sort and selection sort), and I have some idea of how to balance a BST.
I’ve been hesitant to try Python 3.x because it’s not backward compatible with Python 2.x which I’ve been using for scripting since forever. But recently I found out that since Python 3.3, they’ve included a launcher in the Windows version that supports having both versions installed. You can use the launcher to specify the Python version to use at the command line (it defaults to whichever version was installed first):
I had been meaning to try writing a Twitter bot for a while now. I figured a trivia bot would be pretty easy to implement, so I spent some time a couple of weekends to rig one together. It’s (mostly) working now, the bot is active as triviastorm on Twitter, with a supporting webapp deployed on http://trivia.roytang.net/. The bot tweets out a trivia question once every hour. It will then award points to the first five people who gave the correct answer.
There are a few things that one should consider when using and integrating an open source library into your application: What are the licensing terms for the library? There are some liberal licenses that mostly let you do anything you want. The MIT license is an example of a very permissive license. Other licenses may provide a number of restrictions. Can you integrate with closed-source software? Can you distribute binaries without the source?
Back when I was starting out as a software developer, webapps weren’t really a thing. Not as much as they are now anyway. My company provided training to new hires, but I didn’t get any web development training at the time, even though they already had a few web development projects in play at the time. Instead my initial training involved mostly development of so-called client-server software. This was software that was installed and run on the client machine but they would connect to a remote database server.
So after so many months of development you deployed your webapp to production and it’s up and running and everything is fine and you celebrate and your work is done right? Not really. Two days later you get an urgent support call in the middle of the night. (Your clients are halfway across the world.) They’re asking why the website is inaccessible. You check via your browser and sure enough there’s an error 500.
Hopefully by now most developers and project managers are well aware of the mythical man-month and Brooks’ Law: Adding manpower to a late software project makes it later The idea is that communications overhead scales up quickly as you add more people to a project. Oftentimes it is counter-intuitively not worthwhile to keep adding more people to try to catch up. Some implications of larger team/project size may not be immediately obvious.
Just a list I’ve been maintaining for a while: (Disclaimer: This list in no way implies that developers who don’t exhibit all of these attributes are terrible human beings who don’t deserve to live. But working with developers who exhibit many of these traits will probably result in a better experience over the course of your developer career.) Laziness, Impatience and Hubris – from the well-known (notorious?) Larry Wall quote Communicates well; is able to explain and communicate his ideas clearly, especially to nontechnical people; able to write good documentation Understands the concerns with scheduling and project management and communicates clearly with the team to avoid problems.
So the other day I was reworking a Python script that I had been using for years on my home PC to manage and categorize some downloaded files for me. This time I wanted to add some smarter behavior to make it more able to figure out when to group files into folders without constantly needing manual intervention from me. To do this, I needed to persist some data between runs – so that the script remembers how it categorized previous files and is able to group similar files together.
In one of my most recent projects, a large system that had gone through a relatively long and unstable period of many, many changes due to sales demonstrations, different clients and whatnot, one of the “fun buffer tasks” I always kept around for devs was code cleanup. Because of the unstable nature of the project, there was always a lot of duplication, unused/unnecessary/obsolete classes/functions/files and so on. Unnecessarily large CSS files where most of the selectors were no longer really needed or JS libraries that weren’t actually used.
Related: Learning new skills While many people working as programmers/software developers are happy enough specializing in a single programming language or platform, I generally consider it a better idea to have a wider toolset and the ability to easily pick up new programming languages as needed. The benefits should be obvious: when you have a wide variety of tools under your belt and are able to quickly learn to use a new tool, the number of work options you have increases greatly.
In any reasonably large software project, the system will be so large that no one developer will have a good grasp of the details of every function in the codebase. The tendency is for developers to specialize – that is, developers tend to focus only on certain parts of the codebase and become more familiar with that part, while not having much knowledge about the other parts. This tendency is self-reinforcing – once it becomes known that the developer is an “expert” in the given module, there is a tendency that he will be assigned the most difficult and urgent tasks or fixes related to that module, further cementing his expertise.
I was in a meeting once with my boss (literally the CEO, a Malaysian) and some representatives of another company (Americans) where we were discussing the technical details of a possible future partnership. At one point, one of the Americans said to my boss that he was pleasantly surprised that I was openly speaking up independently of my boss and willing to correct him on some points when he didn’t quite get the technical details right.
This is a story of something I consider to be one of my worst mistakes in software product development. Some years ago I was asked whether it was feasible to write software that would be integrated with Software X that allowed us to export that software’s output into a format that was compatible with Standard Y. I took a look and after a while came back with “Well sure. We could use Programming Language M that has an API that lets us integrate into Software X so we can export the output data.
“Button for non-service floor does not light up.” For more than a decade I regularly went to an office building where the elevators verbally spouted this nonsense message whenever you tried to go to a floor that the current elevator car did not service. For context, the elevators in the building were zoned programmatically – this means that they only service a particular subset of the floors that are provided on the elevator panel itself.
There was this project we had where there was a strange bug. The developer working on it found that the problem only appears when the record ID was 12. When it was 11 or less, everything was fine. When it was 13 or more, everything was also fine. After some investigation, it was found that there was some code that executed with a condition of “if record id == 12”, which was already a WTF.
The software development process is already difficult mainly because a lot of it so imprecise. Requirements are often only vague wishes that the client has, with no regard to the sheer number of instructions needed to implement those requirements. Throughout the entire process it’s important to use feedback loops to determine whether development is on the right path. And like all feedback loops, their effectiveness often hinges on how quickly we are able to turn around and give and incorporate feedback into future iterations
In Tagalog: “Madali lang naman diba?” Probably one of the most annoying things a programmer can hear, especially from a client or a manager who has no appreciation of how complex software development is. It’s presumptuous at best and actively damaging to schedule and morale at worst. We already know estimation is hard, there is no need to make it more complicated by automatically assuming the best-case scenario (or in many cases, an impossible scenario)
“Composition over inheritance” is an object-oriented programming principle that I’m sad to say many devs I’ve encountered aren’t too familiar with. Composition provides greater flexibility, modularity, and extensibility in large software systems as compared to inheritance, especially for statically typed languages like Java that don’t support multiple inheritance The most common examples of the problems caused by too much inheritance involved generic object such as the game objects example in the wikipedia page linked above.