Does Source Maintenance Need to Be So Difficult?

RPG
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

How can we significantly improve the maintenance process?

 

If someone could do a time, motion, and thought study of what maintenance programmers actually do, what might be learned, and how could the maintenance process be improved? That would be a two-step process, and actually, the first step has been done to a significant degree. Researchers have studied what maintainers actually do and think. But the lessons learned have not yet been fully exploited to improve the maintenance process.

 

So what can be learned from a time, motion and thought study of maintenance programmers? One of the more interesting studies was documented in "An Exploratory Study of How Developers Seek, Relate, and Collect Relevant Information During Software Maintenance Tasks" by Andrew Ko, Brad Myers, et al. This study was interesting in that it was a lab experiment with well-conceived maintenance tasks and programmer-monitoring techniques. The results track with those of other studies but are provided in more detail and analysis than most.

 

Key Concepts

 

Before I go into the results, let me first explain a few key concepts about software maintenance and the process of program comprehension that programmers go through:

 

  • Program modeling is the programmer's attempt to understand the control flow and data flow of the program; in large programs, the programmer may limit this activity to what are perceived as relevant areas of code. Programmers follow various strategies in doing this.

  • Feature location is the process of finding the code pertaining to a given feature that needs to be understood or maintained in some way. Programmers follow various searching strategies, relying on tool facilities, their own knowledge, the use of beacons to point the way to relevant code, etc.

  • Dependency graphing is the process of mapping various types of dependencies; call dependencies concern calls to and from various modules, direct dependencies regard finding where items are defined, and indirect dependencies regard finding impacting or impacted code from a given location.

 

Depending on circumstances, a programmer may limit his or her attempt to build the mental program model to only the relevant sections of code. In large programs, it may not be feasible to model the entire program. Typically, the programmer will attempt to find all relevant feature locations and then go through a confirmation process that all relevant code has been found and understood.

 

If the programmer is following such a strategy of not modeling the entire program, then he or she often launches right into the process of feature location (finding the relevant code).

 

The Study

 

In this particular study, five maintenance tasks were given to a sampling of developers using a fairly recent version of Eclipse to maintain Java code; some tasks were defect corrections, and some were enhancements. These tasks required the typical processes of program modeling, feature location, etc. The authors also contrived a means of regularly interrupting programmers to simulate the typical pattern of interruptions found in a work environment. Here's the average breakdown of how the developers divided their time performing the five assignments:

•·         27 percent--searching and navigating code

•·         21 percent--reading code

•·         20 percent--editing code

•·         12 percent--testing changes

•·         20 percent--switching between applications, reading technical documentation, reading task instructions, etc.

 

You can see from these numbers that the programmers spent half their time (48 percent) just navigating and reading code, trying to understand it. Other studies have found similar results. Let me emphasize this point so it sinks in: maintenance programmers spend half their time trying to understand code.

 

Now, let me pause at this point and pose a rhetorical question to you, the reader: do the source tools that you use work in support of this fact? Are they fully oriented to the primary tasks, searching and navigating, that maintenance developers do?

 

Given that this study was done using a leading source tool, Eclipse, to maintain Java, we in the RPG world should not feel particularly slighted if we have to answer "no." However, it is pretty easy to imagine that the searching and navigating numbers are substantially higher for anyone still using green-screen source tools.

 

What else was noteworthy from this study? Here's something that jumped out at me: developers were significantly prone to making errors in work that was interrupted. If they did not complete a coding task to a "save point," then they were easily mislead about where they had left off, and in a number of cases they proceeded as if they had never done the unsaved work. By the way, the authors of the study used a statistic on programmer interruptions supported by two workplace studies: programmers in particular are interrupted by either themselves or others every three minutes. Combine that with the fact they easily lose track of their mental state when interrupted and you've got a good opportunity for process improvement right there.

 

What about all the searching and navigating the programmers did? That was the single most time-consuming measured task: 27 percent! What were they doing? The short answer is that they were searching for relevant information. You know what? An average of 88 percent of the searches led to irrelevant information! Why would that be? To understand that, you've got to look at what prompts a developer to initiate a search. (Typical searches might be looking for a subroutine, finding the definition of a field or data structure, tracking a key list, tracing call and data dependencies, etc.) Developers initiate a search when they think there is relevant code in another location; they do this in response to clues they pick up from such things as variable names, comments, knowledge of standards or patterns, etc. What the researchers found is that the developers were often acting on bad clues to initiate searches. The lack of good clues leads to wasted searches and wasted time. An average of 36 percent of the time spent inspecting code at search destinations was completely wasted because the developer did not immediately realize the search was irrelevant.

 

Another interesting fact about navigation: 28 percent of navigations were performed to return to a recent previous location. In fact, developers often have to search all over again to get back to where they were one or two steps prior. This is clearly a very unproductive use of time.

 

Is There a Better Way?

 

But here's a question: is it really necessary for the developer to do a search to see what he or she is looking for? Can't the source tool be intelligent enough to simply show the developer information about what is perhaps under the cursor--show it in a tool tip or in a side panel or something? You can see how much time--and mental overhead--would be saved if the developer had this information immediately available and displayed without searching.

 

Stepping back from the details a bit and focusing on the bigger picture, a lot of developers' work is about feature location. A given feature in a system may be implemented across multiple subroutines and multiple programs. The developer must locate and comprehend all the code that comprises a feature. To visualize it a little more, a feature may consist of 10 lines of code in PgmA-SubrM, 10 lines of code in PgmA-SubrT, 20 lines of code in PgmB-SubrX, and so on. When the developer is attempting to comprehend the feature's code, how does the source tool support that effort? Unfortunately, most source tools do nothing to support this, even though it is a vital activity of software maintenance. The developer should be able to assemble all feature-related code into some kind of view that is at once comprehensible and persistable.

 

As developers attempt to find the source code for a given feature, they engage in an activity that has been called program slicing. What this means is that they attempt to slice away the sections of the program that are not relevant to the feature, in the end leaving (mentally) just the code for the feature. Going through this process involves a lot of navigation to build the dependency graph model in the developer's mind. These dependencies, again, occur in two primary flavors: control flow and data flow. Control flow is about program, subroutine, and procedure calls, and at the lowest level, the control of statement execution through IF and DO types of statements. Data flow is about the path the data follows from files through variables, onto to other variables, and back to files, and so on. Data flow is a logical subset of control flow, but it includes additional operations as well.

 

A great deal of developer time is spent trying to understand the relevant control and data flows. And again, I know this is getting repetitive, but what do source tools do to facilitate this fundamental maintenance process? Well, how about a less discouraging, more helpful question: what could they do? To begin with, side-by-side source views have been shown to be very effective and heavily used by developers when available. A typical need is to view a calling section of code side by side with the called section of code; that's a start, and some tools provide that. If, for example, you are looking at an EXSR statement, shouldn't you be able to tell the source tool--with, say, one click--to open the called subroutine in a side-by-side panel? No navigation; just do it. Or the reverse: you're looking at a BEGSR statement and want to see the code that calls it without a lot of searching. Can't this be done? Or have a quick and effective means to see the ENDs of IFs and the IFs of ENDs and so? Or see all the IF, ELSE, and DO conditions that apply to a given statement? Couldn't an intelligent source tool save a developer a great deal of time in tracking down all these things? This functionality would go a long way to helping the developer build the mental model of control flow.

 

The same applies to data flow. Couldn't a great deal of time be saved if the source tool showed both upstream and downstream data flows, i.e., where MOVE or EVAL operations act on a given variable? Couldn't all that be immediately available at the developer's fingertips?

 

The Art of Maintenance

 

This article is not meant to be all-inclusive of the ways in which source tools can be vastly improved by being oriented to the maintenance process, but it should give a good introduction to the concept. I think the industry has long been excessively enamored of new development, overlooking the fact that the majority of the programming work being done today is maintenance work.

 

Over the last 20 years, there has in fact been a fair amount of academic research done on the subject, and much of it remains to be applied to real-world practices. Getting software maintenance tools up to speed with what maintenance programmers actually do leads to a substantial savings for IT departments. Once the question of what maintenance developers actually do is understood and acted on, then an even larger question will remain: what should maintenance developers do?

BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$0.00 Raised:
$