Thinking of your code as data with ASTs

ASTs (Abstract Syntax Trees) are an appealing tool for developers. If you don't know what they are, see my references below -- it's worth getting a base understanding so that you'll know when they could help you. One much spoken of possibility of ASTs is automatically refactoring code, so-called code-mods.

Another compelling application for them is to gain insights into your code or to help speed up tasks that your team needs to do every day. For example, a tool currently being worked on at Accelo is a component-to-URL tracker. What's the use case? Let's say a dev is fixing a bug, and in the process, they make a modification to the component `CoolComponent`. Given the name of this component, the tool gives a series of URLs that use the component. The URLs can be used for testing, either by the dev or passed on to our QA team.

It's worth mentioning that the tool is highly specific to our codebase as it deals with:

  • Perl and template-toolkit renders some HTML.
  • The rest is mostly angular 1.x code.
  • The is a broad spread of style within our angular code.
  • There is some jquery in dark corners.
  • We've recently started adopting react.
  • It understands our file structure.

Which mean there would be little benefit to sharing the tool.

Without the tool being able to regression test components directly in our app is either guess-work that is likely incomplete or a gruelling task of chasing all the links through files to find each use. The latter option is more robust and is what the tool does for us now. What we've done is taken developer knowledge and embedded it in the tool so that it no longer exists only in our heads. Here lies a subtle befit of such a tool, that it becomes a type of living documentation. If otherwise I was tasked with documenting the steps a dev needs to take to track a component to the URLs, I would be hesitant. This kind of documentation is likely to get forgotten about, out of date and become a source of misinformation. But embedding the knowledge into a tool ensures that it stays up-to-date so long as the tool is used. As I mentioned above, about what our tool deals with, having multiple frameworks, and so on. It means that there is a lot of dev knowledge now bundled into it.

So that's our use-case. You could do the same for your codebase too, or here are some other ideas that might be useful to your team:

  • A tool that counts how uses of each component, to keep track of the most used ones.
  • A tool that warns if there is a component that is used more than 3 or X times without sufficient tests.
  • A tool alerts you when a component is no longer used (depending on how advanced you wanted it, it could even delete the component and submit a pull request).
  • A tool that warns you if a name is used twice (you might want to avoid this for components, for example).

Please note the above is probably bias to web development (as that's what I do). The point is to think of code as a source of data and to pull out useful information and in the process, document knowledge.

ASTs exist for most modern programming languages (Javascript, Python, Go and Rust to name a few), so go forth parse your codebase. Fun fact; Perl does not have an AST (I know this because I work with Perl). If you've ever heard the joke that Perl is a write-only language, it seems that extends to parsing tools too; they can't read it. That means there are only three things in the world that understand Perl code, you for a five-minute window immediately after you wrote it, the Perl engine and probably Larry Wall.

References
A good place to start for ASTs
Accelo
Perl's lack of an AST

Did you know you can get Electronic-Mail sent to your inter-web inbox, for FREE !?!?
You should stay up-to-date with my work by signing up below.