The limits of hand-coding in the Big Data era

Traditionally, when the time comes to make a technology decision, organisations face two options: Either take a custom-coded or tool-based approach. The widespread adoption of big data technologies, especially Hadoop is forcing a similar decision.By Fran?ois Mero, ?Senior VP of Sales EMEA, Talend.  

  • Friday, 6th January 2017 Posted 7 years ago in by Phil Alsop
  • Those rare companies that decide to engage in hand-coding do it either because they lack the required information (they simply don't know there are packaged solutions out there) or, frankly, in my opinion, they are chasing a pipe dream. They often mistakenly believe that all a big data project takes is to feed data into a Hadoop cluster. They figure developing a dedicated on-premise infrastructure to feed and analyse big data will mean a 20% savings over a packaged solution. However, according to Gartner, this approach results in a 200% increase in maintenance costs. And that's not all. Our past ten years’ experience in several integration projects, now in production, offers us the insight into the limits of a custom-coded approach. In reality, by developing in-house, companies expose themselves to three types of risks: Operational, strategic, and economic.
 
Keeping place with technology changes
 
Hadoop is an open source framework without licensing fees. While it may be tempting to acquire the technology internally to develop your infrastructure believing it will save you money, in reality, this strategy is often doomed to failure.
 
On the one hand, the big data ecosystem is constantly changing. The renewal rate for technology platforms keeps a steady pace as the needs of companies change. Within a few years, these have gone from MapReduce to Storm to Spark and maybe soon, to Apex. The innovation velocity of the big data ecosystem represents an unprecedented challenge for application maintenance and those using the technologies. If the infrastructure is set up to collect, enhance, and handle data is not sustainable and able to ensure ease of maintenance and upgradability, the projects that rely on it are at risk. Sooner or later, they'll have to be dropped due to obsolescence, high maintenance costs, or the eventual realisation that high-value IT resources are spending increasing amounts of time on tasks that don’t add value.
 
On the other hand, the proliferation of data sources makes governance and management of data even more complex. A non-tooled approach becomes tedious and time-consuming at best, and impossible at worst. But more than that, security, governance (for tracking movements and transformations applied to the data), data quality, tests, deployment, regulatory obligations, etc. are all areas that benefit from packaged solutions designed and thought through for the most frequent company use case scenarios. Our industry invented the tools to help customers make their projects more secure, ensuring good governance and boosting the productivity of IT teams. Why would these good practices that are so crucial to the success of IT projects disappear in the world of big data?
 
Other economic and strategic risks
 
We've seen it time and time again: Maintenance costs rise as hand-coding projects increase. Moreover, the development team, already known for the rareness of their skills and thus their high cost - are assigned to maintenance and upgrading tasks and so they can't be put to better use elsewhere, on new projects. This stifles innovation.  After all, the whole point of packaged solutions is the flexibility and speed they bring to development. So companies then have to recruit new developers to launch new projects or even outsource them, and all of this comes at a high price.
 
Hand-coding, as discussed above, makes companies highly dependent on experts qualified in these technologies. And, what happens if they decide to leave the company? How do you make sure strategic systems remain sustainable then?
 
Keep an eye on the project size
 
Hand-coding is an option as long the projects are simple and very specific and don't require a lot of maintenance. Very rarely, it may even be the only way to go if there's no solution on the market able to meet a specific need.
 
But, if the size of a project requires multiple specialists working within the company, solutions offer more guarantees: A graphical programming environment that is custom-designed for ease of use by business users, the option to reuse previous development, the support of an editor for setup, etc. And most of all, companies benefit from the finely tuned skills of the editors who were among the very first companies to adopt Hadoop.
 
Gartner, in the study mentioned above, tells us that only 11% of companies stay a supporter of hand-coding development. Hopefully, I’ve helped highly why that’s the case.