Ten years ago, engineer and investor Marc Andreessen wrote in his now-famous article in the Wall Street Journal that "software is eating the world." These days industry visionaries are predicting a future where artificial intelligence (AI) will eat software. But what does this really mean, is software really just lunch for AI?
One thing is certain; we’re at the beginning of a profound change in the way software is developed. Let's examine this technological transition, explore the advantages and challenges of the Software 2.0 way of doing things and take a look at the opportunities this is giving the world.
A lot of our code is already in the process of being transitioned from Software 1.0 (code written by humans) to Software 2.0 (code written by AI, typically in the form of deep learning).
Deep learning is one of the most significant technological breakthroughs we’ve seen in the last decade. Today, many of the most widely used software applications use deep learning to generate real value, and neural networks represent the beginning of a fundamental shift in how we develop software and what it can do.
The models that drive AI don’t require the sort of predetermined structure of traditional software. In fact, one of the things that they are best known for is transforming disorganized and unstructured data into structured data.
Software 2.0 is very effective at working with data sources like images, video, text and audio, and just about all important advancements in these areas have been possible due to Software 2.0 in recent years. Hand-crafted algorithms are no longer relevant in these cases because AI can do better.
Programmers have traditionally built their systems by carefully and painstakingly instructing systems exactly what to do. Programmers tell a computer that if this happens, then that happens. They give computers this information in a language that is unforgivingly logical and deterministic.
The world has built a huge amount of sophisticated tools that assist humans in dealing with the many unique challenges and opportunities that exist when writing code. Integrated Development Environments (IDEs) offer features like syntax highlighting, debugging, profiling, git integration and many other features that have come to be standard and expected components of the software developer’s toolkit.
The programming process is slow, tedious and error-prone; anyone who has written computer code knows the experience of sitting in front of a computer screen for days staring at a program that should work, but doesn’t. Even more frustrating, many of us have had the experience of a trusted and well-functioning program that suddenly falls apart when it encounters some slightly unexpected “corner case.”
With Software 2.0, we don’t really write code anymore. Instead, we program by example. Programs are generated by analyzing large amounts of data, identifying patterns in this data and creating models of this data based on these patterns. We collect many examples of what we want the program to do and what not to do, label them appropriately and train a model to interpret new inputs based on this information.
Since data is the backbone of teaching these models, Software 2.0 developers need help with accumulating, visualizing, cleaning, labeling and sourcing these datasets. They not only use it to teach and train the model, but must be able to measure model performance, explainability and model drift.
So what would a “Software 2.0 IDE” look like?
It's significantly easier to collect data than to explicitly write programs. This is one of the key insights driving the adoption of Software 2.0. But software engineering is not going away anytime soon. For the foreseeable future, both Software 1.0 and 2.0 need to co-exist.
Traditional programming tools and methodologies are well understood. The systems for building data pipelines and deploying machine learning (ML) systems, on the other hand, are under active development. AI platform companies are combining knowledge gleaned from agile software development into a new paradigm of agile model development.
We are uncovering better ways of developing software powered by AI/ML models. Software, data and modeling teams are being brought together at key interaction points and work together on cyclical phases of an “AI Lifecycle” to exploit the full advantage of AI/ML enabled software deployment.
What parts of software programming can be moved to the deep learning 2.0 framework, and what should remain in the traditional 1.0 framework?
In the new paradigm of Software 2.0, much of the attention of a software developer shifts from designing an explicit programming algorithm to designing and curating large datasets.
However, these systems are only as good as the training data they are learning from. In many cases, machine learning systems are limited by human-caused flaws in the training data. Improving a model’s performance frequently involves implementing a solid deployment environment, as well as maintaining a high quality stream of training data.
Developers need a monitoring system to ensure that the code which is written actually works. We know that deep learning neural networks do well in supervised learning settings, and by supervised we (mostly) mean supervised by people. If human beings can provide training data with both good and bad examples – or at least review and edit the ones generated by machines – these models can learn the patterns and provide correct outputs.
Software development has had job roles such as business analyst, systems analyst, architect, developer, tester and development-operations (DevOps). These roles reflect the scoping, design, development, operations and maintenance phases of the software development lifecycle.
With the emergence of machine learning models and the paradigm of Software 2.0, we see a number of new responsibilities arising. Data Scientists, Data Engineers, ML Engineers and MLOps are just a few of the most common titles that you'll see. These roles are a hybrid of software engineering, software operations, statistics, machine learning and data management.
The existing body of engineering talent must start looking at the world differently, and there is a familiar set of resource problems faced in the shift to Software 2.0: lack of skilled people, trouble finding the right use cases and (especially in this case) difficulty of finding data. In fact, In this scenario the role of software engineer might morph into “data curator” or “data enablers.”
Software 2.0 will change the way that software is developed. Just like the way your phone checks spelling and suggests. words based on the context of their conversation, when you’re writing code, errors can be highlighted and suggestions for whole lines or entire functions can be offered as substitutions.
These tools will have a direct impact on the way that software developers do their jobs and will also greatly expand the scope of what software can accomplish in the first place. As we've already mentioned, much progress has been made in working with unstructured data types like images, video, text and audio. These data types can be “understood” by computers in ways that were never possible before. Some of the most exciting use cases in the field include:
AI models are being used every day to filter out harmful image, video, text and audio content from user-generated content streams. Advertisers are able to find off-brand or poor quality content, profanity and toxic speech in text posts and even inappropriate text in images can be detected and moderated.
A wide range of use cases are based on identifying faces, comparing faces, searching faces and verifying identity based on faces. Facial recognition technology is being used to provide secure access to schools, airports and offices.
Airlines, manufactures and agricultural businesses are using computer vision technology to save maintenance and inspection costs and increase the lifespan of capital assets. Equipment monitoring, maintenance scheduling, asset planning and asset efficiency are all well-positioned to see significant benefits from Software 2.0.
Software 2.0 can fail in unintuitive and unexpected ways. Since AI is “made out of data” in a certain sense, there is an ongoing need to examine the quality of the data used in training models. Understanding this data can be challenging when some datasets consist of millions of parameters and observations.
Diversity and bias
Addressing the risks of ML and AI will require cross-functional teams. These teams need to include different kinds of expertise – security, privacy, compliance, ethics, design and domain expertise – and they will also need to include people from different social and cultural backgrounds. One socio-cultural group might accept a given scenario without thinking twice, while another group would find it completely unacceptable.
Machine learning raises the question of explainability; you may not be able to explain why your software does what it does, and there are many application domains (for example, medicine and law) where explainability is essential. Software 2.0 can be used to automate many processes that were traditionally done by people, and it's important to be able to explain why a system made a given decision.
Finally, it's simply not possible to build a machine learning system that is 100% accurate. If you train a system to label your digital assets, how many of that system’s decisions will be incorrect? These systems might make fewer errors than a human, but we’re more forgiving of the errors humans make.
Will machines replace software engineers altogether? It's more likely that Software 2.0 will get us 90 percent of the way there, and even this kind of achievement is still a long way off. Neural networks aren’t a silver bullet; rather, we need to design these AI tools to work with other solutions.
Many aspects of software development will work well with deep learning, and others won't. In the short term, Software 2.0 will become increasingly prevalent in any situation where data is plentiful and cheap, and where the algorithm itself is difficult to design explicitly.
Software 2.0 is driving a fundamental shift in the way that we interact with computers, and new tools and development methodologies will be needed to manage this process. Human resources will need to follow suit to ensure that their companies are hiring the right people, in order to keep up with technological progress.